More

daturkel · 2025-12-16T02:42:55 1765852975

As a math guy who loves reality tv, I was also drawn to the show and wrote a blog post [0] about how to programmatically calculate the probabilities as the show progresses. It was a lot of fun optimizing it to be performant. You can `pip install ayto` to use it to follow along with the show or try out scenarios.

The linked post is a very thorough treatment of AYTO and a great read. I really like the "guess who" bit on how to maximize the value of guesses. It's a shame the participants aren't allowed to have pen and paper—it makes optimization a lot trickier! I'm impressed they do as well as they do.

[0]: https://danturkel.com/2023/01/25/math-code-are-you-the-one.h...

owenlacey · 2025-12-16T09:46:38 1765878398

Let's be friends :')

Loved your post, really enjoyed getting into the meat of it. I wanted to position mine to a layman, kept asking myself "can I explain this to my Dad?"

I think where the post falls short is the absence of a silver bullet that contestants can use to win the game sooner.

daturkel · 2025-12-16T13:54:17 1765893257

Thanks! Optimization was something I'd played with in previous rounds of coding up AYTO simulations, but not in the most recent version. (See the bottom section of this notebook [0]). There's also a very thorough treatment of the problem in a blog post from 2018 by SAS (the software company) [1]. It's surprising how many people have been drawn in by the allure of AYTO!

[0]: https://github.com/daturkel/pyto/blob/master/AYTO_S8.ipynb [1]: https://blogs.sas.com/content/operations/2018/08/14/are-you-...

vasco · 2025-12-16T06:37:54 1765867074

And sometimes they just don't do better as a plot point, staying together an extra week after finding out they are not the one because of the intensity of their love (they met 4 days before)

daturkel · 2025-12-16T13:56:00 1765893360

Giving them more credit than they probably deserve but: when you're solving "by hand" like they are in the show, keeping a known non-match couple together may actually be helpful for interpreting the results of a matchup ceremony because you'll know that that couple didn't contribute to the beams.

vasco · 2025-12-16T14:10:00 1765894200

That's different, they do that also, but sometimes for the plot one couple intentionally mess those plans because the love is just too big.

daturkel · 2025-12-16T15:44:26 1765899866

Relevant xkcd: https://xkcd.com/55/

daturkel · 2025-11-09T21:33:07 1762723987

I've been building a little toy computer and assembly language that's interpreted in python. Pretty close to the first release (and introductory blog post) and a lot of fun to build (and learn a bit more about real assembly as I go).

https://github.com/daturkel/dt31

jeffadelic · 2025-11-10T19:41:09 1762803669

daturkel · 2025-07-09T18:03:08 1752084188

That insane behavior around acting on the highlighted image instead of the selected image has finally been fixed: https://github.com/darktable-org/darktable/issues/16850#issu...

daturkel · 2025-05-12T20:11:59 1747080719

I gave a very short talk (now a blog post) about embeddings and we use them to bridge the gap between human notions of understanding and digital representations. It might be of interest to people who enjoyed this post: https://danturkel.com/2025/03/10/ignite-machine-understandin...

daturkel · on Aug 20, 2024

> I doubt that any recommendation system is capable of providing meaningful results in absence of the "awareness" about the actual content (be it music, books, movies or anything else) of what it's meant to recommend.

Years of experience have proven that you can get quite far with pure collaborative filtering—no user features, no content features. It's a very hard baseline to beat. A similar principle applies to language modeling: from word2vec to transformers, language models never rely on any additional information about what a token "means," only how the tokens relate to each other.

daturkel · on April 2, 2024

I recently had a letter published in The New Yorker in response to Andrew Marantz's (excellent) story "Among the A.I. Doomsayers." In particular, I wanted to highlight that extremist doomers and accelerationists are not the only ones concerned about AI's future. In the post, I elaborate a little on the letter and provide links to further reading.

daturkel · on Nov 28, 2023

Hey Francois, congrats to you and the team on the launch! I've generally chosen Pytorch over Tensorflow for my day to day, but now that Keras is framework agnostic I'm excited to revisit it.

One thing I'm wondering about is if it's possible (or necessary?) to use Keras in concert with Pytorch Lightning. In some ways, Lightning evolved to be "Keras for Pytorch," so what is the path forward in a world where both exist as options for Pytorch users—do they interoperate or are they competitors/alternatives to each other?

kerasteam · on Nov 28, 2023

Both Keras models/layers (with the PyTorch backend) and Lightning Modules are PyTorch Modules, so they should be able to interoperate with each other in a PyTorch workflow. We have not tried this with Lightning, but we've had a good experience with custom PyTorch Modules.

More broadly, it's feasible to use Keras components with any framework built on PyTorch or JAX in the sense that it's always possible to write "adapter layers" that wrap a Keras layer and make it usable by another framework, or the other way around. We have folks doing this to use Flax components (from JAX) as Keras layers, and inversely, to use Keras layers as Flax Modules.

daturkel · on Oct 30, 2023

Annoy came out of Spotify, and they just announced their successor library Voyager [1] last week [2].

[1]: https://github.com/spotify/voyager [2]: https://engineering.atspotify.com/2023/10/introducing-voyage...

gregsadetsky · on Oct 30, 2023

Amazing, thanks for the reference!

The biggest immediate useful difference that I see is that Annoy uses read-only index files (from the docs: "you can not add more items once the tree has been created" [0]), while Voyager allows you to call `.add_item` at any time (I just pip installed to double check and yes -- it's great).

[0] https://github.com/spotify/annoy#summary-of-features

PLenz · on Oct 30, 2023

The great thing about Annoy is that you can write the index to disk and thus do big data work on tiny workers at the edge. I've never really seen anything else do the same.

gregsadetsky · on Oct 31, 2023

Oh yeah, Annoy is definitely mmap'ed i.e. you can use the index without loading the index file into memory. And that's very useful.

As far as I can see, Voyager requires you to load the index into memory and doesn't (yet?) do mmap. Which... would make sense since you can change the data after loading the index. So, Voyager index files are fully loaded in memory..? Do I have this right?

daturkel · on Sept 12, 2023

Any particular reason to highlight this release? 1.4 introduced Properties, which is cool, but this is just a minor release.

btown · on Sept 12, 2023

To this point, if you're releasing minor-version changelogs, expect them to be posted on sites like HN, and make sure you link at the top of the post to the announcement of the latest major release, for those who might have missed it! It's an easy marketing win!

pvg · on Sept 12, 2023

Minor releases (and plenty of not minor ones) are moderated away as mostly offtopic on HN since they turn into generic discussion of the project itself which is more often than not, a dupe. The saner thing is to just not post them/flag them.

Nekobai · on Sept 12, 2023

In fact, 1.4.12 is already out: https://obsidian.md/changelog/2023-09-11-desktop-v1.4.12/

markab21 · on Sept 12, 2023

Yeah, slow news day.

joshstrange · on Sept 12, 2023

I imagine that will pick up somewhat in about 20 minutes, or rather new articles will start to hit over then next 2 hours (Apple event starts at 1pm ET)

karlicoss · on Sept 12, 2023

It sparks a discussion around knowledge management, that's kinda nice :)

l2dy · on Sept 12, 2023

Personally I appreciate the fix for "Reveal in Finder" hanging a system app (Finder). I also wanted to see how people think of the Properties feature, which has not been discussed yet on HN.

daturkel · on Aug 15, 2023

This looks really cool. One thing I've wondered about with, e.g., the OpenAI API is if json is really a good format for passing embeddings back and forth. I'd think that passing floats as text over the wire wastes a ton of space that could add up, and might even sacrifice some precision in. Would it be better to encode at least the vectors as binary blobs, or else use something like protobuf to more efficiently handle sending tons of floats around?

zh217 · on Aug 15, 2023

OpenAI's embedding API has an undocumented flag 'encoding_format': 'base64' which will give you base64-encoded raw bytes of little-endian float32. As it is used by the official python client, it is unlikely to go away.

eigenvalue · on Aug 15, 2023

I totally agree when you're talking about a bunch of embeddings at once-- that's why the document level endpoint (and the token-level embedding endpoint) can optionally return a link to a zip file containing the JSON. For a single embedding, not sure it matters that much, and the extra convenience is nice.

Edit: One other thing is that you can store the JSON in SQLite using the JSON data type and then use the nice querying constructs directly at the database level, which is nice for the token-level embeddings and document embeddings. This is built in to my project.