Hacker News new | past | comments | ask | show | jobs | submit login

> One obvious drawback is that you have to flip the coin a 1000 times to produce the first unbiased random bit, while Neumann's method starts producing bits much earlier.

Definitely. Though you could fix that problem relatively easily.

I think you might even be able to run von Neumann's method first, but store the coin flips; and then once you've got enough stored, extract a few more bits from the already used flips.

Perhaps like this:

When you do two flips, you add one of three possible tokens to your list:

'double-heads', 'double-tails' or 'mixed'.

Crucially, you only store 'mixed' and not whether it was 'head-tails' or 'tails-heads' because that information was already used to produce the von-Neuman-bit.

After your list has 1000 entries, you run an algorithm a bit like what I originally described to extract bits. The complication is that the table you construct has all possibilities of shuffling a fixed number of 'double-heads', 'double-tails' or 'mixed' tokens.




It feels like an awkward middle ground. Like this method has a limited entropy output for the first 2000 coin flips (first 1000 double-heads/double-tails/mixed entries), and then suddenly it adds back a ton of lost entropy.

An other commenter linked to some papers for asymptotically optimal entropy generation, I wonder if there is more of a streaming method there. It feels like there has to be, even maybe after a slow start. My naive intuition is that after 1000000 coin flips you have a good idea what p is, and then you can basically do arithmetic coding from there. Of course a theoretically correct method can't do exactly this, but it might asymptotically approach it.


Oh, if you want something practical, the approach that Linux's /dev/random takes is probably the one to go.

/dev/random being unbiased relies on some assumptions about cryptography; but in practice these assumptions are at least as well-founded as our assumption that our coin flips are independent.

Look at some of the papers mentioned in other comments on this submission. There are (near) optimal streaming methods.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: