This hit me bad once bad. I tested the regular dict and it _looked_ like it was ordered. Turned out, 1 out of about 100000 times it was not. And I had a lot of trouble identifying the reason 3 weeks later, when the bug was buried deep in complex code, and it appeared mostly what looked like random.