Hacker News new | past | comments | ask | show | jobs | submit login

When I was in Uber back in 2015, my org was trying to convert zip-code-based geo partitioning with a hexagon-based scheme. Instead of partitioning a city into on average tens of zip codes, we may partition the city into potentially hundreds of thousands of hexagons and dynamically create areas. The first launch was in Phoenix, and the team who was responsible for the launch stayed up all night for days because they could barely scale our demand-pricing systems. And then the global launch of the feature was delayed first by days, then by weeks, and then by months.

It turned out Uber engineers just loved Redis. Having a need to distribute your work? Throw that to Redis. I remember debating with some infra engineers why we couldn't throw in more redis/memcached nodes to scale our telemetry system, but I digressed. So, the price service we built was based on Redis. The service fanned out millions of requests per second to redis clusters to get information about individual hexagons of a given city, and then computed dynamic areas. We would need dozens of servers just to compute for a single city. I forgot the exact number, but let's say it was 40 servers per an average-sized city. Now multiply that by the 200+ cities we had. It was just prohibitively expensive, let alone that there couldn't other scalability bottlenecks for managing such scale.

The solution was actually pretty simple. I took a look at the algorithms we used, and it was really just that we needed to compute multiple overlapping shapes. So, I wrote an algorithm that used work-stealing to compute the shapes in parallel per city on a single machine, and used Elasticsearch to retrieve hexagons by a number of attributes -- it was actually a perfect use case for a search engine because the retrieval requires boolean queries of multiple attributes. The rationale was pretty simple too: we needed to compute repetitively on the same set of data, so we should retrieve the data only once for multiple computations. The algorithm was of merely dozens of lines, and was implemented and deployed to production over the weekend by this amazing engineer Isaac, who happens to be the author of the library H3. As a result, we were able to compute dynamic areas for 40 cities, give or take, on a single machine, and the launch was unblocked.




I love H3. Isaac and Uber did a real service to the geospatial community with that one.


To me H3 looked over-engineered and unnecessarily complex. Hexagons don't tile nicely at multiple resolutions, for one! Just overcoming that is decidedly non-trivial.

Implementing Google's S2 is simpler, but it has the same overall benefits as H3 such as a hierarchical data structure.


H3's algorithms involve some intricate maths, but the library itself is conceptually simple. Check this page out for some really fun and neat ideas: https://www.redblobgames.com/grids/hexagons/.

Uber internally had extensive research on what kind of grid system to use. In fact, we started with S2 and geo-hash, but H3 is superior. Long story short, hexagons are like discretized circles, and therefore offer more symmetry than S2 cells[1]. Consequently, hexagons offer more uniform shapes when we compose hierarchical structures. Besides, H3 cells have more consistent sizes in different latitudes, which is very important for uber to compute supply and demand of cars.

[1] One of the complications is that H3 has to have pentagons to tile the entire world, just like a soccer ball. We can easily see why by Euler's characteristic formula.


Funny enough, exactly the same at Lyft. Redis everywhere. The original version of the dynamic pricing system was a series of cron jobs reading and writing to redis, before it was replaced with a Flink pipeline (which still wrote to redis for serving).


The aforementioned hex mapping tool: https://h3geo.org/

For anyone doing geo queries it's a powerful tool.


I'm firmly in the "you only need Postgres" camp, so I went into your story thinking it was going to end with you saying that you used PostGIS

Err, now that I think more about that, IIRC Uber is a monster mysql shop so it may cause them to break out in hives if someone installed Postrges there


Cool anecdote, thanks for sharing!




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: