This brings to mind the Viterbi algorithm that calculates the most likely sequen...

akdas · on June 26, 2019

I just literally published an article on that yesterday :) https://avikdas.com/2019/06/24/dynamic-programming-for-machi...

Agreed, it's basically the same algorithm.

n4r9 · on June 26, 2019

One really nice application of the Viterbi algorithm is map-matching, i.e. taking a GPS trail and "snapping" it to the road network to determine the actual route taken. It's difficult because GPS trails are often sparse and noisy, so you can't do something simple like "project onto the closest road segment". If you take the "states" of the model to be road segments and derive the transition probabilities from the shortest route between segments, you can apply Viterbi and get a very accurate result most of the time. Of course, calculating shortest routes involves Dijkstra, another famous dynamic algorithm.

benrbray · on June 27, 2019

Cool! Has anyone written this up and made cool visualizations? It's one of those many things I'd like to do but don't have time for :)

n4r9 · on June 27, 2019

Two of the more well-known routing engines have this capability and have written about it on their blogs with some visualisations:

- OSRM (the main routing engine behind openstreetmap.org) https://blog.mapbox.com/matching-gps-traces-to-a-map-7373019...

- Graphhopper https://www.graphhopper.com/blog/2016/08/08/releasing-the-ma...

Unfortunately it's not a simple project to get started on because you already need to have a way to import road network data (e.g. from openstreetmap), build an internal graph model, do some geographic indexing and perform fast many-to-many routing calculations.

cechner · on June 27, 2019

yep - look up 'Hidden Markov Map Matching through Noise and Sparseness' by Krumm and Newson

it calculates the final probability by combining 'emission' probabilities (the probability that a GPS observation was on a particular road) by the 'transition' probability that if an observation was on a particular road at one point, what is the probability that it is now on this other road segment. By combining these two the final probability incorporates both the nearness of the GPS signals to the roads and the connectivity of the road network itself.

I've found the formulas applied in this paper are good in practice only if the GPS updates are relatively frequent

n4r9 · on June 27, 2019

I've also found that it can be tricky to map "convoluted" routes that don't necessary go simply from point A to pointB.

If you don't mind me asking, roughly what frequency threshold have you found the algorithm to perform badly above, and are you aware of any algorithms or formulae which perform better in these situations?

cechner · on June 27, 2019

it should be in seconds - the problem is that the paper assumes that the 'great circle' (straight line) distance between two points should be almost the same as the 'route' distance between those points, with an exponential probability distribution.

This means that if the path between two points is not simple (around a corner) the probability drops off very quickly. If the time between measurements is in minutes, this heuristic is pretty useless (and you should really use log-scale for your numbers!)

edit: this is actually shown in figure 8 of the paper where they explore different 'sampling periods'

edit 2: I have not explored other methods yet, but it would probably make sense to start by deriving the formula the way they do, by exploring ground-truth data.

edit 3: I just noticed that my comments are largely repeating what you're saying - sorry!

n4r9 · on June 27, 2019

Ah, that rings a bell now. You can vary a parameter they call "beta" to allow for more convoluted routes, and I think a larger value gives a little leeway for less frequent fixes.

Agreed, the log scale is really important to avoid arithmetic underflow =] I believe OSRM and Graphhopper both do it that way. In my implementation I've flipped from thinking of measurement/transition "probabilities" to "disparities", and I choose the final route that has the least disparity from the trail. It seems to handle trails with around a 30-60s frequency over a 5-10hr period with decent accuracy.

cechner · on June 27, 2019

actually, beta is less useful than that! I think it represents the median difference between the two distances, its not a tolerance (at least as far as I can recall after experimenting with tuning this value).

As with you, I have found that it still often gives ok results with slower frequencies, as long as the transition probabilities are still relatively in the same scale as each other for a particular observation pair. However it means that there's no point trying to 'tune' using the gamma and beta parameters

shaki-dora · on June 26, 2019

A lot of problems on sequences are amendable to dynamic programming. Two other famous ones are Smith-Waterman and Needleman-Wunsch for finding optimal global and local alignments of sequences–a typical problem in genetics.

marton78 · on June 26, 2019

The Levenshtein distance, which finds the fewest edits to transform one string to another is also a classical example.