But yeah, I tackled the same problem (for Flickr tags) and did not at first use the "obvious" algorithm; I did something slower and suboptimal.