Very neat and creative approach but I'm honestly conflicted whether the country/map metaphor is the best choice. In many cases the names are not that clear, so one has to zoom in to understand what they represent. It would perhaps be more interesting to do hierarchical clustering and show something like average connectiveness between the (super)clusters with lines, possibly with more descriptive/faithful LLM-generated labels for each cluster.
I was pleasantly surprised that it wasn’t a heavy line drawing creation. As someone who first did those in the 90’s and almost immediately learned their limits, I think this is nice because it doesn’t overclaim. It’s just a view, not a thesis.
I like diagrams where the axes mean something. Lines, shape, boxes/groups, distance, X vs Y, colour, thickness, texture, background, foreground. I also like simple. So often it’s lines to be fancy with no meaning. This one is just a pic, with some grouping, and it has personality. Yay?
I couldn't find a universal clustering algorithm yet: Frequently there is more than one way to group data that still makes sense, and as a result whichever final clustering option we choose - it will not be perfect.
Hm... unless maybe we do some sort of quantum clustering, which could be a fun project to explore!
It's a bit hazy now, but I remember trying hdbscan algorithm (hierarchical clustering), and on the graph of the GitHub size - I just couldn't fit it in memory.
I did end up using something similar to hierarchical clustering (mix of louvain/leiden/my own), and that's what we see in the final map.