I wish I could get this on Linux </3 Any ideas how to get there (without reachin...

petercooper · on Jan 30, 2025

No, but there are ways you could approach it with various tradeoffs.

You could start with something naive using pattern matching, common formats, common words, etc. and apply the most relevant emoji. It'd be fast and small but not as flexible and no cool ML in there to enjoy. It'd probably be how I'd go about it though - you could actually use AI at the dev stage to get the dataset for this pretty tight I reckon..

Or.. you could use a tiny embedding model (which might even be fast on CPU alone), assign emojis to the best appropriate regions, then pick the 'closest' emojis for embeddings of fresh text. Potentially inaccurate but could be fast-ish.

Or.. all the way up to using something like TensorFlow and training a classifier, but it'd probably take a fair bit of space and not be super fast if it fell back to CPU. Would be a fun experiment! (I haven't done this on Linux before so I might be wrong in 2025.)

One big benefit of Core ML is it Just Works™ if you meet the hardware and OS requirements. On Linux you'd potentially need to fall back to CPU or have a lot of various dependencies to lean on (maybe ONNX takes care of all this nowadays?)

willswire · on Jan 30, 2025

I’ll have to look into this; it’s something I’d love to add support for if possible!

sanbor · on Jan 30, 2025

I am pretty sure you can get the same result using ollama. Like “cat emojiprompt.txt input.txt | ollama run llama3 | tee output.txt”

sanbor · on Jan 31, 2025

I created a cross-platform version using this approach https://github.com/sanbor/emoji

It's very slow in comparison because it uses llama3.2 model but it's a very common model and the emojis generated are more related to the lines.

Next feature would be to add an emoji model to this tool to make it faster.

nilamo · on Jan 30, 2025

The training data is in the resources, so you could use that to train a model locally and invoke it.