Bad data on graphs, demos that would have been impressive a year ago, vibe coding the easiest requests (financial dashboard), running out of talking points while cursor is looping on a bug, marginal benchmark improvements. At least the models are kind of cheaper to run.
Feedback:
- At end of game show statistics correct/incorrect
- At the end of game show which particular color weaknesses were identified
- Show progress meter (1 of 20)
- I have a red/green weakness and I expected to run into issues. One of the greens was hard to differentiate but I still got it right. I expected it to be harder. Perhaps look up exact pallets for different color perception issues and use those.
LLMs are very good at knowledge extraction and following instructions, especially if you provide examples. What you see in the prompt you linked is an example of in-context learning, in particular a method called Few-Shot prompting [1]. You provide the model some specific examples of input and desired output, and it will follow the example as best it can. Which, with the latest frontier models, is pretty darn well.
I've been looking forward to playing with this since reading the paper. I was considering implementing it myself based on the paper but I figured the code would just be a few weeks behind and patience did indeed pay off :)
When I go fossiling in Florida almost every large find is from the Pleistocene or Holocene, when Florida was not covered by the ocean. Identifying the bone fragments is always a big puzzle that involves a good bit of study and reference books. A fun puzzle for sure, but sometimes you end up with a find that you just can't place. It would be great to have a technique such as ZooMS to positively identify the unidentifiable.
Bad data on graphs, demos that would have been impressive a year ago, vibe coding the easiest requests (financial dashboard), running out of talking points while cursor is looping on a bug, marginal benchmark improvements. At least the models are kind of cheaper to run.