I'm really confused by "Anecdotally, most people use LLMs for ~4 basic natural language tasks" and "Most LLM applications use some combination of these four".
I'm not sure about the `ELI5` use-case, and feels like this is only true for a very limited type of use-cases people currently use LLMs for.
For conversational FAQ-type use-cases like the ones described by OP perhaps a few basic prompts suffice (although anything requiring the agent to have "agency" in its replies would necessarily require prompt engineering) - but what about all the other ways that people can use LLMs to analyze, transform and generate data?
You say this as if the goal of building a business (and measure of success) was to raise money?
From what I can see, Front has 360+ employees and generates $64M in revenue, which works out to $177K revenue per headcount. Missive generates $2M with a headcount of 4, which comes out to $500k revenue per employee.
Imo, generating almost 3x more revenue per headcount, and not having to deal with the headaches of managing a 360+ employee team (while maintaining complete ownership of your company) is the type of success that more founders should aspire to.
You can’t pull their revenue from one of those estimate sites and pretend it is any bit reliable. It’s like those celebrity net worth sites, a total guess.
Congrats on the launch guys! Fellow Canadian from Montreal here :) I'm curious, what tools / methodology did you follow to generate the high-quality labels, and how many different labels did you end generating? I'm also very curious whether you view the discovery and generation of new labels (and accompanying high-quality training datasets) as a continuing and core part of your development going forward?
Ooh I love MTL. That's the first place I lived in Canada. Great q. We used SEC enforcement actions related to fraud as our gold label (fairly common practice in academia). Really key thing here is that you need to be careful about what years you're using for training because if you include years that are too late in the fraud cycle, you end up with significant target leakage e.g. the filings we'll say, "we're being investigated for fraud". We ended up manually reviewing all of our data/labels. It took over a month. We also use settled class action lawsuits as silver labels. Plus a few other more frequent labels as bronze labels
Thanks for the quick answer! Follow-up around this as it's a space I'm actively working in: did you use or build any tools for the labeling process, or was it Excel? :D Also, do you ultimately see/position your solution as an AI-powered exploration tool that allows humans to derive better insights, faster (but where the NLP side of things is simply to assist in this discovery process), or do you see the models (and resulting flags) eventually being able to completely replace the human intuition?
Our annotation process has been manual so far, but we are working on building something to make it more efficient :-)
We see our solution as an AI-powered assistant for qualitative research that makes the job of an analyst much easier, and don't see it as 'replacing' humans for the foreseeable future.
2 years ago some friends and I started writing one song each per week (and met every Thursday to listen to our respective master-pieces). We mostly ended up composing and recording the songs on iphones the Wednesday night before (thank god for GarageBand), and after 3-4 weeks were producing more creative content in this compressed timeframe than we'd had been able to with no deadline before. A few months in we skipped a Thursday or two, and suggested the solution was to write one song per month instead. That was definitely not the solution. We didn't find more time to write, and the lack of schedule killed the momentum. My biggest regret of the last year is not sticking to it - however a friend just moved to our city with the condition that we'd get it started again with the weekly frequency, so I'm optimistic :)
Agreed - we've built extensions for Chrome, Firefox, IE & Safari ; Firefox and IE are on the same level with regards to debugging, that is : not pleasant. It's a shame because browser extensions have great potential : I'm increasingly turning towards them for a variety of use-cases, however currently limiting efforts to Chrome due to availability of tools.
A few months ago, I started a projet with a few friends, whereby we commit to each writing and recording one original song per week, then meet on thursday to listen to everyone's creations and critique. As far as I am concerned, this experience persuaded me that taking more time does certainly not equate producing higher quality material ; up until now, I had taken months and sometimes years (!!) to finish and sometimes never end up recording songs, always unhappy with the final result, and with no deadline, unable to simply let go of my ego and release.
The goal with la Chanson Du Jeudi (www.lachansondujeudi.com) was to adopt a more "Bob Dylan" approach, ie : do one-take recordings if that's all we had time for (none of us are professional musicians in any capacity), and respect the act of creation above that of perfecting. In most cases, we'd end up writing, composing and recording the night before or day of, which led us to go down creative paths that would otherwise maybe have been discarded.
Interestingly, I was talking with a friend last night who is applying pretty the same concept to picture-taking : a group of friends decide on a topic, and submit pictures every week to a Flickr channel and vote on them. Much like the need to produce songs on a regular basis forced me to start recording any little idea that came to me to be sure I had something to close the week, she now consciously takes her camera everywhere she goes for this project's purpose.
Music composition is definitely subject to creeping featuritis. Uwe Schmidt (Atom Heart) combats this by having an idea for a track/song and executing it, never going back, and never spending tons of time wibbling with details. If you know what you want to make, it's really more a matter of implementation. If you don't, well, then you start down the path of Axl Rose and every other band that takes 8 years to make an album.
For conversational FAQ-type use-cases like the ones described by OP perhaps a few basic prompts suffice (although anything requiring the agent to have "agency" in its replies would necessarily require prompt engineering) - but what about all the other ways that people can use LLMs to analyze, transform and generate data?