Ask HN: How do you stay on top of advances in AI?

PheonixPharts · on June 13, 2023

I work in AI for a living, my advice would be worry less about what's the latest and greatest and focus on solving your problem, and learning the things you're interested in well.

Precisely because "the rate of change in the AI space is super fast", there really isn't much of a point to keeping up, even if you're an academic researcher (for those you only need to keep up with your piece of the puzzle, and you likely know everyone in that space already).

For example, I wasn't working in the NLP space for a few years. I kept an eye on what was getting mentions in various circles but basically ignored everything until I had a problem to solve. I work with LLMs everyday now, and honestly, even though I do understand them pretty deeply, there's no real need. Prior to the rise of LLMs I spent some time building LSTMs because I felt that I needed to understand them better. Lots of fun projects, but if I had skipped all that it wouldn't have really mattered at all.

Even more dramatically, I was never particularly specialized in computer vision, but currently build things (for fun) with Staple Diffusion every night. I've spent a fair bit of time really understanding the underlying model well (still have much to learn) but, because the space moves so fast, it's not that big a deal that I didn't also spend nights building GANs. Even though Stable Diffusion is also perpetually changing, the community has largely stuck with 1.5 and is very focused on squeezing as much juice out of that as they can.

Most important: the fundamentals have never changed. GPT4 still projects information into some highly non-linear latent space and samples from that. Diffusion models are probably the most novel thing happening now, but more so for their engineering (they're essentially 3 models all trained together with differentiable programming). If you really understand the fundamentals then catching up when you need to is fairly easy.

JoshuaDavid · on June 14, 2023

Slightly off-topic, but I am just starting to learn about the field, and saw

> GPT4 still projects information into some highly non-linear latent space and samples from that

Do you have more information/know where I can find things to read about the "highly non-linear" part of that? I've been reading some stuff[1][2][3] about smaller models, and my impression has been that the latent space is shockingly linear for those ones. But counterexamples would be informative, especially if it's something that only happens with larger models.

-----

[1] https://arxiv.org/abs/2209.15162

[2] https://www.neelnanda.io/mechanistic-interpretability/othell...

[3] https://arxiv.org/pdf/1610.01644.pdf

psyklic · on June 14, 2023

The space must be non-linear, as a consequence of the non-linear activation functions. A neural net is just a big math equation, and deeply embedded throughout are non-linear transforms which necessarily make the entire transform non-linear. Just like the presence of 1/x makes an equation no longer linear (at least with rare exception!). Here is a deeper explanation/visualization: https://colah.github.io/posts/2014-03-NN-Manifolds-Topology/

The three references don't seem to conclude that the latent space is linear. The first seems to mention "linear" only since they add an additional linear layer to project an image encoding into a generative language model. So I'm not sure this applies here, since the encoding itself is already richly complex by the time it is mapped.

The second and third are using a "linear probe" in an attempt to gain insights about each layer. This feeds the output of a given layer into a linear classifier that attempts to predict the correct output labels. This doesn't perform well in early layers, but it improves monotonically until the final layer is reached. The researchers conclude this happens entirely as a consequence of the final layer being a linear classifier. So, the features eventually must become linearly separable since that's what the network was trained to do.

This doesn't conclude that each layer's feature space is linear. Instead, they are just using a linear projection to examine how "easily" the net at that layer can make correct predictions. Even if a layer's output is decently predictive in this way, the actual representation could still be richer and contain additional information.

JoshuaDavid · on June 14, 2023

> So, the features eventually must become linearly separable since that's what the network was trained to do.

This clarifies things a lot, thanks!

jiggawatts · on June 14, 2023

Hilariously, the best explanations for how AI models like GPT work I got from GPT-4 itself!

There’s a lot of jargon and assumed shared knowledge in research papers that make them hard for a beginner to parse.

I’ve found GPT can “translate” and explain in context.

I know neither Python nor PyTorch, so I give it snippets and tell it to show me the Mathematica equivalent, which I can read.

mbm · on June 13, 2023

Yes of course, but let’s assume the goal is to be conversational in what’s changing/new at the level of a Marc Andreesen, without actually having to be Marc Andreesen. I work in the space as well, but because I’m not in the Valley, find it more challenging to always be abreast of the changes. While it might not be essential for the actual product engineering to know what’s new, as a founder talking to customers it’s highly beneficial to be on top of things (depending of course on your vertical).

predictabl3 · on June 14, 2023

I got at least one glance when the "ew what" came out audibly.

What does this even mean, anyway, I've kept up with bleeding edge tech from a 3g connection miles from civilization. Do Marcs farts convey some AI-hype-prescience that filters into the SV air that you're missing out on?

What a "producty" take too. Knowing the hype-y buzzwords is more important than your engineers knowing what is cutting edge? Idk, I'm not in sales, maybe I'm way off base.

PheonixPharts · on June 14, 2023

> at the level of a Marc Andreesen [sic], without actually having to be Marc Andreesen [sic].

Oh, I guess we are in a AI bubble then. There are so many wonderful thinkers right now in the AI space to aspire to be like, but the only reason to aspire to be like Marc Andreessen is because you want AI to be the next crypto.

Better find a good coat, since AI winter will be here sooner than I'd thought.

d4rkp4ttern · on June 14, 2023

AI/ML researcher + builder.

Mainly HN and ML subreddit, and a bunch of newsletters. My overarching goal is to completely stay away from Twitter (which I have stopped going to 2 months ago). I instead rely on the above to bubble up interesting things.

Also as others said, find something to build and work on and don’t keep looking sideways (aka Twitter, random news), just keep going. It can be discouraging and depressing to see that someone else is doing something similar. Instead go deep into what you’re doing and only once in a while check around.

Follow your own “train of thought”.

If you think back at the great advances in science or computing, deep work was done by people relentless pursuing their ideas — not bombarded constantly by “news”.

In the LLM era the barrier to build useful/interesting things has gotten very low, leading to a ton of distracting noise.

wanderingmind · on June 15, 2023

Sorry which ML subreddit do you follow? There are many when I search and most are pretty noisy

d4rkp4ttern · on June 15, 2023

Only MachineLearning was worthwhile but now it’s dark. But honestly the signal/noise is far better on HN even for ML related topics.

muzani · on June 14, 2023

Build stuff. There's a lot of untested stuff. There's a lot of research that goes nowhere, or things like Bard, which are really good, but will end up like Windows Mobile.

When you build stuff, it's clearer what the strengths and weaknesses are, and which innovations matter.

There was that hype with AutoGPT and similar iterative GPTs. But when AutoGPT is tested hard, it ends up getting confused on what to do, and the price tag ends up too much. It was a nice attempt, but didn't go too far.

Langchain was extremely good at first, but then the new GPT-3.5 API chat format just allowed you to include "memory" without needing a third party to track all of it.

If you follow most of the AI thought leaders, people are still talking about prompt engineering, but that's becoming more unnecessary with something like GPT-4, which can read your mind better than your spouse. A lot of people with experience are left behind because they overinvest in these things that are just transcended months later.

PaulHoule · on June 13, 2023

I leave it to YOShInOn, the smart RSS reader I have been dreaming about since 2005. YOShInOn ingested 2054 articles (not all about A.I.) and picked out 300 to show to me based on a transformer model. I make thumbs up, thumbs down or favorite jugements on the articles in an interface that looks like TikTok or Stumbleupon. One feed it ingests is

https://arxiv.org/list/cs/new

and it's learned that I really like anything about reccomendation systems, text classifiers and many kinds of deep network but that I'm not so interested in reinforcement learning, theoretical CS, etc.

I had a fight with it early on when it struggled to understand that I liked the NFL but not the Premier League but in the process of understanding why I read so many articles about football and I started thinking "Wow, they won that game 1-0 and it was an own goal" and wondering "How would I feel if it was my team that got relegated?" and before I knew it I cared if Man City or Arsenal came out on top...

nevon · on June 14, 2023

I'm really interested in your RSS reader - is it open source or available anywhere?

aerio · on June 14, 2023

I'm also interested. They've posted about their reader before, but didn't leave any links then either.

scottwick · on June 14, 2023

Me too! I've seen PaulHoule mention it a few times and it sounds really interesting.

mindcrime · on June 13, 2023

Twitter / Mastodon - I follow a lot of researchers and other people in this space.

(Perhaps surprisingly) Facebook - I'm in a few AI/ML/DL related groups, and there's one individual in particular who does a really good job of identifying interesting new papers and articles that show up and then submits the links or one or more of those groups.

Google News - mostly popsci articles show up here, but a few are of interest now and then.

HN - A few really interesting AI related links show up here from time to time.

/r/machinelearning on reddit

/r/artificial on reddit

Manually browsing ArXiv, JMLR, JAIR, and a handful of similar sites.

Email newsletters - I am subscribed to a few newsletters that surface interesting items

Youtube - some conferences put all (or most, or some) of their presentations up on Youtube. The AGI conference, for example, usually has their presentations there. NeurIPS also posts a lot (if not all) of their sessions. And so on...

mbm · on June 13, 2023

Thanks for your feedback! I wish there was something that could aggregate all this so it could be browsed quickly a couple times per week.

jmitcheson · on June 13, 2023

There is a newsletter called 'The Rundown' but it's more on the consumer side of gen AI; rather than hard academic research

https://www.therundown.ai/

tkgally · on June 14, 2023

Matt Wolfe’s Future Tools site is a good compilation of consumer-facing AI products and news:

https://www.futuretools.io/

https://www.futuretools.io/news

He also posts videos that I find informative:

https://youtube.com/@mreflow

I like the AI Explained video channel as well:

https://youtube.com/@ai-explained-

jetnew · on June 14, 2023

I am working on a LLM-powered newsletter (https://distilleddaily.news/) that aims to solve this exact problem for myself (and targeting ML engineers, data scientists and AI researchers).

I've only started last week and have launched among friends, and so only source HackerNews for now but will quickly include curated sources: Twitter popular users' tweets (@hwchase17, @hardmaru), popular blogs (@chipro, @lilianweng), popular research papers and more.

I'm currently prioritising features to build and so would love to hear your thoughts!

guiambros · on June 14, 2023

Answering from the POV of someone who works with AI, but not a researcher:

1) Matt Wolfe's YT channel https://www.youtube.com/@mreflow/videos (he has also the Future Tools site -- FutureTools.io)

2) Two Minute Papers YT channel https://www.youtube.com/@TwoMinutePapers/videos

3) The Neuron Daily newsletter https://www.theneurondaily.com/

4) From a business perspective -- Stratechery https://stratechery.com/ (not AI related, but naturally touches on what key players are doing in this area. Worth the subscription price)

5) On Twitter, here's a list of AI/ML/Math folks I've been collecting over the years https://twitter.com/i/lists/230704954

alexpls · on June 14, 2023

I try to direct as much news to RSS as possible, where I can group feeds by topic, and not get distracted by a constant compulsion to refresh the page (which I feel strongly on HN and Twitter). Some tools I use:

* https://www.freshrss.org/ for subscribing to feeds

* https://netnewswire.com/ for reading them

* https://hnrss.github.io/ to get Hacker News into RSS

* https://mailgrip.io/ (project I'm building) to get email newsletters into RSS

adides · on June 14, 2023

I run a weekly email newsletter that highlights the latest industry news (company announcements, regulations, relevant research, venture capital funding, and product launches). Link here: https://astrofeather.beehiiv.com

I also do deep dives (~700 - 800 words) on broader topics such as humanoid robots in the workforce, the recent U.S. Senate hearing on AI regulation, autonomous agents, and generative AI in advertising.

You can check out my latest issue, where I take a deeper look at generative AI in healthcare and the recent OpenAI defamation case, as well as a roundup of other recent events: https://astrofeather.beehiiv.com/p/improving-clinician-and-p....

Finally, I have a site called AstroFeather (https://www.astrofeather.com) that tracks and summarizes the latest AI news headlines on a daily basis.

malwrar · on June 13, 2023

I scrape arxiv once a day and have a small pipeline that tries to identify all AI-related submissions and summarize them. It’s hit-or-miss currently, but a.) the papers it finds are sometimes pretty interesting when I skim them to check the summaries and b.) it’s a great project to focus my AI learning on and I’m improving it all the time. One day I hope to get it good enough that it’s suitable to post to HN!

tin7in · on June 14, 2023

Mostly reading the OpenAI/Anthropic updates and following a few people that I find interesting on Twitter/Github/Newsletters. What also helps is that I build a small circle of people around me who are researchers/data scientists or are building things.

tmaly · on June 14, 2023

There are a few email newsletters that give different levels of updates on what is going on.

There is the sort of higher low-tech level, others get into new models and techniques at the research level.

Here are some I like:

* The AI Exchange * AI Brews * The Cognitive Revolution * The Gradient * Data Machina

theage · on June 14, 2023

https://www.emergentmind.com

https://www.ideamindai.com

https://news.bensbites.co

https://allainews.com

https://huggingface.co/papers

paulddraper · on June 13, 2023

muzani · on June 14, 2023

HN and Reddit is often a whole generation behind, but it's only by one generation.

quickthrower2 · on June 15, 2023

My current thinking is this:

* I work full time, not in AI, and don't have tonnes of free time. It is unlikely I will get to even Masters level, let alone PhD, unless something changes drastically. Therefore I don't need to go super deep into papers and suchlike.

* On the other hand I want some tactile understanding, so doing Andrej Kaparthy's course. Fiddling with some PyTorch, even if I know my model will be shittier than the cheapest of chips OpenAI models, just to get a feel.

^ From this I really got a sense of what an embedding is, and why we have those and not human picked features, for example. I feel this helps me understand stuff I read online a lot more easily. So speeds things up in general even if I am just hustling with various APIs from JS code or whatever.

* Once I understand how transformers work, roughly, I will move on to the FastAI course which I think will give me a broad but shallow view of lots of models, and sort of make the top part of the T shape.

* I really like microsoft's "guidance" api, so I will probably focus on building out quick apps using that to solve everyday problems. There is also YC company doing something similar called Glass.

* I will ignore the latest LLaMA or other fluffy animal derivatives. They come out at a crazy rate, all seem to run on different platforms, some Python, some C++, need a decent machine etc. I feel that is moving to fast to even keep up on the download it and try it cycle. Probably the most I will do is play with them in a HuggingFace interface.

* Generally will ignore all of that stuff you mention i.e.: press releases/research papers/news/blogs/library updates/videos and tweets. Too much of a barrage of information.

* Most important: find a problem to solve.

I see myself enjoying being an applier, a glue coder, rather than the person who keeps training model and writing math hoping that their idea will be the next breakthrough (sort of like an Edison I guess), but kudos to those who do it. At the same time I want to understand a little of what happens under the hood!

quickthrower2 · on June 16, 2023

Edit: I sounded harsh I think: I love that there are open source LLaMA derivatives, and am grateful people give their time to work on them. I just don't feel I have the capacity to keep up with that, but if they increase their rate of research by 100x and make better stuff than OpenAI I would be happy. At some point I might loop back into this world.

kai-dev · on June 15, 2023

Builder in the LLM space.

Arxiv + google + papers with code is how I do it. As another user pointed out, focus on solving your problem. By doing this, you'll search and then naturally find relevant papers. After a while, a lot of papers become super easy to read because they're conceptually easy to grasp. Once you hit this point it becomes extremely easy to skim the content that you're searching for. Once I identify something interesting in my problem space or adjacent then getting through it is the hard part.

TLDR: Focus on a problem domain and search within it. Don't just shotgun it. If its relevant enough to you it will show up. Use filters like HN and whatnot.

fjkdlsjflkds · on June 14, 2023

No mentions of https://arxiv-sanity-lite.com/ ?

nocodebcn · on June 14, 2023

Mainly https://news.bensbites.co

mbm · on June 15, 2023

Fantastic input, thanks all! Extremely helpful.

j_not_j · on June 14, 2023

TLDR newsletter at tldr.tech has an AI sub-list, which covers most of what I see elsewhere.

"grep" arxiv.org in the "computation and language" section for interesting articles. There is a lot going on. Some is lightweight but much is insightful.

wizzerking · on June 14, 2023

arxiv, paperswithcode, NeuroHive

nektro · on June 15, 2023

it inevitably comes across my feed because people wont shut up about it

TroyZ · on June 14, 2023

- Top ML papers of the week: https://github.com/dair-ai/ML-Papers-of-the-Week

- Daily Newsletter on AI: https://tldr.tech/ai

- Subreddit about open-source LLM advancements: https://www.reddit.com/r/LocalLLaMA/

d4rkp4ttern · on June 14, 2023

I used to rely on labml daily/weekly trending papers which was informed by Twitter trends, to help me stay out of Twitter. But sadly they recently stopped updating it due to Twitter API costs.

https://papers.labml.ai/papers/weekly/