Show HN: I built a LLM-powered Ask HN: like Perplexity, but for HN comments

sprobertson · on May 16, 2024

Well done. Matching the HN style with Tailwind is a nice touch.

One thing I've been wanting for with search in general, especially LLM-powered, is having some kind of date relevance - especially with the fast moving world of technology.

For example I want to know how ProductX and ProductY compare. Last year ProductY didn't have FeatureZ, but they implemented and announced it last month. There might be several comments lamenting the lack of FeatureZ from 2 months ago, but they shouldn't be considered with the same weight now that it does exist.

I don't have any ideas for how this should be done but it's something I'd like to see tackled in RAG systems in general, and wanted to put it out there.

jnnnthnn · on May 17, 2024

Yeah, that'd be a really nice thing to have! My current (and naive) approach is to just limit to the last 3 years of data. That said, it should be easy to add a date filter as a first step, and potentially have a more ingenious approach for letting the LLM reconcile contradictory statements based on the time at which they were posted.

jsunderland323 · on May 17, 2024

My UX feedback is that I had a natural inclination to click on one of the response comments to my query and expected to be redirected to that hn post and if possible scrolled to the comment. I know why you might not want that but as long as you’re using query Params and not doing anything weird with browser history I’d almost certainly toggle back.

I could definitely use this as an alternative to hn algolia for some things but it’s going to be hard for me to remember. I’d recommend doing this as a browser extension or something so it’s not 100% lost in the void.

jnnnthnn · on May 17, 2024

Thanks for the feedback! That feature is there, just not discoverable enough!

Try clicking on the relative time (e.g. "over 1 year ago"). Should take you immediately to that comment on HN, and you can then hit back in your browser to return to Hacker Search.

Agreed on the extension. That's coming next!

Valk3_ · on May 19, 2024

I'm kind of late to this post, but really awesome initiative and well executed project! Thank you for bringing this to people!

I tested it a bit and it seems pretty decent, although for some really niche theoretical questions it wasn't successful in retrieving the answers I wanted even if alot of the results were really good in other aspects. It could simply be because the answer is not available anywhere in hackernews.

I'm wondering if someone were to build a similar project but for other sites, what would your advice be? For instance what technical difficulties did you stumble on that you think would be good to be aware of?

Thanks in advance and once again congratulations on the project!

jnnnthnn · on May 19, 2024

Thanks for the kind words!

Yes, the underlying dataset very much conditions the quality of the responses. Additionally, the retrieval strategy is also a really important factor (and that is something which I haven't had time to extensively optimize).

I'm writing a blog post that will answer your questions! Will post it here when it's fully baked.

Valk3_ · on May 20, 2024

Awesome, looking forward to it!

jnnnthnn · on May 22, 2024

Blog post is now live: - https://news.ycombinator.com/item?id=40442039 - https://jnnnthnn.com/how-to-build-your-own-perplexity-for-an...

r0fl · on May 17, 2024

I tried a few different searches and it has blown me away at how good the summaries are! Incredible execution!

Personally I like that it doesn’t look like a cookie cutter 2024 templates site and has a more raw feel but I’m not sure that’s the best way to gain mass users these days.

10/10 for me

I’m impressed and most of the time I’m disappointed in the show hn section

jnnnthnn · on May 17, 2024

I’m so glad you like it! Thank you for taking the time to write this kind note.

coloneltcb · on May 16, 2024

This is very cool. I wouldn't make buying decisions off of this, but it is a good starting point to get a pulse on the developer zeitgeist on any given topic.

joshi4 · on May 16, 2024

I think you hit the nail on the head when you say, "get a pulse on the developer zeitgeist on a particular topic." I really enjoyed using it to get a sense of what developers feel about a particular topic. I also learned about new products I wasn't aware of.

jnnnthnn · on May 16, 2024

Yay! Thank you for trying it out, so glad it's been useful!

netsroht · on May 17, 2024

Always cool seeing stuff in this space.

Regarding "zeitgeist", about a year ago I built something similar called https://zeitgaist.ai which also incorporates other sources like Mastodon, Bluesky, some subreddits etc.

jnnnthnn · on May 16, 2024

Thank you for trying it out!

aritraghosh007 · on May 17, 2024

Very cool application of LLMs to supercharge the HN search experience. I’m bookmarking it!

jnnnthnn · on May 17, 2024

Thank you!

spikey_sanju · on May 16, 2024

Well done! I was also working on something similar and created a POC, but this is super nice.

jnnnthnn · on May 16, 2024

Thank you! And... got a link to share? :)

jnnnthnn · on May 16, 2024

I got some feedback that GPT-3.5 was letting people down so improved my caching strategy and defaulted everything to GPT-4o!

sebastiennight · on May 17, 2024

Great to hear! Would you be willing to share a bit on what changes you made to your caching strategy to optimize costs?

jnnnthnn · on May 17, 2024

I made the caching window substantially longer (2 days instead of a few minutes) upon noticing most queries weren't about current events, and fixed a bug in the cache resolution logic that'd lead to a substantial number of cache misses.

sebastiennight · on May 18, 2024

Cool, thanks!

gsuuon · on May 17, 2024

This is awesome! The comment filtering feature is a nice touch. Do I need to sign up to try local inferencing?

jnnnthnn · on May 17, 2024

No. It will get offered to you once you exceed the monthly GPT-4o limit (currently 5 to avoid breaking the bank).

GPT-4o outperforms all local models so I figured using it as a fallback was the right approach :)

J_Shelby_J · on May 18, 2024

Hey, great work! Definitely will look forward to using.

Can I ask what the tech stack is? Is it open source?

jnnnthnn · on May 18, 2024

Hey! Not open source, largely because I don't have time to make it good enough for me to feel comfortable sharing what's otherwise pretty scrappy code, but I'm planning a blog post detailing how it was built.

Tech stack is https://news.ycombinator.com/item?id=40238913 + the addition of turbopuffer for the new functionality.

jnnnthnn · on May 22, 2024

Blog post is now live: - https://news.ycombinator.com/item?id=40442039 - https://jnnnthnn.com/how-to-build-your-own-perplexity-for-an...

retox · on May 16, 2024

From the YC legal page

>Except as expressly authorized by Y Combinator, you agree not to modify, copy, frame, scrape, rent, lease, loan, sell, distribute or create derivative works based on the Site or the Site Content, in whole or in part,

>In connection with your use of the Site you will not engage in or use any data mining, robots, scraping or similar data gathering or extraction methods.

Who owns the posts that you are trying to profit from?

jnnnthnn · on May 17, 2024

Hey! This leverages the official HN API (https://github.com/HackerNews/API), no scraping involved. I don't think it's my place to opine on "who owns the posts".

retox · on May 17, 2024

Thanks, the site is blackholed here at the office.

pgillian · on May 16, 2024

this is cool; i like that you used it continuously while building it/ built it for your own needs. are there any kinds of searches it does particularly well at? any fun hot takes come up in your summarizers?

jnnnthnn · on May 16, 2024

I feel that it performs really well on queries like those on the landing page, that generally have to do with understanding HN's sentiment about something or finding resources to learn about a topic.

As to your second question, someone looked up “What are some famous Google office pranks?” (https://hackersearch.net/ask?q=What%20are%20some%20famous%20...?) earlier today and I found some gold in there. Some of the hot takes on Gary Marcus have cracked me up too (https://hackersearch.net/ask?q=what%20do%20you%20think%20of%...?).

maguito · on May 17, 2024

I dig! Something like this would be cool to see for reddit too

jnnnthnn · on May 19, 2024

Perplexity offers a Reddit "focus" which does exactly that! https://www.perplexity.ai

android521 · on May 19, 2024

this is cool. which tech stacks do you use.Is it open source?

jnnnthnn · on May 19, 2024

Hey! Thank you! See https://news.ycombinator.com/item?id=40402935

testmasterflex · on May 19, 2024

jnnnthnn · on May 19, 2024

Thanks for flagging! Actively debugging, looks like one of my DB vendors is having some uptime issues. If you retry a couple times you should (eventually) get lucky.