Show HN: Find AI – Perplexity Meets LinkedIn

ben_jones · 2024-06-26T19:44:04 1719431044

Hi Philip! This tool looks amazing!

My company specializes in selling to underserved communities in the AMEA region. Can your product help me connect with LGBT customers specifically in the Sudan and Saudi Arabia? Thanks!

PS can I get their exact addresses as well as the addresses of their family members and community leaders?

johndhi · 2024-06-26T20:11:04 1719432664

Is this OP's problem or LinkedIn's problem? If he's just searching their data, can he really be blamed for making that work better?

jsemrau · 2024-06-26T19:55:25 1719431725

This shouldn't be downvoted as it raises an important point about privacy and safety.

philip1209 · 2024-06-26T17:45:32 1719423932

Thanks everybody for trying this out. I just looked at our logs and we're doing >2k requests per minute to OpenAI right now.

Free users only get partial search results. If there are any you want to see run to completion, reply here or email me and I'll mark it to run to completion. (The code PRODUCTHUNT is also available this week for a free month of access).

Palmik · 2024-06-26T19:37:50 1719430670

Congrats on the traffic, I hope you get a high conversion rate :)

I assume you do more than one GPT call per search, e.g. rewriting user's query into multiple search queries, summarization of the results, etc.?

philip1209 · 2024-06-26T22:40:49 1719441649

It's complex. We make a lot of GPT calls. Lots of them. And, for lots of different purposes.

jsemrau · 2024-06-27T03:53:16 1719460396

How are you handling the unit costs, rate limits, and model risk?

philip1209 · 2024-06-27T12:22:23 1719490943

That’s a deep question. Under the hood we use some LLM gateways like Helicone and https://usevelvet.com to track cost per request and optimize it.

philip1209 · 2024-06-26T18:26:38 1719426398

Update: >5k requests per minute to OpenAI right now

esafak · 2024-06-26T18:57:40 1719428260

Don't forget to rate limit every end point.

philip1209 · 2024-06-26T19:02:43 1719428563

Yup, we have some decent safeguards in place. It’s technically difficult to allow logged out users to try the product, but I think people appreciate it.

7thpower · 2024-06-26T19:13:20 1719429200

> I started building Find AI to make it easier to search for people. I initially started just having GPT review people's LinkedIn profiles and websites, but it cost thousands of dollars per search (!).

Can you help me connect to what was costing >$1k/search or is that hyperbole? Genuinely interested, not patronizing.

nielsole · 2024-06-26T19:29:36 1719430176

I suppose every search was passing all indexed documents into GPT asking for a rating

philip1209 · 2024-06-26T19:32:20 1719430340

Yes, essentially this. Massive contexts and massive numbers of documents.

webappguy · 2024-06-26T20:51:04 1719435064

Cool but after waiting 2 mins for a one sentence prompt I got;

We have analyzed 1681 candidates and found 0 records matching the search criteria. The search was initiated 2 minutes ago and took 1 minute and 42 seconds to complete.

philip1209 · 2024-06-26T22:30:20 1719441020

That's not a good experience - sorry! What was the prompt? Feel free to reply it here or email and I can look into it.

We're starting a retro on the thousands of searches people ran today, and will tweak the system based on the results. But, an early takeaway is that some searches failed when people applied a filter that the system doesn't understand.

For example, searching "Find AI startups with 50-100 employees" returns 0 results because Find AI doesn't know headcounts yet. (We'll work on that, though).

webappguy · 2024-06-26T20:53:49 1719435229

And then a second more simple Query ‘find me people who have posted on Mamba architecture and might be looking for jobs’ and got;

We have analyzed 695 candidates and found 0 records matching the search criteria. The search was initiated 1 minute ago and took 50 seconds to complete.

philip1209 · 2024-06-26T22:31:53 1719441113

I'd recommend going broader on this one, looking for just people who know Mamba architecture.

Find AI is built right now so that people exist only within a company. So, we don't have profiles or index people as individuals - just as employees. That probably made your search hard, because most employees aren't advertising that they are looking for a job (especially on the data sources we use).

Matticus_Rex · 2024-06-26T17:47:06 1719424026

Trying to get a sense of whether the results really are good by testing it on a query I basically know the answers to for my niche, but it looks like to get more than 3 results I've got to join the $39/mo plan. Is that the case?

philip1209 · 2024-06-26T17:48:34 1719424114

Yes, but if you share or email me the link I'll redo the search run to completion.

There's a pretty high cost to run each search. So, even offering partial searches to logged-out users is pretty expensive for us.

evashang · 2024-06-26T16:13:17 1719418397

When are you going to add non-tech companies? I'm constantly scanning bios like this for lawyers, especially on what types of law they practice and what types of cases they've done in the past.

philip1209 · 2024-06-26T16:17:18 1719418638

We're mainly focused on the tech vertical right now, but are going to expand to a second vertical soon - and law is one we've been considering.

We use public data sources, so Find AI works best in industries where people want to be found. And, lawyers spend a lot of time building websites and profiles - so I think it would work really well with our data model.

I'll follow up with you once we add lawyers!

marigoldwhale · 2024-06-26T17:35:05 1719423305

What are some of the other verticals you're considering. What types of data sources are you using?

philip1209 · 2024-06-26T17:37:17 1719423437

> What are some of the other verticals you're considering?

Thinking of VCs and finance next, because lots of the user so far have been customers. Law is high on the list. Healthcare has been something are interested in, too.

> What types of data sources are you using?

It's all scraping and LLMs under the hood. Nothing secret, but there's some surprising sophistication to how it works. We have ~50k companies and ~100k people in the database right now, and are trying to 10x that over the next month.

marigoldwhale · 2024-06-26T17:51:29 1719424289

Thanks for the response. I think Tech/VCs/Finance are all pretty well correlated, you might want to consider branching out of that bubble of the world IMO.

evashang · 2024-06-26T17:58:54 1719424734

agree with this - other industries like finance and law are much more greenfield

armcat · 2024-06-27T08:00:41 1719475241

Amazing concept! I am a heavy user of such tools, and typically build my own bespoke search. I tried out Find AI across a number of different queries, and I think the general issue here is one of coverage. For example, when searching for "all people with PhD", it only analyzes 1700 candidates (for reference, around 2% of US population alone, holds a PhD).

Also - do you intend on providing an API for this tool, e.g. for enterprise clients?

philip1209 · 2024-06-27T12:23:37 1719491017

Yes, the data is limited. We plan to 10x it over the next month.

And, we have had some people request API access today so we are discussing it. (If you’re also interested, please email me.)

neilv · 2024-06-26T21:24:13 1719437053

The examples start out looking like recruiting, and heavy on the usual school obsession (MIT, Stanford, Harvard), with no improvement over existing simple queries that every bottom-end sourcer is doing.

Ideally, smarter tech will let us get closer to what we're really trying to do like "Find me a person, who I can hire, who will do great work at responsibilities X, Y, and Z."

akritrime · 2024-06-26T21:47:39 1719438459

So what's the AI part in this? (And I am not being snarky or condescending here, everyone assumes that's the default on internet. I have more or less missed the AI wave and just playing catchup.) Are you indexing LinkedIn API and feeding it to an algorithm? Or are you converting a natural language query to a database query using an LLM?

philip1209 · 2024-06-26T22:37:18 1719441438

Find AI is good at meandering or exploratory searches. "Companies that might sell to <X>", "People that might become future founders based on <X>". That's essentially what the AI is doing.

Traditional search engines are built for lookups, and don't handle these more subjective or natural language queries as well.

jrussino · 2024-06-26T20:20:22 1719433222

Confused by these numbers:

Software engineers Search completed: less than a minute ago • 1792 candidates analyzed • stopped after 53 matches found

We have analyzed 84 candidates and found 31 records matching the search criteria. The search was initiated 1 minute ago and took 30 seconds to complete.

philip1209 · 2024-06-26T22:43:24 1719441804

That's a bug. We'll take a look - thanks.

Long story short is that some of this analysis is async. We do a lot of parallelization to make it fast. But, we limit results for free searches to keep costs low. But, sometimes extra matches keep getting found after we try to stop free searches.

robertlagrant · 2024-06-26T19:59:31 1719431971

This seems like a great idea. Organisations pay a fortune for researchers to find executives. I think you're on to something.

Now how do I pay to be at the top of the "executives I should pay a fortune for" list (-:

voiceblue · 2024-06-26T22:26:08 1719440768

Reminds me of Buildspace [0].

[0] https://sage.buildspace.so/chat/i-need-help-with-my-idea

mritchie712 · 2024-06-26T17:49:54 1719424194

What vectordb are you using?

guessing you're just slammed with traffic right now, but it says it searched 1.4k records and it took over 2 minutes. Should be able to run it subsecond.

philip1209 · 2024-06-26T17:52:17 1719424337

We're using PGVector.

The way it works is that we use some heuristics to find candidates for your search. That's the 1.4k number you see - and that does take milliseconds. Then, we go through and analyze each candidate individually with an LLM. So, that's 2 minutes that it ran ~1.4k calls to OpenAI.

janalsncm · 2024-06-26T18:30:42 1719426642

Typically with an LLM you would tokenize strings in a batch and use attention masks to run inference in parallel.

OpenAI must have some similar capability. Looks like they have a batch API.

philip1209 · 2024-06-26T18:33:22 1719426802

Yes, but the batch API takes up to 24 hours to respond. We use it, but not for user-facing search queries.

rahimnathwani · 2024-06-26T18:53:39 1719428019

I don't think janalalcm is talking about the batch API, but just putting multiple profiles in a single API call.

If your prompt is short, it won't make much difference to the cost. But if your prompt is, say, about the same length as a single profile, you could save almost half your inference costs by analysing ten at a time.

philip1209 · 2024-06-26T19:04:35 1719428675

Ah I’ll look into that. Thanks.

alach11 · 2024-06-26T23:42:36 1719445356

I don't believe OpenAI supports batch calls for inference... only for embeddings. If you're interested in cost optimization, however, you're likely better off using Claude 3.5 Sonnet (as a stand-in for gpt-4o) or Claude 3 Haiku (as a stand-in for gpt-3.5-turbo).

johndhi · 2024-06-26T20:09:44 1719432584

I'm actually trying to hire someone with a complex situation right now, but looks like the site is getting the HN hug? Will try tomorrow.

philip1209 · 2024-06-26T20:11:21 1719432681

Ah, please try again. Some intermittent errors but not total outage.

rco8786 · 2024-06-27T23:13:28 1719530008

This sort of stuff is a great use case match for AI, where even a human would end up with imperfect results.

ilrwbwrkhv · 2024-06-26T20:59:14 1719435554

"LinkedIn-type data". What does this mean? Are you scraping LinkedIn? How fresh is the data?

whiplash451 · 2024-06-26T21:40:30 1719438030

Great stuff! If a query returns zero results, would it return non-zero results with the paid plan?

philip1209 · 2024-06-26T22:42:11 1719441731

No, but on the paid plan you can mark it as "live-updating", and you'd get alerts when new matches happen.

We're doing a postmortem on all the searches with 0 results, but please email me what you are looking for and I can give a better answer.

rvba · 2024-06-26T20:17:36 1719433056

So can I pay you, so you set me as a top candidate for any CFO job?

Or we pay you to even be in the results?

mbrain · 2024-06-26T18:50:52 1719427852

Interesting concept. Just wondering, how does it stay updated with the latest info?

philip1209 · 2024-06-26T18:54:05 1719428045

Honestly, the platform is so new that we haven't started working on that problem yet. It is an addressable problem based on our stack, though.

howon92 · 2024-06-26T20:33:30 1719434010

Where does the data come from?

philip1209 · 2024-06-26T22:47:35 1719442055

Scraping

neilv · 2024-06-26T21:40:07 1719438007

Multiple examples are querying for "female" (which could be fine), and this prompted a thought...

What happens when a customer is searching for hiring purposes, and searches specifically for "male"? Or "young", or "unmarried", or "childless", or "straight", or "white", or "non-disabled", or "non-veteran"?

The data is out there, and is bought and sold heavily. (You mention that dog ownership is something you query over.)

What queries are you going to permit, and what not?

Even with current tools, it seems a lot of people people do this casually. Besides the many biases that people will openly admit on HN, I'm reminded of when someone told me to use one of the popular hiring sites to filter out candidates who weren't in early/mid-20s. (They spoke of it as if it was clever to use graduation year, since the site didn't let you filter by age directly.) Aaaannnndddd... the hiring sites surely have that search history information that recruiters and hiring managers for numerous employers are doing, unless they're intentionally discarding it against all data-appetite industry convention, so should be easy fodder for some energetic regulators/lawyers.

hidelooktropic · 2024-06-26T17:53:08 1719424388

Search was "Find companies that are looking for designers with a print background."

Result was: "We have analyzed 1072 candidates and found 0 records matching the search criteria."

Why would you analyze _any_ candidates if the query was to find companies?

philip1209 · 2024-06-26T17:55:50 1719424550

We have a database of ~50k companies right now. When you initiate a search, we identify "candidates" out of there to analyze for you. So, "candidate" is more of an internal word - we can update the designs to make that clearer. Then, we go through each candidate one-by-one to find matches.

mcmcmc · 2024-06-26T20:58:04 1719435484

Why use this over LinkedIn Sales Navigator? Zoominfo/DiscoverOrg? How much time have you actually spent prospecting? If you have to ask vague questions to find your prospects, you probably don't know your ICP and need to refine your GTM strategy

After a little research I'd be frankly surprised if this product ever made back the ~6M in funding you guys have. The whole bet is predicated on the incumbents not adding the most basic of AI features

toomuchtodo · 2024-06-26T17:37:47 1719423467

How do you opt your LinkedIn or other personal info out?

philip1209 · 2024-06-26T17:41:36 1719423696

Send an email to the support email at the bottom of the page.

Like Google, all the data here is from public sources - we're just indexing it.

ThrowawayR2 · 2024-06-26T17:51:20 1719424280

Are people on LinkedIn with profiles set to private indexed?

philip1209 · 2024-06-26T17:58:24 1719424704

We don't scrape LinkedIn.

We use some basic tools to infer the LinkedIn profile link for each person's page (e.g. [1]), but we don't actually scrape linkedin.

[1] https://usefind.ai/companies/contraption-company/people/phil...

lolinder · 2024-06-26T18:16:13 1719425773

> we don't actually scrape linkedin

LinkedIn intentionally made it basically impossible to do so after they lost hiQ Labs v. LinkedIn [0], so this is generally a good assumption for any product.

[0] https://en.wikipedia.org/wiki/HiQ_Labs_v._LinkedIn

derek12345derek · 2024-06-26T18:13:41 1719425621

Can you please elaborate on this? If you don't scrape LI, where do the profile details come from (once you inferred the URL)? Is this publicly available data that can be bought as a bundle? Or is there any LI API that allows you to retrieve the profiles? thanks!

leobg · 2024-06-26T19:56:53 1719431813

You are using data from that LinkedIn leak from a while ago?

philip1209 · 2024-06-26T23:27:37 1719444457

KostyaNst · 2024-06-28T15:57:28 1719590248

ran a couple searches and this looks good! any plans for outbound? i'm using luna ai right now

philip1209 · 2024-06-28T18:32:14 1719599534

Yes, definitely. Send us an email and we'll give you beta access - we're building it right now.

abrichr · 2024-06-26T17:52:43 1719424363

> We're sorry, but something went wrong.

> If you are the application owner check the logs for more information.

Console:

cable_stream_source_element.js:22

       GET https://usefind.ai/searches/are-unhappy-with-robotic-process-automation-tools 500 (Internal Server Error)

philip1209 · 2024-06-26T17:53:20 1719424400

Sorry, under some serious load right now. Just increased all the server sizes and boosted the DB size.

Can you reload or try again?

abrichr · 2024-06-26T17:57:45 1719424665

Thanks! Unfortunately same result

johndhi · 2024-06-26T18:34:32 1719426872

brilliant idea!