Wit.ai (YC W14) raises $3M seed round from Andreessen Horowitz

brandonb · on Oct 15, 2014

Congrats on the round!

I'm curious how you differentiate yourself from the built-in speech APIs on iOS and Android? https://developer.apple.com/library/mac/documentation/Cocoa/... http://developer.android.com/reference/android/speech/Speech... http://developer.android.com/reference/android/speech/Recogn...

I'm a little biased (I used to work on the Google speech team), but it seems very hard for a startup to compete on the basis of accuracy, and for wearables like watches it's pretty clear both Google and Apple are putting third-party APIs for voice interfaces (including command-like syntaxes) front-and-center. A lot of earlier speech/NLP startups have struggled with this dynamic--although an aggressive, well-executing team can get a year or so ahead of the platform, if you do something too close its core competency, eventually Google/Apple will build the same feature directly into the operating system, and then you're stuck competing with a team of 100+ PhDs with a 1000x distribution advantage. At least, that's what would give me hesitation about building a speech/NLP API startup in 2014.

I also noticed you're running a conference on voice interfaces (http://listen.ai/). I'm not sure if you're well-connected to the speech folks at Google/Microsoft/Apple, but if you decide you want somebody from Google to speak, I'd be happy to ping some of my former colleagues on your behalf. Looking at the agenda, I think the areas where they could provide coverage is the core technology--acoustic modeling, deep learning, hotword detection, or embedded recognition.

ar7hur · on Oct 15, 2014

Thanks for these excellent remarks.

We differentiate ourselves from the Android Speech API in several ways:

1) As a developer, with Google you have no way to customize the speech engine by providing your language model. Wit.ai builds a specific, customized configuration for each app. If your app is specific and you cannot tell Google what kind of thing it should expect, accuracy will be bad especially in noisy environments. Wit.ai builds a specific language model for each app automatically and in real time (each time Wit.ai learns more examples about your app, it's updated), and queries several speech engines in parallel. To do that it uses not only your data, but also relevant data from the community. This is the core of our value proposition and not something Google does provide today.

2) Google keep their Natural Language Understanding layer (what translates text to structured, actionable data) for themselves. Developers cannot access this. They're left with free text, but they often need actionable data.

3) Wit.ai is cross-platform. We have SDKs for iOS, Android, Linux, etc. [1] or you can just stream raw audio to our API. Android Speech API is just available on Android (well, you could hack it and use it from elsewhere but you're not supposed to, and you can be shut down anytime). More and more wearables and smart devices will run Linux. For instance hundreds of developers use Wit.ai on Raspberry Pi.

As for the Apple doc you linked, it's Mac only (no iOS) + it just recognizes a few phrases you provide in advance. I think it's a very old API that's still here :)

Regarding listen.ai, yes please we would love to have Google (especially Now) there. We have the Siri founder, the top Cortana guy, the former CEO of Nuance... but nobody from Google yet.

[1] https://wit.ai/docs

mtrimpe · on Oct 15, 2014

Having had RSI for a while I can't tell you how much I've wished for interfaces with point(/look) and speak UI. I literally haven't found a single case in which I couldn't quickly dream up a superior version of an existing UI.

Overall I came away with the conclusion that look-and-speak is probably the most deeply ingrained user interface there is. Perhaps the only one that you could argue is truly intuitive, as it seems to be genetically hard-wired.

On top of that modelling UIs as representation of hierarchical state machines is an astoundingly simple and elegant way to model such UIs; even allowing you to leverage persistent data structures to do amazing things. I've explore that to some degree in https://speakerdeck.com/mtrimpe/graphel-the-meaning-of-an-im...

ckrailo · on Oct 15, 2014

Any chance listen.ai will either be livestreamed or videos made available later (a la confreaks or similar)? I can't make it, but I'm super interested in ALL of this and really, really want to learn.

_jt2r · on Oct 16, 2014

+1, also really interested.

brandonb · on Oct 15, 2014

Cool! Send me an email—the address is on my profile.

untog · on Oct 15, 2014

As far as I can see, either Speech API provides a backend NLP interface like Wit does. Wit lets me build my own data set, custom phrases, etc. etc.

7Figures2Commas · on Oct 15, 2014

A week ago Mark Suster wrote an interesting article about the definition of a "seed round":

> If it looks like an A-round, smells like an A-round & tastes like an A-round … it’s an A-round. My personal definition? It is less about actual money and more about structure of your Cap Table. If you have raised $2-4 million from a bunch of high-net-worth individuals I simply don’t see it as an A-round. If you raised $2 million from two small seed funds I probably don’t either (although in the past I would have). But if you raised $3-5 million from well-known seed funds or from a VC and you’re asking for $8-10 million in your next round … that next round is a B-round no matter what we collectively decide to call it when we VCs fund you.

ar7hur · on Oct 15, 2014

My personal definition of a seed round is a round where you don't give up any board seat or special power to investors. After a seed round you should basically work as usual (product, users, product, users, ... nothing else). By this definition, our round qualifies as seed. Managing a board takes time and energy and the more you can delay this, the better (from my experience).

That being said, this is a very subjective notion and everybody is free to have their own.

SimonDawlat · on Oct 15, 2014

I think this piece you're putting here shows quite well how much the definition of a round doesn't matter to anyone but VCs. Important stuff is: awesome product!

sharkweek · on Oct 15, 2014

Broke down a bunch of data on 11,320 seed rounds and analyzed - Definitely more and more companies creeping into that "$3M+ seed round group"

http://blog.pitchbook.com/adding-data-to-the-latest-vc-debat...

on Oct 15, 2014

[deleted]

joshu · on Oct 15, 2014

No: usually convertible notes or priced preferred. I've only bought common once or twice out of ~90 angel investments.

baudehlo · on Oct 16, 2014

Can you explain what difference it makes whether it's A or "seed"? Surely ultimately it just matters what percentages people take.

otoburb · on Oct 15, 2014

Congrats to the Wit.ai team! Not sure there are any other companies laser focused like this (NLP + IoT).

@ar7hur The pricing page[1] shows that the Community (free) plan allows unlimited queries, but the Starter plan is limited to 250 queries per day.

Did you mean that unlimited queries are still allowable to any open instances, while the query limit is only restricted to the three private Wit instances? If so, I recommend another footnote on your pricing page to clarify this distinction.

[1] https://wit.ai/pricing

ar7hur · on Oct 15, 2014

Yes open instances are free and unlimited. This is the cornerstone of our approach: we want developers to work together and share their training data. Natural Language is very hard and we need to join forces to crack it.

Thanks for the feedback, we'll try to make this easier to understand on the pricing page (yeah, natural language generation is hard for humans, too!)

otoburb · on Oct 15, 2014

Awesome thanks! Also, the other ambiguity is whether the private instance query limits are per instance, or total aggregate across all private instances.

l5t · on Oct 15, 2014

Great feedback again, the query limit is the total aggregate across all the private instances of a user

zytek · on Oct 15, 2014

Fun fact: "Wit.ai" can be pronounced just like "Witaj" in Polish, which means "Welcome/Hello". Dunno if this is intentional or even acknowledged by founders. ;-)

sfilmeyer · on Oct 15, 2014

One of the founders is named Laurent Landowski. Could this be a Polish name? Polish also looks to be in beta for them, which is one of only a handful of languages they support.

warrenmcwin · on Oct 15, 2014

I've met him -- he and three other members of the team are French!

sfilmeyer · on Oct 16, 2014

I guess the Laurent should have given that away, but I figured "Landowski" sounded a bit Polish.

inanutshellus · on Oct 16, 2014

I came here looking for how to pronounce "Wit.ai". I'd decided on "Witty", as read by Forrest Gump. "Witaeeee!"

lukasm · on Oct 15, 2014

My thoughts exactly. "witai" also is a word in czech, so maybe a reference to etymology of the word "robot".

chambo622 · on Oct 15, 2014

Wow, congrats on the round! Looks like an amazing service. Will be trying it soon for a project I'm working on, Android speech APIs aren't quite cutting it.

Can this be run continuously from a Service on Android? Didn't see a mention of it in your docs, but I've yet to play with it.

Jonovono · on Oct 15, 2014

Nice! I have been telling everyone since I found out about Wit several months ago how awesome it is. Excited to see what they offer in the future!

ssteinb · on Oct 15, 2014

Amazing. Perfect example of when Peter Thiel says that the best investments are in companies not easily categorized. That test works here imo

kolencherry · on Oct 15, 2014

The guys at Wit are awesome and are a great group to work with. Definitely glad to see them raise that round. Congrats!

danieldelouya · on Oct 15, 2014

Congrats! Great job!