Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
IBM halting sales of Watson AI tool for drug discovery (statnews.com)
257 points by hprotagonist on April 18, 2019 | hide | past | favorite | 169 comments



I have a friend who is an OB/GYN Oncologist in Indianapolis. Her hospital had IBM Watson Health on campus last year, and she mentioned to me in passing last month that they finally had kicked IBM off of their (learning) university's campus. When I asked her why, she said: "Often times, Watson would recommend courses of treatment that would be completely incorrect, if not detrimental -- or even sometimes lethal -- to the patient. It became more of a hassle than a learning tool."


These are only a few data points in a almost half decade of promising for healthcare solutions and failing to meet those expectations.

The MD Anderson audit is particularly bad: https://www.utsystem.edu/sites/default/files/documents/UT%20...


Wow, they didn't waste any time in that executive summary. Those bullet points are insane on their own.


Even ignoring all the lack of deliverables, just the amount of financial trickery and lack of project oversight by the way money was pumped into it via philanthropy (instead of primary funding channels) which meant it bypassed multiple checks and balances, including from anyone knowledgeable of software development.

Sounds like every other failed $50+ million government software project ever, which IBM is apparently an expert at. Except this money donated by private citizens was redirected from other cancer research projects and siphoned into a billion dollar company’s coffers.

Really shameful stuff.


Reminds me of this:

https://techcrunch.com/2009/09/02/netbase-thinks-you-can-get...

... granted this happened a decade ago:

> "Several of our readers tested out the site and found that healthBase’s semantic search engine has some major glitches (see the comments). One of the most unfortunate examples is when you type in a search for “AIDS,” one of the listed causes of the disease is “Jew.”


This reminds me of a bit in the first The Expanse book I read recently. The onboard medical computer had to be overridden because it calculated coldly that the correct course of treatment was palliative care.


Given the context of that scene, it was more an indication of how much characters A and B had been affected by the situation.


Yeah. Not trying to draw a parallel. Just reminded me of sci fi.


...in the TV adaptation anyway, it is one of the best bits of dark humor.


It's good we still have doctors to provide oversight for this sort of thing. Imagine a future where machines take over because they are thought to be smarter than doctors and then start killing people and their mistakes are written off because the people were already sick. Like an Elizabeth Holmes AI.


Does it make a difference if they preform better than human docs? The same thing happens now with human docs.


The difference is the expectation of fallibility with human docs. Which is why checks are built into the system.

I have zero trust in IBM to market their ML products correctly so that proper checks are maintained.

Especially since technically, explainability is still an active area of ML research.


>I have zero trust in IBM to market their ML products correctly so that proper checks are maintained.

That's a problem, yes, but it's not a new one. Vendor management has been around for decades, centuries, millennia maybe. If I contract out part of my job, it's still my responsibility to make sure the contractors are doing their job right. "But their marketing said..." or "but their sales guys said..." is not an excuse and everyone knows that.

Doctors noticing that Watson is wrong is expected. Doctors missing the fact that Waston is wrong is a failure of that doctor and the doctor who didn't check the results is the responsible party. The checks don't come from Watson, the checks come from humans who oversee Watson.

If Watson is wrong often enough that it's hindering the doctors, then kicking it out is the right call. But there can never be an argument of "Watson got the diagnosis wrong and that's why the patient died" because ultimately IBM is still just a vendor and Watson is still just a contractor.


The difference is like:like vs like:unlike, and seems to be one of the more dangerous ML application challenges.

If I as a medical provider hire a remote vendor, who has medical teams in India look over initial results to flag issues, those humans will fail in human ways. I can anticipate that: I'm a human.

If I use a similar ML product, it's very difficult for me to anticipate (or even understand) the ways it which it might / does fail. Which makes it unlike my previous experience. Which gives it a fundamentally different risk profile.

It's the Boeing issue in a nutshell: the failure case that unfolded was unlike the scenarios the pilots were trained for. Unfortunately, in the two crashes they were unable to dynamically RCA quickly enough to solve the problem.

My point was that coupled with IBM's inept and inaccurate marketing, it seems unlikely the appropriate risk information is in the hands of those responsible for managing risk.

And honestly, if a system has unlimited failure modes, and I can't learn and limit them in practice, it's useless.

Because in that scenario I should be duplicating all the work it's done to ensure it didn't go off the rails. In practice (and guided by labor cost savings promised in the contract signed with management), that full verification doesn't happen (because the vendor is incentivized to recommend it doesn't), and people die.


It's a good point that it's a failure of management. It almost always is in situations like this.

I design and deploy customized automation systems for customers, and it's part of the standard process that we run the automation side-by-side with the old process for several months in order to learn the new failure methods and synchronize the process. Yes, for a few months we're duplicating the machine's work, but without the machine we'd be doing the work anyway. And no one is going to die if my automation fails, but we still do this anyway. It's crazy to think anyone would believe they didn't need to do side-by-side verification no matter what sales and marketing told them.

I don't know enough about Watson or IBM sales to say if Watson is good or bad, but I'm not trying to defend Watson or IBM. Watson may very well be a complete failure. But that aside, it's not the only failure in this story. No one should expect to implement a new tool and never verify if it's working correctly.


> Doctors noticing that Watson is wrong is expected.

Well as a doctor if I’m still ultimately responsible then nothing has fundamentally changed, this is just another tool, possibly one I’ll be forced to use by someone with not one day of medical training.

And medicine in the US is in a precarious position. Software engineers are generally not licensed, and they’re not sued (it’s virtually unheard of). It’s different for doctors. There’s a complex dynamic of removing autonomy from providers to ostensibly improve outcomes (and revenue cycle) while also still holding those same personally liable.

So much of what I do is indirectly dictated by wonks in IT and billing, but they have virtually no liability. And one wonders why burnout in medicine continues climbing with no end.


>Well as a doctor if I’m still ultimately responsible then nothing has fundamentally changed

If your stethoscope fails and you mistakenly pronounce the patient dead because of that, who is responsible? Do you blame the stethoscope salesman for claiming it's an accurate medical instrument that you never need to second-guess? Do you stop using stethoscopes altogether because they're "just another tool"?

If your x-ray machine fails and you say the patient's leg isn't broken because of that, who is responsible? It's "just another tool", do you stop using x-ray machines?

If your physician's assistant measures the patient's blood pressure wrong and you never double check their work, do you fire all of your PAs? And go back to seeing every patient for every procedure yourself?

Everything you have is "just another tool" and as with any tool, it's up to the human doctor to interpret the output. The idea is tools make you faster and more accurate, but everyone knows tools fail so you need to be able to double check their work. If the tool is consistently inaccurate, sure, throw it away. But if your argument is "if the tool can't completely replace me it's worthless" I think you're selling yourself a little short there.

Of course you're ultimately responsible. Your stethoscope didn't go to college for 8 years, it's just there to make your job easier.


> the expectation of fallibility with human docs

Not just the expectation but the understanding. A doctor might very well forget which leg to amputate, so we know to Sharpie "NOT THIS LEG" on the one being kept. But a doctor is very unlikely to see a patient with a broken wrist and prescribe antipsychotics, so we don't do much to prevent that error. Human fallibility happens along fairly predictable channels, and we've spent a very long time committing resources to controlling those channels.

Watson, though, thought Toronto is a city in the USA. Anyone who's dealt with ML output knows that the errors are often quite surprising, even before dealing with adversarial inputs. Even in a system where Watson's outputs are subject to checks, the checks we have today are human-specific and developed at a significant human cost. ML answers can't just outperform individual human doctors to add value, they need to either be gracefully integrated with them or be able to outperform the entire system which keeps those doctors on track.


I don’t understand this argument (which comes up often). Isn’t the answer an unqualified yes? It’s obvious the current system has fuck ups. Are we supposed to be happy if a new system has double or triple the fuckups because “technology”? Like where are you going with this? These clinical decision systems have not proven they improve outcomes and we have no reason to believe they won’t make things worse.


"We successfully automated lethal incompetence" is perhaps not the most appealing of all possible USPs for a medical system.


The purpose of such a system should be to give leads for further research. If you treat them as leads and only leads, then bad leads don't matter much.

Unless, the leads are so bad on average that too much effort is spent on vetting them that could be put to better use elsewhere.


>The purpose of such a system should be to give leads for further research. If you treat them as leads and only leads, then bad leads don't matter much.

If the leads are worse than a random toss of a coin, then you can come up with leads on your own.

Not all leads are good. Inviting the village idiot in a brainstorming session wont be of much value -- and Watson is more like the village idiot, than a valid lead generation engine.


I doubt that's how Watson is being marketed to customers, though. IBM tends to oversell and under deliver and have grown increasingly desperate with their shrinking market shares in both hardware and services.


You can only live on spin for so long.


Unless you work as a consultant. Or in Washington.


Slightly over a year back, we were contacted by IBM to try out Watson under some program for startups. Starting out, they asked us to give them a try. We gave them a NLP related task and they were confident they would be able to do a good job of it. After multiple meetings over the next 2 months, they barely produced any useful output. In the meantime, an intern with us got a pretty decent solution using just Python. Our management did get invited to lunches and meetings for selling them some more products. During one of such invitations, since our CEO couldn't make it, I was sent. I was surprised by how shoddy their presentation was given that I thought they might not be doing a good job on the problem we gave them but they should be slick on the selling side. Their presentation seemed like it was out of 2005 or something.


This is a common pattern everywhere unfortunately. Snake oil salesmen and their victims.

I have daily skirmishes at my company with our CTO and his subordinates who constantly want to believe snake oil salesmen because they’re looking for simple ways outs of hard problems. It’s really sad.

For some it seems easier to -_trust_ a dude with a power point presentation and no real experience over good engineers telling the truth and trying to help them for real.

I now think it’s madness for anyone to be in charge of anything if they don’t have relevant experience in the relevant domain. Nice people or not, they’re susceptible to bullshit.


> I now think it’s madness for anyone to be in charge of anything if they don’t have relevant experience in the relevant domain.

This.

In support of that idea, here's C.A.R. Hoare, from 'The Emperor's Old Clothes', the 1980 Turing Award lecture:

'At last, there breezed into my office the most senior manager of all, a general manager of our parent company, Andrew St. Johnston. I was surprised that he had even heard of me. "You know what went wrong?" he shouted--he always shouted-- "You let your programmers do things which you yourself do not understand." I stared in astonishment. He was obviously out of touch with present day realities. How could one person ever understand the whole of a modem software product like the Elliott 503 Mark II software system? I realized later that he was absolutely right; he had diagnosed the true cause of the problem and he had planted the seed of its later solution.'


What was the solution here? To learn the Elliott 503 Mark II more deeply?


Oversimplifying, but quoting from the speech, that particular challenge was overcome as follows: "Thus we muddled through by common sense and compromise to something approaching success."

He then discusses how the problems there led to his thinking on CSP, and, eventually, formal methods.

But, go read the speech, it's much more wonderful than any summary thereof: http://zoo.cs.yale.edu/classes/cs422/2010/bib/hoare81emperor...


Given the speaker, I'm guessing the suggestion is that the incident sowed the seed for him to develop Hoare Logic[0]

[0] https://en.wikipedia.org/wiki/Hoare_logic


You know, it would be very profitable for someone to profess an expertise in X, reach out to various companies to help them use X to solve their problems, then simultaneously gain experience in X while patenting the heck out of any problems solved with X.


That's roughly how I broke in to technical work, modulo the patents. 25 years later I actually do feel at home in particular domains. But I was a young college dropout in the early 90s, and while I never outright misrepresented my experience, I did tell myself that nobody else had any idea how that stuff worked either. (Which I tend to believe more now than I did then, but for different reasons...)

As programming and systems engineering slowly formalizes, it seems to me that's getting harder to do. I see fewer self-taught folks in their early-mid 20s these days at least, although I could be observing my personal bias from aging. Anyone else have an idea?


You’re describing a person who has a huge amount of expertise in sales.

But a person like that can very lucratively sell actual products that already exist, so why would they want to sell this mysterious X?


Because the person is actually a business that:

a) employs enough scientists who know what they're doing and lawyers who know what they're doing to file relevant and defensible patents; and

b) has enough entrenched relationships that they can pay for a sales force what knows what they're doing; but

c) doesn't employ enough engineers who know what they're doing to develop actual competitive products around those patentable ideas.


but products that actually exist are X-1 not X


This is exactly what IBM and Oracle etc have been doing for years.


What do you think of your CEO now :)?


Well, he was pretty good, he let us take the decisions completely so we never paid anything for it other than in time. Moreover, we weren't flush with cash or anything so they wouldn't have made much anyway.


Wow, I have been in this exact situation as an intern


This sounds like this is a common problem for Watson's go-to market strategy.


Python is included in the most recent versions of Watson Studio ML. Were Jupyter notebooks included when you originally tried the tasks? I’m curious why the included python didn’t work and the standalone did.


That's like saying book A is in Spanish and book B is also in Spanish, so why did you like book B better? Python is just a language, what you do with it is what matters. Sounds like they built something completely distinct from Watson and it did the job.


And that’s fair. I’m always curious with how people are using the same or similar tools that my company uses. It’s good to hear common struggles or differences in workflows.


I suspect it's not that the included Watson Python didn't or wouldn't work but rather that the Watson-Python costs money (via paying for Watson) and Python-Python does not.


Given your perspective, a more important question might be what underlying algorithms are being used? I’m only a rookie data scientist but knowing that (for example) a random forest outperformed a neural net in this case, or even just a set of heuristics, is solid information. These can be built in Python, R, even C depending on the application, developer experience and a bunch of other stuff.


I'm looking at it from a couple of levels to support my data scientists. They're primarily python-based but we are adding newer data scientists who have experience with R. Algorithmically, we built a neural network for phenotyping.


The intern probably had direct access to the data and access to the people with domain knowledge inside the company.

The engineers at IBM were probably separated from the problem by many layers of bureaucracy (meetings, project managers, technical specs).

Even working inside a company, without access to the right data and the right people you aren’t going to get anywhere.


Python written by someone new to the industry and eager to please and make a place for themselves VS Python written by body shop consultants who work at a company famous for failing projects that go well over budget and often simply don't work.


I sincerely hope IBM's failures do not deter further progress in trying to innovate in healthcare technology. I believe healthcare vitally needs GOOD/USEFUL technology and AI applications because medical progress is not advancing quickly enough by several metrics.

I hope IBM continues to fail and becomes a case study in the failure of massive bureaucracy.


IBM will not fail, Oracle will fail first than IBM and that's unlikely to happen.

You see, they are NOT tech companies. They have something you would call "tech", they sell something that looks like "tech" but they really solve problems people with access to large amounts of money have.

In other words, they produce cushy and warm beds for C-level execs to have a good night of sleep, and everyone needs some bed to sleep every night..

Technology is a risky and volatile business and IBM has more than 100 years in business.


Spot on.

IBM really does capitalize on the adage of nobody being fired for buying IBM. Their biggest strength has been in their marketing rather than actual technology.

Note that, being a large org, they do have teams doing excellent work too. IBM research, some of the cloud, design etc. teams seem to be doing good stuff. But most of it is still overpromising and underdelivering while charging clients a lot of cash.


This is less true than it used to be. Oracle/SAP are largely recognized as multi-billion dollar technology quagmires these days; integration has become so much easier that cobbling together a bunch of SaaS solutions to replace a centralized ERP is a viable approach.

Likewise, IBM has become known as a vaporware vendor. They combine the worst parts of management consulting with an offshore / outsourced development process, so I guess it’s not a surprise they ended up like all the other companies that sell business tech solutions (Infosys, Wipro, etc). They’re often mentioned in the same breath these days.


IBM has been banned from our data center for several years. The decision was before my time, but the answer I've been told is overly expensive, unreliable, and poor support when compared to other vendors.


In my experience, IBM’s support is excellent, bordering on fanatical. It’s just that their hardware (in this case, Spectrum Scale and Power9) are preposterously overpriced. Same for their software, if not more so.


I think these two statements can be reconciled:

- IBM's hardware support is excellent to the point of being obsessive

- IBM's support for anything that is not a hardware problem would be better-implemented by a 5 year old eating paste


As a former IBMer I really like the POWER hardware. It is just a shame that it is made by IBM. Like you said IBM stuff is just too damn expensive.


Right, I should have pointed that out: the Power hardware is truly incredible, especially the fast I/O to GPUs. There’s a reason why the national labs and Google have gone with that architectue.


I know about it from the Raptor PC team and their Foss compatible microcode. Would you say it holds its own performance/reliablilty wise with competitors?


Thats why the average company on the s&p500 only survives 10 years so 75% will be whiped out in 10y. At the time ibm was founded the average company stock lifetime was +70 years. So maybe IBM won't die soon but their customers will with this attitude. Every company is a tech company today.


It almost feels like these two corporations (IBM and Oracle) and giant marketing, consulting and legal firms than technology firms.


In the startup world there is a lot of customer obsession. What is it about these big companies that let's them de-prioritize that and still thrive? Honest question.


They have customer semi-obsession, but just a selected big customers ( most SP500 and governments ) and I can't think of a company that offers what IBM does ( the full spectrum, not just a subset).

So after getting in, they don't need to be good, they just need to not be too awful.


I think there has to be tougher scrutiny for the tech we apply to healthcare. Us in the healthcare technology community have to have the guts to stand up and call out companies like theranos. If we don't, investors will come away thinking that healthcare is overly complicated and won't invest, and healthcare stakeholders will come to the conclusion that we don't actually care about their business practice or more importantly patients.

We need to prove them wrong on both these accounts.


How about easier barriers to entry with tougher reporting requirements? I personally fear the FDA is too optimized for bad headline avoidance to the detriment of allowing more experimental leeway. Not saying it should be the Wild West, just suggesting that perhaps the risk tolerance is currently too low. What if it were easier to perform more experiments, but if you hid/covered up any bad data there would be hell to pay?


I think the current system is working now, it just needs enforcement. 510(k) applications (where you need a preexisting requisite on class 2, or low/moderate risk device) certainly seem to be doing their job. Lots of healthcare startups (including those using AI algorithms) have used that path to successfully submit to the FDA for clearance.

That said, there's a lot of bunk science marketed to consumers within healthcare. Your average consumer heats microbiome and genomics and AI and thinks that the product will cure them of disease. I don't think it's the role for the FDA (until they make specific health claims), but we need to remind consumers to be smart purchases of these services. And for us in the healthcare tech space need to be careful who we want to include within our communities to ensure we share the values of positive patient.


I work at a startup that uses machine learning for medical purposes (junior position). My biased impression is that the interest in this tech is growing rapidly and not even IBM can ruin it. IBMs problems might be good in a sense if it leads hospital networks to experiment more with the offerings of up-and-coming companies with more reliable offerings.


EconTalk from last week[1] shed some light on the convoluted world of drug pricing and pharmaceutical patents. While I agree with your point that AI could help with medical progress, I would also emphasize that we're being held back by meat-space institutions as well (in this case, drug patents). One of the key takeaways was looking at the return on marginal investment in gaming the patent system vs developing new drugs. As firms achieve success, they become conservative, wishing to hold onto current profit levels. The patent system, as formulated, enables the drug company to make a meaningless change to a drug and receive an extended period of patent monopoly. This keeps out generics, raising prices, and reducing investment in drug research.

[1] http://www.econtalk.org/robin-feldman-on-drugs-money-and-sec...


All the preclinical work is actually the cheap part of pharmaceutical development. Most of the costs come from the clinical trials for the FDA. Now a lot of the reason for this is that the US spends a pretty penny (and a lot of elbow grease from grad students) on basic biomedical research that's published into journals for "free" from the perspective of the pharmaceutical corporation.

That said, if you can make clinical trial reporting data collection more efficient, there's a LOT of money to be made.

https://www.nature.com/articles/nrd3078


it is a half baked truth. pharma want you to believe trials are expensive so they inflate crazily their costs to price their drugs high. but at the end when u try to sell them software they dont look at the inflates costs of trials be sure of that


Costs don't drive prices. Demand does. I'll be the first to admit that pharmaceutical economics is not simple (lots of externalities, asymmetry of information, etc.), but in the long run costs do matter, and so does price. Best example is Solvaldi, which initially sold for ~$100,000 per treatment. It's much much down (~$65,000) once Merk came online with their hep C competitor. Prices do matter, and while the market is easily distorted, it's not immune to the laws of economics.


The week before that was even more incisive http://www.econtalk.org/jacob-stegenga-on-medical-nihilism/


AI has been applied to healthcare technology since the 1970s, like https://en.wikipedia.org/wiki/Mycin and https://en.wikipedia.org/wiki/CADUCEUS_(expert_system) .

One failure by one company, even IBM, won't stop future attempts.


There was even an Artificial Intelligence in Medicine Symposium held in 1979 -

https://blog.jacob.vi/an-80s-throwback-artificial-intelligen...

In response to the above blog - Watson's early small-scale success clearly hasn't scaled to bigger and broader applications.


Watson is more a marketing term than a technical connection between products.

I invite you to test "Watson Tone Analyzer" https://tone-analyzer-demo.ng.bluemix.net/ :

- "I like this product." => "this is an analytical opinion with neutral emotion."

- "I like it" => "Tentative, 50% happy answer."

- "It's not a bad product." => "analytical".

- "It's not a bad tool" => "joyful answer".


> Watson is more a marketing term than a technical connection between products.

As far as I can tell there's absolutely no connection between the products being labeled as 'Watson' and often times very little machine learning taking place either.

I've interviewed several candidates from IBM over the last ~3 years and almost all of them worked on something with 'Watson' in the name. When we whiteboarded out the architecture of what they worked on it was mostly automation with _maybe_ a touch of machine learning thrown in by a module written by someone else.

Very few of them were able to pass the technical interview.


I tried it with the first phrase on the page "This service uses linguistic analysis to detect joy, fear, sadness, anger, analytical, confident and tentative tones found in text."

response: Fear: A response to impending danger. It is a survival mechanism that is a reaction to some negative stimulus. It may be a mild caution or an extreme phobia.


- "Spiders are not insects." => "fear"

Exactly what I expected, and also flat wrong.

- "There's a monster under my bed." => "joy"

Fascinating.


You'd think they'd pair it up with a naive bayes classifier at least.


Thanks. I tried it myself with text examples similar to yours. The answers are a bit bewildering.


Slowly, I am noticing that people are starting to question the abilities of machine learning systems and, in particular, seeing a lot of the transparent AI-washing that is now going on for systems that aren't AI at all. This is one of the most positive changes in the tech industry, honestly.


Machine learning and neural net systems seem to be the inverse of the "uncanny valley", call it the "miraculous hill" if you will.

They're just good enough to, with insane amount of computing resources, churn out economically viable results, but in my opinion they're the biggest setback in actually developing intelligence or cognition in a long time.

There's very little insight to be gained from what they pick up, and they function purely in a stochastic sense. Even the best ML algorithm has no ability to reason at a high level or produce counterfactuals, it works purely by correlation and still, in those 1% edge cases, will be as helpless as anything else.

That might be good enough when selling advertisements, but in automated cars and healthcare treatment, this sort of failure is not an option.


ML and neural nets work surprisingly well on a lot of boring but important use cases. Optimizing online ad placement might be humdrum but it's the foundation of a $100B market in the US alone. [0]

In my opinion the real hype is that ML has become so popular that new practitioners tend to forget other analytic techniques like SQL data warehouses. Interestingly these are starting to absorb ML capabilities like logistic regression, which are now accessible through SQL and can benefit from MPP and vectorwise query execute in DBMS types like ClickHouse, Vertica, and Google BigQuery.

[1] https://www.forbes.com/sites/danafeldman/2018/03/28/u-s-tv-a...

Disclaimer: I work on ClickHouse.


https://i.imgur.com/FXi6NXb.jpg

According to the trend, we're approaching another AI Winter


I've seen startups with zero AI (beyond rule based algorithms) call their product AI because it attracts investors. So I can vouch for that.



Slightly unrelated, but I hope IBM doesn't ruin Red Hat and CoreOS. I would love to see the CoreOS tools to gain more adoption.


I'm a long time Fedora user. I love the sane defaults. It's just a solid workhorse. But I've been feeling pretty uneasy about using it lately...


Alternatively, a lot more IBMers will be able to use Fedora and it may improve even more. I don't see them having a strong reason to do any damange there.


Maybe? IBM is mainly a Windows and Mac house for dev machines, with Ubuntu being a very, very small portion of workstations.

I’m not sure the acquisition will create a mass exodus from the status quo there.


I work with a group using machine learning for drug discovery (I'm not a biologist/chemist) but the bio people around me loved to talk poorly about Watson's drug discovery tool.

Lots of focus on the algorithms in the comments here, but from what I could glean generally they lacked domain experts when developing the datasets... we spend 90% just finding the best data... and even then it's tricky. I think they may have had lower standards for the input into the system... garbage in garbage out.


a lot of these AI companies products are really terrible. Has anyone ever tried the AI API models from clarifai ? Just so unaccurate. It seems like a scam. I've also had a really bad experience with watson's speech to text apis.


I was an early clarifai employee and although I can't speak for what's there now, up until 2-3 years ago they had one of the most accurate models available (with google being the only real competitor).

The generally available models will almost always be suboptimal due to the difference in the data that they're trained on vs the data that clients use it on. That's why most of these AI companies end up doing a ton of consulting and build specialized models for larger customers.


An AI bubble poppage is around the corner. The actual revenue of AI companies does not justify their stock price.


Watson’s is awful. Google’s default models are by far the best followed by Microsoft at a distance. Then, many of them are so bad that it is laughable.


Do you have any insight about AWS IA ?


I tested most of the solutions a year or two ago before AWS rolled out Amazon Comprehend. I found that Google’s API combined with some code and open source packages ended up being additive for entity recognition, classification, and semantic text. Unsurprisingly, Google’s custom search API’s are also worlds better than the other places, as is it’s Maps API. I haven’t tried any of the image classification API’s, but I won’t be surprised if Google won here too.


Haven’t tried it actually yet.


Deep learning and machine learning don’t work. Quantitative math will always prevail, as it always has. Unfortunately, mathematical research isn’t there yet. We don’t have models for vision, audition and linguistics. Neuroscience and psychology are in their infancy, a good analogy would compare these fields to where physics was pre-Newton, Galileo era of understanding. I suspect that in the decades to come, these fields will influence mathematics the same way physics influenced calculus. Physics historically had a huge influence on math, in the coming century it will be neuroscience and psychology, in linking brains to behavior, and the quantitative laws that allow brains to give rise to minds.


I guess it depends on... what "work" means. So I worked on Deep Networks for quantum chemistry (I'm not a physicist or chemist but) I can tell you people were ecstatic about the possibility that the approximations that the neural nets come up with might get closer to real physics than current approximations right now. This w/o any needed advancements. Some challenges are so difficult in these areas that approximations are the best that are possible. It's kind of similar to drug discovery now.. like if there are models which can help narrow down potential molecules / targets, that has tremendous potential even if the system needs to be double checked by a person. So it's hard to see "don't work" as anything but buzzy. BUT I will agree with you neuroscience will help develop our understanding of cognition.

I just wanted to blast these other applications, because I think people get this idea that AI has to be AI for anything interesting to happen... but there are really niche applications where people don't think these tools are experimental. And what you describe may already be happening, Geoff Hinton's critique of modern deep nets seems to be a call to get more biological. (Thinking of capsules nets).


>Deep learning and machine learning don’t work. Quantitative math will always prevail,

I have a neural net onboard my phone which automatically detects songs offline and tells me what they are. Is that semantically 'quantitative math' and not machine learning?


Which app is that? Most of the well known music identification apps like Shazam use acoustic fingerprinting to identify songs. They work well without using neural nets or deep learning. What benefits does your neural net based app offer over this well known approach?

https://en.m.wikipedia.org/wiki/Acoustic_fingerprint


It uses both - the neural network creates it's own acoustic fingerprint database, which it then uses to perceive sound. This shrinks the traditional acoustic fingerprinting data enough to be stored on mobile devices, while allowing for a low-power "always-on" identification offline.

It's analogous to a human being able to identify songs by remembering the chorus, just that the NN uses it's own features for both the memory and offline perception.

>In 2017 we launched Now Playing on the Pixel 2, using deep neural networks to bring low-power, always-on music recognition to mobile devices. In developing Now Playing, our goal was to create a small, efficient music recognizer which requires a very small fingerprint for each track in the database, allowing music recognition to be run entirely on-device without an internet connection.

https://ai.googleblog.com/2018/09/googles-next-generation-mu...


The biggest issue I have is: "To do this we developed an entirely new system using convolutional neural networks to turn a few seconds of audio into a unique “fingerprint.”

Why did you pick a neural network? What mathematical properties does a neural network have that makes it appealing to this problem? How were the networks trained? Back propagation? It doesn't converge, and worse learning weights for a new batch can cause you to forget previous batches. This isn't a desirable property of neural networks or back propagation. You probably had a lot of heuristics on top, fine. How do you know that the weights you ended up with will always work in practise? Given an arbitrary track, you can encode it? What about growing the database? Does the neural network get updated for new songs, or do you use the same neural network to fingerprint new songs and update the data base?

Here's how I would have done it:

A song file is just a sequence of amplitudes. I would do some kind of an interpolation of piece-wise trig function. Trig functions have very desirable properties: they are continuous everywhere, and infinitely differentiable. Moreover, a sine basis decomposition will be able to reconstruct the original signal very well. This is great, because now you can use theories from DSP and fourier analysis. So we take the entire song, do a continuous time discrete cosine transform, in a block size of 32. Now you compute the square norm of all feature vectors, sort them, eliminate the vectors that are within 1e-3 radius (they are too similar to each other, there's not point in keeping them) and only store the top 25% of feature vectors by the square norm. The 25% cut off threshold and 1e-3 radius of similarity are heuristics, and adjustable parameters.

Now you have a database. For a new song, repeat the procedure, and get a feature vector for every 32 interval. There are probably theories in DSP you can use to get a better similarity measure, but for now, we'll just use the L2 norm of the difference. Do a nearest neighbour search in your data base for all feature vectors, and rank the results based on hits. I can run all of this on a computer from 2000s which are crappier than modern phones, and have the entire backend run on equally crappy hardware too. All parts of what I'm doing are fully deterministic, updating the DB is incredibly fast, CTDCT is super fast, there are no questions of convergence, no need for training. You can probably increase the accuracy and speed by doing some DSP and doing the nearest neighbour search based on different voice, bass, instrumental etc. features.

In practise how would it compare to your neural network? No idea, but I imagine it should be very competitive. The big benefits are that you have only 3 parameters (radius of similarity, cut off threshold and block size). This seems very easy to bench mark against, it should take like a week to implement. I'm not sure about the compression of the finger print however. Not sure how much space 1000000 songs will take (probably 25% since that was our cutoff). You can probably borrow psycho acoustics to make a better data base, and get a better compressed representation. Another alternative would be to down sample the song to 64kbps before hand.


I agree with spaced-out. A neural net can capture all those smaller eigenvectors in the signal that are routinely thrown away during traditional feature engineering, like what you describe. When the number of training samples grows big enough, those factors with marginal contribution become significant and allow higher levels of accuracy in prediction or classification than are possible when curating features manually.

Deep nets are here to stay. They're just not magic bullets that solve all problems equally well, especially those when training data is minimal.


> A neural net can capture all those smaller eigenvectors in the signal that are routinely thrown away during traditional feature engineering

What on earth are you talking about?

>Deep nets are here to stay.

Maybe in silicon valley for consumer products in things like snapchat and siri. They won't work for industrial problems.


You'll never be able to develop features with the heuristic methods you described that will work as well as the features learned by a neural net.


Huh, a quick Google search gave me: https://www.ee.columbia.edu/~dpwe/papers/Wang03-shazam.pdf

This was a paper from Shazam from 2003. This is essentially what I proposed, there is no training. Shazam works pretty well. It's not even going into the mathematical consideration I went into.

>You'll never be able to develop features with the heuristic methods you described that will work as well as the features learned by a neural net.

False.


Quantitative math, or applied math isn't based on fitting data to an arbitrary mathematical structure. It's looking at real life, and deriving the mathematical laws that govern what you see. You could have a neural net predict planetary motion. However, it doesn't know jack shit about physics.

>I have a neural net onboard my phone which automatically detects songs offline and tells me what they are.

MP3 uses something called psycho acoustics, which is a quantitative model on human perception, which is used to eliminate frequencies that can't be heard based on this model.

Your neural network doesn't tell you what features make songs distinct, it's not a quantitative model at all, but a black box heuristic on what the important features are superficially. If actual mathematicians worked on this problem, I guarantee you they'd do a better job, and their models would work on a commadore64, with real time training. Moreover it would tell you things like who is singing, if it's a live performance, which concert it was.


" If actual mathematicians worked on this problem, I guarantee you they'd do a better job"

No, this is wrong.

Some of the most brilliant people in the world have been working on image recognition, voice recognition etc. and AI is crushing all of their work.

"Your neural network doesn't tell you what features make songs distinct, it's not a quantitative model at all" - it doesn't matter at all if our objective is detecting the song. Neither does the mp3 compression algorithm.


>Some of the most brilliant people in the world have been working on image recognition, voice recognition etc. and AI is crushing all of their work.

This is very true. I take my stronger statements back, MAINSTREAM mathematicians attempting this problem are all wrong, and have been wrong for 50 years. But you do need the right theory, and the right math that realizes this theory.

"AI" is superficially beating the work in computer vision. Computer vision is complete bogus. The gabor filters, fourier transfroms etc. are all wrong conceptually. The known methods do abysmally on basic tasks like object recognition, texture segmentation etc. But they keep trying it.

I would take this one step further: computer vision, audio and NLP researchers have been stuck in a rut for the past 50 years. DL is beating THEIR math, but this is because of data and computation speed, not because of any insights. But DL is also wrong, and giving you an illusion of progress. Both of these things are doomed to go the way of GOFAI.

I can go into great detail and carefully explain why MAINSTREAM contemporary ideas in math for vision, audition and language are completely wrong, and have been wrong for 50 years. What is the right model? Like I mentioned before, the right ideas are emerging, neural networks will dominate, just not DL.


Ok so who are the real, non-mainstream mathematicians who would do better?


> It's looking at real life

Collecting observations aka data.

> deriving the mathematical laws that govern what you see

Fitting a model.

> Your neural network doesn't tell you what features make songs distinct

It literally learns better features that you could ever come up with by hand. This is why CNNs do better in computer vision that hand engineered filters.

> I guarantee you they'd do a better job, and their models would work on a commadore64, with real time training.

LOL if you think that a room full of people can listen to TBs of audio data, decide what mathematical functions when combined together are better descriptors of that data than a DL model learning its features.

You don't have the slightest clue what you're talking about.


>If actual mathematicians worked on this problem,

This is a No True Scotsman. Actual mathematicians did work on this problem, training the neural network to achieve it's target task of identifying songs using minimal power and storage consumption - which works.


> Deep learning and machine learning don’t work. Quantitative math will always prevail, as it always has.

What do you think machine learning is, if not “quantitative math”? Deep learning is just linear algebra and calculus, and things like random forests are even simpler mathematically.


Machine learning is glorified curve fitting. DL isn't even mathematically sound, back propagation has no proof of convergence. Quantitative math is about extracting natural laws, and mapping them to mathematical structures. You could use DL to predict planetary motion, and get pretty good at it. But this isn't a quantitative understanding of the world. You didn't learn anything. Physics in contrast has the laws of motion and gravitation. You can directly model arbitrary planets. Moreover, you can model arbitrary rigid bodies, from cars to space shuttles. Your ML, DL random forrest etc. all use math, sure. But so did the Keplarian models of motion. You aren't qualitatively deducing math that governs the world, but forcing an arbitrarily chosen mathematical structure to your data.


If we’re throwing out anything that doesn’t have a proof of convergence as “not mathematically sound,” you can kiss fluid mechanics goodbye, as well as lots of other subfields of physics that rely on partial differential equations.


If you mean for medical diagnosis, maybe, but you do realize that NN/AI is totally state of the art for many tasks? NLP, image recognition etc. ?

AI is in a hype bubble right now surely, but it's a 'very real' thing that's going to infiltrate a lot of areas.


>you do realize that NN/AI is totally state of the art for many tasks?

Being state of the art doesn't imply that these things will solve these problems. In ML terms, how do you know that NN/AI isn't a local maxima that we need to jump out of? All NLP systems are joke. Sure replace Watson with DL, might perform better on Jeopardy. But in real conversations? Forget it.

I wouldn't bet on these things. NN will win, but not the back propagation, ReLu, sigmoid or whatever pseudo science that is the current buzzword. There is 50 years worth of understanding in actual neuroscience and cognitive modelling that no one has paid attention to, and new design principles are emerging that will influence mathematics.


Because DL isn't great at NLP (and may never solve "real conversations" aka AGI), it's worthless?

It's the best performing tool we have for NLP, image recognition, etc. Is it a local maxima? Probably. But it's out there solving real problems nonetheless. We'll capture all the gains we can and then move on after.


"Being state of the art doesn't imply that these things will solve these problems. "

I suggest you are misinformed about the state of AI.

AI is currently ahead of all other approaches in many fields.

It's lead to quite a number of practical advances and breakthroughs.

The 'best examples' are those that I described, but there are many more.

Your comments indicate I think some ignorance on the issue - I think I see the point you are trying to make but I also submit that you're not aware of what AI is doing today.

'Self driving cars' would be impossible without AI today, for example. The vision systems depend on AI it's a breakthrough without which we simply wouldn't have the tech.


What if I don't care about general intelligence? Honestly it sounds more like a burden than a benefit. A local maxima might actually be exactly the type of solution that I need, especially in cases where no satisfying solution currently exists.


I agree that DL/ML is destined to fail in domains like this but can you expand on this reasoning? What exactly do you mean by "quantitative math" (I haven't heard this phrase used in this way before)? And what were the equivalents of DL/ML for physics before calculus?


Quantitative math might be the wrong word, maybe applied math? In quantitative finance, you make quantitative models about the world, and build math that realize those assumptions and understanding. A simple example: options are a great financial trading instrument that you can model mathematically, the simplest being the Black-Scholes model. You can imply things like volatility of the stock price based on the price of the option to get a better understanding of what a risk neutral market is thinking, and compare that to the actual market distribution.

>And what were the equivalents of DL/ML for physics before calculus?

This is a good question. Before Newtonian calculus, and the laws of gravitation, people were building very complicated conic models (ie. eclipses, parabolas etc.) to get better and better prediction of planetary motion. A lot of parametric math came out of this, with many sophisticated models getting better and better, giving these astronomers an illusion of progress. However, Newton's insight was that motion is connected to mass, and this insight was the basis of how to derive the laws of motion, which gave us the laws of gravitation (F = (Gm1m2/r^2)). This insight eliminated the previous Keplarian models of motion, because you were now able to predict the motion of arbitrary rigid bodies using very simple math (we teach this in highschool). Ofcourse, Newtonian motion has its limitations that's why we have quantum physics and Einstien's relativity theory. But for practical technological applications, Newtonian physics on its own gets you incredibly far.

Where is ML/DL? It would be akin to Keplarian elliptical motion. More realistically however, it's closer to aether theory of light, and will go the way of GOFAI. This stuff isn't grounded in modelling any scientific observation. Moreover, they are mathematically useless. Back propagation doesn't converge, and why should you fit your data to an arbitrary mathematical structure? In practise, DL/ML doesn't work at all, you will be much more successful by modelling your problem mathematically. For example, consider an automobile manufacturer, which has all kinds of moving parts in their planes. They typically model each part mathematically (ie. gear x under goes exponential time decay), and imply their parameters using rigours test data. Then you use some sort of an empirical statistical model to predict the failure.

I've seen deep learning companies come and fall flat on their face trying to beat the accuracy of these deterministic systems. Those guys needed a lot of data, and GPUs. I'm not even criticizing the fact that DL is a black box. It's worse, it's inferior to everything out there on every metric imaginable. These mathematical models in contrast have been in production for decades, with yearly updates, and they run in real time with little historical data, they are fully understandable and they beat every method we know of.

This isn't the first time multi layer perceptrons gained hype. They didn't work in the 80s, or 90s or the 2000s, they don't work now. The math behind DL is the same that we had in the 80s, they just called it multi layer perceptron. None of the ideas in modern ML/DL are new, all these ideas like reinforcement learning, GANs etc.


1. Black-Scholes works in a lot of cases but is an approximation: it has edge cases where it does not fare well...

2. Likewise Newtonian physics is also an approximation: it does not fare well near relativistic speeds or high gravity. But at least we have models which seem to be accurate to many decimal places today. Who knows what the future may hold.

3. Not all useful problems can be represented by simple equations, but they can be computed analytically (e.g. N-body problem).

4. Ultimately DL is popular because it works better than anything else in some very specific domains like speech recognition and image recognition. It is overapplied I'll admit, but if you can do better then feel free to publish a paper.


I'm pretty sure it's all a big scam, at least the crap that comes out of IBM.


+1


Yeah their speech recognition API was by far the worst of the ~5 I tested a few years ago. Like, almost as bad as Sphinx.


It seems just like crypto the 'ai' hype cycle is coming to an end with zero achievements to show for itself beyond puffery and premature speculation about radiologists, drivers and others losing their jobs.

And its not just hype, like crypto the ai hype stepped well beyond the line to outright fraud and deception with tech folks trying to pass off backward looking pattern matching as 'intelligence' and hope no one notices. Every single commentator here knows there is nothing in computing or software engineering today that will allow one to 'create' an 'ai' as the world understands the term yet no one questioned pushing intentionally deceptive communication.

This end result of dystopian scaremongering amounting to nothing is there is now zero credibility and extreme suspicion of problematic in-built bias. At the minimum there must be some standards for machine learning solutions to be thoroughly transparent, open to verification and exhaustively tested for racist and sexist bias before any rollout, for anyone who cares about the impact of their work in the real world.


There is no "ai", it's the same bull crap we had in the 80s. We haven't had true innovation in this space for a very long time, the current hype was caused by GPU, cloud, and abundance of data for training.


How is Watson not the butt of jokes in the tech industry by now? Seriously, never heard anything that didn’t end up being smoke and mirrors.


I can’t read most of that article because it’s behind a paywall, but wow, is Watson just the largest attempt to sell vaporware of all time?? They had a super bowl ad!

A few friends from university were hired into the Watson team as hardly technical PMs. I have never heard any of them describe what Watson is or does.


I've commented this before, but Watson is nothing but branding for anything IBM does that is somewhat intelligent. There's pretty much no common technology between the different Watson products, so there's no 'core Watson' like many people believe.

Some products with Watson branding are great and industry leading. Others suck. The latter tend to appear in headlines and severely impact the brand they established with Jeopardy and their ads.


I suspect what happened was that Watson started out as a single product or focused suite of products. I'm just guessing, but then as the marketing induced hype started and growth/results were not on target , they realized there was a lot of interest in Watson so IBM pivoted and started rolling everything under the Watson brand - products, consulting, cloud stuff, etc. Basically that way they could say Watson was a "success" and internally/externally it would be opaque as to where exactly the profits and losses were.

I say all this because I'm at a place now where this is happening. Hugh company with a supposedly game changing product that is mediocre (at best). Massive marketing campaign for "Product X" that is 100% buzzwords and 200% BS. After constant missed revenue targets and product disappointments internally and externally, company is clearly pivoting (but not saying so) to rolling everything under "Brand X."


I worked there (Watson Health) a few years back. All that happened is IBM. That's right. They bought us (a small start-up) and this is what they did in a year to us: 1. replaced managers with their own who didn't get anything, kind of old date executives taken away from mainframe 2. banned remote work (this costed them a few brilliant engineers) 3. opened one huge open space full of noise and chatter, I mean sales team next to the development team, etc (this cost them even more headcount among experienced developers) 4. For months we did nothing! There were whole teams doing nothing for months on an end. I wrote zero lines of code in over 6 month period. Why? Because management didn't know which direction should be taken... and they kept hiring too! They kept hiring new developers when the ones already at place had absolutely nothing to do!

This is from my (simple developer) perspective. Not sure how it looked in sales, among executives, etc. But that was very weird environment.

I had great Manager who asked me to learn react and take courses in react (mind you it was two years back!) As we "might want to do something in react in the future". So I basically spent my last six months there learning React, aka preparing to the job interviews... they even got us paid courses and all. I mean... IBM.

And once they fired me (these were lay-offs, thousands affected, many of them just hired in past year, like myself) I was paid severance pay too. I went there worked a year, last 6 months was learning for job interviews... fantastic pay too. IBM is crazy.


This is normal big corp thought process. The CEO decides upon a growth area, gets some yes men to agree on a growth trajectory, and applies some standard investment/hiring metrics to assure they stay on top of the designated curve.

This all rolls down a few levels of mgmt to the first line guys, who are told your in a growth area, and we expect to need to hire X people over the next 24 months, who will be tasked with XYZ (frequently fancy words which when analyzed boils down to support the product we are going to sell).

This goes on for a couple years until the projected vs real revenue divergence is so large that even an CEO can't ignore the lack of growth. At which point the plug gets pulled and the next adventure starts somewhere else.

Sadly though, IMHO none of that is a problem, the real problem is that the CEO's can't actually tell or make strategic decisions about why these projects are failing to have exponential growth. (see intel & mobile chips/wireless connectivity, those are so strategically necessary for their growth that they need to keep trying until they die). So, they ax them, sometimes just as they are getting a solid product portfolio together. But they don't know that because they have been fed the same line for the past 24-36 months.


>Because management didn't know which direction should be taken... and they kept hiring too! They kept hiring new developers when the ones already at place had absolutely nothing to do!

I think big companies trying to do things like "Capture a market segment that will be a trillion (I made this number up) dollars in 2025" suffer terrible analysis paralysis. The 5 year plan has an extreme revenue ramp up and insane targets. If you combine that with politics, there is a lot of business and financial justification that has to go into every decision, and many decisions will be safe ones that look innovative (we're going to build on Insert Latest Cloud and use AI!) but have no real value to many customers.

The end result is crazy hiring (and firing a few years later) and groups with opposite experiences. Some groups have no work to do and other groups are working 80 hour weeks trying to make it seem like the marketing and growth curves are all true.

It's a comedy (if you are able to stay out of the mess and politics) or a tragedy if you have a manager who feels they want to be the shining star that supports this mad rush.


What did you end up doing in a period of non-work work like that? Read books?


Advise you to read the comment again :).

Spoiler: (s)he learned React.


Not sure if I stopped reading early or just didn't comprehend what they had written. Thanks, lol.


You can sort of tell this from the style of IBM's TV ads, which have a condescending and smarmy tone that is quite distinctive.

The ads are filled with lofty buzzwords. No talk of actual technology at all, because the ads are targetted at non-technical management, the kinds of people who might be euphemistically called "decision-makers". The ads make all kinds of promises to these "decision-makers" about how their business will be utterly transformed. Actual implementation of the systems and business-critical changes is left to the IT department. The non-technical management writes a cheque and then washes their hands of the problem.


My last employer was an IBM customer, so of course we recieved all kinds of material in the mail or marketing calls trying to sell us on 'Watson'. It got so bad our president (who's outof touch with tech)( called a meeting to see "how we are going to use watson"


> It got so bad our president (who's outof touch with tech)( called a meeting to see "how we are going to use watson"

Sounds like it was working perfectly!

I'm not even at an IBM shop and the Watson sales pitches are starting to reach C-level folks.


Watson is a constellation of consulting services and software. It's basically a marketing meme.


I chatted with the lead over lunch, when they were kicking of a game AI initiative. I shied away, given that it appeared to be exactly as you say. My impression was that Watson includes everything from RNNs to linear regression, with an attempt to craft domain-specific taxonomies.


I miss Chef Watson.


I'm not sure I'd call it vaporware. I've not used it myself, but the more objective reports I've seen indicate that Watson is very good at what it does. The problem seems more to be what it does - accurately interpreting and then answering queries written in natural language - isn't actually that useful.

For example, for drug discovery: I'm guessing, off the cuff, that there are other drug discovery tools out there, and that, while they don't allow you to frame your searches in terms of natural language queries, that's probably not actually a problem in practice. Because the querying methods they've developed are presumably highly tuned to their real task, which is enabling a skilled and knowledgeable practitioner to specify what they're looking for with great precision.

It's sort of like programming languages: The ones that are designed to be the closest to natural language (e.g., AppleScript) do have a gentle learning curve for absolute beginners, but they also turn out to be some of the very worst languages for trying to do any sort of serious work.


Which is why data scientists use python instead of AppleScript.


This is what IBM has always done. 20 years ago they pretended that their Deep Blue chess engine was in all their IT products. It's similar to how DeepMind AlphaGo is PR for Google's AI effort, but IBM takes it to comical extremes.


> 20 years ago they pretended that their Deep Blue chess engine was in all their IT products

[citation needed] IBM did not do this, though funding the Deep Thought team from CMU was cheaper than a Super Bowl commercial and brought more durable effect.


I agree with you that IBM is full of shit, Google too to a degree, but Google has used AI in a variety of their products even down to data center cooling processes. I don't think IBM and Google are comparable in this respect.


Watson is the branding term for an army of outsourced consultants (mostly in India) plugging away at the raw data.

Watson is not very good, and largely doesn't exist outside of slick marketing campaigns.


That's really interesting. I work in the emergency room, and pretty much our entire approach up patient care is algorithm-driven.

As soon as the patient checks in, their demographics (age, sex, etc) and vitals are fed into a mysterious program, which suggests an acuity level, and basically drives the whole course of treatment. The patient is required to be tested for a variety of disease processes based on the AI-generated differential diagnoses.

Honestly, I thought/still think medicine is headed toward this, which is why I decided to go into research.


What program is that? Is that something Epic/Cerner/whatever does, or some add-on?


It's MedHost with a lot of modifications specific to our hospital.


I’m running on prem instances of Watson and it’s far from vaporware but Watson has become more of a brand name. For the Watson product I use, It’s more of a wrapper to do AI and ML work, notebooks, containers and related tools for data science.


Same here. And so far I have to say I'm quite happy with it. Is it the best tool for ML? Probably not. But it's certainly the easiest I've found so far and it works for what I'm using it for.


there are large amounts of interesting things in watson but it was still overmarketed for what it can do.


Basically, Triumph of Sales over Engineering.


And now Google is pushing DeepMind Healthcare in England?! Do they not learn from these examples? Anyone who uses EMR knows very well the beauty of copy-paste and this single little convenience is creating more disinformation than the Mt Everest piles of illegible scrawl that used to be medical records. At least in those days, a HUMAN had to write something. And how did everybody forget this little gem from the 1960s: 'GIGO' = garbage in , garbage out. I dont care how many layers your RNN has, it will NEVER get accurate data. Sick people are not reliable sources of info and it takes a HUMAN to strip off all the emotional distortion to get the facts.


No surprise here. IBM has stopped producing usable engineering solutions long time ago.


Watson's applications outside of a quick search of existing datasets are vaporware. I'm amazed not a single customer has gone public with it when so many of them are whispering about it on LinkedIn.


Even Watson wasn't a general AI. Hearing all the anecdotes from folks that tried to use it.. Narrow AI has the attribute that it can't be ported from one field to another. It might win at Jepordy but sucks at Oncology and Drug Discovery. All AI isn't the same. We are general AI and we can solve problems in asymmetrically bad situations. Narrow AI is not portable


There have been a couple articles that a lot of issues with Watson was how it was sold. Watson was being sold as sort of a magic drop in solution that could do oh so many things.

In reality Watson took a great deal of work to carefully structure and process data, evaluation of the output, and more thoughtful of approaches to further refine the information and evaluate outputs. It also required a fair amount of involvement with the individual customer's staff. And there was always the possibility that because each use case was different... it simply wouldn't work out.

To some extent it seems like it should have been sold as a journey... not cookie cuter solution for things that Watson had never encountered before.

How do you sell that, I don't know, if I knew I'd probabbly be a pretty good salesman.


Good. Focus on the use cases that actually work, namely a chatbot for handling customer service questions and an enterprise search engine.


Can’t tell if sarcasm, but IBM’s had massive headache and trouble in this segment too. Good customer service agents are expensive. Bad ones are cheaper. But both still have a far higher success rate than conversational bots do. And a hybrid approach is, again, harder than one would initially assume (you know... why it’s painful and infuriating to press fifteen numbers on the phone to get to a real person? Same concept)


'And the trough is coming...' - Gartner


For reference, here is the IBM Watson VP hustling it [1] on Charlie Rose. Funny in retrospect. Especially that haircut ...

[1] https://charlierose.com/videos/29530


[flagged]





Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: