Hacker News new | past | comments | ask | show | jobs | submit login
Data-Visualization Firm’s New Software Autonomously Finds Abstract Connections (wired.com)
49 points by edmaroferreira on Jan 20, 2013 | hide | past | favorite | 11 comments



> The power of Ayasdi is its unique ability to automatically discover insights — regardless of complexity — without asking questions. Ayasdi’s customers can finally learn the answers to questions that they didn’t know to ask in the first place. Simply stated, Ayasdi is ‘digital serendipity’.”

It’s a bold statement, however by using algebraic topology Ayasdi has managed to totally remove the human element that goes into data mining — and, as such, all the human bias that goes with it.

---

Don't companies with massive datasets already do this in some way? Google's system for piecing together what people "really want" seems such a large, open-ended inquiry that it seems some automated insight discovery must be done.

Also, wouldn't a machine that generates automated insights require another layer of meta-analysis to be able to sift through which insights are actually usable? Using the example of sports statistics, you could generate an infinite number of trivial relations between players and plays and teams...it seems that at some point, a human with real domain knowledge has to go in and program a filter system, which seems to be about the same amount of work as the inquiry-generation that this software automates.

Finally, why all the emphasis on visualization? Visualization helps to illustrate to humans the possibilities of investigation...for a machine that can supposedly ask (and answer) all the correct questions, isn't visualization merely eye candy?

It'd be great to see a concrete example of this software in action. Perhaps feed it the NFL play-by-play data that was posted on HN a few weeks ago and see if it can generate usable strategy.


It's a bit of an odd article, but my guess is that it's a gloss of a press release, and that kind of writing is par for the course for AI/ML press releases.

Gunnar Carlsson, the researcher mentioned, seems to work mainly on an approach related to manifold learning (roughly, finding lower-dimensional structure in high-dimensional data) that's based on algebraic topology. He co-ran a workshop on that last year at NIPS, one of the main machine-learning conferences: https://sites.google.com/site/nips2012topology/ . He's written some highly cited papers on the subject, though it'd be a bit of a stretch to claim he invented the area, since there have been workshops at least back to 2007, that one organized by a set of French researchers: http://topolearnnips2007.insa-rouen.fr/description.html

I would guess the part about removing the human element from data mining is putting an optimistic PR spin on the basic idea of automatically extracting lower-dimensional structure, which, if it works, should allow for less feature engineering. The emphasis on visualization makes sense in that light, if they're planning to sell it as a no-expertise-necessary system: feed it data and get interpretable-by-non-experts viz as output, with any complexity that would normally require "data scientists" being handled automatically.


Automatically finding connections in your data is easy. The problem is finding connections that are 1) non-obvious and 2) actionable. And when they are found, filtered so that they stand out from all the obvious, non-actionable connections.

In other words, if I'm trying to find out why a plane crashed, and the first thing this system tells me is gravity, then it isn't a practical advancement.


So far what I see is an "innovative" service/tool with a non-innovative old (e.g. biotech) software business plan. I have to jump through hoops to get my hands on this tool. Have a phone call with them, talk about your data etc. Then, what is the point of easy UI and no-human-needed touch? Not to mention I have no idea about their price model.

I don't understand what is their cost in letting people try it before using the tool; unless of course it runs on their machines. Put it somewhere and let me try it with some simple data (e.g. iris in R).


Looks very exciting, but what does it do that R does not do? For example a lot of the graphics in nytimes.com done using R. I assume that they use massive data sets. If a platform like Hadoop is used to get the speed, won't R get the same results? How is Ayasdi different?


"...has developed data visualization software it says uses big data to answer the questions you never thought to ask..."

Replace "questions" with "hypotheses" and "thought" to "cared".

I'm all for having a tool that can highlight a ton of interesting things in my data... I'd make a business case to buy that any day. But finding all the connections is so noisy compared to finding the right connections. That's the cynic in me.

The optimist in me is looking forward to when these kinds of techniques are just another R package I can load and turn loose on my data. :)


You could probably already do that with the R package caret.

It can train models for you automatically, so you could loop over every model available and return the top ten predictions. Of course it wouldn't get around the need for feature engineering, but its theoretically possible.

Note - do not attempt this unless you have lots and lots of machines, as it will take a very, very long time.


Email me - we can do that :)


How does this differ from Quid's offering? (http://www.quid.com)


If you visualize the data properly, you rarely need to automatically discover. The answers literally jump out at you. We've been doing this with AlphaVision for years:

https://aqumin.fogbugz.com/default.asp?W45


Shame about the name. 'Iris' was the name for a range of SGI's 3D workstations and their scientific visualization application 'IRIS Explorer' (an AVS clone), that was later sold to NAG Ltd:

http://www.nag.com/Welcome_iec.asp

Mik




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: