Hacker News new | past | comments | ask | show | jobs | submit login
Deep learning meets vector-symbolic AI (ibm.com)
91 points by sayonaraman on Sept 23, 2021 | hide | past | favorite | 16 comments



Seems like a variant of a Siamese network which uses binarized embedding vectors for predictions instead of the raw embedding vectors. What exactly is the novelty presented here?


I would like to see deep learning working with embodied cognition somehow: http://www.jtoy.net/blog/grounded_language.html


Can any experts chime in? Is this any good?


The fundamental idea is "operating in high dimensions", and this does have some solid footing, e.g. see Cover's Theorem (https://en.wikipedia.org/wiki/Cover's_theorem). In fact I recently did a presentation of another paper (from FB AI Research and Carnegie Mellon), exploring the concept of projections to higher dimensions for sentence encoding tasks, see here: https://github.com/acatovic/paper-lunches/blob/main/fb-rands....

There is fair amount of research in the area of high dimensional computing, as well as with sparse representations, which seem to be grounded in neuroscience. As others have pointed out, a number of commercial research labs exists. There is Numenta with their Hierarchical Temporal Memory, and Vicarious (whose founder was one of Numenta's co-founders), as well as Cortical.io (who are borrowing sparse binary coding concepts from Numenta, combining it with self-organizing maps, and applying it to document understanding tasks).


this is awesome, thanks for the link! There seems to be mutual propagation from NLP to CV and from CV to NLP, i'm wondering if there is a visual counterpart for these "Random Encoders". The SOTA for image/text embeddings currently seems to be CLIP [1].

[1] https://openai.com/blog/clip/


You are welcome :-).

With text we have fairly discrete and granular units, i.e. "words", and we see sentences as compositions of words. For images however, it's not so simple. We present the entire image (array of pixels times the number of channels), and all the various layers within the CNN automatically extract "discrete" units like edges, bends etc, but what we get at the end is a semantic encoding of this entire image.

But, you could in principle take the encoding of an image using e.g. VGG16 or ResNet or whatever, and then project that encoding onto randomly initialized higher-dimensional space (such as a very high dimensional Echo State Network), and then use that output in your downstream task, like classifying obstacles on the road. I suspect the same principles as what I wrote in the Jupyter Notebook would also apply here, and you will get a performance boost. Not sure if this was done already (computer vision is not really my focus) ;-)


Different words, same concept as sparse distributed representation in a hierarchical distribution of neural networks.

Automating the process of learning is non-trivial and making it efficient is an ongoing question.

Vicarious.ai , Hawkins Numenta, cortical.io and various other projects have been chasing this in various guises for many years.

The lottery ticket effect on very large networks can make contrasting different architectures difficult, and this looks suspect. IBM isn't necessarily a powerhouse in this arena, so it'd make sense not to get excited until they verify and expand their theory. It could be that their initial success is entirely coincidental with lotteries and the particulars of the design are a dead end.


you might also be interested in the recent work on "resonator networks" VSA architecture [1-4] by Olshausen lab at Berkeley (P. Kanerva who created the influential SDM model [5] is one of the lab members).

It's a continuation of Plate [6] and Kanerva work in the 90s and Olshausen' groundbreaking work on sparse coding [7] which inspired the popular autoencoders [8].

I find it especially promising they found this superposition based approach to be competitive with optimization so prevalent in modern neural nets. May be backprop will die one day and be replaced with something more energy efficient along these lines.

[1] https://redwood.berkeley.edu/wp-content/uploads/2020/11/frad...

[2] https://redwood.berkeley.edu/wp-content/uploads/2020/11/kent...

[3] https://arxiv.org/abs/2009.06734

[4] https://github.com/spencerkent/resonator-networks

[5] https://en.wikipedia.org/wiki/Sparse_distributed_memory

[6] https://www.amazon.com/Holographic-Reduced-Representation-Di...

[7] http://www.scholarpedia.org/article/Sparse_coding

[8] https://web.stanford.edu/class/cs294a/sparseAutoencoder.pdf


Thank you for the great reading material! From a skim my take is that resonator networks are able to sift through data and suss out features (factors) from the noise, and even decode data structures like vectors and mappings. And RNs can be made to iterate on a problem much like a person might concentrate on a mental task. Is that a fair summary of their capabilities?


What makes us humans intelligent and able to learn so quickly is our reasoning faculty especially our conceptual reasoning capabilities. There is no intelligence and learning without that, just sophisticated ml/dl pattern matching / perception. Symbolic AI led to the first AI winter because a symbol is just an object that represents another object. That's not a lot to work with.

The AI industry needs to finally discover conceptual reasoning to actually achieve any understanding. In the mean time huge sums of money, energy and time are being wasted on ml/dl on the idea that given enough data and processing power, intelligence will magically happen.

This IBM effort doesn't even remotely model how the human brain works.


I think old AI led to the first AI winter because it had poor mechanisms to deal with uncertainty. However, lots of mechanisms in old expert systems will make a comeback once we know how to combine symbolic with neural and probabilistic systems.

Just take a look at The Art of Prolog. Many ideas there are getting reused in modern inductive logic and answer-set programming systems.


> ...conceptual reasoning...

> ...a symbol is just an object that represents another object.

Those are the exact same thing. I mourn the defeat of the linguists more than most, but the brute force method undeniably beats out the purpose built on pretty modest time scales. We are well past the point where ML development is better measured in megawatts than megaflops - whoever builds the most nuclear powerplants wins. The prize? Somewhere between superpower level cat photo sorting and an economy that enjoys perfect efficiency - built on the back of autonomous software agents.

Obligatory link for this topic: http://www.incompleteideas.net/IncIdeas/BitterLesson.html


To equate a concept with a symbol, not sure how you do that. For sure you use a symbol to represent an instance of a concept or a concept, like the words in this sentence when you convert them to the concepts in your mind you understand the model of the world I'm talking about. But that symbol just gives you a reference to something else. The something else they forgot about with Symbolic AI is the concept.


A concept is simply a pattern used to generalize the underlying data, such patterns are regularly represented by symbols. For example: [x..x+y] is a range pattern, it generalizes any set of numbers that fall within the pattern's bounds, that data could be [0,1,2] or [4,5] or... it goes on forever.

You might want to spend some time reading up on formal logic, it should only take a few minutes for you to recognize how bad your take on symbolic logic is.

https://en.wikipedia.org/wiki/First-order_logic


Nice hypothesis. Why do you think that approach has never worked well?


You mean the approach using symbols or combining symbols with ml?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: