A Beginner’s Guide to Restricted Boltzmann Machines

darksaints · on Aug 6, 2015

Asking here because I don't really know of a better place. Does anybody involved in deep learning have any use cases other than image-based use cases where deep learning has been a measurably better option than other techniqes?

I don't do any image-related tasks, but I use machine learning extensively and have never been able to successfully break away from my big three of Naive-Bayes, SVMs, and Random Forests. I've tried some of the more established deep learning techniques (LeNet,RBM,etc), but can't seem to find anything that works better than one of those three. Is deep learning an exclusively image-related technique?

teraflop · on Aug 6, 2015

You can think of deep learning as a way to automate feature extraction on very high-dimensional data, where the features are complex enough that they're difficult or impossible to specify by hand. This includes things like:

- objects in an image/video

- phonemes in raw audio

- events in multidimensional time series

- grammatical/semantic structures in unstructured text

- strategic features in a board game position (e.g. Go)

These kinds of inputs are difficult to handle with traditional ML techniques. But if your data already looks more like rows in a table, with simple, semantically meaningful features, deep learning isn't likely to buy you that much.

obastani · on Aug 6, 2015

My understanding is that deep learning has the largest benefit with perceptual data (e.g., images, speech, and to a lesser extent natural language and control). The driving force behind deep learning is the ability to "compress" data by learning hierarchical representations. For example, your images of cats and dogs are very high dimensional and encode a lot of redundant information, making them especially suitable for such compression.

On the other hand, deep nets probably won't outperform more traditional machine learning algorithms on less high dimensional, uncompressed data. This problem is exacerbated by the amount of tuning needed by deep nets. Architectures have been fine tuned for common data types such as images and speech, but if the dataset doesn't fall into one of these categories, you have to tune the algorithms yourself.

blueyes · on Aug 6, 2015

Deep learning is setting new records in accuracy for most major data types including sound, video, etc. Neural networks like word2vec are useful for text analysis.

These links may be helpful: http://deeplearning4j.org/accuracy.html http://deeplearning4j.org/use_cases.html http://deeplearning4j.org/word2vec.html

platz · on Aug 6, 2015

Deep Learning Tips from the Road | SciPy 2015 | Kyle Kastner https://www.youtube.com/watch?v=TBBtOeY2Q78

billconan · on Aug 6, 2015

it is good at voice recognition too and machine translation and recently applied to robotics.

My experience is that dl method is not as smart as a real human. you can't simply put data in and wait.

the trainer has to understand the data first. and then he needs to choose the best model for the data. sometimes it is black magic. sometimes you need to do many experiments, and in the end, you can't explain why this or that gives the best result. lots of manual work is involved.

but I should say I'm not that experienced with dl.

T-A · on Aug 6, 2015

Discussion of a recent review here (you may want to check out the links): https://news.ycombinator.com/item?id=9613810

taimur38 · on Aug 6, 2015

summary: http://www.businessinsider.com/google-tests-new-artificial-i...

paper (worth reading): http://arxiv.org/pdf/1506.05869v2.pdf

kastnerkyle · on Aug 6, 2015

The literature is mostly image based, but as other users have said voice, text, and video have also seen substantial gains. It has also been used to great effect for taxi destination prediction (heterogenous information) [1], drug discovery [2], and general operations on graphs [3].

There are also lots of examples beyond classification - structured prediction such as image segmentation [4], predicting depth, surface normals, and pixels from only RGB [5], text to text translation [6], image captioning [7], Turing machines [8], speech synthesis [9], question answering (Q/A) [10][11], handwriting generation [12], playing Go [13], playing Atari [14], speech recognition [15], drawing things [16] and many more.

[1] http://blog.kaggle.com/2015/07/27/taxi-trajectory-winners-in...

[2] http://blog.kaggle.com/2012/11/01/deep-learning-how-i-did-it...

[3] http://arxiv.org/abs/1403.6652

[4] http://arxiv.org/abs/1502.03240

[5] http://arxiv.org/abs/1411.4734

[6] http://arxiv.org/abs/1409.0473

[7] http://arxiv.org/abs/1502.03044

[8] http://arxiv.org/abs/1410.5401

[9] http://arxiv.org/abs/1506.02216

[10] http://arxiv.org/abs/1507.04808

[11] http://arxiv.org/abs/1503.08895

[12] http://arxiv.org/abs/1308.0850

[13] http://arxiv.org/abs/1412.3409

[14] https://www.cs.toronto.edu/~vmnih/docs/dqn.pdf

[15] http://www.sciencedirect.com/science/article/pii/S0893608014...

[16] https://www.youtube.com/watch?v=Zt-7MI9eKEo

benanne · on Aug 7, 2015

I feel like this guide comes about 5 years too late - RBMs as density models have been shown to be relatively weak, except in the case of binary data. For continuous data, you can often do better even with a simple Gaussian mixture model. Other than that they are cumbersome to train (the gradient needs to be approximated), and the for continuous variants training can be unstable unless you use very low learning rates.

They were very popular for unsupervised pre-training a while ago, but the utility of pre-training has greatly diminished. Unless you have a ton of unlabeled data and very few labels, it's not worth the effort. And if it is, you are better off using autoencoders for pre-training anyway. They are conceptually much simpler and easier to understand, and you'll get roughly the same results.

If you want to get started with deep learning, focus on feed-forward and recurrent neural nets instead, you'll get much more useful knowledge out of that. For most of the common deep learning use cases there is no need to bother with RBMs anymore.

deepnet · on Aug 6, 2015

The recent work at Berkeley using deep nets for domestic robot control is interesting.

Deep nets turn camera pixels into motor torques.

The robots are quickly trained to a new task and once trained the solutions are robust to changes - no camera calibration is required.

http://rll.berkeley.edu/deeplearningrobotics/