Using Neural Networks With Regression

agibsonccc · on July 26, 2015

Hey all,

Didn't expect to get on the front page. Just want to make sure we clarify what the intent was here.

A lot of people on our user group have been asking us about predicting continuous values in neural networks. This page rose out of the need of explaining that a neural network itself is a universal approximator that solves for different objectives/loss functions.

Two instantiations of that problem are regression and classification only changing w.r.t the activation function and objective function.

Happy to clarify/update anything.

Thanks!

haddr · on July 26, 2015

Side comment: great to see some java library for deep learning... I think so far it's mostly python and some C...

hellrich · on July 26, 2015

With no ill intent: I'm a bit skeptical about software quality / test coverage after finding a bug in the linear algebra implementation - I would not use it for production right now.

agibsonccc · on July 26, 2015

Thanks! We'll be working on python bindings as well. We're finishing polishing the lib now.

Happy to help people on gitter as we finish this out:

https://gitter.im/deeplearning4j/deeplearning4j

ratsimihah · on July 26, 2015

Don't forget lua :)

pigscantfly · on July 26, 2015

Caffe also supports Matlab and C++, I believe.

danieldk · on July 26, 2015

Caffe is in C++ and supports Matlab and Python. It's quite easy to bind, e.g. I have a Go binding that I use for 'neutral net surgery'.

mullerrwd · on July 26, 2015

Apart from the go binding, would you mind elaborating on what you mean by "neural* net surgery" and how you use it. Sounds interesting

* typo, I presume?

danieldk · on July 26, 2015

Indeed a typo :).

One example: In my parser I use tag embeddings. There are roughly two possibilities to do this (since Caffe does not have a lookup table a la Torch yet, though there is a pull request for this):

1. You train the embeddings on a large corpus and give them as part of the input.

2. You train the embeddings while training the classifier. You basically encode each tag (at various positions in the parser configuration) using a one-hot vector and direct this input to a linear layer (per position) that has the number of neurons corresponding to the requested embedding size, sharing weights between all such tag layers. After training, the n-th float of the embedding for the m-th tag is the m-th weight of the n-th neuron.

If you want the best of both worlds (train tag embeddings on a large corpus, finetune while training the classifier), you set up the topology for (2) and then you initialize the weights of the linear layer yourself by putting the learned embeddings as the initial weights.

There are also a completely different example for image recognition here:

http://nbviewer.ipython.org/github/BVLC/caffe/blob/dev/examp...

Edit for clarification: Caffe is quite different from the other libraries --- you do not set up a network programmatically, but specify the topology as data (via a protocol buffer file).

agibsonccc · on July 26, 2015

It's a great model. We're embedding caffe as well. One thing I'd like to clarify though is that other neural network frameworks can also allow you to setup networks as well via yaml, json and the like.

Ours supports yaml/json. That's how we allow people to setup workers on spark hadoop (that then delegate numerical workloads to gpus or a lower level blas)

Pylearn2 also supports yaml.

Caffe has a lot of good ideas that are worth extending especially to the industry.

Our intent is for people to not touch java but pure bash.