Didn't expect to get on the front page. Just want to make sure we clarify what the intent was here.
A lot of people on our user group have been asking us about predicting continuous values in neural networks. This page rose out of the need of explaining that a neural network itself is a universal approximator that solves for different objectives/loss functions.
Two instantiations of that problem are regression and classification only changing w.r.t the activation function and objective function.
With no ill intent: I'm a bit skeptical about software quality / test coverage after finding a bug in the linear algebra implementation - I would not use it for production right now.
One example: In my parser I use tag embeddings. There are roughly two possibilities to do this (since Caffe does not have a lookup table a la Torch yet, though there is a pull request for this):
1. You train the embeddings on a large corpus and give them as part of the input.
2. You train the embeddings while training the classifier. You basically encode each tag (at various positions in the parser configuration) using a one-hot vector and direct this input to a linear layer (per position) that has the number of neurons corresponding to the requested embedding size, sharing weights between all such tag layers. After training, the n-th float of the embedding for the m-th tag is the m-th weight of the n-th neuron.
If you want the best of both worlds (train tag embeddings on a large corpus, finetune while training the classifier), you set up the topology for (2) and then you initialize the weights of the linear layer yourself by putting the learned embeddings as the initial weights.
There are also a completely different example for image recognition here:
Edit for clarification: Caffe is quite different from the other libraries --- you do not set up a network programmatically, but specify the topology as data (via a protocol buffer file).
It's a great model. We're embedding caffe as well. One thing I'd like to clarify though is that other neural network frameworks can also allow you to setup networks as well via yaml, json and the like.
Ours supports yaml/json. That's how we allow people to setup workers on spark hadoop (that then delegate numerical workloads to gpus or a lower level blas)
Pylearn2 also supports yaml.
Caffe has a lot of good ideas that are worth extending especially to the industry.
Our intent is for people to not touch java but pure bash.
Didn't expect to get on the front page. Just want to make sure we clarify what the intent was here.
A lot of people on our user group have been asking us about predicting continuous values in neural networks. This page rose out of the need of explaining that a neural network itself is a universal approximator that solves for different objectives/loss functions.
Two instantiations of that problem are regression and classification only changing w.r.t the activation function and objective function.
Happy to clarify/update anything.
Thanks!