Automatic differentiation feels magical. Many compsci people have been captivate...

dekhn · on Aug 26, 2023

At the time I worked in Machine Learning (94-95) I was unaware of AD and my professor who built the objective function also determined its analytic derivative manually. I didn't learn about it until a few years ago and was amazed because I spend much of the late 90's learning enough Mathematica to make my own analytic derivatives.

V1ndaar · on Aug 25, 2023

I think this goes back to "The complex-step derivative approximation" from 2003 by J. Martins, P. Sturdza and J. Alonso. [0] That paper is a great read!

[0]: https://doi.org/10.1145/838250.838251

nicolapede · on Aug 25, 2023

It does feel magic indeed.

Nice write up, thanks for sharing it. Would you know of any introduction to back-propagation written in a similar fashion?

hansvm · on Aug 25, 2023

Backprop is the same algorithm?

Autodiff computes a derivative by examining a computational graph (either up-front all at once, or implicitly by examining each computation) and producing a new graph. The person defines the forward pass (graph), and the computer figures out the backward pass.

Backprop is what happens when you tell the programmer to do the thing autodiff is doing. You examine the computational graph, write down all the local changes that autodiff would do to compute the derivative, and that new code (that you hand-wrote rather than letting a machine generate) is a function computing the derivative by backpropagating error terms through each edge in that computational graph.