> Generative Manim is a prototype of a web app that uses GPT-4 to generate videos with Manim. The idea behind this project is taking advantage of the power of GPT-4 in programming, the understanding of human language and the animation capabilities of Manim to generate a tool that could be used by anyone to create videos. Regardless of their programming or video editing skills.
import unittest
import sympy as sy
test = unittest.TestCase()
x = sy.symbols("x")
# with test.assertRaises(AssertionError):
test.assertEqual(sy.I**(4*x), sy.E**(2*sy.I*sy.pi*x))
The flip and slide thing that electrical engineers do has always struck me as unintuitive; it makes more sense to think of it as a blurring operation.
Consider a function f: R^2 -> R (or Z^2 -> Z if you like) that represents a grayscale image. So f(0,0) is the pixel at the origin, f(1,0) is the pixel at (1,0), etc. Think of g: R^2 -> R as a blurring function, e.g. a gaussian.
What convolution does is it turns every pixel of f into a copy of the blur g, weighted by and centered on each pixel being blurred. So f(0,0) gets turned into a blurred image h(x,y) = f(0,0)g(x,y). f(1,0) gets turned into a blurred image h(x,y) = f(1,0)g(x-1,0). Note that the subtraction is just recentering g so the blur applies in the right position. In general, each pixel gets blurred into the function h(x,y) = f(a,b)g(x-a,y-b).
Now sum up all the blurred pixels, so you get the final image (f*g)(x,y) = integral_(a,b) f(a,b)g(x-a,y-b).
Same thing can be done in the time domain instead of a spatial domain, or you can write it in vector form, so (f*g)(x) = integral_(r) f(r)g(x-r).
Note that you can also write it in a more symmetric way as (f*g)(c) = integral_(a+b=c) f(a)g(b), which makes it clear that you can do this for any semigroup (you get the normal definition when you have a group by noting that b=c-a), makes commutativity obvious, and makes it clear why polynomial multiplication is convolution of coefficients: x^n comes from summing over all the x^i*x^j with i+j=n, so the coefficient for x^n is the sum over all coefficients indexed by i,j with i+j=n, which is the symmetric way to write convolution.
> it makes more sense to think of it as a blurring operation.
It's not a blurring operation, it's not even an operation on images. And even when applied to images, only some convolutions result in what could be described as a "blur." This is also tautological, because you can't really define why a gaussian filter "blurs" an image without first understanding the convolution of the image and filter kernel.
imho you should throw out the notion of "intuition" and stick with the mathematical definition and then explore its properties and relationships.
The simplest and most intuitive description I've heard is "convolution is a mathematical operation that describes mixing two things together" (those things being vectors, matrices, continuous functions, etc)
I don't need to know about convolution to tell you what a Gaussian is; it's just e^(-x^2). You can look at a grayscale image with a single bright point pixel (like a star in the sky) and say it's "sharp", and an image with a gaussian and say it's a "blur" or a blob. That's just a qualitative description of what a point or a gaussian look like. Then, as I described, you can define convolution by saying you're replacing each point pixel with an image that is your blur with its brightness weighted by the point pixel and located at that pixel, and then add them all up.
That intuition also works fine in the time domain, where your signal is "blurred" at each instant by the impulse response.
But yeah more generally, all you get is that (f*g)(c) is the sum over all a (**) b=c f(a)g(b), where (**) is... some operation.
I guess ultimately my point is that looking at g(t-tau) and seeing -tau as "flip the signal" and t as "move to the desired time" seems absolutely bizarre to me. g(t-tau) is g(t), centered at tau. So f(tau)g(t-tau) is g(t), centered at tau, weighted by f(tau). Then sum over all tau. I can never see where the flip comes in; it seems backwards to me. Nor can I see how this flipping thing would intuitively arise in physical systems. Obviously they're equivalent, but it just doesn't make sense to me.
Ok but when i try this, i dont get any good intuitions.
polynomial multiplication is convolution, so what am i supposed to do here when you say, "g is blurring function"
When i think of multiplying term by term an summing over the diagonal I can clearly see what the f(tau)g(t-tau) is doing, its collecting terms with like exponents.
This also makes sense when I think of probability distributions, i'm multiplying the densities and summing "over the diagonal"
If you plot your coefficients as if they were a discrete time signal, you ought to see multiplication appear as a blur I guess, but yeah IMO it's hard to interpret for polynomials because polynomials are very abstract.
I'd certainly look at the diagonal (or what I said is the "symmetric" way to write it) as the proof for "why" convolution appears in the sum of random variables, and I've never considered the blurring thing in that setting, but it actually makes a lot of intuitive sense for looking at "what" happens when you add distributions. Like just picturing repeated blurrings of a uniform distribution at the origin (i.e. start with a square and blur with a square over and over), it seems intuitively plausible that it will converge to something like a gaussian (i.e. that you'll get a bump of some kind), so that's neat.
And which you can use to do actual multiplication, using the DFT and inverse DFT, if you let $f$ and $g$ be polynomials over, say $Z/(2^k +1), where the modulus is selected such that $2^k$ is the word size of your machine.