This is super cool but it's unfortunate that it clamps negative result values to black.
It's probably worth mentioning that there are a lot of ways to implement convolution with a kernel, and the kernel can be of any size, not just 3×3. The explanation here shows how to implement the output-side algorithm nonrecursively; http://www.dspguide.com/ch6/3.htm gives this for the one-dimensional case. But you can implement it on the input side instead (iterating over the input samples instead of the output samples), there are kernels that have a much more efficient recursive implementation (including zero-phase kernels using time-reversal), you can implement very large kernels if you can afford to do the convolution in the frequency domain, and there's a whole class of kernels that have efficient sparse filter cascade representations, including large Gaussians.
(To say nothing of convolutions over other rings.)
The standard thing to do in these cases is to make zero be a medium gray, so that you can see negative output as well as positive. If you use one of the edge-detection filters, you'll see what I mean.
Well yeah. That has use, but when one mentions claiming negative results to black, I assumed that was for pictures. Obviously negative values mean something in other places, but I assumed we were talking about pictures.
No experience with this stuff, but it feels like you could do an abs(kernel) to extract some meaning from the negative value.
Applying this to the "outline" kernel, it seems like white bordered by black would show up just as white as black bordered by white, with homogeneous kernels still showing up black.
(Can't comment on much else, but just wanted to say thanks for the link. I just got an SDR and the link you sent looks incredibly useful for learning. Thanks!)
A little art and plenty of science. The kernel matrices can be broken down logically once you know what the numbers are operating on. Considering a 3x3 kernel, the center of the kernel matrix is the origin pixel, and the kernel elements around the origin are the neighboring pixels in their respective directions and distances.
The identity kernel is [0 0 0; 0 1 0; 0 0 0]. For every input pixel, the output is the original pixel value. Don't change the pixel value based on what is around it.
A simple blur kernel would be 1/9 * [1 1 1; 1 1 1; 1 1 1]. The output for each input pixel is the average of the origin's pixel value and all of its neighboring values, with an even weighting. A less dramatic blur can weigh the origin pixel higher than the neighbors, such as a gaussian blur. This will result in the output pixel being more similar to the origin pixel, than to it's neighbors.
Edge detection like [-1 -1 -1; -1 8 -1; -1 -1 -1] can be understood by multiplying the origin pixel value by 8, and subtracting off all of its 8 neighboring values. If the values are all fairly similar, lets say all gray, your output will be black. 8A - 8A = 0. So it "punishes" pixels that are similar to its neighbors. When a pixel is different than some or all of its neighbors, you will be left with some value > 0 at the output, which detects a change from its neighbors: an edge. Horizontal edge detection: [1 0 -1; 2 0 -2; 1 0 -1]. Ignores the pixels above, below, and center, but accentuates the differences between what is on the left from the right.
In general terms, convolution is defined for two functions which both can be continuous and then it's essentially an "weighted integral". In 1D case (sound), when both functions have discrete domain and domain of filter is finite, then it is the same thing as FIR filter. Convolution with continuous domain kernel (eg. sinc()) is useful for example for resampling/scaling.
- generate a matrix such that each element in the matrix is equal to 1 (e.g. averaging)
- generate a matrix such that it represents the Gaussian distribution (you can use a 2D Gauss function)
For edge detection, you essentially have "derivative", e.g. rate of change; the more abrupt the change, the brighter the resulting pixel, hence why edges are highlighted. A good convolution kernel for edge detection would be the Laplacian.
For sharpness, it's pretty much the Laplacian Kernel + identity kernel.
The values of the gaussian kernel matrix are determined by doing a discrete sampling of the gaussian function. You get to choose sigma (gaussian's standard deviation) and kernel size (spatial neighborhood of the kernel, ie how much of the surroundings that the kernel will examine).
The kernel matrix is the result of composing a gaussian smoothing with a spatial-differencing operation. Thus, the Sobel estimates edges from smoothed images.
As for the sharpen kernel described in the post -- an intuitive explanation is that you want to accentuate differences in pixel intensities.
Both, but lots of science. In particular, convolution is linear so if you know what you want a bright dot (a single pixel—a delta function) to look like, then that is your kernel. Want a dot to look like a fuzzy blob? Then your kernel should be a fuzzy blob (Gaussian blur). Want a dot to turn into a positive blip on the left and a negative one on the right (an x-derivative), then (-1, 0, 1) is your buddy (I lied in the symmetric case: it needs to be flipped).
If you understand how things work in the frequency domain, you can design them there and convert them back to the time domain (or leave them in the frequency domain, because if they're of any significant size, convolution is faster in the frequency domain anyway).
image kernels are 2D convolutions, and you can think of them as extensions from 1D convolutions. In 1D, understanding the frequency domain behavior is much easier since you don't have to worry about spacial frequency and 1D frequency is something most people can intuitively understand.
look at FIR filters, they are essentially 1D image kernels.
Nope, and in general it doesn't exist. That said, there are techniques for "deconvolution" which partially reverse (or in some, special cases completely reverse) a kernel convolution.
I don't see image kernels compared to cellular automata, but that's what they are. We just don't iterate more than once or twice with a kernel, and the long-term evolution (stability, chaos, or more interesting dynamics) is not the concern here.
That is to say, there is more to cellular automata than the GOL, and one bit per cell.
I'd be interested to learn more about this! I'm completely outside image processing/machine learning/whatever this is, but articles about it really fascinate me.
A Survey on Two Dimensional Cellular Automata and Its Application in Image Processing
Parallel algorithms for solving any image processing task is a highly demanded approach in the modern world. Cellular Automata (CA) are the most common and simple models of parallel computation. So, CA has been successfully used in the domain of image processing for the last couple of years. This paper provides a survey of available literatures of some methodologies employed by different researchers to utilize the cellular automata for solving some important problems of image processing. The survey includes some important image processing tasks such as rotation, zooming, translation, segmentation, edge detection, compression and noise reduction of images. Finally, the experimental results of some methodologies are presented.
Thanks for this. I have been playing with CAs recently[1], and doing a lot of GIF rendering based on the Game of Life rules. This water rendering algorithm seems like a great next step for me to experiment with.
Image kernels generalize very naturally to real values (so naturally that it hardly constitutes a generalization). CAs can be generalized but it seems less natural.
CAs are already highly generalized, we are just conditioned to only think of the famous ones like the Game of Life.
All you need for a CA is a space in which to map your cells (in an arbitrary number of dimensions, with an arbitrary mapping), some state for each cell (maybe a real, this is arbitrary) and some transition function to compute the next state of each cell from its current state and that of its "neighborhood" (which again is arbitrarily defined.)
The problem is that CAs are too generalized. They don't have the kind of structure image kernels have (in particular, they aren't linear transformations).
Just wanted to add this is quoting the definition of CA as instances of dynamical systems. It is interesting to think about how a large number of filter passes would affect an image. Makes me wonder if there are very simple filters that create absolute chaos after a while (could an edge detector be one?).
This was an interesting article though I am not an image-o-phile. However, what I really like was the base site! I am a part time instructor for business students and I am teaching them about the power of visualization. This is an incredibly illustrative source that explains points well, and I'll be able to use it as a teaching tool! Thanks for the post!
It's probably worth mentioning that there are a lot of ways to implement convolution with a kernel, and the kernel can be of any size, not just 3×3. The explanation here shows how to implement the output-side algorithm nonrecursively; http://www.dspguide.com/ch6/3.htm gives this for the one-dimensional case. But you can implement it on the input side instead (iterating over the input samples instead of the output samples), there are kernels that have a much more efficient recursive implementation (including zero-phase kernels using time-reversal), you can implement very large kernels if you can afford to do the convolution in the frequency domain, and there's a whole class of kernels that have efficient sparse filter cascade representations, including large Gaussians.
(To say nothing of convolutions over other rings.)