more oxxoxoxooo's comments

oxxoxoxooo · on May 5, 2021

At the very bottom of the subsequent post [1], is it really possible for the second `qhat` (i.e. `q0`) to be off by 2? Any examples of that?

[1] https://ridiculousfish.com/blog/posts/labor-of-division-epis...

oxxoxoxooo · on Dec 31, 2020

Not sure why this gets down voted, it is the correct definition of "telephoto" (i.e. "the physical length of the lens is shorter than the focal length").

pkd · on Dec 31, 2020

Eh, no? It's based on the field of view, not on the physical length of the lens.

Fuji's 27mm pancake lens is only 23mm tall. A telephoto lens it is not.

roelschroeven · on Dec 31, 2020

It's the original/technical definition. In common parlance telephoto means small field of view.

The commonly used meaning has drifted from the original/technical definition. The same has happened with other things. 'Lens' for example, is technically one single element; the whole assembly is technically an 'objective'. But in photography the latter is what is commonly referred to as 'lens'.

Language isn't always precise and often context dependent (I don't always like that but there's not a lot I can do).

sprobertson · on Dec 31, 2020

GP was more correct in that it's the construction (using a telephoto group) that makes it shorter. Being shorter does not in itself make it telephoto.

oxxoxoxooo · on Sept 27, 2020

What is "Native zero padding to model open systems"? And how come it is "up to 2x faster than simply padding input array with zeros"?

gct · on Sept 28, 2020

So you can pad your input array with zeros, but the algorithm doesn't know that it's padded, and will just compute with those zeros like any other value. If you could tell it that they were zeros it could take advantage of x*0=0 and x+0=x to significantly reduce computation. That's what I think that is.

DTolm · on Sept 28, 2020

That is almost the correct answer. To go even further, there are sequences that are completely full of zeros in the padded case of multidimensional FFTs and we can omit their FFTs entirely.

oxxoxoxooo · on Sept 28, 2020

Thank you for the reply! Could you be more specific? In the case of 1D FFT, the right half (possibly zero-padded) of the signal is completely mixed up with the left half after the first pass [of breath-first FFT]. If the right half was all zeros, would it still be twice as fast in 1D case? Do you have any pointers to literature which discusses this?

DTolm · on Sept 28, 2020

No, the 1D case will mostly save on the fact that it transfers 2x times less data from the vram to the chip. The up to 2x times increase in performance was mainly related to 2D and 3D cases, where only 1/4 or 1/8 of the data is nonzero. In 2D, when doing 1D FFTs over x-axis, we omit sequences after Ny/2 because we know they are full of 0 and thus their result will be 0. So we do 0.5Ny x-axis ffts and full Nx y-axis ffts. For a square system this will mean a drop from 2N to 1.5N sequences. In 3D the drop will be even bigger, from 3N^2 to (1/4+1/2+1)=1.75N^2 sequences (almost 2x).

oxxoxoxooo · on June 9, 2020

If you are into integer sorting, this might be of interest as well:

https://yourbasic.org/algorithms/fastest-sorting-algorithm/

https://sorting.cr.yp.to/

oxxoxoxooo · on May 30, 2020

Josh, Gerben!

Have you tried a sorting network, instead of the bubble sort?

gerbenst · on May 30, 2020

No and I suspect that there is room for significant improvement here.

ncmncm · on May 31, 2020

I got pretty dramatic results with just a 3-element branchless sorting network.