Hacker Newsnew | past | comments | ask | show | jobs | submit | program_whiz's commentslogin

Doesn't this imply a 4th type with alternating rotated atoms and aligned magnetic spin? Also seems like you could mix and match (making the effect continuously tunable at macro scale).

Yea they mentioned it in the article. Antialtermagnetism

Sorry I should have responded to this comment, but I wrote a separate response in the parent thread. I didn't feel the pdf / paper was really trying to mimick spiking biological networks in all but the loosest sense (there is a sequence of activations and layers of "neurons"). I think the major contribution is just using the dot product on output transpose output, the rest is just diffusion / attention on inputs. Its conceptually a combination of "input attention" and "output attention" using a kind of stepped recursive model.


In my reading of the paper, I don't feel this is really like biological / spiking networks at all. They keep a running history of inputs and use multi-headed attention to form an internal model of how the past "pre-synaptic" inputs factor into the current output (post-synaptic). this is just like a modified transformer (keep history of inputs, use attention on them to form an output).

Then the "synchronization" is just using an inner product of all the post activations (stored in a large ever-growing list and using subsampling for performance reasons).

But its still being optimized by gradient descent, except the time step at which the loss is applied is chosen to be the time step with minimum loss, or minimum uncertainty (uncertainty being described by the data entropy of the output term).

I'm not sure where people are reading that this is in any way similar to spiking neuron models with time simulation (time is just the number of steps the data is cycled through the system, similar to diffusion model or how LLM processes tokens recursively).

The "neuron synchronization" is also a bit different from how its meant in biological terms. Its using an inner product of the output terms (producing a square matrix), which is then projected into the output space/dimensions. I suppose this produces "synchronization" in the sense that to produce the right answer, different outputs that are being multiplied together must produce the right value on the right timestep. It feels a bit like introducing sparsity (where the nature of combining many outputs into a larger matrix makes their combination more important than the individual values). The fact that they must correctly combine on each time step is what they are calling "synchronization".

Techniques like this are the basic the mechanism underlying attention (produce one or more outputs from multiple subsystems, dot product to combine).


I would say one weakness of the paper is that they primarily compare performance with LSTM (a simpler recursion model), rather than similar attention / diffusion models. I would be curious how well a model that just has N layers of attention in/out would perform on these tasks (using a recursive time-stepped approach). My guess is performance will be very similar, and network architecture will also be quite similar (although a true transformer is a bit different than input attention + unet which they employ).


The time complexity of a large matrix multiplication is still much higher than using fourier, for large matrices it has superior performance.


Exactly this. I was thinking exactly like GP but I've been doing a large amount of benchmarking on this and the FFT rapidly overcomes direct convolution with cudnn, cublas, cutlass... I think I'd seen a recent Phd thesis exploring and confirming this recently. NlogN beats N^2 or N^3 quickly, even with tensor cores. At some point complexity overcomes even the bestest hardware optimizations.

And the longer the convolution the more matrices look tall-skinny (less optimized). Also you have to duplicate a lot of data to make your matrix toeplitz/circulant and fit into matmul kernels which convolution is a special case...


> for large matrices it has superior performance.

how large? and umm citation please?


You can mostly compute it for yourself or get an asymptotic feeling. Matmul is N^3 even with tensor cores eating a large part of it (with diminished precision, but ok) and FFT-based convolution is NlogN mostly. At some point - not that high - even the available TFLOPS available on tensor cores can't keep up. I was surprised by how far cuDNN will take you, but at some point the curves cross and matmul-based convolution time keeps growing at polynomial rate and FFT-based mostly linear (until it gets memory bound) and even that can be slashed down with block-based FFT schemes. Look up the thesis I posted earlier there's an introduction in it.


> Matmul is N^3

are we talking about matmul or conv? because no one, today, is using matmul for conv (at least not gemm - igemm isn't the same thing).

like i said - the arms race has been going for ~10 years and everyone already discovered divide-conquer and twiddle factors a long time ago. if you do `nm -C libcudnn.so` you will see lots of mentions of winograd and it's not because the cudnn team likes grapes.

> Look up the thesis I posted earlier there's an introduction in it.

there are a billion theses on this. including mine. there are infinte tricks/approaches/methods for different platforms. that's why saying something like "fft is best" is silly.


You ask how large and citation, and given one with sizes, you already have a billion ones. The article is about long convolutions (winograd being still a matmul implementation for square matrices and a pain to adapt to long convolutions and still N^2.3), and given comparisons with 10-years-optimized-(as you say) cuDNN and very naive barely running FFT-based long convolutions, showing actual O complexity matters at some point - even with the best hardware tricks and implementations.

I don't know what more to say ? Don't use FFT-based convolution for long convolutions if it doesn't work for your use cases or if you don't believe it would or should. And those of us who have benchmarked against SOTA direct convolution and found out FFT-based convolution worked better for our use cases will keep using it and talk about it when people ask on forums ?


> I don't know what more to say ?

Do you understand that you can't magic away the complexity bound on conv and matmul by simply taking the fft? I know the you do so given this very very obvious fact, there are only two options for how an fft approach could beat XYZ conventional kernel

1. The fft primitive you're using for your platform is more highly optimized due to sheer attention/effort/years prior to basically 2010. FFTW could fall into this category on some platforms.

2. The shape/data-layout you're operating on is particularly suited to the butterfly ops in cooley-tukey

That's it. There is no other possibility because again fft isn't some magical oracle for conv - it's literally just a linear mapping right?

So taking points 1 and 2 together you arrive at the implication: whatever you're doing/seeing/benching isn't general enough for anyone to care. I mean think about it: do you think the cudnn org doesn't know about cooley-tukey? Like just so happens they've completely slept on this method that's taught in like every single undergrad signals and systems class? So that must mean it's not a coincidence that fft doesn't rate as highly as you think it does. If you disagree just write your fftdnn library that revolutions conv perf for the whole world and collect your fame and fortune from every single FAANG that currently uses cudnn/cublas.


Yes this is common knowledge in econ supply/demand and money supply.

Other currencies 1M base notes is not a lot (e.g. 1M dinar). You can just add/remove zeros but prices have adjusted, those people can't live like "millionaires" with 1M dinar.

There was a time goods like meat cost pennies, now it's $10 per pound. In those times $10,000 would be a life-altering amount of money, today most people have $10,000 in assets. The price of goods is related to money supply, they get more expensive if people have more money.

Money has no intrinsic value, it is balanced by whatever goods and services can be bought by it. If you add money but no goods and services, money is worth less (see COVID policy, increase money and decrease goods and services).


Thank you! The COVID policy example feels especially relevant because it illustrates the kind of sudden economic shift I'm curious about, rather than just changes in nominal currency values. To clarify my thought experiment: imagine a stable economy where suddenly every person worldwide is gifted $1 million USD. I'm interested in exploring how this kind of immediate influx would impact prices and standards of living, beyond just inflation.


If every person is giftet USD1M, prices of all things will go up by a lot. Furthermore, prices of necessities will rise by larger fractions because most people in the world are significantly poorer than the median of this forum.

More generally, gifting every person worldwide the same amount of money seems roughly equivalent to taxing every above-average wealth person a fixed percentage of their surplus and giving every below-average-wealth person a fixed percentage of their deficit...


I think that's not quite true, because rich people mostly keep their wealth in assets, not cash. Stock will just rise.


Which of my two claims is not true? Why should stock rise more than basic necessities? Of course, all prices would rise a lot. But I think stocks would rise by a smaller factor.


Ah sorry, I wasn't very clear. I was talking about your second claim - that giving people lots of money would redistribute a fixed proportion of wealth from the rich to the poor. My point was that the rich have most of their wealth in stocks and the like, so the redistribution would only affect the cash portion of their wealth, which is quite small.


I see. The "taxation" scheme I had in mind was intended to apply to "any and all wealth", which I didn't make clear. Do you think it works out then?

On a more realistic note, gifting every human 1M would probably completely break the financial system and cause a global recession...


To me this is the clearest case of using lawfare to try and suppress honest competition. Yes Palworld borrows some creative ideas from pokemon, but it so clearly a different game there's no question to it (if that's not different enough, then what is?).

In this case, I think Nintendo realizes their biggest cash-cow is pokemon so they _have_ to make a play to suppress any competition. However, this is very bad for the market imo and should be disincentivized somehow.


If they succeed, imagine the chilling effect to game makers who have to pause and think "if I succeed, will I be destroyed because my idea shares some commonalities with other games?"


Yeah,this would be grim because this is how genres are born.



From what I understand their lawfare approach is using japanese software patent law, they have notably not tried to do the same using trademark or copyright infringement.

If they could get away with what you're suggesting, I imagine they would have tried it on digimon decades ago: https://digi-battle.com/Content/CardScans/CP-22.png vs https://archives.bulbagarden.net/media/upload/e/ed/0098Krabb...

There's a laundry list of these comparisons actually https://preview.redd.it/what-if-digimon-adventure-was-pokemo...


No one is looking at Digimon and going "Hey, that's Pokemon!" -- looking at Palworld's creatures for the uninitiated, it looks so very much like Pokemon creatures that most people I know have confused it for Pokemon.

There's a HUGE difference between being influenced by, and blatantly copying the inspiration and design. It took Nintendo decades to come up with creature designs, and Palworld less than a year - and they could do that because they likely went through each creature one-by-one and said, "How can we make it just so slightly different?"


No one is looking at Digimon and saying it looks like Pokemon now, but in the 90s they sure as heck were. People's mothers (ie the key demographic for "the uninitiated") commonly confused one for the other, even the TV show. This is no longer the case purely because Digimon is far less popular.

Bearing in mind pretty much all Pokemon designs follow the rule of "what if a mythical animal existed in our art style", it is in fact shockingly easy to accidentally ape a Pokemon design just by cartoonifying mythos.

This is incidentally also true of Pokemon, who were accused of ripping off Dragon Quest when the first games started coming out. Does anyone remember that at this point?

https://pbs.twimg.com/media/GEYrZuzXUAAjpkB?format=jpg&name=...


Similarity in design is not necessarily infringement. Invincible is obviously based on the DC comics universe, with Omni-Man very similar to Superman both in appearance and background, and there are basically one-to-one equivalents of most of the Justice League (e.g. Darkwing for Batman). Yet that doesn't mean it infringes on DC's IP.


At this point Marvel and DC actually need each other in order to have a comic book market. The more readers you get at Marvel, the more potential ones you would get in DC (because they cover different social issues) and vice versa.


Invincible is actually published by Image Comics.


It does look bad, but this is about the core design of the creatures, not the gameplay. I think we should be concerned if this meant making a game clone like how first-person shooters originated as "Doom-clones" was off the table.


could you imagine a world where only the first company to develop a game mechanic is allowed to use that mechanic in their games?

That would make games like iphones, only the smallest change allowed between each generation, Atari would rule the game universe, the kids would be playing jumpman in 4k resolution (Now with 256 colors and 12 unique levels!)


The creatures are the biggest part of the Pokémon IP. You don't sell plushies and merch of the gameplay.


The entire lawfare approach thus far has been via patent dispute from what I understand, so the gameplay is actually what's being argued.


Agree the creatures look similar but that isn't at issue in the case, it's gameplay mechanics (catching creatures and then using them to battle).


They probably took some inspiration but you also could argue that there's so many Pokémon nowadays that you can't create any creature from scratch which would not look like any of them.

And some of them in the Pokémon world aren't really that inspired either...


For any aspiring inventors / engineers out there, take a good look at how Hall was treated by GE. He literally invented game-changing tech with every obstacle thrown in his way by management, and was given a 10% raise and $10 savings bond.

Had he done it on his own, he would have been extremely wealthy, being the supplier of synthetic diamonds to the world (assuming he wouldn't have faced legal challenges by former employer). He would have also been able to pursue this full time, who knows how much he could have improved the tech.

Just because the powers that be don't think its a good idea, doesn't mean it isn't (it also doesn't mean it is). And if they don't want you building it, for goodness sakes, don't just give them your amazing idea, build it so you can profit when it turns out to be a golden nugget.


There's a similar story for the inventor of the blue LED!

https://en.wikipedia.org/wiki/Shuji_Nakamura#Careers


Love the gist of this, but just wanted to point out, but there's no need to draw the line between buildings and gardening. Anyone who has built a house or done major remodel knows that it too suffers from fractal complexity. It may not be a nail that becomes a wormhole of complexity (as neither is something simple arithmetic operations in programming), but all kinds of things can crop up. The soil has shifted since last survey, the pipes from city are old, the wiring is out of date, the standards have changed, the weather got in the way, the supplies changed in price / specification, etc. Everything in the world is like that, software isn't special in that regard. In fact, software only has such complexity because its usually trying to model some real-world data or decision. For the totally arbitrary toy examples, the code is usually predictable, simple, and clean ; the mess starts once we try to fit it to real-world usecases (such as building construction).


I've seen little evidence that the smartest humans are able to dominate or control society as it is now. We have 250 IQ people alive right now, they haven't caused imminent destruction, they've actually helped society. Also gaining power / wealth / influence only seems a little connected to intelligence (see current presidential race for most powerful position in the world, finger on nuke trigger).


I loved the author's example of Orcas, which may have more raw intelligence than a person -- still waiting for world domination (crashing yacht's doesn't count).


We have zero people with an IQ over 200, due to how IQ is defined: http://www.wolframalpha.com/input/?i=6.67%CF%83

We also can't reliably test IQ scores over 130, a fact which I wish I'd learned sooner than a decade after getting a test result of 148.

Most humans are motivated to help other humans; the exceptions often end up dead, either in combat (perhaps vs. military, perhaps vs. police), or executed for murder, or as a result of running a suicidal cult like Jim Jones. But not all of them, as seen in the hawks on both sides of the cold war. "Better dead than red" comes to mind.

> Also gaining power / wealth / influence only seems a little connected to intelligence

On the contrary, they are correlated: https://www.vox.com/2016/5/24/11723182/iq-test-intelligence

Trump inherited a lot and reportedly did worse than the market average with that inheritance, so I'm not sure you can draw inference from him using that inhereted money to promote himself as the Republican candidate, beyond the fact that it means other valid rich people (i.e. not Musk because he wasn't born in the US) don't even want to be president — given the high responsibility and relatively low pay, can you say they're wrong? They've already got power and influence. Musk can ban anyone he wants from Twitter, the actual POTUS isn't allowed to do that. And given Musk's diverse businesses, even if he was allowed to run, he'd have a hard time demonstrating that he wasn't running the country to benefit himself (an accusation that has also been made against Trump). Sure the POTUS has a military, how many of the billionaires are even that interested in having their own given what they can do without one?


Why? If you knew 100% someone was guilty, why would you defend them? Isn't the point of the "strong defense" that we haven't established guilt? If someone is guilty, then give them the consequence of the crime, unless you think having some guilty people get away with it (at substantial cost) is in the best public interest?

I understand providing a way to determine guilt and innocence without bias, but if everyone knows someone is guilty, isn't trying them and/or defending them just a waste of resources, with the best outcome being "same as plea bargain" and worst possible outcome being "goes free without punishment, despite having committed crime"?


It’s a token effort to catch the false positives. If you put in no effort you’d start to become overconfident in the guilty assessment and an increasingly larger percent of defendants will be considered guilty even though more of them will in fact be innocent.

Plus there is more to it than guilty/not guilty and the state should not be permitted to take shortcuts.


Our whole concept of justice enshrined in the US Constitution is that the truth must be ascertained by a jury of your peers from a fair presentation of the evidence and arguments in a court of law. And it is entirely to prevent people from being "rubber stamped" guilty and discarded. At its heart the trial is about ascertaining the truth, with the understanding that the verdict has some powerful gravity for everyone involved.


> If you knew 100% someone was guilty, why would you defend them?

Ask defense attorneys, I think you'll be surprised at the number of "yes" responses. How do you establish that "everyone knows" someone is guilty? You're assuming the conclusion as true and then reasoning from there. Put yourself in the shoes of a defendant who "everyone knows" is guilty. Would you not still want a robust and aggressive defense?


Ok I get what people are saying, but you are describing the common case, where guilt is in question. I'm talking about the parent comment which is saying "even if someone is known to be guilty they still deserve defense." You're talking about the common case -- where there is doubt to guilt and/or someone maintains innocence. Do you also believe someone who admits guilt should still be defended (e.g. hide this fact from the jury and proceed as though they didn't admit to the crime)? The only reason for defending someone is because they may be innocent, there is no advantage to excusing a guilty person. Or do you believe there is an advantage to excusing the guilty?

I also understand why we have our current system, and that there may be false positives, I'm merely commenting on the fact that we should not defend those that are 100% known to be guilty (and I'm not claiming that its easy to ascertain, but that in cases where people plead/confess, maybe its for the best).


Let's enter fantasy land for a moment and assume that we even can ascertain whether someone is "100% known to be guilty" without a trial. Say we have a crystal ball that we can just ask. That person still is entitled to representation to ensure that all the proper procedure was followed (was the law somehow broken when the crystal ball was consulted? was the crystal ball accurately calibrated and configured? is it a real crystal ball and not a knock-off that always says "guilty"?).

Even if there was no crystal ball! The defendant admitted and signed a confession. He still needs a defense. Was the confession forced or obtained under duress? Did the defendant know what he was confessing to?

And even if there was no crystal ball, the defendant confessed voluntarily and understood fully what he confessed to, cooperatively admitted everything to the point where there is zero chance of reasonable doubt. He still needs a defense. Who is going to ensure that a fair punishment is imposed? Without a defense attorney and proper procedure, what stops the judge from simply imposing the maximum sentence for everyone?


Thanks this changed my mind and I agree with your assessment.


Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: