Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

A lot of innovation is stealing ideas from two domains that often don’t talk to each other and combining them. That’s how we get simultaneous invention. Two talented individuals both realize that a new fact, when combined with existing facts, implies the existence of more facts.

Someone once asserted that all learning is compression, and I’m pretty sure that’s how polymaths work. Maybe the first couple of domains they learn occupy considerable space in their heads, but then patterns emerge, and this school has elements from these other three, with important differences. X is like Y except for Z. Shortcut is too strong a word, but recycling perhaps.



I'm unsure if I misunderstand you or your writing ingroup!

> learning is compression

I don't think I know enough about compression to find that metaphor useful

> occupy considerable space in their heads

I reckon this is a terribly misleading cliche. Our brains don't work like hard drives. From what I see we can keep stuffing more in there (compression?). Much of my past learning is now blurred but sometimes it surfaces in intuitions? Perhaps attention or interest is a better concept to use?

My favorite thing about LLMs is wondering how much of people's (or my own) conversations are just LLMs. I love the idea of playing games with people to see if I can predictably trigger phrases from people, but unfortunately I would feel like a heel doing that (so I don't). And catching myself doing an LLM reply is wonderful.

Some of the other sibling replies are also gorgeously vague-as (and I'm teasing myself with vagueness too). Abstracts are so soft.


If you have some probability distribution over finite sequences of bits, a stream of independent samples drawn from that stream can be compressed so that the number of bits in the compressed stream per sample from the original stream, is (in the long run) the (base 2) entropy of the distribution. Likewise if instead of independent samples from a distribution there is instead a Markov process or something like that, with some fixed average rate of entropy.

The closer one can get to this ideal, the closer one has to a complete description of the distribution.

I think this is the sort of thing they were getting at with the compression comment.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: