From a different robot (Boston Dynamics' new Atlas) - the system moves at a "reasonable" speed. But watch at 1m20s in this video[1]. You can see it bump and then move VERY quickly -- with speed that would certainly damage something, or hurt someone.
This doesn’t capture work that’s happened in the last year or so.
For example some former colleagues timeseries foundation model (Granite TS) which was doing pretty well when we were experimenting with it. [1]
An aha moment for me was realizing that the way you can think of anomaly models working is that they’re effectively forecasting the next N steps, and then noticing when the actual measured values are “different enough” from the expected. This is simple to draw on a whiteboard for one signal but when it’s multi variate, pretty neat that it works.
My similar recognition was when I read about isolation forests for outlier detection[0]. When predictions are different from the average, something is off.
> what were you thinking then before your aha moment? :D
My naive view was that there was some sort of “normalization” or “pattern matching” that was happening. Like - you can look at a trend line that generally has some shape, and notice when something changes or there’s a discontinuity. That’s a very simplistic view - but - I assumed that stuff was trying to do regressions and notice when something was out of a statistical norm like k-means analysis. Which works, sort of, but is difficult to generalize.
Care to share the contexts in which someone needs a zero-shot model for time series? I have just never come across one in which you don't have some historical data to fit a model and go from there.
> About "people still thinking LLMs are quite useless", I still believe that the problem is that most people are exposed to ChatGPT 4o that at this point for my use case (programming / design partner) is basically a useless toy....
and
> a key thing with LLMs is that their ability to help, as a tool, changes vastly based on your communication ability.
I still hold that the innovations we've seen as an industry with text transfer to the data from other domains. And there's an odd misbehavior with people that I've now seen play out twice -- back in 2017 with vision models (please don't shove a picture of a spectrogram into an object detector), and today. People are trying to coerce text models to do stuff with data series, or (again!) pictures of charts, rather than paying attention to timeseries foundation models which directly can work on the data.[1]
Further, the tricks we're seeing with encoder / decoder pipelines should work for other domains. And we're not yet recognizing that as an industry. For example, whisper or the emerging video models are getting there, but think about multi-spectral satellite data, fraud detection (a type graph problem).
There's lots of value to unlock from coding models. They're just text models. So what if you were to shove an abstract syntax tree in as the data representation, or the intermediate code from LLVM or a JVM or whatever runtime and interact with that?
> It's a bit sad and confusing that LLMs ("Large Language Models") have little to do with language; It's just historical. They are highly general purpose technology for statistical modeling of token streams. A better name would be Autoregressive Transformers or something.
> They don't care if the tokens happen to represent little text chunks. It could just as well be little image patches, audio chunks, action choices, molecules, or whatever. If you can reduce your problem to that of modeling token streams (for any arbitrary vocabulary of some set of discrete tokens), you can "throw an LLM at it".
But I need enormous amounts of learning data and enormous amount of computing to learn new models, right? So it's kind of useless advice for most people who can't just parse github repositories and teach their new model using AST tokens. They have to use existing opensourced models or API and those happened to use text.
The environmental arguments are hilarious to me as a diehard crypto guy. The ultimate answer to “waste” of electricity arguments is that energy is a free market and people pay the price if it’s useful for them. As long as the activity isn’t illegal then training LLMs or mining bitcoins, it doesn’t matter. I pay for the electricity I use.
One argument against that line of thinking is that energy production has negative externalities. If you use a lot of electricity, its price goes up, which incentivizes more electricity production, which generates more negative externalities. It will also raise the costs for other consumers of electricity.
Now that alone is not yet an argument against crypto currencies, and one person's frivolous squandering of resources is another person's essential service. But you can't simply point to the free market to absolve yourself of any responsibility for your consumption.
Unintentionally, the energy demands of cryptocurrencies, and data centers in general, have finally motivated utilities (and their regulators) to finally start building out the massive new grid capacity needed for our glorious renewable energy future.
Acknowledging that facilitating scams (eg pig butchering) are cryptocurrency's primary (sole?) use case, I'm willing to look the other way if we end up with the grid we need to address climate crisis.
To pretend romance / affinity scams and crime were created by crypto is absurd. It’s fair to argue crypto made crime more efficient, but it also made the responsible parties quicker to patch holes.
The primary use case of crypto is to protect wealth from a greedy, corrupt, money-printing state. Everything else is a sideshow
I greatly despise video games. Why is that not a waste of energy? If you are entertained by something, even if it serves no human purpose other than entertainment, is that not a valid use of electricity?
I've told it before, but when we were doing some clean sheet work a while ago I decided to use the C4 model and drew out the obligatory "Context" diagram with "user" "phone" "laptop" "app" sort of stuff.
I found them silly and (honestly) I still find that if I see one "in the wild" with no further elaboration I become suspect.
However two hours later, because of that silly context diagram, I realized that we had both an online and a semi-disconnected mobile app that could be offline for hours, and that certain things -had- to use a queue and expect an arbitrary amount of time for a task to run, and it completely changed how we thought about the core of how we implemented something pretty important.
Most of what I do these days is silly drawings in excalidraw. As a result I seem to understand more of our systems than anyone else. I'll even export the SVGs and commit them to our repos
But if you want to talk about REAL complex systems talk to a microprocessor logic owner or architect trying to shoot a bug.
A while ago we found a bug that could crash a system (fixed in a new RIT of the chip) if we did X then Y in state … we didn’t know.
Listening to the various leads for the sub-units on a phone call trying to reason about what was happening I found myself visualizing this increasingly complicated steam powered machine, with parts sprawling, tiny gears whirring, and bits zipping about whenever X happened.
Eh. A better analogy - the output would decide that there needs to be conduit between floors for chilled water, hot water, sewage, dutifully make several 4” pipes, and then from floor to floor forget which is which.
"No, 'c_water' means 'clean_water', it has nothing to do with the temperature, so that's why you got burnt; also 'gray water' has nothing to do with a positional encoding scheme, and 'garbage collection' is just a service that goes around and picks up your discarded post-it notes - you didn't take that rotting fruit out of the bowl, so how could we be expected to know you were done with it?"
I have a theory: Back in 1996 Bugzilla worked very well. It had been designed, and honed, by a bunch of senior developers who also wrote the bug management system. So lots of dog food eaten. iirc it was written in Perl.
Then, someone I believe decided to make a "Bugzilla in Java", because they didn't like Perl (reasonable).
But whoever that was didn't have the deep knowledge of how the thing was supposed to be used. Lacking that insight, they created a "Swiss Army Chainsaw", implementing simultaneously everything, and nothing.
Next, some MBAs got hold of the thing, and made everything 10X worse.
Meanwhile, Bugzilla is still the same and still the best software project management tool, if you know how it's intended to be used.
> We originally used Bugzilla for bug tracking and the developers in the office started calling it by the Japanese name for Godzilla, Gojira (the original black-and-white Japanese Godzilla films are also office favourites). As we developed our own bug tracker, and then it became an issue tracker, the name stuck, but the Go got dropped - hence JIRA.
TL;DR it's so completely customizable that it's more like a DIY project management toolkit. Pivotal and Linear have/had a more opinionated approach: "here's how you manage projects. Good luck and have fun!" Jira almost seems to push otherwise rational people to build the most baroque processes imaginable.
I love a good PM. Trust me, you don't want to be responsible for all the reporting and status updates and all that they have to deal with daily.
It's just that I've never worked with someone I considered a good PM who loved Jira. The great ones wouldn't care if we did all the planning on papyrus because they were more concerned with getting things done than documenting them in excruciating detail.
There are quite a few memorable words you can spell using 32 or 64 bits—like BA5EBA11. This is the story of me -not- choosing one of those.
These bit-pattern words are handy because they’re easy to recognize, especially in a random memory dump.
On my first “real” assignment, I was writing real-time embedded C code for a 16-bit processor that communicated with a host microprocessor on a server. We needed to run periodic assurance tests across a bus to ensure reliable communication with the host since we weren't constantly using the bus.*
We were given an unused register address on the host processor and told to write whatever we wanted to it. The idea was to periodically write a value, read it back, and if we encountered any write errors, incorrect reads, or failures, we’d declare a comm error and degrade the system in a controlled manner.
Instead of using zeros or something like 0xDEADBEEF, I decided to write 0x4D494B45 - "MIKE" in ASCII. It was unique, unlikely to be tampered with, it worked, and no one argued with me. The code shipped, the product shipped, and all was well. We even detected legitimate hardware errors, which I thought was pretty cool.
Fast forward two generations of systems, and long after I’d moved on from that team, the code had been ported around but that assurance test remained unchanged. Everything was fine until they brought up a new generation of systems, flipped on the firmware for that device, and 10 seconds later, my assurance test clobbered an important register. The entire system promptly checkstopped and crashed. It took the team days to figure out what was wrong, and I had to explain myself when they found "MIKE" staring back at them from the memory dump.
That was a fun project. ;-)
* Note: It would've been bad if our device went out to lunch because we were responsible for energy management of the server. If the power budget was exceeded and we couldn't downclock and downvolt the processor, something might have crashed or been damaged.