How is this different in practice to the regular ESP32's secure boot, where you can technically flash the chip with whatever you like but unless you have the signing key the bootloader will refuse to load it?
You can generate the keys on-device during the initial provisioning and have it encrypt the flash with that key, so every device generates its own unique key and there isn't any practical way to extract it; even the developer can't flash it directly, and OTAs are required to update the firmware. This effectively means nobody can flash the chip anyway since you can't know the keys. Is there some sort of attack vector here I'm missing that gets mitigated by preventing flashing entirely?
We don't know if he's gone to the cops, but more often than not the police will not be able to help you in these sorts of situations. The way that Kiwifarms target people is incredibly hard to stop and I don't think they particularly need ammo; the way to not "give them ammo" would be to stop standing up for trans rights, and the fact that he hasn't is commendable.
I replied to another commenter about this, but these sorts of things are basically part and parcel of being a public figure.
If you can't handle that, you need to sandbag. There's no other way around it. It is unlikely that Marcan will stop being the center of attention wherever he goes.
If he wants to stand up for what he believes to be right, he shouldn't have a problem with the consequences of dealing with people who disagree with him, sometimes virulently.
Trans issues are a very contentious issue right now and in my opinion the only way to win that type of situation is to not play. Doesn't mean I don't respect those people, but taking large public stances just put targets on your back
> If he wants to stand up for what he believes to be right, he shouldn't have a problem with the consequences of dealing with people who disagree with him, sometimes virulently.
This sounds like victim blaming. I suspect very few people who take a stand are truly prepared for years of abuse, even if they think they are. No one has perfect knowledge of the future.
It's intended that you understand the consequences of your actions. If you're a primary order thinker who can't think past the first step then you got to be way more cautious with your life.
I suspect this is why OpenAI is going more in the direction of optimising for price / latency / whatever with 4o-mini and whatnot. Presumably they found out long before the rest of us did that models can't really get all that much better than what we're approaching now, and once you're there the only thing you can compete on is how many parameters it takes and how cheaply you can serve that to users.
Meta just claimed the opposite in their Llama 3.1 paper. Look at the conclusion. They say that their experience indicates significant gains for the next iteration of models.
The current crop of benchmarks might not reflect these gains, by the way.
I sell widgets. I promise the incalculable power of widgets has yet to be unleashed on the world, but it is tremendous and awesome and we should all be very afraid of widgets taking over the world because I can't see how they won't.
Anyway here's the sales page. the widget subscription is so premium you won't even miss the subscription fee.
This. It's really weird the way we suddenly live in a world where it's the norm to take whatever a tech company says about future products at face value. This is the same world where Tesla promised "zero intervention LA to NYC self driving" by the end of the year in 2016, 2017, 2018, 2019, 2020, 2021, 2022, 2023, and 2024. The same world where we know for a fact that multiple GenAI demos by multiple companies were just completely faked.
It's weird. In the late 2010s it seems like people were wising up to the idea that you can't implicitly trust big tech companies, even if they have nap pods in the office and have their first day employees wear funny hats. Then ChatGPT lands and everyone is back to fully trusting these companies when they say they are mere months from turning the world upside down with their AI, which they say every month for the last 12-24 months.
> What makes this interview – and really, this paper — so remarkable is how thoroughly and aggressively it attacks every bit of marketing collateral the AI movement has. Acemoglu specifically questions the belief that AI models will simply get more powerful as we throw more data and GPU capacity at them, and specifically ask a question: what does it mean to "double AI's capabilities"? How does that actually make something like, say, a customer service rep better? And this is a specific problem with the AI fantasists' spiel. They heavily rely on the idea that not only will these large language models (LLMs) get more powerful, but that getting more powerful will somehow grant it the power to do...something. As Acemoglu says, "what does it mean to double AI's capabilities?"
I don't think claiming that pure scaling of LLMs isn't going to lead to AGI is a particularly hot take. Or that current LLMs don't provide a whole lot of economic value. Obviously, if you were running a research lab you'd be trying a bunch of different things, including pure scaling. It would be weird not to. I don't know if we're going to hit actual AGI in the next decade, but given the progress of the last less-than-decade I don't see why anyone would rule it out. That in itself seems pretty remarkable, and it's not hard to see where the hype is coming from.
That line of thinking would not have reached the conclusion that you imply, which is that open source == pure altruism. Having the benefit of hindsight, it’s very difficult for me to believe that. Who knows though!
I’m about Zucks age, and have been following his career/impact since college; it’s been roughly a cosine graph of doing good or evil over time :) I think we’re at 2pi by now, and if you are correct maybe it hockey-sticks up and to the right. I hope so.
Wouldn't the equivalent for Meta actually be something like:
> Other companies sell widgets. We have a bunch of widget-making machines and so we released a whole bunch of free widgets. We noticed that the widgets got better the more we made and expect widgets to become even better in future. Anyway here's the free download.
Given that Meta isn't actually selling their models?
Your response might make sense if it were to something OpenAI or Anthropic said, but as is I can't say I follow the analogy.
that would make sense if it was from Openai, but Meta doesn't actually sell these widgets? They release the widget machines for free in the hopes that other people will build a widget ecosystem around them to rival the closed widget ecosystem that threatens to lock them out of a potential "next platform" powered by widgets.
Meta doesn't sell widgets in this scenario - they give them away for free. Their competition sells widgets, so Meta would be perfectly happy if the widget market totally collapsed.
That is strong (and fun) point, but this is peer reviewable and has more open collaboration elements than purely selling widgets.
We should still be skeptical because often want to claim to be better or have unearned answers, but I don't think the motive to lie is quite as strong as a salesman's.
Others can build models that try to have decent performance with a lower number of parameters. If they match what is in the paper that is the crudest form of review, but Mistral is releasing some models (this one?) so this can get more nuanced if needs.
That said, doing that is slow and people will need to make decisions before that is done.
Meta uses ai in all the recommendation algorithms. They absolutely hope to turn their chat assistants into a product on WhatsApp too, and GenAI is crucial to creating the metaverse. This isn’t just a charity case.
If OpenAI was saying this you'd have a point but I wouldn't call Facebook a widget seller in this case when they're giving their widgets away for free.
They also said in the paper that 405B was only trained to "compute-optimal" unlike the smaller models that were trained well past that point indicating the larger model still had some runway so had they continued it would have kept getting stronger.
Makes sense right? Otherwise why make a model so large that nobody can conceivably run it if not to optimize for performance on a limited dataset/compute? It was always a distillation source model, not a production one.
LLMs are reaching saturation on even some of the latest benchmarks and yet I am still a little disappointed by how they perform in practice.
They are by no means bad, but I am now mostly interested in long context competency. We need benchmarks that force the LLM to complete multiple tasks simultaneously in one super long session.
I don't know anything about AI but there's one thing I want it to do for me. Program a full body exercise program long term based on the parameters I give it such as available equipment and past workout context goals. I haven't had good success with chatgpt but I assume what you're talking about is relevant to my goals.
Yeah, but what does that actually mean? That if they had simply doubled the parameters on Llama 405b it would score way better on benchmarks and become the new state-of-the-art by a long mile?
I mean, going by their own model evals on various benchmarks (https://llama.meta.com/), Llama 405b scores anywhere from a few points to almost 10 points more than than Llama 70b even though the former has ~5.5x more params. As far as scale in concerned, the relationship isn't even linear.
Which in most cases makes sense, you obviously can't get a 200% on these benchmarks, so if the smaller model is already at ~95% or whatever then there isn't much room for improvement. There is, however, the GPQA benchmark. Whereas Llama 70b scores ~47%, Llama 405b only scores ~51%. That's not a huge improvement despite the significant difference in size.
Most likely, we're going to see improvements in small model performance by way of better data. Otherwise though, I fail to see how we're supposed to get significantly better model performance by way of scale when the relationship between model size and benchmark scores is nowhere near linear. I really wish someone who's team "scale is all you need" could help me see what I'm missing.
And of course we might find some breakthrough that enables actual reasoning in models or whatever, but I find that purely speculative at this point, anything but inevitable.
Or maybe they just want to avoid getting sued by shareholders for dumping so much money into unproven technology that ended up being the same or worse than the competitor
> the only thing you can compete on is how many parameters it takes and how cheaply you can serve that to users.
The problem with this strategy is that it's really tough to compete with open models in this space over the long run.
If you look at OpenAI's homepage right now they're trying to promote "ChatGPT on your desktop", so it's clear even they realize that most people are looking for a local product. But once again this is a problem for them because open models run locally are always going to offer more in terms of privacy and features.
In order for proprietary models served through an API to compete long term they need to offer significant performance improvements over open/local offerings, but that gap has been perpetually shrinking.
On an M3 macbook pro you can run open models easily for free that perform close enough to OpenAI that I can use them as my primary LLM for effectively free with complete privacy and lots of room for improvement if I want to dive into the details. Ollama today is pretty much easier to install than just logging into ChatGPT and the performance feels a bit more responsive for most tasks. If I'm doing a serious LLM project I most certainly won't use proprietary models because the control I have over the model is too limited.
At this point I have completely stopped using proprietary LLMs despite working with LLMs everyday. Honestly can't understand any serious software engineer who wouldn't use open models (again the control and tooling provided is just so much better), and for less technical users it's getting easier and easier to just run open models locally.
In the long run maybe but it's going to take probably 5 years or more before laptops such as Macbook M3 with 64 GB RAM will be mainstream. Also it's going going to take a while before such models with 70B params will be bundled in Windows and Mac with system update. Even more time before you will have such models inside your smartphone.
OpenAI did a good move with making GPTo mini so dirty cheap that it's faster and cheaper to run than LLama 3.1 70B. Most consumers will interact with LLM via some apps using LLM API, Web Panel on desktop or native mobile app for the same reason most people use GMail etc. instead of native email client. Setting up IMAP, POP etc is for most people out of reach the same like installing Ollama + Docker + OpenWebUI
App developers are not gonna bet on local LLM only as long they are not mainstream and preinstalled on 50%+ devices.
Totally. I wrote about this when they announced their dev-day stuff.
In my opinion, they've found that intelligence with current architecture is actually an S-curve and not an exponential, so trying to make progress in other directions: UX and EQ.
Its more a commentary on how amateur it is in my respectful view. iMessage is a fundamentally unserious "product" in search of enough collateral flaws to harm those foolish enough to depend on it for anything.
How so? Other than the security issues that get exploited by NSO group from time to time (that appear to be mitigated fairly well by lockdown mode if that's something that's important to you) or the obvious flaw that you can't talk to anyone that doesn't have an iPhone it seems to be a perfectly good platform. The alternatives either have worse encryption (Telegram, RCS), worse privacy (WhatsApp), or the same platform lock-in as iMessage (Google's RCS).
iMessage is the LastPass of messaging apps. This has been endlessly discussed and I want people to use their curiosity to help direct them to why I would comment in this way. In practice (not whitepaper or the ideal implementation), it is no more secure than sms (actually worse)
I'm curious how Apple implements Keychain in the sense that they claim it is also e2ee but they also use e2ee for ADP and its absolutely not (or at least not zero knowledge), rather it is convergent encryption which is not zero-knowledge and also allows for knowledge of filenames and hashes cuz "de-dupe" is so important for people with TB of cloud storage at the expense of their privacy.
"E2E" is a joke when Apple holds the encryption keys to the vast majority of all messages, and uses them to respond to law enforcement requests. (It's how iCloud backup works by default and we know people don't change defaults. This is documented by Apple, not a conspiracy theory.)
No, when you sign into iCloud/your account in Settings, it sets a bunch of insane defaults like iMessage and Facetime and every app you add is opt-out for iCloud storage. Defaults are end-runs around true explicit and informed consent and open people to implications they didn't knowingly understand
Last time I checked, everyone knows SMS is cleartext and can't take over your phone in the profound way built-in 1st party apps/services you emphatically cannot remove (only toggle) can seize the means of production so to speak.
“Everyone” may be overly broad… just about everybody with any technical inclination knows yes, but for many years now the overwhelming majority of smartphone users have not been particularly technically inclined, and as such I would not expect most of them to be aware of the security and privacy implications that come with use of the various messaging services.
With that in mind, I’d say that most messaging apps don’t go far enough to make that distinction clear. Any app handling SMS or any other unencrypted messages should have ever-present, readily visible warnings when conversations aren’t encrypted.
Didn't mean to sound so bratty, I just get frustrated by this topic. My apologies if I was a bit testy. I just mean that iMessage is extremely misleading and overly-technical in what it takes to truly have a chance at making it secure and private to the extent it extolls itself.
This shit matters now that people aren't able to receive proper reproductive care and education and other grey areas where Apple is setting its users and itself up for terrible and unjust outcomes that depend on everyone but Apple having flawed/imperfect information and Apple pretending 'Saul Goodman...
But unless everyone you talk to also changes it then Apple still holds the keys to your conversations. If you care, it is best to avoid software with bad security defaults altogether.
That's the thing tho: it will never be secure because its the skeleton key. It was never truly intended to be secure. Same reason why only WebKit's allowed on all billion+ iPhones. Access is only guranteed if its monocultural.
I started out using a cheap MicroSD and it only lasted a week or two before failing; I've since replaced it with a 128 GB SATA SSD with a USB3-SATA dongle I bought off Amazon. No SD card needed and it's served me incredibly well. I figured that even the best MicroSD card isn't going to be designed to run as a computer's boot drive, and SATA SSDs are cheap and plentiful.
It's worth bearing in mind that _all_ ports on the device are expansion cards, including the USB-C port that the device needs to actually charge. In any case I think WiFi is 'good enough' for most use cases, and as such the number of laptops that provide ethernet as standard is rapidly diminishing. I very rarely need an ethernet port on my laptop, but for the times I do it'd be very helpful to have it installed rather than needing a dongle flapping about.
I prefer the "oldschool" way... give us a bunch of usb ports, ethernet, sd card reader, one or two video out options, etc. then add a few (back in the time pcmcia/expresscard) expansion ports for other things.
The current options seem like configuring a car, and having a steering wheel as an optional addon module.
Eh - I was mostly in your camp until I got a work machine with nothing but USB-C.
Turns out... it just really doesn't matter much, and there are a bunch of upsides.
Namely - I can charge on either side, with basically any decent usb-c charger, and I can plug it into basically any usb-c dock. Those are pretty huge points of convenience. They actively make my life a lot better (seriously - a single charger for all my devices while traveling is SO fucking nice - two laptops, headphones, phone, remarkable... one charger).
The only caveat? I have to throw a single usb-c dongle into my laptop bag for the occasional time I need it. My pick of choice is the pinephone dongle (https://pine64.com/product/pinephone-usb-c-docking-bar-dark-...) since it's cheap, has ethernet and HDMI, plus two usb ports, and works with basically every usb-c device I've plugged it into, on any OS.
Big plus? That dongle is way, way smaller than the extra chargers I'd be lugging around otherwise.
Pancreatic cancer actually has one of the worst survival rates; in the UK around 25% of people diagnosed survive past a year, with 5% living more than five years after diagnosis [0]. Chemo in those circumstances often has the effect of prolonging life while significantly decreasing the quality of life, and as such many people choose not to go through with it.
Right but Steve had a rare version of pancreatic cancer that was totally curable and had good odds of success. But he waited too long while wasting his time with at-home remedies (an all-fruit diet, which actually made things worse), and by the time he followed his doctor’s advice it was too late:
> Once it was clear that Jobs had the rare islet-cell pancreatic cancer, there was an excellent chance of a cure. According to Cleveland Clinic gastroenterologist Maged Rizk, MD, there’s an overall 80% to 90% chance of 5-year survival. In the world of cancer survival, that’s a huge milestone.
This just solved a problem I have that I never knew there was a solution for! It seems really weird to me that every OS doesn't have a native solution for changing external monitor brightness; I thought it was down to the monitor but clearly it isn't. Thanks for the link and thanks for your work on Lunar!
Well it kinda is up to the monitor. Given how many monitors don’t implement the DDC/CI standard correctly and flash and crash and blackout on simple brightness change commands, it would turn into a PR nightmare at the scale of an operating system.
I’m overwhelmed by the daily support I have to provide for Lunar users and I barely have 20k users. I can’t even imagine how bad it would be if 100k users started having their monitor crash or lose color on “such a simple thing as changing brightness”.
I've easily saved a couple of minutes each day by not having to search some API docs, since Copilot already knows how my variables should fit into the function call. Even if it's just a minute a day, 20 days a month, it works out being worth $10 easily if you're on a typical western software dev salary.
I've found copilot invaluable for throwing together quick scripts, especially in languages I don't quite understand. Writing e.g. a bash script, and being able to add a comment saying
# Print an error message in red and exit if this program returns an error
and have it print out
if ! some_program
then
echo -e "\e[31msome_program failed\e[0m"
exit 1
fi
makes it so much quicker to cobble together something that works without having to context switch and go Google something. That being said, I've found when writing more complex code it has a real tendency to introduce subtle bugs that can really catch you out if you're not paying attention.
Purely from the amount of time I've saved I'd say it's well worth the $10/mo for my employer (it only has to save a few minutes a day to be worthwhile). Very excited to see how they improve it in the future!
I've found when writing more complex code it has a real tendency to introduce subtle bugs that can really catch you out if you're not paying attention.
Yea, that is basically my experience as well. On balance I feel I wasted about as much time debugging broken copilot code as I've saved from using it.
I totally agree. If you code professionally in a stack where copilot perform well, 10 usd/month is a steal. Assuming you are just 5% more productive with copilot, that would easily translate into hundreds of dollars of savings.
It's for the hobbyists that it's painful. Adding another 10 usd subscriptions might be too much for your budget, especially if you only code occasionally.
It would have been nice of them to introduce a free tier where you could use copilot for a few hours a month for free.
This is unfortunate, because my experience has been that it's far more harmful than helpful when I'm working in stacks/techs that I am comfortable and experienced in- the ones I use professionally-, but extremely useful when I'm working in an unfamiliar language or stack as I often do for hobby projects.
I guess that makes sense. It's mostly useful when working on mundane tasks on popular stacks. Experts working on non-trivial use-cases won't see much benefits.
Personally, I'm not a in a coding position anymore and only code occasionally on stacks I'm mostly unfamiliar with: copilot is a godsend as it saves me from googling every other lines of code to figure out which api calls I'm suppose to do to accomplish the task at end.
I see it as a stackoverflow on steroids.
Even if I don't use it much, I guess I'll have to pony up the 10 usd because I would not want to go back to googling basic syntax for everything when I'm coding something.
Do you retain as much? Do you need to? Is that important?
I hope this whole thing ends up like a space saving tool like Google did. I don't need to memorise that API because I can Google the docs. Now maybe that goes a level higher.
> far more harmful than helpful when I'm working in stacks/techs that I am comfortable and experienced in... but extremely useful when I'm working in an unfamiliar language
Ha I don't think it's totally that, but it may well be part of it!
I think the biggest thing is that when working in tech I'm unfamiliar with it's extremely helpful to get some sort of skeleton in place, even if it's wrong in some way. I'm going to have to go slowly and evaluate it either way, so doesn't really matter if it's got problems or is less than ideal. What I would do otherwise is just go copy something from StackOveflow and then comb over it to adapt to my needs. Copilot is more or less just doing the same thing, but faster.
When I'm working in a stack I know well, I can quickly put down the code I need and it will generally be pretty good. Copilot can do it faster, but it gets things wrong a lot more often than I do. Since fixing something wrong is a LOT slower than me getting it right the first time, it ends up being more trouble than it's worth.
Honestly, that’s the hope. Putting together well-solved combinations of computer functionality ought to become less-skilled work as technology progresses.
That's been the objective of programming languages for 50 years. It hasn't happened yet, because the essential complexity of programming problems isn't in writing the code.
anecdotally, this is where the famous lack of modern skills comes from in engineering culture. If you keep doing the same things you've been doing, you'll look around one day and see that everyone has moved on.
The market for simple SMB websites is a great example. This went from custom HTML+webservices, to Wordpress, and now to WIX/shopify/square. I'd bet the market for SMB marketing will similarly move to near plug+play google/FB offerings.
However if you started out making websites in 1993, then there are a vast array of products and services one could move into over the last 3 decades.
If law - the human language equivalent of programming - hasn’t gotten simpler in past thousands of years as new abstractions and complications have arisen, I hold no hope for programming.
Surely the human language equivalent of programming is recipes and other types of written instructions. Law is far more abstract and subjective than programming.
Law is opposed to these sorts of changes due to the business model of the law firm. In the law firm world billable hours are king. Automation reduces billable hours. No law firm wants to do that.
I think this really depends on what kind of firm you're talking about. You could make the same case for contractors i.e. "billable hours are king". Take the example where you need to paint a house. You could hire someone off the street who does it with a paintbrush and rollers or hire a pro with a sprayer and prep knowledge to do it in 1/4 the time and with 10x the quality.
In this context automation could be a tool that a law firm uses to enhance the quality of their product. Personally, I would pay more for a tech-savvy law firm that embraces automation, not less.
A lot of contractors in painting/drywall do piecework rather than hourly. My roommate has been a drywall taper for 20+ years. When he quotes a job it’s a flat rate and then he tries to finish as fast as possible by using his best tools to speed up the job. On the other hand, if someone hires him on an hourly basis he puts away those fancy tools and does a lot more manual work, getting the job done slowly.
His rationale: why put wear and tear on his expensive tools if it’s just going to reduce his income in the end? Needless to say he prefers piecework because he likes to move from one job to the next as quickly as he can. He makes a lot more money that way.
I think it has more to do with having an adversarial law system. It doesn’t matter what new tool you come up with in the arms race. Your competitor will soon have it as well.
Yes, but that's a perfect illustration for one of copilot's essential flaws: "a lot of flies eat poop" (side note: is that actually a saying? Asking because in German it is, and it fits perfectly here).
A lot of code is of mediocre quality. An ML service that learns from huge amounts of code without an ability to tell "good" code from "bad" code will only ever be able to produce mediocre code, at best.
Which may still be of value if you know and can recognize mediocre code.
Some code is like a giant pile of dirt; you need someone to pile it up and then you can go in and clean up the edges and make it "good" whereas other code is entirely delicate all the way through.
The big question is how much is each one and can it help. I suspect it helps for many, but those who know enough to recognize where it can flaw will have an advantage.
But newer programmers may never really "learn" the code the way the older ones do, as they'll just let the computer do the basics.
Holy **. This is exactly the kind of use case ML code generation should NOT be used for. If you don't even understand a language you will probably not be able to debug subtle bugs that are introduced, especially in something like bash. Please don't do this for anything that touches prod.