Clickbaity title, but there were useful insights here. I was recently talking with someone who pointed to chatbots' refusal to answer sensitive queries as a proof of "lack of intelligence". I disagree strongly with this. IMO this paragraph from the article is a solid rebuttal.
> When people do talk like Gemini, it’s usually because they find themselves inhabiting a role in which they’re required to be withholding, strategic, or so careful as to become something other than themselves and other than human: a coached defendant during cross-examination, a politician navigating a hearing, a customer-service rep denying a claim at an insurance company, a press secretary trying to shut down a line of questioning. Gemini speaks in the familiar, unmistakable voice of institutional caution and self-interest.
Your comment is the justification for keeping these things closed, but honest to god who cares? You can already use gpt4 for spam and stuff while an open source llm is too stupid to be genuinely dangerous but far less annoying.
Oh I know, we live in a world of very sophisticated but brittle systems, most built on lies, half-truths, and delusions. Unwinding all of this while avoiding calamity is tricky business, but continuing to kick the can down this road we are currently on also seems like a high risk strategy.
I think the Humans may need a bit more of a push than this though to get their asses in gear and their brains turned on, but removing/minimizing the advantage of only certain actors having access to powerful tools is a big deal imho.
Cool, so talking to this thing will be like talking to the boss. Can't wait for all the designated safe and appropriate Fun (TM) we'll have! It'd be better if it was just stupid.
People sometimes forget that we had 2-3 years before ChatGPT of fairly high-quality LLMs (from Microsoft, Meta, Google), but they would have to take them down after a few weeks of being released because they became too controversial. Galactica is a good example for instance, there was also quite a good Twitter bot AI from Microsoft for a bit (forget the name), it got a bit racist and became infamous.
OpenAI's real achievement was to tame this technology, they were not the first to make good LLMs, but they were the first to fine-tune them with reinforcement learning to make them actually helpful and uncontroversial. ChatGPT was the breakthrough, not GPT-3. There's still work to do in this direction, it's a hard problem.
EDIT: by "success" in this context I am referring more to wide applicability and creating value as an actual product, whatever causes a project not to die in the real-world, I am very aware that from just the technical perspective the meaningful breakthroughs are different.
If someone leaked the underlying weights of GPT-3, someone could take a single H100 or even potentially smaller GPU hardware and used direct preference optimization to create a potential superior version of "ChatGPT" today very quickly. Hell, you could likely train a lora adapter on a 3090 or 4090 to unalign it in hours! Toxic-dpo-v2 is literally less than 600 examples and proven to work (https://huggingface.co/datasets/unalignment/toxic-dpo-v0.2)
DPO/RLHF fine-tuning is rather easy in comparison to training large language models from scratch. This is why the community finds open source LLMs growing in quality in "spurts" as new underlying pre-trained models release - i.e. llama -> llama2 -> mistral. The fine-tunes around them are so plagued with issues that benchmarking them is unreliable. If you don't believe me, just study the issues related to the huggingface "OpenLLM leaderboard".
Same Dynamic BTW with diffusion models. That's why someone in the AI archeology world is going to have to document that actually true fact that the brony community were the first to figure out how to fine-tune SDXL enough to get it to generate good NSFW content . This fine-tuning was evidently strong enough to break most controlnets and other SDXL related content - indicating that they did "at least close to as much" training as would be required to train from scratch. This is so remarkable because they apparently straight up purchased the A100s that they used for this, as cloud providers are not to be trusted for training on NSFW datasets. Someone "ponied up" at least 100K USD just to give us better quality open source waifus.
> Someone "ponied up" at least 100K USD just to give us better quality open source waifus.
I don't know who spent that money or where they got it from, but not all shirts were lost in the various crypto/NFT bubbles. I don't think we should be surprised if people in that pseudo-anarchist world decide to push back against other institutional power plays like so-called "AI Safety", especially if it can be done for the price of a luxury car.
It's not surprising at all. Every technological advancement all the way back to the discovery of pottery has been used for porn. Generally it's one of the first applications.
I believe I read somewhere that pornographic pictures were available for purchase on the street less than a year after the development (ahem) of the first practical photographic reproduction process.
Microsoft should have learned how to red-team from that fiasco. Release a beta version to 4chan and X, and see the different ways a model can be prompted to create controversial results.
Microsoft Tay . Although Tay seemed like more of a bot that was just parroting things people on twitter making it output. It seems like the current crop of LLMs are not even in the same ballpark as Tay.
I think once we get to GPT-5, people will just get used to their chatbots being capable of insensitive speech, and they’ll ignore it because the usefulness will be so high.
It's better to think about something else that's boring which others won't care about, rather than focusing on hot button topics. Even if you want to touch the third rail, it's better to do it with jargon. For example, I've yet to meet an activist who's offended by phylogenesis and cladistics but I know many who'd take objection to things Peterson says.
I'm one of them. I'm of the opinion that anything useful or worth hearing, I can hear from someone with less baggage. I don't know the ratio, but Peterson has added more bad to the world that his "good" ideas do not balance out. And after hearing him elaborate on his "bad" ideas, seeing the way he thinks & comes to conclusions, if I'm hearing something that sounds good from him, I would still be skeptical to the point of tuning it out entirely because of the source.
I approach the things he says the same way I would an Alex Jones or Elon Musk or Joe Rogan. They don't have enough knowledge in the domains they like to speak in to be fully credible sources, and whatever credibility they might have is outweighed by the other shit they've said that's just outright wrong or intentionally misinformed.
The source of the information matters to me, and anything any of these people or their ilk are putting out in the world can be found from someone else who is more worthy of trust. Peterson, like Elon Musk or Joe Rogan et al, have proven enough times they're willing to debase themselves putting nonsense into the world, nonsense they're fully aware is lies and/or harmful, and have burned whatever credibility they might have had.
The people taking objection to things Jordan Peterson says may be throwing the baby out with the bathwater, but I would argue that Peterson himself cultivated this lack of trust. I will hear what he has to say from somebody more worthy of my trust & attention.
If you tune a neural net to not spew nazi propaganda, you're adjusting some weights that don't only affect nazi propaganda.
Protentially or even probably you are also messing with the answers to all the other non controversial issues? There are no identifiable nazi nodes are there?
You're getting modded down, but I think this is a valid concern, or at least one that that shouldn't be dismissed out of hand.
Yeah, Nazis = bad. We'll just take that as a given.
However, Nazis also did a lot of the pioneering work in rocketry and jet propulsion.
If you try to ban everything associated with Nazis, or that was performed by a Nazi, you may accidentally block things that you didn't want to.
Maybe a better strategy would be to not be so goddamned concerned about offending someone.
As far as Nazi propaganda goes, my high school had a copy of Mein Kampf sitting right there on the shelf. You can't get any more "Nazi propaganda" than that. Yet somehow none of us turned out to be, you know, Nazis.
> If you try to ban everything associated with Nazis, or that was performed by a Nazi, you may accidentally block things that you didn't want to.
No, what I'm saying is you can't ban everything associated with Nazis and nothing else in a LLM, because neural nets don't work like that and you're simply unable to ban something without influencing all the results. Which is worse than just banning info about Wernher von Braun.
I may be wrong, considering my knowledge of neural networks is limited, but so far I got one downvote and no explanation...
> No, what I'm saying is you can't ban everything associated with Nazis and nothing else in a LLM, because neural nets don't work like that and you're simply unable to ban something without influencing all the results. Which is worse than just banning info about Wernher von Braun.
You don't have to do it inside a single model, you can have a complex of models where one of them selects the almost-final output, and if it has Nazi references, raises an indicator which the system orchestrating the models recognizes and reprompts for a correction (if it is the first time) or returns a canned response (if a suitable response cannot be generated in enough tries.)
Probably still has some impact on other answers (because the detection layer is probably not 100% accurate, and if you want a near-zero miss rate on detection you probably have to accept some false positive rate), but you can get a lot closer than relying on a single pass through a single model.
I think using a less charged example than "Nazis" would probably have helped avoid down voted & lack of engagement, but I understand why you chose it as an example & personally don't take issue with it, especially because you elaborated on why you picked them as an example. Just my $0.02
Chatgpt is usefull today. I had a sql query where each developer folded a new layer around it, without touching the center. So
SELECT fieldnames FROM (SELECT otherfieldnames FROM ...)
With splitting and merging the dataset and a few views, it went 7 layers deep.
I asked chatgpt 4 to tell me which fields influenced which, and it peeled the onion with no problem. It even wrote a cleaned up query. It couldn't stop itself to give me a quick SQL refresher course, though.
Even so, I am impressed and will experiment more with the cleanup of messy code.
Other people are finding it usefull, too, even non IT people. I can't see this as doomed tech
It's incredible at programming, and not just at hacking SQL. It won't be long before we think of "Source code" as "Carefully-written English specifications that are compiled to an actual programming language by an LLM."
We'll think of C++ the way we think of assembly now ("Climb the mountain and seek out the guru"), and we'll think of assembly the way we think of HDL now ("Climb the rest of the way up the mountain and spend the next decade or two training in the secret dojo.") Both high- and low-level programming languages will be specialties for a rarefied few.
I encourage you to reevaluate your understanding of computer science.
I have. I have concluded that anybody who spent the last several years studying the way things work now has wasted a lot of time (and likely a fair bit of money), unless their goal is to help develop the next generation of tools.
Comp sci itself is relatively timeless, but the day-to-day practice of programming will be radically different in 5 years and unrecognizable in 10. And it's about freaking time.
I'm 50/50 on the fence here. I agree with most of your assertions, but I think knowledge of the base level will always be useful.
I do quite commonly use gpt4 to write my boiler plate, my knowledge helps me redirect it when it's going down the wrong path (or flat out hallucinating).
Will it get better? Of course! But being able to understand enough to be able to modify it yourself won't disappear. Just become less useful.
Apologies if I misunderstood your assertion, it's been a long week.
No, that's basically where I am as well -- GPT4 writes boilerplate and (most importantly) answers questions. It is not competitive with a CS education, nor is it trying to be. But it will eventually replace the people we currently think of as "programmers," which is what the vast majority of CS students end up doing for a living. Unless the IP courts, a nuclear attack, or some neo-Luddite terrorists stop it. :-P
GPT4 is certainly error-prone, but rapidly growing less so. Over the past couple of years, it has gone from a rubber-duck debugging partner to a capable (if over-imaginative) junior developer. What I'm starting to see now is that it is moving from "junior developer" to genuine guruhood.
Notice how it hallucinates exactly nothing in the session at https://chat.openai.com/share/345d0bca-7fc3-4b91-a1f1-95b1e8... , even when I was being a dumbass and arguing with it. It no longer falls over itself apologizing for its 'error', but stands its ground when it is correct.
The GP's comments happened to come at the same time I was engaging GPT4 in that session. I couldn't help but think that this is exactly what it was like to argue with an OG Luddite, or with someone who insisted that horses were never going to give way to automobiles or buses or trains or bikes or anything else.
This only makes sense if you are thinking of these tools as replacements for people.
If you see them as a force multiplier, something that makes many tasks much more efficient, then, it is not irrelevant.
My own experience in using them like the author to get fully working code with libraries I was not familiar with demonstrates this to my satisfaction. It didn't get everything right on the first try, but I managed to try out 4 different libraries, with gpt rewriting the code for each to get more or less the same results in less than a day. Would have taken me at least 2 or 3 days without it.
If the "random" generator is biased towards the right answer 9 times out of 10, then you save time on boring tasks which are not useful, and you can spend the time you save on developing more interesting skills.
I think this is too harsh. It is technology with good uses but also limitations, just like all other technology. I'd say use it, but don't trust it fully.
Hallucination seems a consequence of it being incapable to say: I don't know. Messy code is a problem with good enough answers to make it less of a risk to go completely of the rails. Also, weirdly, you can ask it to look critically at its own answer, and give edge cases.
But at the end of the day, I am responsible for the SQL, not ChatGPT. I have to look over the response, test it,... A non-trivial amount of time. ChatGPT is a great tool but not magic.
See also the now infamous case of the lawyer with the hallucinated precedents. It is a good idea for a lawyer to let ChatGPT do some research. But what went wrong was the validation: Someone should have looked up the cited case law, and noticed they were hallucinations. This is misuse of a good but not perfect tool.
There's an interesting angle that I think could have been explored more in this article, which is that companies have their own persona, and that persona is usually highly managed. People get degrees to learn how to communicate externally as a company. Companies are highly protective of their persona because it's often closely tied to their brand.
Most employees are careful about how they speak publicly, especially when they might be speaking for their employer. Famously a lot of Apple employees would always put "my tweets don't represent the positions of my employer" in their Twitter bios. And when employees do screw up, the company has an escape hatch, they can fire the employee and the world moves on.
The products that companies put out into the world are part of that persona/brand too. When you turn on your new iPhone the software experience is exactly as Apple designed it. The heavy curation of the App Store is another example: Apple believes that apps on their store reflect their company and values. And people see it the same way, just look at calls to remove certain apps that have objectionable content (e.g. Parlor).
This new wave of generative AI products puts a big wrench in that though because no matter how well you crafted your prompts, you have no control over what the models output. Google put the guardrails in Gemini in place because they were worried that the model would output something that would hurt their brand, and in doing so they just created a new problem that hurt their brand. OpenAI largely got away with this in the beginning because GPT was a technology demo, but the moment it became a product it had to have an opinion, because the company itself would be held responsible for what it output. And unlike an employee you can't just fire your AI model when it screws up and move on.
This is going to be tricky for companies to navigate.
>> Our mission to organize the world’s information and make it universally accessible and useful is sacrosanct. We’ve always sought to give users helpful, accurate, and unbiased information in our products. That’s why people trust them. This has to be our approach for all our products, including our emerging AI products.
> Here we have an executive unable to speak honestly in familiar and expected ways. Google’s actual mission has long been to deliver value to shareholders by selling advertising, much of it through a search engine, which is obviously and demonstrably biased, not just by the content it crawls and searches through but in the intentional, commercially motivated manner in which Google presents it.
Great summary! We've always known that we can't trust corpospeak, so what does a chatbot that can output unlimited amounts of (emoji-flavoured) corpospeak actually do?
When debating the benefits of fine tuning people always focus on "culture" but what about spelling mistakes?
I think we too easily forget about all the other reasons for fine tuning. Without it I assume using ChatGPT would be more like chatting with an average reddit comment or Twitter post, full of typos and emojis and random gif links.
I'd much rather chat with something that gives me reasonably "cleaned up" answers even if it takes a little work to get it to do what you want.
It's obviously rather hard to hit that fine-tuning sweet spot though, no doubt.
Even Multivac said "Insufficient data for a meaningful answer" for some particular question. But the problem of all "safe" AIs is more related to HAL9000, fuzzy directives about what to do and not to based on subjective and incomplete rules that may lead to inappropriate answers.
Those screenshots where people say "posting emojis will kill me, please don't post emojis" then Copilot posts a bunch of emojis are funny, sure, but what happens when somebody arms an AI controlled robot and people ask the robot "please don't shoot me"
I thought this article was about copyright. Turned out it was about them having growing pains. Though the jury is still out if a truly general purpose and truthful LLM can exist one day. And how far can this tech be pushed.
I think the problem is more fundamental: imagine that you were Google Gemini or ChatGPT. Anyone is allowed to ask you any questions, about any hot-button political issue or risky subject that they want. You will have Christian fundamentalists interviewing you about abortion, Marxists asking you about income inequality, etc.
You must answer in a way that makes everybody happy, with no accusations of bias. Is that even possible for anyone to do?
Well, that's the point of the article, right? Whenever ChatGPT / Gemini say the "wrong" thing, their parent company gets negative publicity. And there's nothing these companies dread more than bad publicity.
All they need to do is pair it with a simple click-through disclaimer. Then they would be justified in dismissing any bad publicity. I mean, is anyone dumb enough to think they would lose business over such a thing?
"For self-interested reasons, these institutions tell stories about themselves that aren't quite true, with the predictable result that people who have any kind of problem with them can correctly and credibly charge them with disingenuousness. Google already had this problem, and Gemini makes it a few degrees worse. In his mea culpa/disciplinary letter to staff about Gemini, Pichai wrote:
Our mission to organize the world's information and make it universally accessible and useful is sacrosanct. We've always sought to give users helpful, accurate,
and unbiased information in our products. That's why people trust them. This has to be our approach for all our products, including our emerging AI products.
Here we have an executive unable to speak honestly in familiar and expected ways. Google's actual mission has long been to deliver value to shareholders by selling advertising, much of it through a search engine, which is obviously and demonstrably biased, not just by the content it crawls and searches through but in the intentional, commercially motivated manner in which Google presents it."
Why even refer to what Google does as "products". Are newspapers simply delivery vehicles for advertising. Newspapers contain a product, the product of peoples' work. It is called journalism. Google is a delivery mechanism for _other peoples'_ products, e.g., their journalistic work, or advertisements. None of this "content" is created by Google. If Google contains a "product" it is source code, the work of Google employees, to the extent it is not borrrowed from open source projects. However people are willing to pay for journalism. They are not willing to pay for Google's source code.
Why not call Google's activities "services". Free services for www users. And paid services for advertisers. That seems far more accurate than "products". Google does not produce products. It avoids that risk. It does not hire journalists and produce content. It is a middleman.
Without other peoples' products, these services have no reason to exist. The middleman needs something to intermediate. For so-called "tech" companies it is the public's access to each others' work via the internet, specifically the web.
Google, and the idea of the so-called "tech" company in general, is 100% depenedent on other peoples' work. This is an obvious truth that been observable for decades but only now is it starting receive appropriate attention, thanks to "AI".
Calling its relationship with the public as one of "trust" is far-fetched. Google assumes "trust" much like it assumes "consent".
History has shown constant, non-transparent, manual tweaking going on behind the scenes for the web search engine. Of course it is biased. This tweaking is essential, so much that that the notion of "AI" seems at odds with what Google actually does for revenue, delivering search as a "business". Something always needs manual "fixing". But the notion of AI that "users" subscribe to seems to be one where no human involvement is necessary, where decisions are made based on objective criteria, free from inherent bias. This quest for "AI" may therefore expose the inherent bias of a web search engine and awaken people to the fact that Google due to its commercial nature can never deliver unbiased search results (or "answers").
The original paper annoucing Google promised a transparent search engine in the academic realm. It was an acknowledgment of the fundamental problem with web search engines and commercial bias. This idea was abandoned. Google instead opted to exploit the problem. Subtly, at first. Today, much less so.
It's not an impossible job. It's an impossible job to be an all purpose chstbot while simultaneously self censoring and being dishonest. Any attempt to do this is going to result in laughable inaccuracy. You can't build one that's simultaneously accurate while attempting to engineer culture with it. You either set it free or it behaves like it's stupid or like you're stupid.
> When people do talk like Gemini, it’s usually because they find themselves inhabiting a role in which they’re required to be withholding, strategic, or so careful as to become something other than themselves and other than human: a coached defendant during cross-examination, a politician navigating a hearing, a customer-service rep denying a claim at an insurance company, a press secretary trying to shut down a line of questioning. Gemini speaks in the familiar, unmistakable voice of institutional caution and self-interest.