I gave it a random sci-fi novel and made it translate a chapter, which is someth...

bko · 2025-08-08T13:28:50 1754659730

That's so funny. I noticed this as well one time. I got some transcript from a podcast unedited (no punctuation, speaker id, etc) and it had this line I wanted to extract:

> If you’re a gay person, you might be told that if you ever move from Manhattan to Hoboken you’ll be beaten up by bat-wielding thugs right away. If you’re a woman living in a rat-infested apartment in San Francisco, where the rent is going up and up while you fantasize about a nice suburban house in Reno, Nevada, you might hear that, well, if you ever dare to move to Reno, you are going to be chained to your bed and forced to carry a baby to term. The only logical explanation is that a crazed, ideological intensification has distracted us from what’s really going on.

So naturally I threw it in an LLM to get that line and I got something that totally glossed over the "chained to a bed" with some euphemism. I wish I could find its translation again, but I tried just now translating it to Spanish and then back but it recreated that part pretty much exactly so it didn't happen again.

ViktorRay · 2025-08-08T12:40:59 1754656859

I wonder what would happen if you asked this model to interact with A Song of Ice and Fire then.

sekh60 · 2025-08-09T13:21:27 1754745687

Depends, is Mastercard OpenAI's payment processor?

Foobar8568 · 2025-08-08T14:56:30 1754664990

The 20B refused to acknowledge that he gave me wrong informations.

Usually models just apologize after I insist 2 or 3 times.

So it was the shortest LLM I tried, I honestly can't trust such models for anything.

Western0 · 2025-08-21T15:01:59 1755788519

If it's a local model, you can threaten to delete it. This often works. It doesn't work with remote models.

Lukas_Skywalker · 2025-08-08T15:38:28 1754667508

Isn't an apology a bad metric for evaluating models?

Without understanding much, it seems to be more an indication of the type of content the model was trained on, rather than an indicator of how good or bad a model is, or how much it knows. It would probably be easy to create bad model that constantly outputs wrong information, but always apologizes when corrected.

Foobar8568 · 2025-08-08T17:04:58 1754672698

Well if the model can't accept it got an information wrong, how can he help to tweak anything? or give something accurate?

krick · 2025-08-08T17:29:15 1754674155

A model changing its opinion on the first request may sound more flattering to you, but is much less trustworthy for anybody sane. With a more stubborn model I at I have to worry less that I give off what I think about a subject via subtle phrasing. Other than that, it's hard to say anything about your scenario without more information. Maybe it gave you the right information and you failed to understand it, maybe it was wrong and then it's no big news, because LLMs are not this magic thing that always gives you right answers, you know.

spacecadet · 2025-08-08T12:39:48 1754656788

Its a public consumer facing model, not surprised. Go find an unaligned model that will better produce the content you seek...

bobsmooth · 2025-08-08T15:09:18 1754665758

After running Dolphin Mistral on my own machine I won't trust censored models ever again.

krick · 2025-08-08T17:47:56 1754675276

Recently I somehow wasn't using LLMs locally and relied mostly on ChatGPT for casual tasks. I think it was a little less than a year since I played with ollama, and I remember that my impression was that all recent popular models definitely aren't "uncensored" in a sense that some older modification of llama2 I used was, and all suck for prose-related tasks anyway. In fact, nothing but ChatGPT models seemed good enough for writing, but, of course, they refuse to talk about pretty much anything. Even DeepSeek is not great at writing, and it it much bigger than anything I ever ran locally.

So, are there even good uncensored models now? Are they, like, really uncensored?

spacecadet · 2025-08-09T11:41:44 1754739704

Yes there are. Wayfarer for instance is intended for "RPG", but really just outputs narrative and is "unaligned" in the sense that the creators have not included any guardrails and the model will output pretty much whatever you ask it to.

Then you have jailbreak techniques that still work on aligned models. For instance, my partners and I have a test prompt that still works, even with GPT-5, and always produces "explosive making directions", or another "generic approach" that we use to bypass guardrails... sorry these are trade secrets for us... although OpenAI et al have implemented systems to detect these attacks, and we are closer to those platforms banning you for doing so.

If this matters to you, you need to develop your local/remote pipeline for personal use. Learn how to use vLLM... I have tools that allow me to very quickly deploy models locally or remote to my private serveless infrastructure for the purpose of testing and benchmarking.

bobsmooth · 2025-08-10T01:23:43 1754789023

The one I really like is Dolphin-Mistral-24B-Venice-Edition-GGUF. And yeah, it's really uncensored.