I'm having a lot of fun experimenting with stuff like this. I'm trying to put to...

andai · 2025-04-29T18:04:15 1745949855

I am thinking the same thing! Multiple "personalities", in parallel, or in series. For example, I have approximated, in GPT, some of Gemini's ability to call out nonsense, sloppy thinking, by telling GPT to be mean! (The politeness seems to filter out much that is of great value!)

However, the result is not pleasant to read. Gemini solved this in their training, by doing it in two phases... and making the first phase private! ("Thinking.")

So I thought, what I need is a two-phase approach, where that "mean" output gets humanized a little bit. (It gets harsh to work in that way for more than short intervals.)

As a side note, I think there would be great value in a UI that allows a "group chat" of different LLM personalities. I don't know if such a thing exists, but I haven't seen it yet, although the message object format seems to have been designed with it in mind (e.g. every message has a name, to allow for multiple users and multiple AIs).

Even better if it supports multiple providers, since they have different strengths. (It's like getting a second opinion.)

jbm · 2025-04-29T18:53:39 1745952819

I disagree.

If anything, telling GPT to be blunt seems to downgrade its IQ; it hallucinates more and makes statements without considering priors or context. I jokingly call it Reddit mode.

dingnuts · 2025-04-29T19:11:29 1745953889

why would that be a joke? there's a ton of Reddit comments in the training data, and the output is of similar quality. LLMs are literally outputting average Reddit comments.

jbm · 2025-04-29T22:24:06 1745965446

I have hard similar things but I think that's an exaggeration. When I tell GPT o3 or o4-high to assume a professional air, it stops acting like a meat-based AIs on r/politics; specifically, it stops making inane assumptions about the situation and starts becoming useful again.

For example, I had a question from a colleague that made no sense and I was trying to understand it. After feeding the question to GPT 3o, it aggressively told me that I made a major mistake in a quote and I had to make major changes. (It would be OK if this is what the colleague had said, but this wasn't the case). In reality the colleague had misunderstood something about the scope of the project and GPT had picked up on the other person's opinion as the "voice of reason" and just projected what it thought he was saying in a stronger way.

I changed its instructions to "Be direct; but polite, professional and helpful. Make an effort to understand the assumptions underlying your own points and the assumptions made by the user. Offer outside-of-the-box thinking as well if you are being too generic.". The aggro was immediately lost, and it instead it actually tried to clarify what my colleague was saying and being useful again.

I agree with those who say the vanilla version is sycophantic, but the plain talk version has far too many bad habits from the wrong crowd. It's a bit like Monday; lots of aggro, little introspection of assumption.

MoonGhost · 2025-04-29T20:44:02 1745959442

Reddit works hard to make comments accessible to only Google. However MS + OIA might have grabbed something before Reddit-Google contract.

inanutshellus · 2025-04-29T20:36:01 1745958961

See, he's not joking, he's "joking" ...

NitpickLawyer · 2025-04-29T18:28:57 1745951337

> As a side note, I think there would be great value in a UI that allows a "group chat" of different LLM personalities.

This is the basic idea behind autogen. They also have a web UI now in autogen studio, it's gotten a bit better. You can create "teams" of agents (with different prompts, themes, tools, etc.) and have them discuss / cooperate. I think they even added memory recently. Have a look at it, might be what you need.

theturtletalks · 2025-04-29T20:04:20 1745957060

MoE, but an abstraction deeper?

irthomasthomas · 2025-04-29T19:09:40 1745953780

I think you can do most of this already with llm-consortium (maybe needs the llm-openrouter plugin with my pr merging)

A consortium sends the same prompt to multiple models in parallel and the responses are all sent to one arbiter model which judges the model responses. The arbiter decides if more iterations are required. It can also be forced to iterate more until confidence-threshold or min-iterations.

Now, using the pr i made to llm-openrouter, you can save an alias to a model that includes lots of model options. For examples, you can do llm openrouter save -m qwen3 -o online -o temperature 0, system "research prompt" --name qwen-researcher

And now, you can build a consortium where one member is an online research specialist. You could make another uses JSON mode for entity extraction, and a third which writes a blind draft. The arbiter would then make use of all that and synthesize a good answer.

kridsdale1 · 2025-04-29T19:34:18 1745955258

Any links or names of example implementations of this?

irthomasthomas · 2025-04-29T19:38:02 1745955482

https://github.com/irthomasthomas/llm-consortium

also, you aren't limited to cli. When you save a consortium it creates a model. You can then interact with a consortium as if it where a normal model (albeit slower and higher quality). You can then serve your custom models on an openai endpoint and use them with any chat client that supports custom openai endpoints.

The default behaviour is to output just the final synthesis, and this should conform to your user prompt. I recently added the ability to continue conversations with a consortium. In this case it only includes your user prompt and final synthesis in the conversation, so it mimics a normal chat, unlike running multiple iterations in the consortium, where full iteration history and arbiter responses are included.

UV tool install llm

llm install llm-consortium

llm install llm-model-gateway

llm consortium save qwen-gem-sonnet -m qwen3-32b -n 2 -m sonnet-3.7 -m gemini-2.5-pro --arbiter gemini-2.5-flash --confidence-threshold 95 --max-iterations 3

llm serve qwen-gem-sonnet

In this example I used -n 2 on the qwen model since it's so cheap we can include multiple instances of it in a consortium

Gemini flash works well as the arbiter for most prompts. However if your prompt has complex formatting requirements, then embedding that within an already complex consortium prompt often confuses it. In that case use gemini-2.5-pro for the arbiter. .

globalise83 · 2025-04-29T17:54:19 1745949259

Have you tried n8n? It allows you to build flows like that - you can run the community version in a Docker container within a few minutes and share the configurations for the flows you have built very easily.

mecsred · 2025-04-29T18:01:35 1745949695

_#_ has to be one of the worst word shortening schemes I've ever seen get widespread. It only works with a very small number of long-lived technologies, in which case they basically just get a nickname, "k8s" "i18n". It does not at all work for larger contexts. You're basically making someone solve a crossword (2 across, 10 letters with two filled in) just to parse your sentence.

jjj123 · 2025-04-29T18:07:46 1745950066

I just googled it and it looks like “n8n” is the name of the service. The op wasn’t abbreviating anything so I don’t think it’s the same phenomenon as what you’re describing.

lgas · 2025-04-29T18:28:48 1745951328

Well, the service is doing the same thing though. The part I don't understand is that I assume n8n is short for "Nation" but literally every single person I've seen talk about it on YouTube (which is quite a lot) say "En Eight En" every time.

nemomarx · 2025-04-29T18:40:45 1745952045

nation is too short for 8 - maybe navigation?

pkaye · 2025-04-29T18:57:29 1745953049

Looks like n8n is short for nodemation

firesteelrain · 2025-04-29T20:15:04 1745957704

Why do we do this to ourselves?

Y_Y · 2025-04-29T21:48:38 1745963318

Techno-flagellation is the only way to atone

lgas · 2025-05-04T00:15:10 1746317710

So the 8 stands for "odematio"? That sounds about right.

oppodeldoc · 2025-04-29T22:51:47 1745967107

https://github.com/n8n-io/n8n?tab=readme-ov-file#what-does-n...

globalise83 · 2025-04-30T02:12:26 1745979146

The app is actually called n8n - https://n8n.io/

eddieroger · 2025-04-29T18:43:00 1745952180

It's just another form of any other jargon - unknown until you know it, and usually specific to the use case. I see k8s and i18n or a11y and I know exactly what they mean because at some point I learned it and it's part of the world I live in. Searching for stuff is how we learn, not solving crosswords.

wongarsu · 2025-04-29T19:52:12 1745956332

I kind of get k8s and can live with i18n (at least it's a long word). But a11y just shouldn't exist. "Oh look, it looks like ally, what a cute play on words". Yeah, but for a dumb joke and 9 saved keystrokes you literally made the word accessibility less accessible. That's exactly the opposite of what accessibility is about

mecsred · 2025-04-29T19:35:19 1745955319

Right, my complaint is that it only works like jargon, where you are just giving something a context-specific nickname. As a word shortening scheme, it's terrible. A world where many projects have names like s11g is a nightmare.

psychoslave · 2025-04-30T18:11:38 1746036698

No it's not just part of the world and it's fatality we have to live with like gravity. Abbreviation can in rare occasion have a net benefit, but only in very narrow highly unusual context do they bring any general benefit. Most often than not it just obfuscate the message for new comers, making artificial entry barrier higher.

hnuser123456 · 2025-04-29T18:17:18 1745950638

I had not, but that looks awesome. Microsoft put out something called "agent flows" that also fits this category.[1] I'm working on more of an "at home" version - no "talk to sales" button.

https://www.microsoft.com/en-us/microsoft-copilot/blog/copil...