Awesome-totally-open-ChatGPT: A list of open alternatives to ChatGPT

mg · on March 21, 2023

Comparing search engines has been my passion for a while now. I built various tools around it (https://www.gnod.com/search/)

With the advent of LLMs, I also started doing it for AI engines recently:

https://www.gnod.com/search/ai

Will keep it updated with new LLMs as they come on the scene.

If anybody runs one of these open ChatGPT alternatives as a service, let me know. I would love to include your service in the AI comparison.

efficientsticks · on March 21, 2023

Good idea. A couple more non-AI engines: https://search.marginalia.nu/ and my own https://highbrow.se/

marginalia_nu · on March 21, 2023

Something in Marginalia's query processing is almost comically broken right now. Only really works well for single-term queries. Got a fix in the pipe, but it'll be a week or so.

mycall · on March 22, 2023

Do you have any suggestions how to feed different search engines into each other, along with random injections of random prompts, to watch their dialog?

notRobot · on March 21, 2023

Prompt:

What would happen if the Eiffel tower was submerged in a massive tub of liquid nitrogen?

    ----

YouChat:

Based on the provided search results, there does not appear to be any reliable information on what would happen if the Eiffel Tower was submerged in a massive tub of liquid nitrogen. It is important to note that submerging the Eiffel Tower in such extreme conditions could potentially cause damage or destruction to the tower, and it is not recommended to attempt such an experiment. If you have any other questions or concerns, please let me know and I will do my best to assist you.

    ----

Phind:

Liquid nitrogen is a colorless, odorless, and non-flammable substance that is used in many industrial processes. It has a boiling point of −196 °C (−321 °F), which is much lower than the freezing point of water. If the Eiffel Tower were submerged in a massive tub of liquid nitrogen, several things would happen [4].

The liquid nitrogen would quickly evaporate into gas since the Eiffel Tower is much warmer than −196 °C. This would create a cloud of nitrogen gas around the tower, which could obscure its visibility.

The tower's metal structure would initially contract due to the extreme cold, which could cause it to make cracking or popping sounds. However, the contraction would be minimal since the tower is made of wrought iron, which has a low coefficient of thermal expansion [5].

The tower's paint would become brittle and could crack due to the sudden change in temperature. However, the paint layer is not thick enough to affect the overall structural integrity of the tower [1].

The tower's micro-movements due to wind and sun exposure would be reduced since the metal structure would be much colder and less prone to expansion and contraction. However, this effect would be temporary since the tower would eventually warm up again [1].

The tower's lights would still be visible since they are powered by electricity, which is not affected by the temperature of the tower [3].

In summary, submerging the Eiffel Tower in a massive tub of liquid nitrogen would not cause any significant damage to the tower's structure. However, it could create a temporary cloud of nitrogen gas and cause the tower's paint to become brittle.

[1]: https://www.toureiffel.paris/en/news/130-years/how-long-can-...

[2]: <missing>

[3]: https://www.toureiffel.paris/en/news/history-and-culture/15-...

[4]: https://quizlet.com/501418121/assignment-4-eiffel-flash-card...

[5]: https://www.snexplores.org/article/how-science-saved-eiffel-...

    ----

Perplexity:

Submerging the Eiffel Tower in a massive tub of liquid nitrogen would not be possible as it would require an enormous amount of liquid nitrogen, which is not readily available. However, if such a scenario were to occur, the tower would become extremely cold and brittle due to the low temperature of liquid nitrogen (-196°C). The metal structure of the tower would contract and shrink, causing it to become more fragile and prone to breaking. Additionally, any water or moisture on the tower would freeze instantly, leading to potential damage or destruction of the structure[1][2].

[1]: https://youtube.com/watch?v=u4gobiTvFg0

[2]: https://insh.world/science/what-if-we-poured-a-pool-of-liqui...

satvikpendem · on March 21, 2023

Phind is pretty impressive, I wonder what proprietary models it uses, GPT?

toyg · on March 21, 2023

Nice, I've found engines that our corporate firewall doesn't block, woot =)

lgbr · on March 21, 2023

It's absolutely fantastic that we have so many runtimes, so quickly, to the point where we have an awesome list.

However, given that the usefulness of chatbots depends more on the model being used, what I would find a lot more useful is a ranking of the various models that are available. Currently I'm having to rely on comments on the internet to find out if Alpaca 7B or LlaMA 65B is genuinely productive to use. As new models come out, I'd love it if I knew how well it tells jokes, answers complicated questions, or generates code.

LASR · on March 21, 2023

We have a whole team of folks just watching for these to come out and then go evaluate them.

Short answer: none of them do as well as the OG Davinci-003. Not even close. Even the 3.5 Turbo models from OpenAI don’t do as well.

We throw some sophisticated prompts at them to attempt chain of thought reasoning.

WinstonSmith84 · on March 21, 2023

That's quite a confusing comment. `davinci-003` is from OpenAI, whereas ChatGPT is some sort of variants more "optimized" for chatting. Said differently, GPT3 or 3.5 is a customized version of `davinci-003`, made for chatting. Please don't ask me on the details, I don't know, but `davinci-003` is not an alternative to ChatGPT

sebzim4500 · on March 21, 2023

>but `davinci-003` is not an alternative to ChatGPT

Why makes you believe that? In my testing davinci does better than gpt-3.5-turbo for most tasks.

WinstonSmith84 · on March 21, 2023

I think people, and this article, is about suggesting alternatives (competitors) to ChatGPT. `davinci` is obviously not an alternative, ChatGPT is `davinci` made for chatting. As to whether davinci produces better responses than ChatGPT ... maybe? but that's a different question

wahnfrieden · on March 21, 2023

It is an alternative. It’s just more expensive.

inciampati · on March 21, 2023

Do you have a citation for that?

sirlunchalot · on March 21, 2023

would be interested in that as well

simonw · on March 21, 2023

What kind of things have you seen davinci-003 do better than 3.5 turbo?

dr_dshiv · on March 21, 2023

We need open benchmarks, clearly. Know any projects in that space?

dotancohen · on March 21, 2023

Could you expand on this a bit more? What types of prompts? What are your evaluation criteria?

This actually sounds fascinating. Not unlike birdwatching! ))

joenot443 · on March 21, 2023

That’s interesting - what about 4?

intalentive · on March 21, 2023

No! We can’t have open alternatives! That would be “unsafe”. We need regulatory capt— I mean, responsible government regulation now! As a MSFT shareholder I demand it.

mhuffman · on March 21, 2023

OpenAI: This technology is too power to be put in the hands of mere mortals!

Also OpenAI: ... unless, they pay us $20 per month!

nashashmi · on March 21, 2023

Sam Altman: "We definitely need more regulation on AI"

Regulation could prevent indie AI projects.

BizarroLand · on March 21, 2023

Just like regulation stopped piracy and 3d printed weaponry and prostitution and the war on drugs and prohibition stopped the drug and alcohol trade in America.

lisp-pornstar · on March 22, 2023

Regulation can do bad things/Regulation can do good things ? Is this what you meant ?

bigfishrunning · on March 23, 2023

I suspect that gp meant that, for a lot of things, regulation is ineffective and costly.

siwakotisaurav · on March 21, 2023

ChatGLM is the only one on the list whose performance feels similar to gpt3. I tried out Alpaca, wasn’t as good yet but I’ve heard on Twitter someone has a better dataset and has finetuned it, let’s see if they release the model weights

satvikpendem · on March 21, 2023

I posted this yesterday too but I've found OpenAssistant to be the best of the OSS ones so far (Alpaca and LLaMA are non-commercial so not truly open): https://huggingface.co/spaces/olivierdehaene/chat-llm-stream...

intalentive · on March 21, 2023

What's nice about Open Assistant is that it's continuously improving and the community can contribute to the training data. It is potentially MUCH better than ChatGPT.

not-chatgpt · on March 21, 2023

I've tried most of these and I gotta say, ChatGLM is actually pretty good. The Chinese responses are sometimes indistinguishable from those of ChatGPT, and it's only a 6B model.

Sadly, whenever I use harder questions or logical ones, it performs terribly. It also occasionally uses Chinese adjectives (albeit correctly) when responding in English.

Jensson · on March 21, 2023

I wonder if their censorship means that the Chinese corpus has less spam than the rest of the internet? Would be interesting if that turns out to be a huge advantage for making AI.

brookst · on March 21, 2023

Chinese censorship is both subtractive and additive (“flood the zone with shit”). There will be plenty of spam in the corpus.

runka · on March 21, 2023

I feel like we need to invest in open alternatives otherwise we are gonna create something worse than IBM used to be.

elorant · on March 21, 2023

What we need is a crowdsource project where we dedicate gpu resources to build a model from scratch. There are thousands of us in here, and thousands more in the wide community.

inawarminister · on March 21, 2023

Open-assistant.io is what you want. They've gotten more than 100k RLHF training dataset and is actively tuning up a Chat model based on GPT-NeoX (?) from what I've heard.

Garlef · on March 21, 2023

What would be great: A list of performance benchmarks: What's the model good at?

VadimPR · on March 21, 2023

Those exist - the trouble is that they're not for the average mortal...

amelius · on March 21, 2023

Cool, but a pity there are no benchmarks so we can compare them.

lionkor · on March 21, 2023

If OpenAI would release more than masturbation papers, yes, we could compare them. Right now, we only know GPT-4 does things, and looks smart when you ask it stuff.

isaacfrond · on March 21, 2023

What do you mean? The paper [1] is full of benchmarks?

[1] GPT-4 Technical Report, https://cdn.openai.com/papers/gpt-4.pdf

amrb · on March 21, 2023

Problem being its not a Research Paper, which they where doing previously. This is very bad state as you're not detailing anything that external parties can recreate or prove the scientific method.

They can exclaim the model says 40% less "xbox live gamer words" which people outside the company couldn't validate.

tl:dr OpenAi is now a business

Worth watch Yannic talk about the problem and other cool ML topics too. https://www.youtube.com/watch?v=2zW33LfffPc

siva7 · on March 21, 2023

It's not like a closed model only available to scientists you can't benchmark yourself. Benchmarking should also be done by a 3rd party otherwise we have a conflict of interest.

amrb · on March 21, 2023

If this was a cpu/graphic cards sure lets benchmark it, worst case you getting less frames.

Here we'd need to see more about its design and safety, else you may be getting recipes for veggie dishes when what you really wanted was fried chicken.

brookst · on March 21, 2023

How would knowing the architecture or safety mechanisms help you decide if it’s going to give incorrect results more than actual testing would?

I’m no LLM expert, but I don’t think you can eyeball the arch and say “that’s going to confuse veggies for fried chicken”.

kgarten · on March 21, 2023

https://aisnakeoil.substack.com/p/gpt-4-and-professional-ben...

"GPT-4 and professional benchmarks: the wrong answer to the wrong question OpenAI may have tested on the training data. Besides, human benchmarks are meaningless for bots."

brabel · on March 21, 2023

Just looking at the pictures and graphs in that paper is enough to become amazed by what they're achieving. The example where they show 3 pictures of an old monitor plug being connected to an iphone to recharge it, and then GPT4 is asked what's funny about it, and answers incredibly accurately, is amazing.

amrb · on March 21, 2023

Since we don't have access to this feature lets be skeptical, its feels like "leading the witness," if your asks what be the funny here. Also if the image is from a forum or sub with funny images is that able to give it away?

Having multiple tests would be a stronger test say with example prompts: "whats going on in this picture", "what would a person think seeing this image" etc..

gpt4 is cool as a numbers box but this is not reasoning logic and without papers hasn't been proven either.

sebzim4500 · on March 21, 2023

There are plenty of people with access to GPT-4 through the API. They can benchmark it however they want.

mdaniel · on March 21, 2023

Your suggestion made me wonder: are the OpenAI API responses deterministic? I would expect that if one gets a different reply to the same prompt each time (as I would expect, given the probabilistic nature of LLM), that makes "benchmarking" kind of weird. Or maybe the solution is just to run the prompt 100 times and mean/average across whatever metric is being used