Hacker News new | past | comments | ask | show | jobs | submit login

(I work at OpenAI.)

Thanks for the report — these are not actually messages from other users, but instead the model generating something ~random due to hitting a bug on our backend where, rather than submitting your question, we submitted an empty query to the model.

That's why you see just the answers and no question upon refresh — the question has been effectively dropped for this request. Team is fixing the issue so this doesn't happen in the future!




While I have your ear, please implement some way to do third party integrations safely. There’s a tool called GhostWrite which autocompletes emails for you, powered by ChatGPT. But I can’t use it, because that would mean letting some random company get access to all my emails.

The same thing happened with code. There’s a ChatGPT integration for pycharm, but I can’t use it since it’ll be uploading the code to someone other than OpenAI.

This problem may seem unsolvable, but there are a few reasons to take it seriously. E.g. you’re outsourcing your reputation to third party companies. The moment one of these companies breaches user trust, people will be upset at you in addition to them.

Everyone’s data goes to Google when they use Google. But everyone’s data goes to a bunch of random companies when they use ChatGPT. The implications of this seem to be pretty big.


I can't speak for every company, but I've seen a lot of people claiming that they're leveraging "chat GPT" for their tech stack when underneath the covers they're just using the standard open davinci-03 model.

Still wrong obv but for a different reason.


Welcome to marketing copy. ChatGPT has the name recognition. text-davinci-003 does not.


GPT-3 surely does too, but ChatGPT is undeniably the new hotness.


I don't really see the issue. You are using a service called GhostWrite which uses ChatGPT under the hood. OpenAI/ChatGPT would be considered a sub-processor of GhostWrite. What am I missing?


On properly designed privacy respecting systems, the client sends the request to the trusted server with whatever API keys are needed to make it work.

But that would break the server lock-in subscript model, so only downloadable software would work.


How are they using ChatGPT - is there an API? Or is this simply abuse of TOS?


They're not using ChatGPT, they're using GPT-3, which has an API. There is a ChatGPT API coming but it's not available yet.

It is infuriating how everyone is describing all GPT models as "ChatGPT". It's very misleading.


Supposedly there is a hidden model that you can use via the API that actually is ChatGPT. One of the libraries mentioned in these comments is using it.

Edit: this one https://github.com/transitive-bullshit/chatgpt-api


In case anyone wants to replace davinci-003 with the chat GPT model, the name is `text-chat-davinci-002-20230126`



> Everyone’s data goes to Google when they use Google. But everyone’s data goes to a bunch of random companies when they use ChatGPT.

No, their data goes to random companies when they use random companies. And these services also exist for google.


> But I can’t use it, because that would mean letting some random company get access to all my emails.

That's because they do it to get access to your e-mails, not to give you AI powered email autocomplete.


Honestly, they’ll probably offer some enterprise offering where data sent to the model will be contained and abide by XYZ regulation. But for hobbyist devs, think this won’t be around for a while


Isn't this what the Azure OpenAI service is for? Sure it's technically "Microsoft", but at some point you have to trust someone if you want to build on the modern web.


Tl;dr

"Dear CTO, let me leech onto this unrelated topic to ask you to completely remove ways you gather data (even though it's the core way you create any of your products)."

Some people man..


I think you may have misread. The goal is to protect end users from random companies taking your data. OpenAI themselves should be the ones to get the data, not the other companies.

That wouldn't remove anything. Quite the contrary, they'd be in a stronger position for it, since the companies won't have access to e.g. your email, or your code, whereas OpenAI will.

I'm fine trusting OpenAI with that kind of sensitive info. But right now there are several dozen new startups launching every month, all powered by ChatGPT. And they're all vying for me to send them a different aspect of my life, whether it's email or code or HN comments. Surely we can agree that HN comments are fine to send to random companies, but emails aren't.

I suspect that this pattern is going to become a big issue in the near future. Maybe I'll turn out to be wrong about that.

It's also not my choice in most cases. I want to use ChatGPT in a business context. But that means the company I work for needs to also be ok with sending their confidential information to random companies. Who would possibly agree to such a thing? And that's a big segment of the market lost.

Whereas I think companies would be much more inclined to say "Ok, but as long as OpenAI are the only ones to see it." Just like they're fine with Google holding their email.

Or I'm completely wrong about this and users/companies don't care about privacy at all. I'd be surprised, but I admit that's a possibility. Maybe ChatGPT will be that good.


Sketch of a design to solve this:

Company can upload some prompts to OpenAI, and be given 'prompt tokens'.

Then companies client side app can run a query with '<prompt_token>[user data]<other_prompt_token>'. They may have a delegated API key which has limits applied - for example, may only use this model, must always start with this prompt.

That really reduces the privacy worries of using all these third party companies.


ChatGPT had sparked the imagination of the industry, but the fire will be lit with offline models that can take accept private data.


Bad take. He's actually asking for them to directly gather data as he trusts them more than the random middle-men who are currently providing the services he's interested in.

As someone working for a random middle-man, I hope OpenAI maintain the status quo and continue to focus on the core product.


Funny how gdb is helping debug openAI!


Quick question. Will ChatGPT be fine-tune able from the API ?

PS: You should really do an AMA!


I fully agree with the AMA request.

I'd specially like to know why it was "generating something ~random" instead of "generating something random" when given an empty question.

If it's random, how does it come up with the topic, and if it is "~random", how is it not other (random) user's data? The former case being the interesting one, since the second one would appear to be more of a caching or session management bug.


The most amusing thing about that bug is that if you ask it what question it was answering, it will conjure one that made sense given the answer.


Is OpenAI hiring software engineers without a background in academic machine learning these days? Seems like a super exciting place to work.


Is the inability to "continue" a long answer also a bug? (Please say yes :)


Should a proper large language model be able to generate arguments for and against any side of any debate?


Can you help me understand why the ChatGPT model has an inherent bias towards Joe Biden and against Donald Trump? This is not really what I would expect from a large language model .......


It's a uniquely American perspective that the two political parties should be treated equally. From a global perspective, one is far more problematic than the other, and GPT reflects that accurately.


And yet you’d almost certainly complain if an American company meddled in your country’s politics.


Should the language model treat every political party in every country equally?


Yes. Do you want to use a tool or an automated mouth piece for the regime?


An unbiased tool would never treat two parties completely equally.

If Trump and Biden both claim to have won the election, who should ChatGPT say is president? Should it flip a coin?


Godwin's Law in 3, 2, 1...


[flagged]


We've banned this account. Please don't create accounts to break HN's rules with.

https://news.ycombinator.com/newsguidelines.html


It's probably been fed a lot of lefty propaganda about how it's "bad" to support insurrectionist riots and "wrong" to lie about losing an election.


Reality has an inherent liberal bias.

In all honesty though, the dataset it was trained on may have a liberal bias. This is _precisely_ the sort of bias you should expect from a large language model .............................


Weren't Reddit posts part of the core data set used to train the model?

That alone probably explains the bias.


Yes. And it probably wouldn't have a bias if reddit wasn't heavily censored, with anyone right leaning being banned. It's practically a left wing propoganda website now.


What do you mean about liberal bias.... Reality is by it's very nature unbiased. It just...is


It was a joke. I mean, it's a joke I personally happen to believe is true, but not something I will state as factual.

Somewhere on the political spectrum lies objective facts, truth, and logic. My priors tell me this side tends to be left-of-center. My priors also tell me that the majority of people's political beliefs are decided for them by their parents and their upbringing. So I'm happy to admit that plenty of liberals are in it for the wrong reasons. That doesn't detract from it being the side on the correct side of history.

But again, it was a joke.


I also used to believe that facts and truth were left of center. But after the whole "get vaccinated or you will be killing someone's grandparents" propoganda came out to be false, I have a hard time believing the left.


Okay.


A large data set will be biased if the sum of data is leaning towards some direction.

I'm not sure you can produce a truly unbiased model without actively interfering with it.

Just consider the fact that you'll find less republicans among scientists. (source: https://www.pewresearch.org/politics/2009/07/09/section-4-sc...)

Now the research-based data on ChatGPT will be biased. It takes no active "inserting" by OpenAI. It may manage creating the bias all by itself.


(Psst. You're the broken one, mate.)


Don't question the matrix on HN. Agents are already on their way. . .


It's also more hateful towards men than women.


[flagged]


Didn't someone just go to jail for this? They were sending invoices to google, fb, and a bunch of other companies, who did actually pay it. Then one day they realized the invoices were for nothing, no services rendered.

So, be careful with your trolling. It might come back to bite you someday, sir or ma'am.

https://www.npr.org/2019/03/25/706715377/man-pleads-guilty-t...

Not quite the same, but it's in the same ballpark. It's a big deal to send fake invoices to companies, even if you believe they're legit.


"included" is a loaded word here. Nobody is getting your content, unaltered, as ChatGPT responses, and if they are it's a bug that'll get fixed.

Besides, the law is far from resolved on this issue, there are a number of pending cases that would need to be resolved before you could so unambiguously claim such as you are.



"Subscribe to WIRED to continue reading."

Besides, looks like an opinion article, suggesting a course of action, not factually claiming, as you are here, that one idea or opinion is objectively correct.


Do humans also have to pay you money if they read your publications, learn from them and use them in their jobs?


Do you send an invoice for 700K to everyone who looks at and reads your "works" and then remembers them?

If you don't, you're supremely hypocritical.


wow this copyright thing really sounds broken


Yeah, it would be great if people followed the law instead of being digital thieves.

But you know...

Goals.


Oh no, not another copyright flamewar on HN.


While I have have your ear, please tell your team not to inject their political biases into this tool. Thanks


This is like asking water not to be wet.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: