100% Yes. We can (because people are going to ask how to do X in postgres and AI can reply how to do that in postgres with a side-note that this Sponsored product does it out of the box). however, can you sell more ads (by tagging them as "ads" or "Sponsored") - That is where the gray area shows up (IMO).
Obviously, this is still evolving but I guess something along that lines could be done, if one is too serious to monetize it. In my personal opinion, I think right now companies are focussed around capturing market than monetization.
I don't think so. Google had Lambda for quite some time internally. If it was a slum dunk on selling more ads, it would have been out way before chatgpt.
Google has maximized their ad revenue with their current model. You don't have the same ad real estate in your example and that is a problem. Now they get money for impressions on a whole page, and clicks from people trying to find the right fit among the seo. I believe that google wants information filtering but not too much as it needs current estate for ads.
You know some people used to use Google as a glorified spell checker? I use ChatGPT as a glorified stupidity checker. What I mean is I ask it the silliest of the doubts. Like how to set environment variable in Windows (because we are all used to EXPORT aren't we?), whether or not can we do X in K8S YAML in a given conversation.
Obviously I use it for other purposes as well, but it definitely has saved me a lot of hours getting the basics things right there in a prompt.
The thing is - our ability is limited to our understanding. AI maybe already doing things we do not understand (which could be classified as AGI). Thinking of it like how dogs do not have the cognitive ability to understand the concept of the "future" or tomorrow - There is a good chance AI would already be doing things which are beyond our cognitive ability (including the smartest people working on the tech.)
But I am sure if we let 2 fairly good LLMs talk to each other - They'd shortly start talking things we feel are hallucinations but the 2 LLMs would understand and take it further.
Again, this is just my opinion and I have never worked on any LLMs. So an outsider.
I agree on that - that private data (in a best case scenario) should not and would not be included in the training but there would be some parts of internal documents which would be public (lets say public website) - It is expected that chatGPT would know at least those ..
Considering that but about robots.txt is true (and I feel it is true) what can one do. Are there no regulations (implemented or in planning stage) on any of the bodies which decide the standards?
At some point, content owner should be - technically - be having some control to be able to limit / control who accesses their content