As someone who struggles to make anything that looks good, I am fascinated by designers ability to take a brief and bring it to life using their own unique artisic voice.
> major concern that server hosters are on the hook to implement authorization
Doesn't it make perfect sense for server hosters to implement that? If Claude wants access to my Jira instance on my behalf, and Jira hosts a remote MCP server that aids in exposing the resources I own, isn't it obvious Jira should be responsible for authorization?
That github issue is closed because it's been mostly completed. As of https://github.com/modelcontextprotocol/modelcontextprotocol..., the latest draft specification does not require the resource server to act as or poxy to the IdP. It just hasn't made its way to a ratified spec yet, but SDKs are already implementing the draft.
The authorization server and resource server can be separate entities. Meaning that jira instance can validate the token but not be the one issuing it or handling credentials.
Yes, this is true of OAuth, which is exactly what the latest Model context protocol is using.. What's the concern again?
I guess maybe you are saying the onus is NOT on the MCP server but on the authorization server.
Anyway while technically true this is mostly just distracting because:
1. in my experience the resource server and the authorization server are almost always maintained by the same company -- Jira/Atlassian being an example
2. the resource server still minimally has the responsibility of identifying and integrating with some authorization server, and *someone* has to be the authorization server, so I'm not sure deferring the responsibility to that unidentified party is a strong defense against the critique anyway. The strong defense is: of course the MCP server should have these responsibilities.
I think the pain points will be mostly for enterprise customers who want to integrate servers into their auth systems.
For example, say you have a JIRA self hosted instance with SSO to entra id. You can't just install an MCP server off the shelf because authZ and resources are tightly coupled and implementation specific. It would be much easier if the server only handled providing resources, and authZ was offloaded to a provider of your choosing.
I'm under the impression that what you described is exactly how the new model context protocol works, since it's using oauth and is therefore unaware of any of the authentication (eg SSO) details. Your authentication process could be done via carrier pigeon and Claude would be none the wiser.
For better or worse this has become the defacto standard in LLM Evaluation research papers since the LLM as a Judge paper [0] came out. Its also heavily embedded into frameworks like LangChain and LlamaIndex to evaluate RAG pipelines.
its for the better, and i'm actually serious about this. it's just that Subbarao is ALSO right and it is not perfect nor human level. but it -DOES- improve results measurably and consistently.
so what i'm saying is don't throw the baby out with the bathwater. LLM as judge doesnt replace human judgement but its a pretty darn good first pass for how cheap it is. and you can imagine that it will get better over time.
Cursor has a neat feature where you can upload custom docs, and then reference them with @Docs. I find this prevents hallucinations, and also using a reasoning model
To be honest, this is what I assumed this repo was doing from the title. It talks about arguing with itself, but it looks like it's just generating multiple alternative responses in parallel and selecting the best one.
Do you find your method handles "sycophancy" well?
I stopped using ChatGPT at some point because I disliked how cagey it became about a lot of topics. I used to enjoy making write improbable movies mashup when GPT3 was released and at some point it became very touchy about IP rights and violence which was annoying.
I generally use Deepseek nowadays which is not sycophantic and surprisingly doesn’t seem as censored to me especially if you use a version not hosted by Deepseek themselves.
That's a good question. Currently, there is one way to do it. The client querying the agent receives JSON-encoded values that are returned from plugin function calls made by the agent. These values are received alongside the agent token response stream (via SSE). So plugins can essentially emit events that the client can forward to the UI application, such as to click a button etc. The limitation with this is that there is no built-in way to send a success/error status back, it's one way only. It works well for actions that are infallible such as simple UI actions.
The client here would also need a way to interact with the target program of course, e.g. from a JavaScript browser you can click buttons and manipulate the DOM, or from a VSCode Plugin you can interact with the editor etc.
It's definitely something that can be improved though! I've been thinking about some type of MCP interoperability that could maybe assist with this.
A programmers job is to provide value to the business. Thinking is certainly a part of the process, but not the job in itself.
I agree with the initial point he's making here - that code takes time to parse mentally, but that does not naturally lead to the conclusion that this _is_ the job.
There was some interesting research published by Anthropic recently [0] which showed how university students used Claude, and it largely supports the hypothesis here. Claude was being used to complete higher order cognitive thinking tasks 70% of the time.
> ...it does point to the potential concerns of students outsourcing cognitive abilities to AI. There are legitimate worries that AI systems may provide a crutch for students, stifling the development of foundational skills needed to support higher-order thinking. An inverted pyramid, after all, can topple over
The second part to this is a fine example - https://goodsniff.substack.com/p/creating-bluey-tales-from-t...
I've always wondered how they managed to make the show look and feel Brisbane, and this delivers.
reply