More

meander_water · 2025-05-03T04:29:20 1746246560

As someone who struggles to make anything that looks good, I am fascinated by designers ability to take a brief and bring it to life using their own unique artisic voice.

The second part to this is a fine example - https://goodsniff.substack.com/p/creating-bluey-tales-from-t...

I've always wondered how they managed to make the show look and feel Brisbane, and this delivers.

meander_water · 2025-05-01T22:25:19 1746138319

Looks like this is possible due to the relatively recent addition of OAuth2.1 to the MCP spec [0] to allow secure comms to remote servers.

However, there's a major concern that server hosters are on the hook to implement authorization. Ongoing discussion here [1].

[0] https://modelcontextprotocol.io/specification/2025-03-26

[1] https://github.com/modelcontextprotocol/modelcontextprotocol...

marifjeren · 2025-05-02T00:50:40 1746147040

That github issue is closed but:

> major concern that server hosters are on the hook to implement authorization

Doesn't it make perfect sense for server hosters to implement that? If Claude wants access to my Jira instance on my behalf, and Jira hosts a remote MCP server that aids in exposing the resources I own, isn't it obvious Jira should be responsible for authorization?

How else would they do it?

halter73 · 2025-05-02T19:05:29 1746212729

That github issue is closed because it's been mostly completed. As of https://github.com/modelcontextprotocol/modelcontextprotocol..., the latest draft specification does not require the resource server to act as or poxy to the IdP. It just hasn't made its way to a ratified spec yet, but SDKs are already implementing the draft.

cruffle_duffle · 2025-05-02T01:14:36 1746148476

The authorization server and resource server can be separate entities. Meaning that jira instance can validate the token but not be the one issuing it or handling credentials.

marifjeren · 2025-05-02T02:34:39 1746153279

Yes, this is true of OAuth, which is exactly what the latest Model context protocol is using.. What's the concern again?

I guess maybe you are saying the onus is NOT on the MCP server but on the authorization server.

Anyway while technically true this is mostly just distracting because:

1. in my experience the resource server and the authorization server are almost always maintained by the same company -- Jira/Atlassian being an example

2. the resource server still minimally has the responsibility of identifying and integrating with some authorization server, and *someone* has to be the authorization server, so I'm not sure deferring the responsibility to that unidentified party is a strong defense against the critique anyway. The strong defense is: of course the MCP server should have these responsibilities.

meander_water · 2025-05-02T03:17:03 1746155823

I think the pain points will be mostly for enterprise customers who want to integrate servers into their auth systems.

For example, say you have a JIRA self hosted instance with SSO to entra id. You can't just install an MCP server off the shelf because authZ and resources are tightly coupled and implementation specific. It would be much easier if the server only handled providing resources, and authZ was offloaded to a provider of your choosing.

marifjeren · 2025-05-02T03:26:36 1746156396

I'm under the impression that what you described is exactly how the new model context protocol works, since it's using oauth and is therefore unaware of any of the authentication (eg SSO) details. Your authentication process could be done via carrier pigeon and Claude would be none the wiser.

dmarble · 2025-05-01T22:50:03 1746139803

Direct link to the spec page on authorization: https://modelcontextprotocol.io/specification/2025-03-26/bas...

Source: https://github.com/modelcontextprotocol/modelcontextprotocol...

meander_water · 2025-04-30T04:15:46 1745986546

For better or worse this has become the defacto standard in LLM Evaluation research papers since the LLM as a Judge paper [0] came out. Its also heavily embedded into frameworks like LangChain and LlamaIndex to evaluate RAG pipelines.

[0] https://arxiv.org/abs/2306.05685

[1] https://arxiv.org/abs/2411.15594

swyx · 2025-04-30T05:52:49 1745992369

its for the better, and i'm actually serious about this. it's just that Subbarao is ALSO right and it is not perfect nor human level. but it -DOES- improve results measurably and consistently.

so what i'm saying is don't throw the baby out with the bathwater. LLM as judge doesnt replace human judgement but its a pretty darn good first pass for how cheap it is. and you can imagine that it will get better over time.

meander_water · 2025-04-29T23:31:30 1745969490

Cursor has a neat feature where you can upload custom docs, and then reference them with @Docs. I find this prevents hallucinations, and also using a reasoning model

meander_water · 2025-04-29T23:29:06 1745969346

To be honest, this is what I assumed this repo was doing from the title. It talks about arguing with itself, but it looks like it's just generating multiple alternative responses in parallel and selecting the best one.

Do you find your method handles "sycophancy" well?

StopDisinfo910 · 2025-04-30T05:44:14 1745991854

I don’t really know.

I stopped using ChatGPT at some point because I disliked how cagey it became about a lot of topics. I used to enjoy making write improbable movies mashup when GPT3 was released and at some point it became very touchy about IP rights and violence which was annoying.

I generally use Deepseek nowadays which is not sycophantic and surprisingly doesn’t seem as censored to me especially if you use a version not hosted by Deepseek themselves.

lblume · 2025-04-30T19:37:46 1746041866

Which hosting service would you recommend?

meander_water · 2025-04-28T13:53:09 1745848389

Not fiction, but an amazing piece of work nonetheless which shifted my worldview somewhat to a more hopeful one.

Humankind by Rutger Bergman

He's also written Utopia for Realists, which is on my to read list.

meander_water · 2025-04-28T08:05:35 1745827535

This looks interesting, how do you plan to handle agents which operate apps with a UI - for example playwright, obsidian etc. Or is this out of scope?

rellfy · 2025-04-28T08:19:56 1745828396

Thanks!

That's a good question. Currently, there is one way to do it. The client querying the agent receives JSON-encoded values that are returned from plugin function calls made by the agent. These values are received alongside the agent token response stream (via SSE). So plugins can essentially emit events that the client can forward to the UI application, such as to click a button etc. The limitation with this is that there is no built-in way to send a success/error status back, it's one way only. It works well for actions that are infallible such as simple UI actions.

The client here would also need a way to interact with the target program of course, e.g. from a JavaScript browser you can click buttons and manipulate the DOM, or from a VSCode Plugin you can interact with the editor etc.

It's definitely something that can be improved though! I've been thinking about some type of MCP interoperability that could maybe assist with this.

meander_water · 2025-04-27T22:15:09 1745792109

A programmers job is to provide value to the business. Thinking is certainly a part of the process, but not the job in itself.

I agree with the initial point he's making here - that code takes time to parse mentally, but that does not naturally lead to the conclusion that this _is_ the job.

meander_water · 2025-04-27T13:09:04 1745759344

This is really neat, love the idea!

andersmyrmel · 2025-04-27T21:00:25 1745787625

Thank you!

meander_water · 2025-04-26T10:51:41 1745664701

There was some interesting research published by Anthropic recently [0] which showed how university students used Claude, and it largely supports the hypothesis here. Claude was being used to complete higher order cognitive thinking tasks 70% of the time.

> ...it does point to the potential concerns of students outsourcing cognitive abilities to AI. There are legitimate worries that AI systems may provide a crutch for students, stifling the development of foundational skills needed to support higher-order thinking. An inverted pyramid, after all, can topple over

[0] https://www.anthropic.com/news/anthropic-education-report-ho...