Hacker Newsnew | past | comments | ask | show | jobs | submit | scottcha's commentslogin

I think there was a clarification posted on Reddit that said Claude Agents SDK didn't apply for now.

Yes GLM5 and KimiK2.5 are pretty close replacements for sonnet.

Haven't really tried GLM5 much but I've used 4.7 quite a bit and it was pretty far from competing with Sonnet at the time, although I saw claims online to the contrary.

What coding harness are you using? What are some example workflows you have used either for? Have you used them only for new/simple projects or for more complicated refactoring or architecture design?

I use OpenCode and have just started using Nanoclaw with ClaudeCode (my coworker has a post coming on this) and sometimes ClaudeCode with Claude Code Router. I do a range of small to complex work with these but I also do drop back in to Claude Opus for some really complex things where I want it to be more autonomous.

I switch between Claude Code (Opus/Sonnet) and Qwen (OpenCode, OpenClaw) multiple times throughout the day and Qwen 3.5 is really nice. I do also use KimiK2.5 and GLM5 pretty often too and I'm starting to get a sense that the agent tool is becoming a little more important than the model with these level of models. As long as tool calling and prompt quality is all configured correctly by the provider.

What are the reasons for switching? Personally I got into the habit of doing a bit of a round robin with Codex/Claude (CLI) and then DeepSeek and Qwen web chat. And Claude in web chat. I like to switch just to learn the differences, otherwise I'd never know what the other models can do. But I still feel attached to Opus, but this can be fammillarity. If I only had Qwen maybe it would be effectively identical at the end of the day. Hard to say.

Mine are pretty unique since we optimize the energy for and run an inference service api so forces me to dogfood alot of different options.

We offer multiple SOA models at https://portal.neuralwatt.com at very generous pricing since we have options to bill per kWh instead of per token. Recipes for your favorite tools here: https://github.com/neuralwatt/neuralwatt-tools

I actually built this analysis while I worked at Microsoft so I 100% agree. Doing the work at the platform level is the way to go and you can actually make a significant impact with this kind of approach. The other value of this that's not obvious is that doing it client side ends up touching all the grids/generators in the world outside of the market based accounting that tends to drive the datacenter carbon impact analysis.


There have been a few questions about the state of Show HN lately. Was actually interested in this post but I see all the OPs responses to questions are Dead? I do see its a new account but I don't really see anything egregious or against policy for these.


That is a pretty good article although the one factor not mentioned that we see that has a huge impact on energy is batch size but that would be hard to estimate with the data he has.

We've only launched to friends and family but I'll share this here since its relevant: we have a service which actually optimizes and measures the energy of your AI use: https://portal.neuralwatt.com if you want to check it out. We also have a tools repo we put together that shows some demonstrations of surfacing energy metadata in to your tools: https://github.com/neuralwatt/neuralwatt-tools/

Our underlying technology is really about OS level energy optimization and datacenter grid flexibility so if you are on the pay by KWHr plan you get additional value as we continue to roll new optimizations out.

DM me with your email and I'd be happy to add some additional credits to you.


To add a bit more to what @scottcha is saying: overall GPU load has a fairly significant impact on the energy per result. Energy per result is inversely related, since the idle TDP of these servers is significant the more the energy gets spread the more efficient the system becomes. I imagine Anthropic is able to harness that efficiency since I imagine their servers are far from idle :)


You can infer the discount from the pricing of the batch API, which is presumably arranged for minimum inference costs. Anthropic offers a 50% discount there, which is consistent with other model providers.


Neuralwatt | https://neuralwatt.com | REMOTE (US – Seattle/Denver/Boulder metros only) | Full-time | $180k–$220k DOE Energy is the #1 constraint in new datacenter buildouts. Neuralwatt is reshaping AI compute around energy efficiency to maximize revenue per kilowatt. We’re a VC-backed, early-stage startup building optimization tools for AI, HPC, and datacenter workloads.

We're hiring 2 experienced founding engineers to help architect our core systems and work directly with customers.

What you'll do:

Technically: Architect critical datacenter infrastructure - Write Rust and Python - Measure real-world energy impact - Design state-of-the-art AI-led optimizations

Non-technically: Help build the business and win customers - Present at conferences - Develop marketing and company materials

Requirements:

- 5–10+ years of software development experience

- Thrive in ambiguous, outcome-driven environments

- Experience working closely with customers

- Clear communication and strong leadership

- Familiarity with LLM/AI infrastructure

Location: Remote-first, but we meet regularly in Seattle/Denver metro areas.

To apply: Email: scott@neuralwatt.com Subject: HN Hiring Include: - Resume - GitHub profile - A short note on why you're interested.

Please note: At this time, we are unable to offer visa sponsorship.


Neuralwatt | https://neuralwatt.com | REMOTE (US – Seattle/Denver/Boulder metros only) | Full-time | $180k–$220k DOE

Energy is the #1 constraint in new datacenter buildouts. Neuralwatt is reshaping AI compute around energy efficiency to maximize revenue per kilowatt. We’re a VC-backed, early-stage startup building optimization tools for AI, HPC, and datacenter workloads.

We're hiring 2 founding engineers to help architect our core systems and work directly with customers.

What you'll do:

Technically: Architect critical datacenter infrastructure - Write Rust and Python - Measure real-world energy impact - Design state-of-the-art AI-led optimizations

Non-technically: Help build the business and win customers - Present at conferences - Develop marketing and company materials

Requirements:

- 5–10+ years of software development experience

- Thrive in ambiguous, outcome-driven environments

- Experience working closely with customers

- Clear communication and strong leadership

- Familiarity with LLM/AI infrastructure

Location: Remote-first, but we meet regularly in Seattle/Denver metro areas.

To apply: Email: scott@neuralwatt.com Subject: HN Hiring Include: - Resume - GitHub profile - A short note on why you're interested


I’ve asked that question on linked in to the Cerebras team a couple times and haven’t ever received a response. There is system max tdp values posted online but I’m not sure you can assume the system is running in max tdp for these queries. If it is the numbers are quite high (I just tried to find the number but couldn’t find it but I had it in my notes as 23kw).

If someone from Cerebras is reading this feel free to dm me as optimizing this power is what we do.


23kw gotdamn


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: