Hacker Newsnew | past | comments | ask | show | jobs | submit | haniehz's commentslogin

based on the article, it seems like a good reasoning model like gpt5 or opus 4.1 might be good choices for the planner. I wonder if the gpt oss reasoning models would do well


Personally been using GPT-OSS-120b locally with reasoning_effort set to `high` and it blows pretty much every other local model out of the water, but takes a lot of time for it to eventually do a proper content reply. But for fire-and-forget jobs like "Create a well-researched report on X from perspective Y" it works really well.


what machine are you running GPT-OSS-120B on? I'm currently only able to get GPT-OSS-20B working on my macbook using Ollama


Gemini 2.5 Pro is also a great reasoning model, I still prefer it over GPT 5


Gemini is great, it's just incredibly clumsy at tool use and that's why it fails so often in practice. I'm looking forward to the next version, it will for sure address it, it's a big issue internally too (I'm a recent xoogler).


Yes it really is horrible at using tools. Codex is way better (even better than Claude code ). Gemini is great at doing audits and content (though I’ve switched to codex for everything all in one).


Can you elaborate on “clumsy at tool use”?


have you ever witnessed how sometimes Gemini makes multiple attempts at writing a file only to give up and start chanting "I'm worthless...".

That's tool use failure :)


I'm excited for the next version!


pretty soon! I think it's already happening. Just a matter of time for people to adapt.


I fed ElevenLabs Music a single prompt about our open-source MCP agent framework and got back a complete song: vocals, instrumentation, arrangement, the works. Zero post-processing.

Here's what caught me off guard: the vocal phrasing. Not just the melody, but the micro-timing, breath placement, and emotional inflection. The model placed emphasis on "composable" in a way that actually reinforced the technical meaning. It added vocal runs that felt intentional, not algorithmic.

Technical details that worked:

Prompt structure: [Genre] [Mood] [Key technical terms] [Narrative structure] Generated: 2:04 track with verse/chorus/bridge structure Quality: Comparable to demo-level indie recordings

What this means: Voice synthesis was the laggard in generative AI. That's changing rapidly. We're moving from "impressive for AI" to "actually usable in production workflows." Non-English limitations: I tested it with different languages and hit a wall — very patchy results, nowhere near the English quality. Anyone have experience with non-English lyrics? Curious about phoneme handling across languages.

The gap between human and AI musical performance is shrinking faster than I expected. Worth paying attention to.


Watching LLMs mimic reasoning so well definitely makes you wonder how much of our own thinking is just pattern prediction with extra steps.


How are you thinking about scaling topic and source curation without turning into another aggregator that makes assumptions for the user?


Great question and those are solid tools! Where this differs is the orchestration. Instead of switching tabs or manually checking five sources, this bundles everything into one interactive, LLM-assisted report, plus it remembers your preferences, investment style, and context. You can even swap out GPT for a local model if you’re privacy-conscious or budget-sensitive.


I didn’t build this to beat the market...it started as a way to reduce my own decision fatigue and make sure I wasn’t missing obvious signals. That said, it’s more of a research agent than a trading bot. For ROI, it’s not auto-trading, but it has helped me avoid a few bad calls and made my reports more consistent and thorough.


Very nice.


I think big orgs are just going to start building their tools in house!


Thanks, I'll go check it out! But, totally agree. The idea of spinning up personal apps on demand is a huge shift. Why download, install, and learn a tool when you can just ask for what you need and get it instantly?


Might even end the Apple/Google hegemony. Come thinking of it more likely Google gonna be quick on this ball when they realize the potential, while Apple will take its time. Would excellent to see some alternatives though, Samsung and Huawei maybe? Maybe an up-and-comer?


Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: