The difference is that you're asking it to perform one intellectual task (write ... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

bonzini 9 months ago | parent | context | favorite | on: Anthropic Claude 3.5 can create icalendar files, s...

The difference is that you're asking it to perform one intellectual task (write a program) instead of 100 menial tasks (parse a file). To the LLM the two are the same level of complexity, so performing less work means less possibility of error.

Also, the LLM is more likely to fail spectacularly by hallucinating APIs when writing a script, and more likely to fail subtly on parsing tasks.

dbaupp 9 months ago [–]

In addition to what you say, it can also be easier for a (appropriately-skilled) human to verify a small program than to verify voluminous parsing output, plus, as you say, there's the semi-automated "verification" of a very-wrong program failing to execute.

Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact