Number of tokens is still a useful metric, as their endpoints have Tokens Per Mi... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

throwaway74432 on April 4, 2024 | parent | context | favorite | on: Improvements to the fine-tuning API and expanding ...

Number of tokens is still a useful metric, as their endpoints have Tokens Per Minute quotas. Decreasing number of tokens used means increasing throughput, up until the Request Per Minute quota.

afro88 on April 4, 2024 [–]

But the sentence was "Indeed was able to improve cost..."

mechagodzilla on April 4, 2024 | | [–]

From OpenAI’s perspective the cost improved!

throwaway74432 on April 4, 2024 | | [–]

"...and latency" Higher throughput = lower latency*

*Under certain conditions

Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact