Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's AMC-12 scores aren't awful. It's at roughly 50th percentile for AMC which (given who takes the AMC) probably puts it in the top 5% or so of high school students in math ability. It's AMC 10 score being dramatically lower is pretty bad though...



> It's AMC-12 scores aren't awful.

A blank test scores 37.5

The best score 60 is 5 correct answers + 20 blank answers; or 6 correct, 4 correct random guesses, and 15 incorrect random guesses. (20% chance of correct guess)

The 5 easiest questions are relatively simple calculations, once the parsing task is achieved.

(Example: https://artofproblemsolving.com/wiki/index.php/2022_AMC_12A_... ) so the main factor in that score is how good GPT is at refusing to answer a question, or doing a bit better to overcome the guessing penalty.

> It's AMC 10 score being dramatically lower is pretty bad though...

All versions (scoring 30, 36) It scored worse than leaving the test blank.

The only explanation I can imagine for that is that it can't understand diagrams.

It's also unclear if the AMC performance is based on Englush or the computer-encoded version from this benchmark set: https://arxiv.org/pdf/2109.00110.pdf https://openai.com/research/formal-math

AMC/AIME and even to some extent USAMO/IMO problems are hard for humans because they are time-limited and closed-book. But they aren't conceptually hard -- they are solved by applying a subset of known set of theorems a few times to the input data.

The hard part of math, for humans, is ingesting data into their brains, retaining it, and searching it. Humans are bad a memorizing large databases of symbolic data, but that's trivial for a large computer system.

An AI system has a comprehensive library, and high-speech search algorithms.

Can someone who pays $20/month please post some sample AMC10/AMC12 Q&A?




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: