Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

My tester could trounce a 2-ply minimax engine easily. At 6 plies with the alpha-beta optimization it beat my tester for the first time (beats the average person, gets wrecked at the chess club but is trying to change that) which frustrated him greatly but after he spent a day thinking about strategy he prevailed. (Without alpha-beta the 6 ply search would have been completely unreasonable)

I got the signs wrong and it managed to fool's mate itself!

I struggled with testing for a while because when it makes bad moves you don't know if you correctly coded a bad chess engine or incorrectly coded a bad chess engine. Eventually I started using chess puzzles

https://www.chessprogramming.org/Test-Positions

which are not unit tests because they take seconds to run, but unlike a real game where there is no right move, there really is a right solution. BK.01 from

https://www.chessprogramming.org/Bratko-Kopec_Test

is a particularly nice one because it runs quickly!



Yeah, a 2-ply engine is pretty terrible at chess. Especially with no quiescence search.

I know what you're describing well, I've dabbled quite a bit in chess engine dev myself, and I'm planning to get back into it soon; I've got some interesting new ideas recently I wanna try out(once they're fleshed out enough to actually be implemented, right now they're just fanciful ideas I'm kicking around my head).

Testing is a bitch though, for sure. I know that stockfish is constantly being playtasted against itself, with a new instance spawned for every pull request etc, and then given an elo rating. That way they can tell if a potential change makes it weaker or stronger.

Debugging isn't easy either. Forget about stepping over code in the debugger. You have no idea whether the bug is only triggered after billions of nodes. That's a lot of stack frames to step through. And forget about debug prints too, for the most part, because putting an unconditional debug print in your search() , qsearch() or eval() will quickly lead to gigabytes and gigabytes of output...

Only helpful thing I found was to use asserts. Find invariants, and in your debug version check them every node, die if they don't hold and barf out your stack frame or a core dump. If you're lucky the bug is somewhere near where the assert failed in the call tree. Even that isn't guaranteed though.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: