Dmitriy Dzhanhirov: When using LLM-based AI in its pure form, there appears to be a fundamental limitation on playing strength — and at a relatively modest level.
A simple consideration is that hallucination rates in multi-step, long-horizon reasoning — which includes chess — remain significant in current models. Even if we eliminate illegal hallucinations such as impossible moves or pieces appearing on nonexistent squares, the number of severe mistakes within the rules — as seen throughout the Kaggle tournament — is unlikely to be drastically reduced.
I believe an important milestone for LLM-based AI will be the moment when a chatbot can defeat Turochamp — the first chess program in history, developed by Alan Turing and David Champernowne between 1948 and 1950.
Turochamp used a search depth of two ply (one move by each side), with occasional extensions, and relied on a simple but original heuristic evaluation function.
Even this program proved too complex for the computers of the early 1950s, and later, more advanced programs overshadowed it. Today, Turochamp has been reconstructed as a playable engine. In 2012, a demonstration game was played against Garry Kasparov, where the former world champion, playing Black, delivered checkmate on move sixteen.
Its Elo rating can be roughly estimated at around 1000+, corresponding to an advanced beginner — a player who no longer makes the most basic mistakes.