Solving Chess with Regex

Cover Image for Solving Chess with Regex
William Forty
William Forty

The challenge: given an 8x8 chess board, with the black king somewhere on it and some number of white pieces scattered around, determine whether the king is in check. The constraint I imposed on myself: solve it with a single regular expression.

This is the kind of constraint that should not work. Regex matches strings. A chess board is a 2D structure with edges, diagonals, knight jumps, and direction-dependent threats. The sensible solution is to parse the board into a grid, find the king, then walk outward in each of the eight directions checking for the appropriate piece type. A hundred lines of code, a switch statement, done.

I did not do that.

The premise

Flatten the board into a single string, row by row. The black king is k. White pieces are uppercase. Empty squares are spaces. The board becomes a 64-character string (plus padding, which we'll get to). Question: can a regex detect threat geometry inside that string?

It can. Once you spot the trick, every piece type collapses into a fixed-offset pattern.

Pawns: the easy case that breaks first

A black king can only be threatened by a white pawn in two squares: diagonally forward-left and forward-right. Lay the board out in a single line and the forward-left pawn sits 8 characters before the king, the forward-right 6.

First attempt:

P.{8}k

A P, any eight characters, then the king. Match means check.

This works in the middle of the board. It breaks at the edges. Because we've concatenated everything into one string with no notion of rows or columns, a "diagonal" threat from a pawn on the opposite edge of the previous row still matches. The regex doesn't know the board wraps. As far as it's concerned, character 7 of row 1 is right next to character 0 of row 2.

The padding trick

Insert artificial padding between rows. If every row is followed by some non-board characters — call them # — then a pawn trying to threaten the king from the wrong side of the board gets a # in the way and the regex fails to match.

You only need padding on one side of each row. When the rows concatenate, the padding from the right of row N sits directly adjacent to where the left of row N+1 would be. Same #s double as both right-edge and left-edge guard.

Seven characters is enough. Not eight, because a diagonal threat that would have wrapped exactly eight characters misses the corner anyway — the wrap point doesn't land on a real square. Seven holds.

Mirror the board to halve the work

Two pawn-check positions means two patterns. Annoying.

So mirror the board. Each row becomes: row + padding + row-reversed. Now every "left" threat in the original appears as a "right" threat in the mirror, and the regex only needs to match one direction. The other direction is the mirror's problem.

This is the move that makes everything else collapse. Every piece's threat pattern now only needs checking in one direction.

With the mirror in place, the pawn check becomes a single fixed-offset pattern. The pawn sits 30 characters before the king:

P.{30}k

One regex, both pawn threat squares, edge cases handled.

Bishops and queens: optional repetition

A bishop on the king's diagonal could be one square away, two, or all the way across the board. In flat-string terms, that's the pawn pattern with the offset block optionally repeating:

(.{30}[ ]?)+[BQ]

Wrap the offset chunk in a group, repeat one or more times, and the regex matches a bishop or queen anywhere along the diagonal — provided the squares between are spaces, not blocked. Same trick for both pieces because along a diagonal they're functionally identical.

The mirror handles one diagonal axis. Diagonals also run upward from the king, so you need a second variant with the king first and the piece after.

Rooks, queens, knights

Rooks and queens along a rank are trivial: [RQ] followed by any number of spaces followed by k. Mirror handles left/right.

Vertically, the same trick as diagonals but without the diagonal offset — fixed step of 29 characters per square, wrapped in an optional repetition group.

Knights are just offsets. No line to scan, no blocking. A knight threat is the knight at one of eight specific offsets from the king; the mirror halves it to four. Enumerate them with | alternation and you're done.

The reveal

Stitch every piece's pattern together with |, run a single .test() against the padded mirrored string, and that's check detection. One regex. No board parser, no direction loops, no piece-specific functions. Dramatically shorter than every conventional solution to the same problem.

Whether it's good code is up for debate. Whether it's concise is not.

The lesson

Regex is not a string-matching tool. Regex is a constraint-encoding language that happens to operate on strings. Once you accept that, the question stops being "can I match this text" and becomes "can I lay out my data so the constraints I care about become local positional facts in a flat string?"

For chess, the answer is yes. Flatten the board, pad the rows so edges don't wrap, mirror to halve the directional cases, and every threat type — pawn, bishop, rook, queen, knight — collapses into a fixed-offset pattern or an optional-repetition variant of one.

The board parser is the obvious answer. The regex is the better demonstration of what the tool actually is. Most of the time you should write the board parser. Some of the time you should write the regex, just to remember you could.