The implementation is using Python, but can easily be ported to other languages:

The idea is to “flatten” the number (e.g. 3 becomes “111” and 5 becomes “11111”), and then try to group the 1’s into multiple equal groups. If we succeed to group, the number isn’t prime, otherwise it’s prime (with the exception of the number 1).

It works because the so-called “regular expression”, isn’t actually regular.

import re def is_prime(n): return re.match(r'^1?$|^(11+?)\1+$', "1" * n) == None;

The first part of the regexp “^1?$” matches a string with an optional “1″, that is, it matches the empty string, and the string “1″, which are the “flat” representations of 0 and 1 respectively. This is because Zero and One are special cases, and are not prime.

The second part tries to match a string of “start of line”, then two 1’s or more ^(11+?), then it tries to find one or more times the previous match (\1+). then end of line $.

i.e. try to divide the 1’s into two or more groups of two, if failed, try two or more groups of three, etc… in general: try to divide the sticks into two or more groups of “two or more”. Which is exactly the definition of a non-prime number (a.k.a composite number).

So, if the regexp succeeds to match the pattern, the number is not prime. If it fails, the number is prime.

For example, consider the string “aaaa” and the regexp ‘(a+)\1+’ the regexp will match the string, the value of \1 will be “aa”, since the a+ pattern matched the longest string which allows the regexp to match the string.

The regexp ‘(a+?)\1+’ will also match the string, but the value of \1 will be “a”, because this time (a+?) looked for the minimal match which will allow the regexp to match the string.

In our case, the use of (+?) instead of (+) does not affect correctness. The only difference is related to performance, because it tries small groups first, rather than big groups first.