Sudoku: Given Isomorphisms.

Hi folks! So, I have been a busy little bee, and have been trying to work out how many isomorphisms there are for a certain number of givens (pre-existing placements) – that is, initial configurations which are the same, modulo row, col, stack, band permutations, and reflection. This is just the given locations, not the contents of those cells. I get the following numbers:

Order2 board ( 4×4 cells):

 There are 1 isomorphisms with 0 givens, order2
There are 1 isomorphisms with 1 givens, order2
There are 5 isomorphisms with 2 givens, order2
There are 10 isomorphisms with 3 givens, order2
There are 33 isomorphisms with 4 givens, order2
There are 53 isomorphisms with 5 givens, order2
There are 101 isomorphisms with 6 givens, order2
There are 122 isomorphisms with 7 givens, order2
There are 153 isomorphisms with 8 givens, order2
There are 122 isomorphisms with 9 givens, order2
There are 101 isomorphisms with 10 givens, order2
There are 53 isomorphisms with 11 givens, order2
There are 33 isomorphisms with 12 givens, order2
There are 10 isomorphisms with 13 givens, order2
There are 5 isomorphisms with 14 givens, order2
There are 1 isomorphisms with 15 givens, order2
There are 1 isomorphisms with 16 givens, order2

Order3 board ( 9×9 cells, “standard” sudoku board):


 There are 1 isomorphisms with 0 givens, order3
There are 1 isomorphisms with 1 givens, order3
There are 5 isomorphisms with 2 givens, order3
There are 21 isomorphisms with 3 givens, order3
There are 109 isomorphisms with 4 givens, order3
There are 548 isomorphisms with 5 givens, order3
There are 3074 isomorphisms with 6 givens, order3
There are 16847 isomorphisms with 7 givens, order3
There are 92393 isomorphisms with 8 givens, order3

Now. That took me 12 CPU core hours. The magic number I want to get to with order3 boards is of course 17, the lowest number of givens in any standard sudoku puzzle. All of a sudden this looks like it will take some serious CPU grunt. I’m not exactly sure, but for small numbers of givens (less than 1/4 the cells on the board), it looks like adding one extra placement multiplies the number of isomorphisms by a factor of roughly 5.5.

With respect to computational cost, determining isomorphisms is reasonably fast (at least 200k per CPU core second on average), but since there’s no easy way of “sorting” that I know about, there’s no “log N” shortcut, and determining uniqueness requires an N^2 comparison cost, so the computational cost is proportional to the square of the number of isomorphisms.

Plugging these numbers into a spreadsheet, assuming (conservatively) 4 CPU cores per machine (your average laptop), I can then calculate the costs in thousand machine years:

GivensCPU HoursThousand machine years
01.35899570417687E-093.87841239776503E-17
11.35899570417687E-093.87841239776503E-17
23.39748926044217E-089.69603099441258E-16
35.99317105541998E-071.71037986741438E-14
41.61462279613254E-054.60794176978463E-13
50.0004081118459471.16470275669843E-11
60.0128417972907223.66489648707831E-10
70.3857120755844261.10077647141674E-08
811.6010212330413.31079373089068E-07
9350.9308922994891.00151510359443E-05
1010615.65949205950.000302958318837
11321123.6996348010.009164489144829
129713991.913952730.277225796631071
13293848255.397078.3860803480899
148888909725.76137253.67893052972
15268889519204.2817673.78764852402
168133907955929.51232132.076367851
172460507156668687021995.31012751

Now, as it turns out, distributed computing is my sort of thing, so I reckon getting up to about 12 givens (277 machine years) is pretty doable, but 17 looks somewhat out of the question, unless I can find a way to answer these two questions:

  • Currently I compare all boards against all other boards, with a couple of “tricks” for rapid comparison that removes the need to consider permutations in most cases. I could possibly re-apply some of those tricks to reduce the cost from N^2, but I’m not sure it’ll give anything more than a constant cost improvement. Probably won’t reduce the cost to N log N. (Details available after some discussion).
  • My math assumes a constant 5.5 increase for each extra given. Do we have any reasons to assume that that that number might decrease above 9 or 10 givens?

More Sudoku.

Little sudoku update. So, I’ve been working on various algorithms to try and work towards solving a pretty knotty problem with respect to Sudoku. It’s turning out to be much more difficult than expected.

I have a “naive” algorithm which gets correct results for smaller board sizes, but which will not complete for a regular sudoku board in my lifetime.

I have a “clever” algorithm, which is slower for small boards, but bigger for fast ones. Unfortunately, it gets wrong results (for reasons I understand), which is not the end of the world, and fixable – it misses some cases.

However, the clever algorithm also seems to get inconsistent results: if I swap a board where the squares are filled with an equivalent board where the squares are empty, I should get the same answer, and I don’t. Debugging in progress: I have probably taken a shortcut somewhere that I shouldn’t have done.

I have also attempted a modest improvement at improving the speed of the exact cover algorithm by improving how it picks constraints. To my astonishment, unless the number of constraints is very large, and the subset of them which change is quite small, it’s actually not any quicker than the naive algorithm, which is n/2 cost.

Regardless, both algorithms are too slow, and I also have some more unusual-ish ideas about how to make a faster algorithm. The only problem it takes so much code to try out each new idea and then find whether it works or is bust. Perseverance will be required!

Once I have a good algorithm, then I can start thinking about distributing it across lots of CPU’s.

A rather faster Sudoku solver.

So, wanting to improve my knowledge of sudoku, I have now implemented a “proper” DLX solver, which solves all sudoku problems in the blink of an eye, yes it’s about a thousand times faster than a naive algorithm.

You can download a copy of the testapp here.

Rest assured, this is just the start of some rather more interesting problem solving with respect to sudoku.