Hi folks! So, I have been a busy little bee, and have been trying to work out how many isomorphisms there are for a certain number of givens (pre-existing placements) – that is, initial configurations which are the same, modulo row, col, stack, band permutations, and reflection. This is just the given locations, not the contents of those cells. I get the following numbers:

**Order2 board ( 4×4 cells):**

There are 1 isomorphisms with 0 givens, order2

There are 1 isomorphisms with 1 givens, order2

There are 5 isomorphisms with 2 givens, order2

There are 10 isomorphisms with 3 givens, order2

There are 33 isomorphisms with 4 givens, order2

There are 53 isomorphisms with 5 givens, order2

There are 101 isomorphisms with 6 givens, order2

There are 122 isomorphisms with 7 givens, order2

There are 153 isomorphisms with 8 givens, order2

There are 122 isomorphisms with 9 givens, order2

There are 101 isomorphisms with 10 givens, order2

There are 53 isomorphisms with 11 givens, order2

There are 33 isomorphisms with 12 givens, order2

There are 10 isomorphisms with 13 givens, order2

There are 5 isomorphisms with 14 givens, order2

There are 1 isomorphisms with 15 givens, order2

There are 1 isomorphisms with 16 givens, order2

**Order3 board ( 9×9 cells, “standard” sudoku board):**

There are 1 isomorphisms with 0 givens, order3

There are 1 isomorphisms with 1 givens, order3

There are 5 isomorphisms with 2 givens, order3

There are 21 isomorphisms with 3 givens, order3

There are 109 isomorphisms with 4 givens, order3

There are 548 isomorphisms with 5 givens, order3

There are 3074 isomorphisms with 6 givens, order3

There are 16847 isomorphisms with 7 givens, order3

There are 92393 isomorphisms with 8 givens, order3

Now. That took me 12 CPU core hours. The magic number I want to get to with order3 boards is of course 17, the lowest number of givens in any standard sudoku puzzle. All of a sudden this looks like it will take some serious CPU grunt. I’m not exactly sure, but for small numbers of givens (less than 1/4 the cells on the board), it looks like adding one extra placement multiplies the number of isomorphisms by a factor of roughly 5.5.

With respect to computational cost, determining isomorphisms is reasonably fast (at least 200k per CPU core second on average), but since there’s no easy way of “sorting” that I know about, there’s no “log N” shortcut, and determining uniqueness requires an N^2 comparison cost, so the computational cost is proportional to the square of the number of isomorphisms.

Plugging these numbers into a spreadsheet, assuming (conservatively) 4 CPU cores per machine (your average laptop), I can then calculate the costs in thousand machine years:

Givens | CPU Hours | Thousand machine years |

0 | 1.35899570417687E-09 | 3.87841239776503E-17 |

1 | 1.35899570417687E-09 | 3.87841239776503E-17 |

2 | 3.39748926044217E-08 | 9.69603099441258E-16 |

3 | 5.99317105541998E-07 | 1.71037986741438E-14 |

4 | 1.61462279613254E-05 | 4.60794176978463E-13 |

5 | 0.000408111845947 | 1.16470275669843E-11 |

6 | 0.012841797290722 | 3.66489648707831E-10 |

7 | 0.385712075584426 | 1.10077647141674E-08 |

8 | 11.601021233041 | 3.31079373089068E-07 |

9 | 350.930892299489 | 1.00151510359443E-05 |

10 | 10615.6594920595 | 0.000302958318837 |

11 | 321123.699634801 | 0.009164489144829 |

12 | 9713991.91395273 | 0.277225796631071 |

13 | 293848255.39707 | 8.3860803480899 |

14 | 8888909725.76137 | 253.67893052972 |

15 | 268889519204.281 | 7673.78764852402 |

16 | 8133907955929.51 | 232132.076367851 |

17 | 246050715666868 | 7021995.31012751 |

Now, as it turns out, distributed computing is my sort of thing, so I reckon getting up to about 12 givens (277 machine years) is pretty doable, but 17 looks somewhat out of the question, unless I can find a way to answer these two questions:

- Currently I compare all boards against all other boards, with a couple of “tricks” for rapid comparison that removes the need to consider permutations in most cases. I could possibly re-apply some of those tricks to reduce the cost from N^2, but I’m not sure it’ll give anything more than a constant cost improvement. Probably won’t reduce the cost to N log N. (Details available after some discussion).
- My math assumes a constant 5.5 increase for each extra given. Do we have any reasons to assume that that that number might decrease above 9 or 10 givens?