For Programmers: Free Programming Magazines  


Home > Archive > AWK > October 2006 > awk sudoku solvers ?









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author awk sudoku solvers ?
andreasplesch@gmail.com

2006-08-30, 6:56 pm

Hello,

I discovered sudoku, and could not resist to write my own little
backtracking solver in my everyday scripting language, awk. Here is my
simple, recursive solution. I thought awks associative array and
recursive capabilities would lend themselves well to sudoku solving but
I ended up just using a string which holds the sudoku and can be passed
to recursive functions as local variable.

How do people deal with recursive functions and arrays ?

Comments on the code, style ? Other awk solvers out there ?

Use it as in:

$ gawk -f solve.awk
....2.518.9....4.32.....85..8..5...2.63.....58.9...6..1..98.....14.3....5.264.1...

best, Andreas


Here is the code:

solve.awk:

BEGIN{
tt = systime()
}
{ st = systime();
plot2($1);
solve2($1);
plot2(sos);
print "Sudoku "NR",\t "NR/(systime()-tt+1e-10)" sudokus/s"
}

function solve2(ss, a, b, c, d, e, r, co, ro, bi, bl, nss){
i = 0;
# first, use some simple logic to fill grid as much as possible
do {i++
didit = 0; delete nr; delete nc; delete nb; delete ca;
for (a = 0; a< 81; a++) {
b = substr(ss, a+1, 1)
if (b == ".") {
# construct row, column and block at cell
c = a%9; r = int(a/9);
ro = substr(ss, r*9+1, 9);
co="";for (d=0; d<9; d++) {co = co substr(ss, d*9+c+1, 1)};
bi = int(c/3)*3+(int(r/3)*3)*9+1;
bl="";for (d=0; d<3; d++) {bl = bl substr(ss, bi+d*9, 3)};
e=0;
for (d=1; d<10; d++) {
# count non-occurrences of digits 1-9 in combined row, column and
block, per row, column and block,\
and flag cell/digit as candidate
if (index(ro co bl, d) == 0) {e++; nr[r, d]++; nc[c, d]++;
nb[bi, d]++; ca[a, d] = d}
}
# in case no candidate is available, give up
if (e == 0) {return 0}
}
}
# go through all cell/digit candidates
for (ad in ca) {
split(ad, spl, SUBSEP)
a = spl[1];
d = spl[2];
c = a%9; r = int(a/9);
bi = int(c/3)*3+(int(r/3)*3)*9+1;
# unique solution if at least one non-occurrence counter is exactly 1
if((nr[r, d] == 1)||(nc[c, d] == 1)||(nb[bi, d] == 1)) {
ss = substr(ss, 1, a) d substr(ss, a+2, length(ss));
didit = 1;
}
}
} while (didit == 1)
# Second, pick a viable solution for the next empty cell and see if it
leads to a solution
a = index(ss, ".")-1;
if (a == -1) {
# no more empty cells, done
sos = ss; return 1}
else {
c = a%9; r = int(a/9);
# concatenate current row, column and block
# row
co = substr(ss, r*9+1, 9);
# column
for (d=0; d<9; d++) {co = co substr(ss, d*9+c+1, 1)};
# block
c = int(c/3)*3+(int(r/3)*3)*9+1;
for (d=0; d<3; d++) {co = co substr(ss, c+d*9, 3)};
# get a viable digit
for (b = 1; b < 10; b++) {
# check if candidate digit already exists
if (index(co, b) == 0) {
# is viable, put in cell
nss = substr(ss, 1, a) b substr(ss, a+2, length(ss));
# try to solve
d = solve2(nss);
# if successful, return
if (d == 1) {return 1}
}
}
# no digits viable, no solution
} return 0
}

function plot2(ss, ou, a, b, c, d){
print "| - - - + - - - + - - - |"
for (a = 0; a < 3; a++) {
for (b = 0; b < 3; b++) {
ou = "|";
for (c = 0; c < 3; c++) {
for (d = 0; d < 3; d++) {
ou = ou" "substr(ss, 1+d+3*c+9*b+27*a, 1);
}
ou=ou" |";
}
print ou;
} print "| - - - + - - - + - - - |"
}
}

andreasplesch@gmail.com

2006-08-31, 3:56 am

> # go through all cell/digit candidates
> for (ad in ca) {
> split(ad, spl, SUBSEP)
> a = spl[1];
> d = spl[2];
> c = a%9; r = int(a/9);
> bi = int(c/3)*3+(int(r/3)*3)*9+1;
> # unique solution if at least one non-occurrence counter is exactly 1
> if((nr[r, d] == 1)||(nc[c, d] == 1)||(nb[bi, d] == 1)) {
> ss = substr(ss, 1, a) d substr(ss, a+2, length(ss));
> didit = 1;
> }
> }
> } while (didit == 1)


For some situations, there needs to be an additional check in this
section, updated code:

# go through all cell/digit candidates
# hidden singles
for (crd in ca) {
# a candidate may have been deleted after the loop started
if (ca[crd]!=""){
split(crd, spl, SUBSEP)
c = spl[1];
r = spl[2];
d = spl[3];
bi = ca[crd];
a = c+r*9;
# c = a%9; r = int(a/9);
# bi = int(c/3)*3+(int(r/3)*3)*9+1;
# unique solution if at least one non-occurrence counter is exactly 1
if ((nr[r, d] == 1) || (nc[c, d] == 1) || (nb[bi, d] == 1)) {
ss = substr(ss, 1, a) d substr(ss, a+2, length(ss));
didit = 1;
# remove candidates from current row, column, block
for (e=0;e<9;e++) {delete ca[c, e, d]; delete ca[e, r, d];}
for (e=0;e<3;e++) {for (b=0;b<3;b++) {delete ca[int(c/3)*3+b,
int(r/3)*3+e, d];}}
}}
}

Steve Calfee

2006-09-01, 6:56 pm

On 30 Aug 2006 15:47:58 -0700, andreasplesch@gmail.com wrote:

>Hello,
>
>I discovered sudoku, and could not resist to write my own little
>backtracking solver in my everyday scripting language, awk. Here is my
>simple, recursive solution. I thought awks associative array and
>recursive capabilities would lend themselves well to sudoku solving but
>I ended up just using a string which holds the sudoku and can be passed
>to recursive functions as local variable.
>
>How do people deal with recursive functions and arrays ?
>
>Comments on the code, style ? Other awk solvers out there ?
>
>Use it as in:
>
>$ gawk -f solve.awk
>...2.518.9....4.32.....85..8..5...2.63.....58.9...6..1..98.....14.3....5.264.1...
>


Hi Andreas,

Interesting, small solution. Does it work?

There was a good article in circuit cellar magazine (april 2006) "FROM
THE BENCH Automating Sudoku, by Jeff Bachiochi, p. 80" where he did a
winxp gui solver in basic.

I don't understand your input format, I guess "." is an unknown? And
each number is the initial value 1 to 9 of the 9x9 matrix?

When I did my awk based sudoku solver I used the basic program's input
files as input to my solver:
puzzle 1
0,0,0,0,0,4,0,0,5
0,5,0,0,0,9,0,6,0
4,0,3,0,0,1,0,0,0
8,0,0,0,0,0,2,0,6
1,0,0,8,0,2,0,0,3
9,0,5,0,0,0,0,0,7
0,0,0,2,0,0,7,0,1
0,7,0,5,0,0,0,8,0
3,0,0,1,0,0,0,0,0
puzzle 2
0,0,9,0,6,0,0,8,0
6,0,3,5,0,0,0,4,0
0,8,0,0,0,3,2,0,6
7,0,2,0,0,0,0,0,0
0,0,0,1,5,7,0,0,0
0,0,0,0,0,0,4,0,7
9,0,1,8,0,0,0,7,0
0,7,0,0,0,9,8,0,2
0,4,0,0,7,0,3,0,0

.... etc ...

Where 0 is unknown 1 to 9 is known cell contents.

One thing I wanted to do was be able to provide "interesting" starting
puzzles - but very quickly I found out how long the processing can
take. For instance solving:

clear
0,0,0,0,0,0,0,0,0
0,0,0,0,0,0,0,0,0
0,0,0,0,0,0,0,0,0
0,0,0,0,0,0,0,0,0
0,0,0,0,0,0,0,0,0
0,0,0,0,0,0,0,0,0
0,0,0,0,0,0,0,0,0
0,0,0,0,0,0,0,0,0
0,0,0,0,0,0,0,0,0

This puzzle takes so long I have never let the program run long enough
to finish it. I have only tried the initial 9 puzzles from the basic
program and my awk program solved them all. I suspect the trick in
doing sudoku puzzles is making them simple enough for a human to solve
in a reasonable amount of time.

Your solution is much smaller than mine, so if it works, it is best.
If you want I could also post my solution.

Regards, Steve

There is no "x" in my email address.
Michael Jaritz

2006-09-04, 3:56 am

andreasplesch@gmail.com schrieb:

>solve.awk:


Fascinating.

I tried to solve 18 sudokus (with the updated code in message
<1157011572.582483.232830@e3g2000cwe.googlegroups.com> ),
see http://zielgra.de/temp/awk/examples.txt

The results are right, but I don't understand the following part of the
code.

|6476522 for (crd in ca) {
|6476522 if (ca[crd] != "") { # 6476522
|6476522 split(crd, spl, SUBSEP)
|6476522 c = spl[1]
|6476522 r = spl[2]
|6476522 d = spl[3]
|6476522 bi = ca[crd]
|6476522 a = c + r * 9
|6476522 if (nr[r, d] == 1 || nc[c, d] == 1 || nb[bi, d] == 1) {
| ss = ((substr(ss, 1, a) d) substr(ss, a + 2, length(ss)))
| didit = 1
| for (e = 0; e < 9; e++) {
| delete ca[c, e, d]
| delete ca[e, r, d]
| }
| for (e = 0; e < 3; e++) {
| for (b = 0; b < 3; b++) {
| delete ca[int(c / 3) * 3 + b, int(r / 3) * 3 + e, d]
| }
| }
| }
| }
| }

You have an example where
(nr[r, d] == 1 || nc[c, d] == 1 || nb[bi, d] == 1) is true?

See complete profile -> http://zielgra.de/temp/awk/solve.prof

Michael
Jürgen Kahrs

2006-09-04, 6:56 pm

andreasplesch@gmail.com wrote:

> I discovered sudoku, and could not resist to write my own little
> backtracking solver in my everyday scripting language, awk. Here is my
> simple, recursive solution. I thought awks associative array and


Thanks for posting this. It's a nice demonstration
of AWK's powerful features.

> recursive capabilities would lend themselves well to sudoku solving but
> I ended up just using a string which holds the sudoku and can be passed
> to recursive functions as local variable.
>
> How do people deal with recursive functions and arrays ?


I dont know any better way of handling local arrays.
I guess the way you used them is the best one can do.
andreasplesch@gmail.com

2006-09-05, 6:56 pm

Hi,

Steve Calfee wrote:

> Interesting, small solution. Does it work?


I think it does, and it should solve all puzzles since it really uses
just trial and error.

> There was a good article in circuit cellar magazine (april 2006) "FROM
> THE BENCH Automating Sudoku, by Jeff Bachiochi, p. 80" where he did a
> winxp gui solver in basic.
>
> I don't understand your input format, I guess "." is an unknown? And
> each number is the initial value 1 to 9 of the 9x9 matrix?


Yes, the ordering is from left to right, from top to bottom.

> When I did my awk based sudoku solver I used the basic program's input
> files as input to my solver:
> puzzle 1
> 0,0,0,0,0,4,0,0,5
> 0,5,0,0,0,9,0,6,0
> 4,0,3,0,0,1,0,0,0
> 8,0,0,0,0,0,2,0,6
> 1,0,0,8,0,2,0,0,3
> 9,0,5,0,0,0,0,0,7
> 0,0,0,2,0,0,7,0,1
> 0,7,0,5,0,0,0,8,0
> 3,0,0,1,0,0,0,0,0


so this becomes:

......4..5.5...9.6.4.3..1...8.....2.61..8.2..39.5.....7...2..7.1.7.5...8.3..1.....

And the output of my solver is:

| - - - + - - - + - - - |
| . . . | . . 4 | . . 5 |
| . 5 . | . . 9 | . 6 . |
| 4 . 3 | . . 1 | . . . |
| - - - + - - - + - - - |
| 8 . . | . . . | 2 . 6 |
| 1 . . | 8 . 2 | . . 3 |
| 9 . 5 | . . . | . . 7 |
| - - - + - - - + - - - |
| . . . | 2 . . | 7 . 1 |
| . 7 . | 5 . . | . 8 . |
| 3 . . | 1 . . | . . . |
| - - - + - - - + - - - |
| - - - + - - - + - - - |
| 2 1 9 | 6 8 4 | 3 7 5 |
| 7 5 8 | 3 2 9 | 1 6 4 |
| 4 6 3 | 7 5 1 | 9 2 8 |
| - - - + - - - + - - - |
| 8 3 7 | 9 1 5 | 2 4 6 |
| 1 4 6 | 8 7 2 | 5 9 3 |
| 9 2 5 | 4 3 6 | 8 1 7 |
| - - - + - - - + - - - |
| 5 9 4 | 2 6 8 | 7 3 1 |
| 6 7 1 | 5 9 3 | 4 8 2 |
| 3 8 2 | 1 4 7 | 6 5 9 |
| - - - + - - - + - - - |


> One thing I wanted to do was be able to provide "interesting" starting
> puzzles - but very quickly I found out how long the processing can
> take.


Yes, the solving is considered the easy part, generation is harder.
Look at this forum for in depth discussion:

http://www.setbb.com/phpbb/?mforum=sudoku

cheers, Andreas

andreasplesch@gmail.com

2006-09-05, 6:56 pm

Hello,

> The results are right, but I don't understand the following part of the
> code.
>
> |6476522 for (crd in ca) {
> |6476522 if (ca[crd] != "") { # 6476522
> |6476522 split(crd, spl, SUBSEP)
> |6476522 c = spl[1]
> |6476522 r = spl[2]
> |6476522 d = spl[3]
> |6476522 bi = ca[crd]
> |6476522 a = c + r * 9
> |6476522 if (nr[r, d] == 1 || nc[c, d] == 1 || nb[bi, d] == 1) {
> | ss = ((substr(ss, 1, a) d) substr(ss, a + 2, length(ss)))
> | didit = 1
> | for (e = 0; e < 9; e++) {
> | delete ca[c, e, d]
> | delete ca[e, r, d]
> | }
> | for (e = 0; e < 3; e++) {
> | for (b = 0; b < 3; b++) {
> | delete ca[int(c / 3) * 3 + b, int(r / 3) * 3 + e, d]
> | }
> | }
> | }
> | }
> | }


Oh, this made me check again and I found that my update of this section
also included a small but important change above in the candidate
finding loop, in the following line:

# count non-occurrences of digits 1-9 in combined row, column and
block, per row, column and block, and flag cell/digit as candidate
if (index(ro co bl, d) == 0) {e++; nr[r, d]++; nc[c, d]++;
nb[bi, d]++; ca[a, d] = d}

should be

# count non-occurrences of digits 1-9 in combined row, column and
block, per row, column and block,\
and flag cell/digit as candidate
if (index(ro co bl, d) == 0) {e++; nr[r, d]++; nc[c, d]++;
nb[bi, d]++; ca[c, r, d] = bi}

This change just stores directly the column and row of the candidate,
instead of the position in the string (a).

> .. example where
> (nr[r, d] == 1 || nc[c, d] == 1 || nb[bi, d] == 1) is true?


The arrays stores the number of locations in a given row (or column or
block) for each possible digit (d). If there is only one location, this
must be the correct location.

| - - - + - - - + - - - |
| 8 . . | . . . | 2 . 6 |
| 1 . . | 8 . 2 | . . 3 |
| 9 x 5 | . . . | . . 7 |
| - - - + - - - + - - - |

In position x, nb[bi, 2] would be 1 because "2" cannot be anywhere else
in the block.

Does this help ?

andreasplesch@gmail.com

2006-09-05, 6:56 pm

J=FCrgen Kahrs wrote:

> Thanks for posting this. It's a nice demonstration
> of AWK's powerful features.


Hey, you are welcome.

I posted the solution also in

http://www.setbb.com/phpbb/?mforum=3Dsudoku

but thought I might get more awk specific feedback here.

>
> I dont know any better way of handling local arrays.
> I guess the way you used them is the best one can do.


ok, this is good to know then.

Andreas

Anton Treuenfels

2006-09-06, 3:56 am


<andreasplesch@gmail.com> wrote in message
news:1156978078.368666.122180@p79g2000cwp.googlegroups.com...
> Other awk solvers out there ?
>
> Use it as in:
>
> $ gawk -f solve.awk
>

....2.518.9....4.32.....85..8..5...2.63.....58.9...6..1..98.....14.3....5.264
..1...

How about one in TAWK?

I've used this with puzzles kept in a file, nine hints to a line, nine lines
to a puzzle. Could have all 81 hints on one line, or one hint on 81 lines,
doesn't matter.

# -----------------------------------
# Backtracking Soduko Puzzle Solver
# -----------------------------------

# for TAWK 4.0

# by Anton Treuenfels
# first created: 09/04/06
# last revision: 09/05/06

# the puzzle definition

local puzzleDef

# the puzzle solution

local sodukoSolution

# the count of unknown cell values

local unknownCnt

# the sequence in which grid cells are visited

local cellSequence

# flags for "digit is used" for rows, columns and boxes

local valUsedInRow, valUsedInCol, valUsedInBox

# pre-calculated indices for the row, column and box of each cell
# - not necessary but saves a bit of time since they never change

local rowNum, colNum, boxNum

# keep track of how we're doing
# - not necessary but nice to know sometimes

local solutionNum, recursionNum

# ------------------------------

INIT {

local cell

# initialize the indices for each cell row, column and box
# - cells are numbered left-to-right, top-to-bottom;
# nine rows of nine cells each
# - rows are numbered top to bottom
# - columns are numbered left to right
# - boxes are numbered left to right, top-to-bottom;
# three rows of three boxes each

for ( cell = 1; cell <= 81; cell++ ) {
rowNum[ cell ] = int( (cell-1) / 9 ) + 1
colNum[ cell ] = (cell-1) % 9 + 1
boxNum[ cell ] = \
3 * int( (rowNum[cell]-1)/3) + int( (colNum[cell]-1)/3 ) + 1
}
}

# ------------------------------

# display the current solution values
# - if not complete, the cells with unassigned values display the
# sequence number they will be visited in

local function show() {

local cell
local row

row = sprintf( "Recursion# %d Recursions/Unknown %.02f", \
recursionNum, recursionNum/unknownCnt )
print row
row = ""
for ( cell = 1; cell <= 81; cell++ ) {
if ( (cell == 28) || (cell == 55) )
print strdup( "-", 48 )
if ( sodukoSolution[cell] )
row = row sprintf( " %d ", sodukoSolution[cell] )
else
row = row sprintf( "[%02d] ", sequenceCell[cell] )
if ( cell%9 == 0 ) {
print row
row = ""
}
else if ( cell%3 == 0 )
row = row "| "
}

getkey()
}

# initialize puzzle

local function initpuzzle(this) {

local i, j, cell
local val

# set all "digits used" to "no"

for ( i = 1; i <= 9; i++ ) {
for ( j = 1; j <= 9; j++ ) {
valUsedInRow[ i ][ j ] = 0
valUsedInCol[ i ][ j ] = 0
valUsedInBox[ i ][ j ] = 0
}
}

# initialize the solution with the cell values given in puzzle
# - also note what digits are initially known in each row, column and
box
# - we could (but do not currently) do some consistency checking here;
# ie., no digit can appear twice in any row, column or box, so if we
# find one or more we have an inconsistent puzzle

for ( cell = 1; cell <= 81; cell++ ) {
val = index( ".123456789", substr(this, cell, 1) ) - 1
sodukoSolution[ cell ] = val
if ( val ) {
valUsedInRow[ rowNum[cell] ][ val ] = 1
valUsedInCol[ colNum[cell] ][ val ] = 1
valUsedInBox[ boxNum[cell] ][ val ] = 1
}
}

# just starting out...

solutionNum = recursionNum = 0
}

# determine the sequence in which we will visit unknown cells
# - this is essentially the only heuristic we are using
# - basically we want to look at cells in increasing order
# of possible digits, ie., we will look at cells with the least number of
# possible digits first so as to prune the search as fast as possible
# - as a secondary sort key we will use the total number of
# digits known in the 20 neighbor cells (all the ones in the same row,
# column and box), visiting cells with the least number of unknowns
# in the neighbor cells first
# - this is still not enough to guarantee a unique value for each
# unknown cell, so some may have the same value and will be visited in
# increasing cell number order
# - again we could check for inconsistency; if a cell is unknown but all
# nine digits are found among the neighbors, the puzzle is inconsistent

local function sequence() {

local i, cell
local row, col, box
local totaldigits
local digitsused, sortvalue

# determine the sort value for each unknown cell

unknownCnt = 0
for ( cell = 1; cell <= 81; cell++ ) {
if ( sodukoSolution[cell] )
continue
delete( digitsused )
totaldigits = 0
row = rowNum[ cell ]
col = colNum[ cell ]
box = boxNum[ cell ]
for ( i = 1; i <= 9; i++ ) {
if ( valUsedInRow[row][i] ) {
digitsused[ i ] = 1
totaldigits++
}
if ( valUsedInCol[col][i] ) {
digitsused[ i ] = 1
totaldigits++
}
if ( valUsedInBox[box][i] ) {
digitsused[ i ] = 1
totaldigits++
}
}

sortvalue[ length(digitsused) * 100 + totaldigits ][ cell ] = ".T."
++unknownCnt
}

# unravel the sort
# - TAWK arrays are by default automatically sorted in increasing order
# in each dimension
# - "cellSequence" is for the algorithm
# - "sequenceCell" is strictly for display

j = unknownCnt
for ( i in sortvalue ) {
for ( cell in sortvalue[i] ) {
cellSequence[ j ] = cell
sequenceCell[ cell ] = j--
}
}

# so what have we got now ?

print "Sequenced Puzzle - " unknownCnt " Unknown Cell Values"
show()
}

# solve the puzzle

local function solve(ndx) {

local i
local cell, row, col, box

# show progress ?

if ( ++recursionNum%10000 == 0 )
show()

# which unknown cell are we trying ?

cell = cellSequence[ ndx ]
row = rowNum[ cell ]
col = colNum[ cell ]
box = boxNum[ cell ]

# try each digit in turn

for ( i = 1; i <= 9; i++ ) {
if ( valUsedInRow[row][i] )
continue
if ( valUsedInCol[col][i] )
continue
if ( valUsedInBox[box][i] )
continue
sodukoSolution[ cell ] = i

# more unknowns ?

if ( ndx < unknownCnt ) {
valUsedInRow[ row ][ i ] = 1
valUsedInCol[ col ][ i ] = 1
valUsedInBox[ box ][ i ] = 1
solve( ndx+1 )
valUsedInBox[ box ][ i ] = 0
valUsedInCol[ col ][ i ] = 0
valUsedInRow[ row ][ i ] = 0
}

# no more unknowns; don't recurse but show solution instead

else {
print " Solution# " ++solutionNum
show()
}
sodukoSolution[ cell ] = 0
}
}

# ------------------------------

# collect puzzle lines until we have 81 cells specified

$1 ~ /^[.1-9]+$/ {

puzzleDef = puzzleDef $1

if ( length(puzzleDef) == 81 ) {
initpuzzle( puzzleDef )
sequence()
solve( 1 )
puzzleDef = ""
}
else if ( length(puzzleDef) > 81 ) {
print "Bad puzzle length"
exit( 1 )
}

next
}

# show name of puzzle, if any

/puzzle/i { print $0 }

# all other lines are ingored and so may contain anything


andreasplesch@gmail.com

2006-09-06, 6:56 pm

Anton Treuenfels wrote:
> # determine the sequence in which we will visit unknown cells
> # - this is essentially the only heuristic we are using
> # - basically we want to look at cells in increasing order
> # of possible digits, ie., we will look at cells with the least number of
> # possible digits first so as to prune the search as fast as possible
> # - as a secondary sort key we will use the total number of
> # digits known in the 20 neighbor cells (all the ones in the same row,
> # column and box), visiting cells with the least number of unknowns
> # in the neighbor cells first
> # - this is still not enough to guarantee a unique value for each
> # unknown cell, so some may have the same value and will be visited in
> # increasing cell number order
> # - again we could check for inconsistency; if a cell is unknown but all
> # nine digits are found among the neighbors, the puzzle is inconsistent


This seems like a good idea. Since filling in solutions of course
changes the number of possible digits in the affected row, column and
block, it may make sense to dynamically update the sort array. Or does
it ?

> # more unknowns ?
>
> if ( ndx < unknownCnt ) {
> valUsedInRow[ row ][ i ] = 1
> valUsedInCol[ col ][ i ] = 1
> valUsedInBox[ box ][ i ] = 1
> solve( ndx+1 )
> valUsedInBox[ box ][ i ] = 0
> valUsedInCol[ col ][ i ] = 0
> valUsedInRow[ row ][ i ] = 0
> }


No sure about the logic here. If solve() returns here, it means that
the puzzle is not solved ? And therefore the tags need to be reset ?
And then go on and try the next digit for that cell ?

I found that adding simple logic to fill in easy solutions dramatically
speeds up the process. On the other hand. speed is not necessarily the
main goal when using awk. Although tawk is used for speed, apparently.

Andreas

Anton Treuenfels

2006-09-06, 9:56 pm


<andreasplesch@gmail.com> wrote in message
news:1157573336.686513.18330@d34g2000cwd.googlegroups.com...
> Anton Treuenfels wrote:
of[color=darkred]
>
> This seems like a good idea. Since filling in solutions of course
> changes the number of possible digits in the affected row, column and
> block, it may make sense to dynamically update the sort array. Or does
> it ?


Or at the start of each recursion it would be possible to calculate the
"best" cell to look at by looping through all unknown cells once, always
keeping just the "best one found so far". No sorting involved.

Would it make a difference in the total number of recursions? Maybe. It's
easy enough to test, although I've never done so. Eyeballing a few puzzles
to see which cells such an attempt would pick shows the first few, at any
rate, are usually the same with either method. So any differences are likely
to show up later in the process.

My intuitive feeling is that the time is better spent doing just one sort to
provide a general guide and then iterate possible solutions as fast as
possible.

But I've not really studied all the possible heuristics. I just picked this
one out of the air. Having watched it inaction, I'm wondering if I should
add a heuristic to also pick the order of digits to check, running from
least to most frequent in the puzzle. But I dunno if that will do anything.

>
> No sure about the logic here. If solve() returns here, it means that
> the puzzle is not solved ? And therefore the tags need to be reset ?
> And then go on and try the next digit for that cell ?


The puzzle may have already been solved (the recursion reached maximum
depth) and the algorithm is now looking for additional solutions (if any).
My understanding is that puzzles need at least 17 starting values to
guarantee a unique solution.

Most often, though, a returning recursion means the algorithm ran into a
dead end at a deeper level (shallower recursion levels are already trying
all nine digits in the cell's row, column and box, so it's not possible to
continue at this point) and the shallower level returned to now searches for
the next possible digit it can place in the unknown cell. It wipes out the
last digit it tried and searches for a new one.

If it can't find one it too returns to a shallower level, until eventually a
recursion level finds a new digit it can try and starts the recursion deeper
again, OR the very shallowest (first) recursion level runs out of digits to
check (at which point we are done).

> I found that adding simple logic to fill in easy solutions dramatically
> speeds up the process. On the other hand. speed is not necessarily the
> main goal when using awk. Although tawk is used for speed, apparently.


TAWK is much faster, yes, but I also find it easier to use in the first
place.

I haven't investigated simple logic, but I do note that if any cell is
surrounded by eight of the nine digits, it gets put at the head of the
sequence list and is very quickly dealt with. So it didn't seem worth a
special check.

- Anton Treuenfels


andreasplesch@gmail.com

2006-09-07, 6:56 pm

Hello,

Anton Treuenfels wrote:
> My intuitive feeling is that the time is better spent doing just one sort to
> provide a general guide and then iterate possible solutions as fast as
> possible.


That is my unfounded feeling as well.

> ...
> If it can't find one it too returns to a shallower level, until eventually a
> recursion level finds a new digit it can try and starts the recursion deeper
> again, OR the very shallowest (first) recursion level runs out of digits to
> check (at which point we are done).


Thanks.

>
> TAWK is much faster, yes, but I also find it easier to use in the first
> place.
>
> I haven't investigated simple logic, but I do note that if any cell is
> surrounded by eight of the nine digits, it gets put at the head of the
> sequence list and is very quickly dealt with. So it didn't seem worth a
> special check.


And the "naked single" would be picked within the recursion as the only
solution to try as well. From hand solving I found that many puzzles
let you add "hidden singles", so I decided to implement that logic.
For me it had the additional advantage that it adds solution "away"
from newly solved cells. These tend to affect more unsolved cells than
a solution of the next empty cell in the row by row sequence that I
have.

Andreas

Michael Jaritz

2006-09-30, 6:57 pm

andreasplesch@gmail.com schrieb:

Hello,
I'm back from holiday.

[color=darkred]
[never true]
[color=darkred]
>Oh, this made me check again and I found that my update of this section
>also included a small but important change above in the candidate
>finding loop, in the following line:
>
> # count non-occurrences of digits 1-9 in combined row, column and
>block, per row, column and block, and flag cell/digit as candidate
> if (index(ro co bl, d) == 0) {e++; nr[r, d]++; nc[c, d]++;
>nb[bi, d]++; ca[a, d] = d}
>
>should be
>
> # count non-occurrences of digits 1-9 in combined row, column and
>block, per row, column and block,\
>and flag cell/digit as candidate
> if (index(ro co bl, d) == 0) {e++; nr[r, d]++; nc[c, d]++;
>nb[bi, d]++; ca[c, r, d] = bi}
>
>This change just stores directly the column and row of the candidate,
>instead of the position in the string (a).


This change makes it a very fast solver.

In <1156978078.368666.122180@p79g2000cwp.googlegroups.com> you asked
"Other awk solvers out there?"

http://zielgra.de/temp/awk/yass.awk searchs not for one solution but for
all solutions.

Example:
Input is http://zielgra.de/temp/awk/puzzles1.txt
Output is http://zielgra.de/temp/awk/yass_solutions.txt

Try http://zielgra.de/temp/awk/puzzles2.txt if you have too much time.
There are thousands of solutions.

Michael
Jürgen Kahrs

2006-09-30, 6:57 pm

Michael Jaritz wrote:

> Try http://zielgra.de/temp/awk/puzzles2.txt if you have too much time.
> There are thousands of solutions.


I let it run for more than 10 CPU minutes
(on my AMD Sempron 2800+) and it didnt
produce a result. Results for the other
test cases come up instantly.

Thanks again for posting these.
Michael Jaritz

2006-09-30, 6:57 pm

Jürgen Kahrs schrieb:

>Michael Jaritz wrote:
>

^^^^^^^^^^^^^^^^^^^^^^^^^^[color=darkred
]
>
>I let it run for more than 10 CPU minutes
>(on my AMD Sempron 2800+) and it didnt
>produce a result. Results for the other
>test cases come up instantly.


You have to have patience with it. On my old and slow machine (Intel
Pentium IIIE, 700 MHz) it runs 11 hours.
The Puzzles in http://zielgra.de/temp/awk/puzzles2.txt have 20355, 34525
and 333951 solutions (if the script is right).

Michael
Steve Calfee

2006-09-30, 9:58 pm

On 5 Sep 2006 12:03:27 -0700, andreasplesch@gmail.com wrote:
>Oh, this made me check again and I found that my update of this section
>also included a small but important change above in the candidate
>finding loop, in the following line:
>
> # count non-occurrences of digits 1-9 in combined row, column and
>block, per row, column and block, and flag cell/digit as candidate
> if (index(ro co bl, d) == 0) {e++; nr[r, d]++; nc[c, d]++;
>nb[bi, d]++; ca[a, d] = d}
>
>should be
>
> # count non-occurrences of digits 1-9 in combined row, column and
>block, per row, column and block,\
>and flag cell/digit as candidate
> if (index(ro co bl, d) == 0) {e++; nr[r, d]++; nc[c, d]++;
>nb[bi, d]++; ca[c, r, d] = bi}
>
>This change just stores directly the column and row of the candidate,
>instead of the position in the string (a).
>
>



This is twice now you have said to make small changes in what was to
me confusing ways. Would you mind reposting your complete, up to date
solution?

Thanks Steve
There is no "x" in my email address.
andreasplesch@gmail.com

2006-10-30, 7:01 pm

Steve Calfee wrote:
> This is twice now you have said to make small changes in what was to
> me confusing ways. Would you mind reposting your complete, up to date
> solution?
>
> Thanks Steve
> There is no "x" in my email address.


Apologies, yes this became confusing. Here is the complete script:

BEGIN{
tt = systime()
}

{ st = systime();
plot2($1);
solve2($1);
plot2(sos);
print "Sudoku "NR",\t "NR/(systime()-tt+1e-10)" sudokus/s"
}

function solve2(ss, a, b, c, d, e, r, co, ro, bi, bl, nss){
i = 0;
# first, use some simple logic to fill grid as much as possible
do {i++
didit = 0; delete nr; delete nc; delete nb; delete ca;
for (a = 0; a< 81; a++) {
b = substr(ss, a+1, 1)
if (b == ".") {
# construct row, column and block at cell
c = a%9; r = int(a/9);
ro = substr(ss, r*9+1, 9);
co="";for (d=0; d<9; d++) {co = co substr(ss, d*9+c+1,
1)};
bi = int(c/3)*3+(int(r/3)*3)*9+1;
bl="";for (d=0; d<3; d++) {bl = bl substr(ss, bi+d*9,
3)};
e=0;
for (d=1; d<10; d++) {
# count non-occurrences of digits 1-9 in combined row, column and
block, per row, column and block,and flag cell/digit as candidate
if (index(ro co bl, d) == 0) {e++; nr[r, d]++; nc[c,
d]++;
nb[bi, d]++; ca[c, r, d] = bi}
}
# in case no candidate is available, give up
if (e == 0) {return 0}
}
}
# go through all cell/digit candidates
# hidden singles
for (crd in ca) {
# a candidate may have been deleted after the loop started
if (ca[crd]!=""){
split(crd, spl, SUBSEP)
c = spl[1];
r = spl[2];
d = spl[3];
bi = ca[crd];
a = c+r*9;
# unique solution if at least one non-occurrence counter is exactly 1
if ((nr[r, d] == 1) || (nc[c, d] == 1) || (nb[bi, d] == 1))
{
ss = substr(ss, 1, a) d substr(ss, a+2, length(ss));
didit = 1;
# we can now remove candidates for this digit from current row, column,
block
for (e=0;e<9;e++) {delete ca[c, e, d]; delete ca[e, r,
d];}
for (e=0;e<3;e++) {for (b=0;b<3;b++) {delete
ca[int(c/3)*3+b,

int(r/3)*3+e, d];}}
}}
}
} while (didit == 1)
# Second, pick a viable solution for the next empty cell and see if it
leads to a solution; trial and error
a = index(ss, ".")-1;
if (a == -1) {
# no more empty cells, done
sos = ss; return 1}
else {
c = a%9; r = int(a/9);
# concatenate current row, column and block
# row
co = substr(ss, r*9+1, 9);
# column
for (d=0; d<9; d++) {co = co substr(ss, d*9+c+1, 1)};
# block
c = int(c/3)*3+(int(r/3)*3)*9+1;
for (d=0; d<3; d++) {co = co substr(ss, c+d*9, 3)};
# get a viable digit
for (b = 1; b < 10; b++) {
# check if candidate digit already exists
if (index(co, b) == 0) {
# is viable, put in cell
nss = substr(ss, 1, a) b substr(ss, a+2, length(ss));
# try to solve
d = solve2(nss);
# if successful, return
if (d == 1) {return 1}
# if not, try next viable digit
}
}
# no digits viable, no solution
} return 0
}

function plot2(ss, ou, a, b, c, d){
print "| - - - + - - - + - - - |"
for (a = 0; a < 3; a++) {
for (b = 0; b < 3; b++) {
ou = "|";
for (c = 0; c < 3; c++) {
for (d = 0; d < 3; d++) {
ou = ou" "substr(ss, 1+d+3*c+9*b+27*a, 1);
}
ou=ou" |";
}
print ou;
} print "| - - - + - - - + - - - |"
}
}

Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2008 codecomments.com