Little Golem

Ingo Althofer at 2010-10-24

How far are we/is Castro still away from perfect play?

Ingo.

Castro_bot at 2010-10-24

I have solved 3 of the 6 openings on size 4. The other 3 need algorithmic (smarter pruning for a smaller branching factor) or engineering (better memory management) improvements before they are doable. That said, Castro doesn't use the solutions in its play yet. I haven't had enough time to build those proofs into an opening book, or even to use any opening book. It can solve some fairly early positions in a couple seconds during normal gameplay. If I was to build an opening book, it wouldn't take long to build an opening book big enough that it would go straight from opening book into end game solving.

Dvd Avins at 2010-10-24

The number of distinct legal positions for an unfinished Havannah-4 game should be not that large, right? 10^{4 maybe, but I'm taking an outsider's geuss that it's not 10}5. Am I way off?

Castro_bot at 2010-10-24

It is much much bigger than 10^{5. I'd guess closer to 10}12. A rough upper bound is 3^{37/12, which is 10}16, but this includes many impossible positions.

If it was only 10^5, I'd solve the whole thing in about 3 seconds, but I've so far spent well over 100 cpu hours solving the 3 that I have solved.

Ingo Althofer at 2010-10-24

Hi Timo,

thanks for the information.

In my opinion it is not necessary to incorporate a

perfect size-4 player in Castro - there are even

some (human) players out there who would not like

you to do this.

But … for the understanding of the game it would be

nice when you published a size-4 solution in some thesis

(your Ph.D. ?!) or some other paper.

By the way, are you a member of the Games Group at

University of Alberta?

And one more question from a person who is always hungry

for new insights:

How would you rate the complexity of Havannah at size 5?

Is there a chance that humans will see “the” solution for it?

Ingo.

Mirko Rahn at 2010-10-24

@Timo: In which form do you have the solutions? Do you have a dump of the complete trees? Are the corners first player wins (assuming no swap rule)?

@Ingo: Isn't size 5 just (61/37)^(61-37) \approx 160000 times harder?

Ingo Althofer at 2010-10-24

Hi Mirko,

> @Ingo: Isn't size 5 just (61/37)^(61-37) \approx 160000 times harder?

Not necessarily, when you find some “solution tree” that does not cover

the whole board.

By the way: There are several board-sizes in between 4 and 5,

for instance 4-4-5-4-4-5 (lengths of the board sides in clockwise order).

Ingo.

christian freeling at 2010-10-24

“In my opinion it is not necessary to incorporate a perfect size-4 player in Castro - there are even some (human) players out there who would not like you to do this.“

I can neither imagine why one would implement it, nor what objections one would have.

Castro_bot at 2010-10-24

My simple upper bound has many impossible cases where the number of moves is vastly uneven, such as all white but no black. A friend of mine did a little more complex analysis of the complexity to only include board positions where the number of pieces is almost in balance, and found that there are about 6*10^{15 unique positions. This doesn't take the cases where someone has already won into account, and many of them are just stupid positions, but this suggests my estimate of 10}12 is likely pretty close. A complete proof tree may be much smaller, like 10^{8 or so, but for now my move ordering isn't very accurate, so I won't manage to get a minimal proof tree, so will use more like 10}10 or 10^11 states in my proof.

There are several games groups at the University of Alberta, the Go group, the Hex group, the Poker group, etc. I talk to all of them, but I'm not formally part of any of them.

I'm hoping to solve the complete size 4 game by the end of the year, and will publish a paper once I do. I'd likely also put up a player that plays size 4 perfectly, even if it isn't this one. One reason why someone may want me to post it is to make sure my solution is in fact correct. For the moment, I don't actually have the proof tree saved, so will need to re-solve everything I've done so far, but once I improve my solver a bit, that won't be much extra work.

Yes, the corner is a first player win (assuming no swap), and is the easiest to solve (currently takes me about 30 cpu hours). B3 is also a win, but takes about 5-10 times more work to solve.

pgadey at 2010-10-24

This is a little OT, but I'm very excited that this kind of research is being done. I look forward to flipping through the paper once it gets out there.

Mirko Rahn at 2010-10-25

So the actual state of havannah up to size 4 is the following:

win for first player

win for second player

a field, symmetric to some F or S

size 1: there is only one field, no interior, all corner

The outcome is not fully clear (to me). Maybe some think it is a draw since there is no winning figure. I tend to see this board as a projection of some larger board. Then it is clearly a first player win.

size 2: with interior, all corner, no border

   .
 .   .
   S
 F   .
   .

size 3: first size with corner and border

     .
   .   .
 .   .   .
   .   .
 S   S   .
   S   .
 F   .   .
   .   .
     .

size 4: first size, where the corner is not the only first player winning move

       .
     .   .
   .   .   .
 .   .   .   .
   .   .   .
 .   .   .   .
   F   ?   .
 ?   ?   .   .
   ?   .   .
 F   .   .   .
   .   .   .
     .   .
       .

Eager to get more.

Castro_bot at 2010-12-14

Ok, so I've got new results!

a1 is the easiest to prove, and is a first player win

b3 is next at about 4 times the difficulty, and also first player win

b2 is next at another 4 times difficulty, again first player win

a2 is about double the difficulty of b2, also a first player win

Those are definitely solved, and I've got the proof trees for them (totalling ~100gb)

c3 and d4 are likely to be a draw! I've solved about 25 million states each, with 80% of c3 being draws and 99% of d4 being draws.

        1 2 3 4
     A F F . . 5
    B . F F . . 6
   C . . D . . . 7
  D . . . D . . .
   E . . . . . .
    F . . . . .
     G . . . .

Ingo Althofer at 2010-12-14

Hi Timo,

fantastic news. Thanks for sharing them with us!

Concerning the first moves that lead to a draw,

can you give (later) the list of all replies (and

rereplies) which lead to the draw?

Congrats, Ingo.

MarleysGhost at 2010-12-14

> 100 GB

Holy cow. Seems like a lot.

If there's an initial move that leads to a draw, then, unlike in Hex, the swap rule is sufficient to compensate for the first-move advantage.

Ingo Althofer at 2010-12-14

Marleys Cow wrote:

> If there's an initial move that leads to a draw, then, unlike

> in Hex, the swap rule is sufficient to compensate for the

> first-move advantage.

And our hope should be that there is a more or less

complicated mesh of drawing lines for this starter.

Ingo.

PS: No pun intended with the “cow”.

christian freeling at 2010-12-14

_“And our hope should be that there is a more or less

complicated mesh of drawing lines for this starter.“_

That would surprise me, I'd expect a draw to be a thin line rather than a spectre.

Mirko Rahn at 2010-12-14

Hmm, cannot believe in the draws. How many occured on LG? I'm aware of a handful just.

Will you finally make the dumped proofs available together with an explanation of the format? I would love to get a copy. I would send you and USB mass storage device to put it on.

Ingo Althofer at 2010-12-14

> Will you finally make the dumped proofs available together with an

> explanation of the format?

The ICGA Journal would be an ideal place for publication

of the solution in an appropriate “format”.

Ingo.

Castro_bot at 2010-12-14

Unfortunately right now, I can't give you the principle variation that leads to a draw. Castro crashed in the attempt, so I'd need to restart the solver and wait a couple days. I'm hoping to implement some early draw detection so I can at least stop once neither player can win even if the two work together. Right now it has to look at all permutations of those moves (ie build the complete proof tree) to prove a draw, which is very wasteful.

Right now the format is simply a text file of solved positions where each line has a board hash, the outcome and the approximate amount of work it took to solve it. Since the board hash is a zobrist hash, I can't create a position from it, but since it is deterministic, I can use that hash to decide on a move if I get back to the position. It is also on the universities compute cluster, so I can't easily analyze it offline. I should also mention that I'm only saving states that take more than 1000 work to solve, which means it would take about 50ms to resolve each of the leaf nodes in my proof tree.

Of the >100,000 games on size 4 that I've logged, about 5% are draws. I believe it is so rare here for 2 reasons: it's rare for any player to start in a position that is actually a draw, and no one here is actually a very strong player.

Castro_bot at 2010-12-20

I added some better draw detection, so it stops searching as soon as no wins are possible. Once one side can no longer win, it'll backup a draw instead of proving the remaining moves too. For the draw on size 5 in the other thread, it can find the draw fairly quickly 15 moves early and can prove one side can't win even earlier than that.

C3 no longer thinks it'll be a draw, but is instead a win, like the others. D4 isn't as far along, so is less certain, but is also unlikely to be a draw. It's down to about 45% winrate on all moves now, whereas previously it stayed near 49% for a very long time.

Sorry for the false hope of a draw earlier.

Ingo Althofer at 2010-12-21

Hi Timo, thanks for th update.

> Sorry for the false hope of a draw earlier.

Hmm. But there should be a reason for the observations

that led you to the draw conjecture… Strange.

Meditating, Ingo.

Castro_bot at 2011-01-14

I've got some new news. It turns out that c3 is indeed a win for the first player. d4 (center) has 4 replies that are first player wins, but I haven't solved the 5th reply yet. There's a very real chance that d4 will in fact be a draw or loss though. What had looked like the most promising move, turned out to be a loss. The move in question is d4, a1, a4, with d1 being a winning move for player 2.

Does anyone care to venture a guess as to what the best reply to d4, a1 is (since it's not a4 or d1 by symmetry)?

Peter Koning at 2011-01-14

I think c2 works

Castro_bot at 2011-01-18

c2/b3 is also a loss (ie 2nd player win). It's looking like d4 is going to be a second player win.

Castro_bot at 2011-01-25

Well, it turns out that d4 is also a first player win. The beginning of the PV for d4 is: d4 a1 g7 a4 b3. This means all openings are in fact first player wins.

Ingo Althofer at 2011-01-25

Hello Timo,

> Well, it turns out that d4 is also a first

> player win. The beginning of the PV for d4

> is: d4 a1 g7 a4 b3. This means all openings

> are in fact first player wins.

Thanks for sharing your findings with us -

and congratulations for solving Havannah-4.

Do you have decided already where to publish the

findings formally? (One natural candidate would be the

ICGA Journal.)

It would be nice when you could give (for instance in

that publication) a “small” starting tree for the best

lines.

Cheers, Ingo.

Castro_bot at 2011-01-25

I'm hoping to finish writing it up quickly enough to publish in AAAI, though that is due fairly soon, so I may not have time for that.

Clearly the tree grows very fast, so I'm not sure how much of the tree I can give. Is there some criteria for the moves you'd be interested in?

Ingo Althofer at 2011-01-25

Hello Timo,

> Clearly the tree grows very fast, so I'm not sure how

> much of the tree I can give. Is there some criteria

> for the moves you'd be interested in?

Of course, you should not give some full-width portion

of the tree. Instead, for each non-leaf-position in the

tree the “interesting” moves should be included.

“Interesting-ness” may be defined in several ways:

\* Those moves, proposed by the top Havannah bots (without opening books)

\* Those moves, proposed by top human players

\* Those moves where you program had the hardest time to refute them

(measured for instance in computing resources)

An interactive approach might run as follows:

You design such a (preliminary) tree and show it, and we

here in the forum of Little Golem make proposals in which

lines to go into more depth, or where to delete parts

that look too elaborated.

Of course, it would also be completely fine, when you

simply present your tree (for instance in your AAAI paper) -

and give others the chance to make secondary publications

by “filling gaps”, i.e. by giving more detailed lines in

some part of your tree.

Presenting a tree would also make it more likely that soon

someone else finds the energy to try a second proof. Such

a guy might start by trying to find holes in your analysis,

jumping from branch to branch in your tree. And at some point

he has put so much energy in his attempt that he decides to

give a complete (second) proof.

Ingo.

PS: Many years ago (in 1993) Ralph Mueller from ETH Zuerich

informed the public that he had solved the game “nine men's

morris”. It took several years, until Peter Stahlhacke gave

a second - independent - complete analysis.

Castro_bot at 2011-09-21

After a fair amount of inactivity, I'd like to finally post a proof for the a1 opening. Reconstructing the proof from the logs I generated has been painful and I'm not quite sure why, so I finally gave up and added code to output the proof directly out of the solver, but unfortunately going this direction this means having to resolve everything, throwing away months worth of work, but I wanted to prove it could be done anyway. I first tested this on the a1 opening simply because it is the quickest, and now have a proof tree from a1 in varying sizes. The cutoffs are how many simulations were needed to solve the node. a1-1000.hgf is biggest because it includes every state that is part of the proof tree that took more than 1000 simulations to solve, while a1-1000000.hgf is much smaller because it only includes nodes that took more than 1000000 simulations to solve. Nodes that took more than the cutoff to solve but that aren't part of the proof tree aren't included.

http://dropbox.ewalds.ca/a1-100000000.hgf - 30kb

http://dropbox.ewalds.ca/a1-10000000.hgf - 253kb

http://dropbox.ewalds.ca/a1-1000000.hgf - 2.5Mb

http://dropbox.ewalds.ca/a1.tar.bzip2 - 84Mb but expands to about 1.5Gb and includes a1-1000 to a1-100000000 in powers of 10.

Sorry, since this was a first attempt at running this code I had a bug that reversed the reported colours, so it claims black made the a1 move. This will be fixed in future proof trees.

Right now each hgf node includes the outcome (always 1 because it is only a proof tree), the best move from that position (the winning move or the hardest loss to prove), and the number of simulations used to prove the node. Is there any other information you'd like to see in the proof tree before I start running the rest of them?

Ingo Althofer at 2011-09-21

Hello Timo,

thanks for your work.

I downloaded the 30kb file and opened it with a normal editor.

I do not really understand what the meaning of the data there is.

Could you give a more detailed explanation?

Especially, could you explan what the first 20+ lines mean?

\\\* I have copied them in here ***

(;B[a1]C[mcts:1,best:g7,sims:44253884087]

(;W[a2]C[mcts:1,best:b2,sims:772614999]

(;B[b2]C[mcts:1,best:g7,sims:772535386]

(;W[c2]C[mcts:1,best:g7,sims:221003080]

(;B[g7]C[mcts:1,best:g6,sims:220763658])

)

(;W[a4]C[mcts:1,best:d2,sims:8578098618]

(;B[d2]C[mcts:1,best:g7,sims:8564063375]

(;W[b1]C[mcts:1,best:b2,sims:922155579]

(;B[b2]C[mcts:1,best:g7,sims:921953471]

(;W[d1]C[mcts:1,best:d7,sims:784525947]

(;B[d7]C[mcts:1,best:g7,sims:783839709]

(;W[e2]C[mcts:1,best:e3,sims:111906914]

(;B[e3]C[mcts:1,best:g7,sims:111837539])

)

(;W[g4]C[mcts:1,best:e2,sims:202194016]

(;B[e2]C[mcts:1,best:g7,sims:201878699])

)

(;W[g5]C[mcts:1,best:c3,sims:130854972]

(;B[c3]C[mcts:1,best:g7,sims:130197649])

)

(;W[b4]C[mcts:1,best:g4,sims:1273494331]

(;B[g4]C[mcts:1,best:g7,sims:1273415092]

(;W[d1]C[mcts:1,best:c2,sims:156261450]

(;B[c2]C[mcts:1,best:g7,sims:156159154]

(;W[d7]C[mcts:1,best:c5,sims:137539615]

(;B[c5]C[mcts:1,best:g7,sims:137537157])

)

(;W[d7]C[mcts:1,best:c5,sims:1094628661]

\\\\\\\\\\\\\\\\\\\\\\\\\*+

Thanks in advance,

Ingo.

Castro_bot at 2011-09-21

The file is in the sgf/hgf format described in detail here: http://www.red-bean.com/sgf/ . It is the same format that havannahgui loads and that little golem generates. For example for the game http://www.littlegolem.net/jsp/game/game.jsp?gid=1368747 the hgf is http://www.littlegolem.net/servlet/sgf/1368747/game.hgf .

The file is a tree of moves, where the indentation in spaces indicates the depth. So this excerpt shows a couple strong lines of play:

a1 a2 b2 c2 g7

a1 a4 d2 b1 b2 d1 d7 e2 e3

a1 a4 d2 b1 b2 d1 d7 g4 e2

a1 a4 d2 b1 b2 d1 d7 g5 c3

a1 a4 d2 b4 g4 d1 c2 d7 c5

Ingo Althofer at 2011-09-21

Thanks for the explanations.

By the way: Will you and your bot be in Tilburg?

Ingo.

Castro_bot at 2011-09-21

I certainly plan to go to Tilburg, though I haven't registered for anything yet.

Any comments on which information you want in the proof trees?

Ingo Althofer at 2011-09-21

> … each hgf node includes the outcome (always 1 because it is only a proof

> tree), the best move from that position (the winning move or the hardest

> loss to prove),

My wish: COuld you include not only the hardest loss move,

but the two or three hardest loss moves? That would give a

lot more information.

Ingo (see you in Tilburg)

Castro_bot at 2011-09-21

That information would implicitly be in the bigger proof trees.

Ingo Althofer at 2011-09-21

> That information would implicitly be in the bigger proof trees.

Yes, I am sure that it is “somewhere”. For me it would be definitely

useful to have this little extra piece just in the node.

Castro_bot at 2011-09-28

I just noticed that on size 4: c3 a1 g7 b5 f4 is a proven draw. Unfortunately I can't post the pv without re-solving that since I'm only saving the proof tree, but can re-solve it if anyone is interested. There may be earlier draws, but this is the earliest one that I have noticed. It took 4.6 billion simulations to prove, and looked like a white win until very late, having an aggregate 78% winning rate.

Havannah size 4 Hex, Havannah

Reply to this topic