Scripts/AlphaBeta

From J Wiki
Jump to: navigation, search

Download old (pre J6.01) version of this script: File:Alphabeta.ijs Download script: alphabeta.ijs

NB. definitions used here
default =: [ , (}.~ #)~
NB. alpha-beta minimization

NB. This is a base class for alpha-beta minimization.  Classes
NB. that extend this class are expected to provide the following
NB. functions:
NB.
NB. suggestmoves
NB. y is position,player;sortflag
NB.  if sortflag is 1, the moves should returned best move first
NB. Result is (list of newposn;newplayer;move) ,&< (state info for next call)
NB. The monadic case is the first call; after that the dyad is called,
NB.  with x set to the state info returned by the previous call
NB. Each call returns as many moves as can be gathered quickly.
NB.
NB. evalposn
NB. y is position,<player
NB. Result is best guess at the value of the game from the point
NB. of view of that player (_1 = dead loss, 1 = forced win)
NB.
NB. checkterminal
NB. y is position,<player
NB. Result is 0 if position is not terminal, or
NB. 1;value is position is terminal
NB.
NB. and optionally:
NB.
NB. canonposnplayer
NB. y is (boxed) posnplayer
NB. result is (boxed) canonical posnplayer

coclass 'alphabeta'

NB. NOTE: in this file, 'posn' and 'move' are always boxed
NB. posnplayer is posn,<player

NB. Dummy canonposn in case none specified by user
NB. y is posn,<player
NB. result is canonical (posn,<player) ; (uncanoninst) ; (symmetries)
NB.  uncanoninst will be passed in to uncanonmove
NB.  symmetries will be passed in to suggestmoves
canonposnplayer =: ]"1

NB. Dummy posnplayertochar in case none specified by user
NB. Convert position to string, prefixed by player#
posnplayertochar =: 3 : 0"1
posn =. 0{::y
(": 1{:: y) , 5!:5 <'posn'
)

NB. x is minimum allocation for table
NB. y is mask of table entries to keep
NB. We reinit the table, and reinstall the symbols and values that
NB. were indicated to be kept
purgexposition =: 4 : 0
NB. Save the symbol names we want to keep
savesyms =. 5 s: y # xpositioninspt {. xpositionsyms
NB. Reinit the symbol table
10 s: xpositionreinitval
NB. Reallocate the tables
xpositionsyms =: x ((>. #) {. ]) s: savesyms
xpositiondata =: x ((>. #) {. ]) y # xpositioninspt {. xpositiondata
xpositioninspt =: #savesyms
NILRET
)

NB. Called to allocate xposition table.  y is size of initial allocation
initxposition =: 3 : 0
if. ifdefined 'xpositionreinitval' do. 10 s: xpositionreinitval end.
xpositionreinitval =: 0 s: 10
xpositionsyms =: y # s: <''
xpositiondata =: y # <''
xpositioninspt =: 0
NILRET
)

NB. y is list of (boxed position),player#[;other stuff]
NB. Convert the posnplayers to canonical form; then look them up
NB. to see which ones have been encountered before.  Return
NB. (old positions),&<(new positions)
NB. where each old position is (evaluation parms);posn;player;(other stuff)
NB. and each new position is handle;posn;player;(other stuff)
NB.
NB. Keep the records in order, since the best one comes first
checkxposition =: 3 : 0
NB. Convert positions to canonical form
canonpos =. canonposnplayer@(2&{.)"1 y
otherstuff =. 2 }."1 y
NB. Get a symbol for the posnplayers
sym =. s: charpp =. <@posnplayertochar canonpos
NB. Discard duplicates - but if the character value is '', it means we are
NB. not bothering with saving a symbol for this value, so don't let those values get
NB. treated as equivalent
nubmsk =. (~: sym) +. charpp = <''
sym =. nubmsk # sym
nuby =. canonpos ,.&(nubmsk&#) otherstuff
NB. Look up the symbols, see which are old
oldsym =. xpositioninspt > m =. (xpositioninspt {. xpositionsyms) i. sym
NB. Build the return
(((oldsym # m) { xpositiondata) ,. oldsym # nuby) ,&< sym ;"_1&((-. oldsym)&#) nuby
)

NB. x is (unboxed) record to save, y is handle returned by 'checkxposition'
NB. No result; the table is modified
addxposition =: 4 : 0
NB. Reallocate table if it is full
if. xpositioninspt = #xpositionsyms do.
  xpositionsyms =: ({.~ >.@(1.25&*)@#) xpositionsyms
  xpositiondata =: (#xpositionsyms) {. xpositiondata
end.
NB. Install the item at the next point in the table
xpositionsyms =: y xpositioninspt} xpositionsyms
xpositiondata =: (<x) xpositioninspt} xpositiondata
xpositioninspt =: >: xpositioninspt
NILRET
)


NB.*v-- findmove find one move of the game
NB. x is anything
NB. y is position;player#;max depth allowed
NB. Result is new position;player;max depth;game value;move that was made
findmove =: 4 : 0
movect   =: 0
valmove =. x searchtree y,_2;2;1;1
NB. If there is no move, the starting position was terminal: return it
if. 4 > #valmove do. y return. end.
NB. The move is found.  Purge the saved list of positions of
NB. everything that wasn't terminal.  We assume that the second
NB. element in each data box is the exact-evaluation flag
512 purgexposition 1&{@> xpositioninspt {. xpositiondata
qprintf 'movect '
(1 2{valmove),(2{y),(0 3{valmove)
)

NB.*v-- searchtree  alpha-beta algorithm to search game tree
NB. run alpha-beta minimization
NB. x is anything, passed along to the other functions
NB. y is position;player#;max depth allowed;alpha;beta;posn type[;fullreturn]
NB. player# is 0 or 1
NB. alpha is the negative of the best value that the moving player has been able to
NB.  guarantee so far by taking a branch of the tree that does not
NB.  lead to the parent of the move being considered
NB. beta is the best value that the moving player has been able to
NB.  guarantee by choosing among alternative moves at the current level
NB.  (i. e. it is the negative of the best value his opponent is able to get)
NB. posn type is 1 if all moves are first options (i. e. predicted best);
NB.  alternating between 2 and 3 (starting with 2) once a second choice
NB.  is tried.  Type 3 positions do not need to worry about order of successors.
NB.
NB. result is (game value,terminal flag)[;posn;player;move]
NB. the game value is a value between _1 and 1, always represented
NB.  from the point of view of the player making the move (_1 is a
NB.  dead loss, 1 a forced win).  Terminal flag is 1 if we know this
NB.  is a correct answer
NB. the move describes the move we selected.  the move may not be part of the
NB.  returned value if fullreturn is not 1 (default 0)
NB.
NB. start by calling with alpha=_1,beta=1
searchtree =: 4 : 0
movect   =: >: movect
posnplayer =. 2 {. y
'depth alpha beta postype fullreturn' =. 2 3 4 5 6 { y default '';0;0;_1;1;1;0
NB. If the position is terminal, return the value with no move, and terminal flag set
if. >{. r =. checkterminal posnplayer do.  ,<>|.r return. end.
NB. If depth is 0, we have to stop searching.  Force an evaluation and return
NB. whatever is available, with no move
if. depth = 0 do. ,<(evalposn posnplayer),0 return. end.
NB. initialize state that will be supplied to the next level
newdepth =. <: depth
NB. Successors of types 2 and 3 alternate; the first successor to type
NB. 1 is type 1, the others are type 2
newpostype =. postype { 0 1 3 2
NB. Init to 'no best move found yet'
bestmove =. 0$a:
NB. Start off indicating exact evaluation
exacteval =. 1
NB. Get a list of moves, initializing the move-finder.  This returns
NB. a few moves
'posmovelist nextstate' =. suggestmoves posnplayer,<(postype ~: 3)
NB. Now process all the moves, until we have done them all or we have found
NB. a refutation of the move that got us here
while. #posmovelist do.
  NB. Convert all the positions to canonical form
  NB. Establish a handle for each position, and see if it has been evaluated already
  NB. Each posnplayer gets a flag meaning 'previously evaluated', and either a
  NB. symbol or the results of the evaluation
  'oldposns newposns' =. checkxposition posmovelist
  NB. Looking at all the positions that have been evaluated, get the best result and
  NB. use it.  oldposns is list of evaluation ; move
  if. #prevevals =. 0 ocol oldposns do.
    NB. if all the evaluations were exact, remember that
    exacteval =. exacteval *. *./ 1 {"1 prevevals
    if. (-moveval =. <./ mv =. 0 {"1 prevevals) > alpha do.
      bestmove =. (<(mv i.!.0 moveval);1 2 3) { oldposns
      alpha =. -moveval
    end.
    if. alpha >: beta do. break. end.  NB. refutation!
  end.
  NB. Now consider the positions that have not already been evaluated.
  NB. newposns is handle;posn;player;move
  NB. For each yet-to-be-evaluated position, evaluate it, save the result,
  NB. and stop if we have hit a refutation
  for_p. newposns do.
    'moveval exact' =. >{. x searchtree (1 2{p),newdepth;(-beta);(-alpha);newpostype
    NB. Remember the value for this position so we don't have to look at it again.
    NB. But if the symbol is the empty symbol, we don't bother saving
    (moveval,exact) addxposition^:(~:/@(2&{.)@]) 0 {:: p
    exacteval =. exacteval *. exact
    if. (-moveval) > alpha do.
      alpha =. -moveval
      bestmove =. 1 2 3 { p
    end.
    if. alpha >: beta do. break. end.  NB. refutation!
    NB. After the first move, successors of type-1 positions become type 2
    newpostype =. 2 >. newpostype
  end.
  if. alpha >: beta do. break. end.
  NB. Get more moves, if there are any
  'posmovelist nextstate' =. nextstate suggestmoves posnplayer,<(postype ~: 3)
end.
NB. If a full return was called for, return the selected move
bestmove ,~^:fullreturn ,<alpha,exacteval *. alpha < beta
)


Contributed by Henry Rich


CategoryGames CategoryLiterate