Minimax with alpha-beta pruning, Objective C, iPhone

Here’s a small tutorial on how to implement game algorithm minimax and its variant alpha-beta pruning in objective C. The tutorial focuses on finding the best possible move for a 2 player turn-based game and explains the search algorithm as well as the methods used by it. You can also find a few optimization tips, important if you’re developing a game for iPhone or iPod Touch.
“Minimax (sometimes minmax) is a decision rule used in decision theory, game theory, statistics and philosophy for minimizing the maximum possible loss” - Wikipedia (minimax).

This search algorithm is well documented on the Internet but it doesn’t hurt to see the approach with Objective C. First of all, I should mention that the algorithm is resource hungry unless optimized so bear this in mind when developing for the iPhone or your CPU player won’t be able to think more than 2-3 moves in advance. You’ll find more on optimization near the end of this article. Minimax is rarely used in its original form because of the resources problem and commonly a variation is used called alpha-beta pruning. This is the modification of minimax algorithm which reduces the number of nodes that are evaluated in the search tree by the minimax algorithm. Alpha-beta pruning completely eliminates some branches in the minimax search tree which allows it to perform much better than plain minimax. The examples are based on a simple 2 player turn based game but it can be used in other search/decision cases.

Let’s see the Objective C code for alpha-beta pruning method.

-(double)alphabeta:(id)state withDept:(int)depth prevAlpha:(double)alpha prevBeta:(double)beta {
    
    NSArray* validMoves = [Logic getValidMovesForState:state];
    
    if (depth == 0 || validMoves.count == 0 || state == END_GAME)
        return [self evaluateState:state];
    
    
    for(Move* m in validMoves) {
        id newState = [Logic makeMoveOnStateWithMove:m];
        
        double val = - (double) ([self alphabeta:newState withDept:depth-1 prevAlpha:-beta prevBeta:-alpha]);
        
        if (val > alpha)
            alpha = val;
        
        if (alpha >= beta) {
            break;  // prune branch.
        }
    }
    
    return alpha;
}


This code can’t be exactly used as-is, but with a few modifications you should be able to implement it for your own needs (in a two player game etc...). So, what needs to be modified? Let’s go through the algorithm for start to find out. We go through these steps:
1. Get the valid moves for the current state
2. Check if we’ve reached the depth limit, there are no valid moves or we’ve reached the end of the game. If so, we return the evaluation value for the current state. (More on state evaluation below).
3. For each possible move, we create new game states using that move and we recursively call alphabeta method with this newly created test state, decreased depth by 1 (we go 1 move ahead == 1 step deeper in algorithm) and we switch alpha and beta values as required by the algorithm.
4. We compare the returned result with alpha, and if we’ve found a greater value (better move) we replace the old alpha with this value.
5. We check if alpha is >= than beta - we prune the branch. We don’t go any further here.

So now, we clearly see that we must implement the “state” variable (replace “id” with whatever the class name you use), we need to have a logic to make the moves on the given state and some methods to determine the end of the game as well as a getter for valid moves etc... You should be able to implement your own method based on the tips I just gave you.




Finding the best move


Now, as you probably noticed this method returns a double as in value so how the hell is that supposed to tell me which move did the algorithm pick? We need one more step which actually precedes this alphabeta method. This one is simple:
1. Get all valid moves that can be played by the player whose turn it is.
2. For each move, create a test new state, make the move and call alphabeta to give you the double evaluation of that move.
3. Remember the move which has the greatest value returned by the alphabeta method (the best move).

So, in Objective C you’ll have something like this:

-(Move*) getBestMoveForState: (id)state {
    double alpha = -1000; //game lost constant value
    double beta = 1000; //game won const value 
    
    NSArray* validMoves = [Logic getValidMovesForState:state];
    Move* bestMove;
    double bestRank;
    
    for(Move* m in validMoves) {
        
        id newState = [Logic makeMoveOnStateWithMove:m];
        
        double rank = - [self alphabeta:newState withDept:kDepth prevAlpha:alpha prevBeta:beta];
        
        Move* move;
        
        if(bestMove == nil) {
            bestMove = move;
            bestRank = rank;
            continue;
        }
        
        if(rank > bestRank) {
            bestMove = move;
            bestRank = rank;
        }
    }
    
    [bestMove autorelease];
    return bestMove;
}


Now and there you have it. This method returns the best possible move your CPU player can make.



Evaluation function for alpha-beta algorithm


The evaluation function which we called in alphabeta method (evaluateState) is actually the most important part of your game logic. This is where you tell the minimax algorithm which state is good for our player (most obvious - a win) and which state is bad for our player (most obvious - a loss) and you also give the value for this outcome in terms of a number. So basically, you say a win is worth a 1.000 and a loss is - 1.000. What’s in between? Neither win nor loss. Since your alphabeta method is limited in depth, most of the times evaluation method will be called on a state which is not an end-game state where you know who won and who lost. So, you need to come up with a number which describes this middle-game state and says “hey, this state is good for my player” or vice versa. The better the state, the greater the value you’ll return and the other way around. If the state you’re examining is very, very bad for your player, near to the loss - you’ll return a number close to -1.000 (or whatever you chose to represent the loss).

Let’s take a look at the possible alphabeta state evaluation method (example):

-(double) examineState: (id)state {
    
    double score = 0;
    
    //if the game is finished, check who won
    if(state == END_GAME) {
        
        if([Logic playerWon])
            return 1000;
        else
            return -1000;
    }
    
    //the game is not finished, so calc in all possible state parameters you can think of
    //for this example we just calc in the number of strategic fields we took
    //and the number of fields taken over by our opponent (which we calc in in a negative way)
    
    int strategicFieldsCount = [Logic countStrategicFields];
    int takenByOpponent = [Logic countTakenByOpponent];
    
    score += strategicFieldsCount;
    score -= takenByOpponent;
    
    return score;
}


So, now our CPU player will choose the best move based on how many strategic fields it took over and how many fields are taken by its opponent. Adding more parameters here will change the way your CPU plays. This method is easily modified, for example, if you want some parameters to have greater value than the others you can refactor the numbers.

Say, for your game it’s more important to gain strategic fields than to lose other fields by the opponent. You can refactor the numbers like this:

score += strategicFieldsCount * 2.5;
score -= takenByOpponent * 0.5;

Constants 2.5 and 0.5 can be modified to suit your own game logic. In our case, we gave strategic fields count 2.5 times greater importance than before and we’ve decreased the importance of taken over fields by half.



Optimization


This is the tricky part. If you’re planning to use this algorithm you must take care of performance. Also an absolute must when developing for the iPhone or iPod Touch. Minimax, although already optimized with alpha-beta pruning, has room for more improvements and this all depends on the logic of your own code. An experienced programmer should be able to find the parts of the code to optimize, I can give you a few generic tips.

  • Be extra cautious with the retain count for objects. Forgetting to release an object can lead to disaster. You’re dealing with a recursive method and if you don’t release your objects correctly you can end up with having millions of leaked objects.
  • If possible, avoid autoreleasing objects inside alphabeta method. This recursive method is a performance hog and again, you can end up using your entire available memory until the objects get autoreleased. You won’t leak (providing you play correctly #1 above) but that wouldn’t matter since you would probably make a mess on your device (computer, whatever...).
  • Cache the state evaluations. If your evaluation method uses some extensive calculating, (and even if it doesn’t - remember that it can be called millions of times by the recursive alphabeta) many times this method will be called on the same state, the one you already evaluated for the other sequence of moves (you can end up with the same game state with different sequence of moves). So, instead of evaluating the same state again, cache the evaluations somewhere and reuse them when possible. If you run into the state which is already evaluated, you don’t need to evaluate it again but instead you just use the value you cached.
0 Comments