Yesterday we started on a new challange of building tetrix-solving AI. Russell and Norvig give insight into how a rational agent can be structured using a state machine , a utility function, and a tree searching algorithm. We have the first two, and a failing test:
[info] Solver should
[info] + pick MoveLeft for s1
[info] x pick Drop for s3
[error] 'MoveLeft' is not equal to 'Drop' (AgentSpec.scala:32)
First we need to lay out the things we know, which is the possible moves and corresponding state transition function:
private[this] val possibleMoves: Seq[StageMessage] =
Seq(MoveLeft, MoveRight, RotateCW, Tick, Drop)
private[this] def toTrans(message: StageMessage): GameState => GameState =
message match {
case MoveLeft => moveLeft
case MoveRight => moveRight
case RotateCW => rotateCW
case Tick => tick
case Drop => drop
}
To implement “What if I do action A?”, use possibleMoves
, toTrans
, and the given state s0
to emulate the next state. We can then use utility
function to calculate the happiness and pick the move that maximizes the utility.
def bestMove(s0: GameState): StageMessage = {
var retval: StageMessage = MoveLeft
var current: Double = minUtility
possibleMoves foreach { move =>
val u = utility(toTrans(move)(s0))
if (u > current) {
current = u
retval = move
} // if
}
retval
}
The implementation looks imperative, but it’s fine as long as it’s within the method. We now have the first version of the solver. To prevent the agent from cheating, we need to create a GameMasterActor
, which issues BestMove(s)
message to the agent actor:
sealed trait AgentMessage
case class BestMove(s: GameState) extends AgentMessage
Here are the actor implementations:
class AgentActor(stageActor: ActorRef) extends Actor {
private[this] val agent = new Agent
def receive = {
case BestMove(s: GameState) =>
val message = agent.bestMove(s)
println("selected " + message)
stageActor ! message
}
}
class GameMasterActor(stateActor: ActorRef, agentActor: ActorRef) extends Actor {
def receive = {
case Tick =>
val s = getState
if (s.status != GameOver) {
agentActor ! BestMove(getState)
}
}
private[this] def getState: GameState = {
val future = (stateActor ? GetState)(1 second).mapTo[GameState]
Await.result(future, 1 second)
}
}
This surprisingly simple yet powerful. Since the whole point of calculating the best move is to make the move, the agent actor can send it out to a stageActor
directly. Let’s hook these up:
private[this] val system = ActorSystem("TetrixSystem")
private[this] val stateActor = system.actorOf(Props(new StateActor(
initialState)), name = "stateActor")
private[this] val playerActor = system.actorOf(Props(new StageActor(
stateActor)), name = "playerActor")
private[this] val agentActor = system.actorOf(Props(new AgentActor(
playerActor)), name = "agentActor")
private[this] val masterActor = system.actorOf(Props(new GameMasterActor(
stateActor, agentActor)), name = "masterActor")
private[this] val tickTimer = system.scheduler.schedule(
0 millisecond, 700 millisecond, playerActor, Tick)
private[this] val masterTickTimer = system.scheduler.schedule(
0 millisecond, 700 millisecond, masterActor, Tick)