Escape the Castle

Introduction:

The assignment for my CS5100: Foundations in Artificial Intelligence class served as an exploration of Markov Decision Processes (MDP's), specifically through Model Based Monte Carlo (MBMC) and Model Free Monte Carlo (MFMC) methods.

MDPs, MBMC, and MFMC:

Markov Decision Process: Given a stochastic environment, where an action taken from a state is not guaranteed to lead to a specific successor state, you evaluate policies, or mapping from state to action. After evaluating every known policy, the policy that maximizes your reward is your best policy.

MBMC: A reinforcement learning (RL) method where, assuming an underlying MDP, data is stored and used to determine the optimal policy at each state. This method is computationally expensive and inflexible.

MFMC: Another RL method that uses Q-tables to iteratively update a running average of reward value for every state action pair. Given this example, there are 6 total actions, 25 different cells, 3 agent health levels, and 4 different guards (or no guard) that can occur which results in a hash map representing 375 state action pairs and containing the running average of the reward for taking that action given that state.

During MFMC Training

After MFMC Training

Assignment

Code

See Disclaimer on Projects Page