YYZ / 2013
LHR / 1993

Reinforcement Learning










Cliff Walking Env - Temporal Difference Models




Sarsa(0), SarsaMax(Q-Learning), Expected Sarsa Algorithms







Blackjack Env 2.0 - Monte Carlo Problems







Monte Carlo Random vs. First Visit Policy Algorithm

Monte Carlo Alpha GLIE Algorithm





Writeups Coming Soon:





Banana Search Game



Ping Pong



Reacher






Some of my notes:











Data Visualisation Experiments







Using Opacity to Uncover Distribution Patterns:










An Early Attempt at Mapping a Complete Food Recommendation, CRM & Delivery Platform:






Mark