YYZ / 2013
LHR / 1993

Reinforcement Learning

Cliff Walking Env - Temporal Difference Models

Sarsa(0), SarsaMax(Q-Learning), Expected Sarsa Algorithms

Blackjack Env 2.0 - Monte Carlo Problems

Monte Carlo Random vs. First Visit Policy Algorithm

Monte Carlo Alpha GLIE Algorithm

Writeups Coming Soon:

Banana Search Game

Ping Pong


Some of my notes:

Data Visualisation Experiments

Using Opacity to Uncover Distribution Patterns:

An Early Attempt at Mapping a Complete Food Recommendation, CRM & Delivery Platform:


︎︎ Heading Home?
(Or use your ︎ and ︎ buttons to navigate site)