Decision Making

Note: I took the core codebase for environment simulation and related infrastructure in this repository from my assignments in the Deep Decision Making and Reinforcement Learning (DDRL) course at New York University. I only implemented the Behavior Cloning, Goal-Conditioned Behavior Cloning, DAgger, Double Q-Learning, Dueling DQN, and PPO algorithms. Their performance are shown in the GIFs below.

Environment

Environment

Expert Dataset

Changing Goal

Episode 1

Changing Goal, Episode 1

Episode 2

Changing Goal, Episode 2

Episode 3

Changing Goal, Episode 3

Fixed Goal

Episode 1

Fixed Goal, Episode 1

Episode 2

Fixed Goal, Episode 1

Episode 3

Fixed Goal, Episode 2

Behavior Cloning

Behavior Cloning

Changing Goal

Changing Goal

Fixed Goal

Fixed Goal

Multimodal

Multimodal

Goal Conditioned Behavior Cloning

Goal-Conditioned Behavior Cloning

Changing Goal

Changing Goal

Fixed Goal

Fixed Goal

Behavior Transformer

DAgger

Q-Learning

Double Q-Learning

Dueling Double Q-Learning

Reinforce

PPO