Continual Learning

Note: Currently, the models are cheating. They memorize the past frame(s) and optical flow(s) and show those as the prediction of the next video frame(s) and optical flow(s). I am currently working on to fix that issue.

Next Frame(s) Prediction

Next Frame(s) Prediction Gif

Optical Flow(s) Prediction

Optical Flows(s) Prediction Gif

Data

The full dataset can be downloaded from here: http://clevrer.csail.mit.edu.

Training

The training process is summarized in the figures below.

Flow Reconstruction Model Image Reconstruction Model

Flow Reconstruction                         Image Reconstruction

Pipeline

Pipeline

After installing the libraries listed in requirements.txt, the training process can be started using the following code:

python train.py\
    --num_predictions 3\
    --embed_dim 512\
    --hidden_size 512\
    --stride 1\
    --num_frames 127\
    --resize_img 224\
    --patch_size 32

These are all optional parameters, and the code can also run with the simpler command:

python train.py