

This repo is a practive in integrating data version control into a machine learning/data science pipeline. The goal is to take advantage of the robust pipeline and metric comparison tools, amongst other features, provided by DVC.


The dataset is publiclly available on Kaggle at this link.

Docs and Code

The documentation lives here.

The code lives here.

Known Bugs

  • None

Recent Changes

Please see the CHANGELOG

Next Steps

  • Initiate data version control with DVC.