Final Project: Independent Model Development

Final Project: Independent Model Development#

Objective
Demonstrate your end‑to‑end mastery of the ML workflow by independently choosing a dataset, building at least three Keras models (baseline + two variations), tuning hyper‑parameters, and analyzing the results in a concise written report.

High‑Level Requirements#

Independence: You are responsible for every decision—data prep, architecture, hyper‑parameters, evaluation, and write‑up.
Dataset:
- Option 1 – Source any public dataset that fits the guidelines below:
  - You may not use any of the datasets available from the Keras API (https://keras.io/api/datasets/) or that we’ve used in class.
  - Your full training loop (one model) must finish in ≤ 60 minutes on either a free Google Colab CPU or GPU, or on your computer.
- Option 2 – Select one dataset from the following list:
Models: Minimum three distinct experiments (baseline + two variations). Pre‑trained models are allowed if you perform additional fine‑tuning.
Metric: Choose one performance metric appropriate for your task (e.g., Accuracy for classification, MSE for regression). Justify your choice.
Deliverable: A single Jupyter notebook (.ipynb) containing:
1. All executable code.
2. Plots / tables you generate.
3. A 500‑–1000‑word written report answering the guiding questions below.

Guiding Questions (address these in your report)#

Model Goal - Clearly articulate what you are training your model to do.
Model Performance – How does each model perform on your chosen metric? Rank and explain performance differences.
Design Choices – Why did you select each architecture and hyper‑parameter configuration?
Data Preparation – What did you do to prepare and split the data, and why?
Baseline Comparison – What simple baseline did you implement, and how do your models compare?
Learning Curves – Include and interpret training vs. validation loss/metric plots. What do they say about under/over‑fitting?
Error Analysis – Can you find any patterns or systematic issues in the errors your model makes?
Future Work – If you had 10 more hours, what would you try next, and why?

Grading Rubric#

Component	Weight
Code Quality & Reproducibility	20 %
Experimental Rigor (baseline + 2 variations, clear methodology, proper validation)	40 %
Written Analysis (clarity, depth, answers to guiding questions)	40 %

▶️ Start Your Project Below#

Replace the TODO placeholders with your own code and analysis. Feel free to add or reorder cells as needed.

# TODO: Import libraries and set random seeds

# TODO: Load and inspect your dataset

# TODO: Clean, transform, and split the data

# TODO: Build and evaluate a *baseline* model

# TODO: Build and evaluate **Model 1** (first variation)

# TODO: Build and evaluate **Model 2** (second variation)

# TODO: Optional – additional models or hyper‑parameter sweeps

# TODO: Plot learning curves (training vs. validation)

# TODO: Perform error analysis

📄 Written Report (500–1000 words)#

Respond to each guiding question here. Use Markdown formatting as you like.