diff --git a/README.md b/README.md index 8216e6a..6759661 100644 --- a/README.md +++ b/README.md @@ -1,3 +1,232 @@ +This project was a test run using Cursor and "vibe coding" to create a full object detection project. I wrote almost no lines of code to get to this point, which +kind of works. The technology is definitely impressive, but really feels more suited to things that can be developed in a more test-driven way. I'll update this later +with other things I've learned along the way. + # Torchvision Vibecoding Project -A project demonstrating finetuning torchvision object detection models, built with the help of Vibecoding AI. +A PyTorch-based object detection project using Mask R-CNN to detect pedestrians in the Penn-Fudan dataset. This project demonstrates model training, evaluation, and visualization with PyTorch and Torchvision. + +## Table of Contents + +- [Prerequisites](#prerequisites) +- [Project Setup](#project-setup) +- [Project Structure](#project-structure) +- [Data Preparation](#data-preparation) +- [Configuration](#configuration) +- [Training](#training) +- [Evaluation](#evaluation) +- [Visualization](#visualization) +- [Testing](#testing) +- [Debugging](#debugging) + +## Prerequisites + +- Python 3.10+ +- [uv](https://github.com/astral-sh/uv) for package management +- CUDA-compatible GPU (optional but recommended) + +## Project Setup + +1. Clone the repository: +```bash +git clone https://github.com/yourusername/torchvision-vibecoding-project.git +cd torchvision-vibecoding-project +``` + +2. Set up the environment with uv: +```bash +uv init +uv sync +``` + +3. Install development dependencies: +```bash +uv add ruff pytest matplotlib +``` + +4. Set up pre-commit hooks: +```bash +pre-commit install +``` + +## Project Structure + +``` +├── configs/ # Configuration files +│ ├── base_config.py # Base configuration with defaults +│ ├── debug_config.py # Configuration for quick debugging +│ └── pennfudan_maskrcnn_config.py # Configuration for Penn-Fudan dataset +├── data/ # Dataset directory (not tracked by git) +│ └── PennFudanPed/ # Penn-Fudan pedestrian dataset +├── models/ # Model definitions +│ └── detection.py # Mask R-CNN model definition +├── outputs/ # Training outputs (not tracked by git) +│ └── / # Named by configuration +│ ├── checkpoints/ # Model checkpoints +│ └── *.log # Log files +├── scripts/ # Utility scripts +│ ├── download_data.sh # Script to download dataset +│ ├── test_model.py # Script for quick model testing +│ └── visualize_predictions.py # Script for prediction visualization +├── tests/ # Unit tests +│ ├── conftest.py # Test fixtures +│ ├── test_data_utils.py # Tests for data utilities +│ ├── test_model.py # Tests for model functionality +│ └── test_visualization.py # Tests for visualization +├── utils/ # Utility modules +│ ├── common.py # Common functionality +│ ├── data_utils.py # Dataset handling +│ ├── eval_utils.py # Evaluation functions +│ └── log_utils.py # Logging utilities +├── train.py # Training script +├── test.py # Evaluation script +├── pyproject.toml # Project dependencies and configuration +├── .pre-commit-config.yaml # Pre-commit configuration +└── README.md # This file +``` + +## Data Preparation + +Download the Penn-Fudan pedestrian dataset: + +```bash +./scripts/download_data.sh +``` + +This will download and extract the dataset to the `data/PennFudanPed` directory. + +## Configuration + +The project uses Python dictionaries for configuration: + +- `configs/base_config.py`: Default configuration values +- `configs/pennfudan_maskrcnn_config.py`: Configuration for training on Penn-Fudan +- `configs/debug_config.py`: Configuration for quick testing (CPU, minimal training) + +Key configuration parameters: + +- `data_root`: Path to dataset +- `output_dir`: Directory for outputs +- `device`: Computing device ('cuda' or 'cpu') +- `batch_size`: Batch size for training +- `num_epochs`: Number of training epochs +- `lr`, `momentum`, `weight_decay`: Optimizer parameters + +## Training + +Run the training script with a configuration file: + +```bash +python train.py --config configs/pennfudan_maskrcnn_config.py +``` + +For quick debugging on CPU: + +```bash +python train.py --config configs/debug_config.py +``` + +To resume training from the latest checkpoint: + +```bash +python train.py --config configs/pennfudan_maskrcnn_config.py --resume +``` + +Training outputs (logs, checkpoints) are saved to `outputs//`. + +## Evaluation + +Evaluate a trained model: + +```bash +python test.py --config configs/pennfudan_maskrcnn_config.py --checkpoint outputs/pennfudan_maskrcnn_v1/checkpoints/checkpoint_epoch_10.pth +``` + +This runs the model on the test dataset and reports metrics. + +## Visualization + +Visualize model predictions on images: + +```bash +python scripts/visualize_predictions.py --config configs/pennfudan_maskrcnn_config.py --checkpoint outputs/pennfudan_maskrcnn_v1/checkpoints/checkpoint_epoch_10.pth --index 0 --output prediction.png +``` + +Parameters: +- `--config`: Configuration file path +- `--checkpoint`: Model checkpoint path +- `--index`: Image index in dataset (default: 0) +- `--threshold`: Detection confidence threshold (default: 0.5) +- `--output`: Output image path (optional, displays interactively if not specified) + +## Testing + +Run all tests: + +```bash +python -m pytest +``` + +Run specific test file: + +```bash +python -m pytest tests/test_data_utils.py +``` + +Run tests with verbosity: + +```bash +python -m pytest -v +``` + +## Debugging + +For quick model testing without full training: + +```bash +python scripts/test_model.py +``` + +This verifies: +- Model creation +- Forward pass +- Backward pass +- Dataset loading + +For training with minimal resources: + +```bash +python train.py --config configs/debug_config.py +``` + +This uses: +- CPU computation +- Minimal epochs (1) +- Small batch size (1) +- No multiprocessing + +## Code Quality + +Format code: + +```bash +ruff format . +``` + +Run linter: + +```bash +ruff check . +``` + +Fix auto-fixable issues: + +```bash +ruff check --fix . +``` + +Run pre-commit checks: + +```bash +pre-commit run --all-files +```