Use Agent to create full readme, and add note about this project at the top.

2025-04-15 13:58:47 +01:00
parent 046e36678e
commit baba9b9b9f
1 changed files with 230 additions and 1 deletions
--- a/README.md
+++ b/README.md
@@ -1,3 +1,232 @@
+This project was a test run using Cursor and "vibe coding" to create a full object detection project. I wrote almost no lines of code to get to this point, which 
+kind of works. The technology is definitely impressive, but really feels more suited to things that can be developed in a more test-driven way. I'll update this later 
+with other things I've learned along the way. 
+
 # Torchvision Vibecoding Project

-A project demonstrating finetuning torchvision object detection models, built with the help of Vibecoding AI.
+A PyTorch-based object detection project using Mask R-CNN to detect pedestrians in the Penn-Fudan dataset. This project demonstrates model training, evaluation, and visualization with PyTorch and Torchvision.
+
+## Table of Contents
+
+- [Prerequisites](#prerequisites)
+- [Project Setup](#project-setup)
+- [Project Structure](#project-structure)
+- [Data Preparation](#data-preparation)
+- [Configuration](#configuration)
+- [Training](#training)
+- [Evaluation](#evaluation)
+- [Visualization](#visualization)
+- [Testing](#testing)
+- [Debugging](#debugging)
+
+## Prerequisites
+
+- Python 3.10+
+- [uv](https://github.com/astral-sh/uv) for package management
+- CUDA-compatible GPU (optional but recommended)
+
+## Project Setup
+
+1. Clone the repository:
+```bash
+git clone https://github.com/yourusername/torchvision-vibecoding-project.git
+cd torchvision-vibecoding-project
+```
+
+2. Set up the environment with uv:
+```bash
+uv init
+uv sync
+```
+
+3. Install development dependencies:
+```bash
+uv add ruff pytest matplotlib
+```
+
+4. Set up pre-commit hooks:
+```bash
+pre-commit install
+```
+
+## Project Structure
+
+```
+├── configs/                 # Configuration files
+│   ├── base_config.py       # Base configuration with defaults
+│   ├── debug_config.py      # Configuration for quick debugging
+│   └── pennfudan_maskrcnn_config.py  # Configuration for Penn-Fudan dataset
+├── data/                    # Dataset directory (not tracked by git)
+│   └── PennFudanPed/        # Penn-Fudan pedestrian dataset
+├── models/                  # Model definitions
+│   └── detection.py         # Mask R-CNN model definition
+├── outputs/                 # Training outputs (not tracked by git)
+│   └── <config_name>/       # Named by configuration
+│       ├── checkpoints/     # Model checkpoints
+│       └── *.log            # Log files
+├── scripts/                 # Utility scripts
+│   ├── download_data.sh     # Script to download dataset
+│   ├── test_model.py        # Script for quick model testing
+│   └── visualize_predictions.py  # Script for prediction visualization
+├── tests/                   # Unit tests
+│   ├── conftest.py          # Test fixtures
+│   ├── test_data_utils.py   # Tests for data utilities
+│   ├── test_model.py        # Tests for model functionality
+│   └── test_visualization.py  # Tests for visualization
+├── utils/                   # Utility modules
+│   ├── common.py            # Common functionality
+│   ├── data_utils.py        # Dataset handling
+│   ├── eval_utils.py        # Evaluation functions
+│   └── log_utils.py         # Logging utilities
+├── train.py                 # Training script
+├── test.py                  # Evaluation script
+├── pyproject.toml           # Project dependencies and configuration
+├── .pre-commit-config.yaml  # Pre-commit configuration
+└── README.md                # This file
+```
+
+## Data Preparation
+
+Download the Penn-Fudan pedestrian dataset:
+
+```bash
+./scripts/download_data.sh
+```
+
+This will download and extract the dataset to the `data/PennFudanPed` directory.
+
+## Configuration
+
+The project uses Python dictionaries for configuration:
+
+- `configs/base_config.py`: Default configuration values
+- `configs/pennfudan_maskrcnn_config.py`: Configuration for training on Penn-Fudan
+- `configs/debug_config.py`: Configuration for quick testing (CPU, minimal training)
+
+Key configuration parameters:
+
+- `data_root`: Path to dataset
+- `output_dir`: Directory for outputs
+- `device`: Computing device ('cuda' or 'cpu')
+- `batch_size`: Batch size for training
+- `num_epochs`: Number of training epochs
+- `lr`, `momentum`, `weight_decay`: Optimizer parameters
+
+## Training
+
+Run the training script with a configuration file:
+
+```bash
+python train.py --config configs/pennfudan_maskrcnn_config.py
+```
+
+For quick debugging on CPU:
+
+```bash
+python train.py --config configs/debug_config.py
+```
+
+To resume training from the latest checkpoint:
+
+```bash
+python train.py --config configs/pennfudan_maskrcnn_config.py --resume
+```
+
+Training outputs (logs, checkpoints) are saved to `outputs/<config_name>/`.
+
+## Evaluation
+
+Evaluate a trained model:
+
+```bash
+python test.py --config configs/pennfudan_maskrcnn_config.py --checkpoint outputs/pennfudan_maskrcnn_v1/checkpoints/checkpoint_epoch_10.pth
+```
+
+This runs the model on the test dataset and reports metrics.
+
+## Visualization
+
+Visualize model predictions on images:
+
+```bash
+python scripts/visualize_predictions.py --config configs/pennfudan_maskrcnn_config.py --checkpoint outputs/pennfudan_maskrcnn_v1/checkpoints/checkpoint_epoch_10.pth --index 0 --output prediction.png
+```
+
+Parameters:
+- `--config`: Configuration file path
+- `--checkpoint`: Model checkpoint path
+- `--index`: Image index in dataset (default: 0)
+- `--threshold`: Detection confidence threshold (default: 0.5)
+- `--output`: Output image path (optional, displays interactively if not specified)
+
+## Testing
+
+Run all tests:
+
+```bash
+python -m pytest
+```
+
+Run specific test file:
+
+```bash
+python -m pytest tests/test_data_utils.py
+```
+
+Run tests with verbosity:
+
+```bash
+python -m pytest -v
+```
+
+## Debugging
+
+For quick model testing without full training:
+
+```bash
+python scripts/test_model.py
+```
+
+This verifies:
+- Model creation
+- Forward pass
+- Backward pass
+- Dataset loading
+
+For training with minimal resources:
+
+```bash
+python train.py --config configs/debug_config.py
+```
+
+This uses:
+- CPU computation
+- Minimal epochs (1)
+- Small batch size (1)
+- No multiprocessing
+
+## Code Quality
+
+Format code:
+
+```bash
+ruff format .
+```
+
+Run linter:
+
+```bash
+ruff check .
+```
+
+Fix auto-fixable issues:
+
+```bash
+ruff check --fix .
+```
+
+Run pre-commit checks:
+
+```bash
+pre-commit run --all-files
+```