Created project spec and plan using LLMs

2025-04-12 09:35:18 +01:00
commit 8c6de7380f
3 changed files with 641 additions and 0 deletions
--- a/todo.md
+++ b/todo.md
@@ -0,0 +1,133 @@
+# Project To-Do List
+
+This list outlines the steps required to complete the Torchvision Finetuning project, derived from `prompt_plan.md`.
+
+## Phase 1: Foundation & Setup
+
+- [ ] Set up project structure (directories: `configs`, `data`, `models`, `utils`, `tests`, `scripts`).
+- [ ] Initialize Git repository.
+- [ ] Create `.gitignore` file (ignore `data`, `outputs`, `logs`, `.venv`, caches, `*.pth`).
+- [ ] Initialize `pyproject.toml` using `uv init`, set Python 3.10.
+- [ ] Add core dependencies (`torch`, `torchvision`, `ruff`, `numpy`, `Pillow`, `pytest`) using `uv add`.
+- [ ] Create `pre-commit-config.yaml` and configure `ruff` hooks (format, lint, import sort).
+- [ ] Create `__init__.py` files in necessary directories.
+- [ ] Create empty placeholder files (`train.py`, `test.py`, `configs/base_config.py`, `utils/data_utils.py`, `models/detection.py`, `tests/conftest.py`).
+- [ ] Create basic `README.md`.
+- [ ] Install pre-commit hooks (`pre-commit install`).
+- [ ] Create `scripts/download_data.sh` script.
+    - [ ] Check if data exists.
+    - [ ] Create `data/` directory.
+    - [ ] Use `wget` to download PennFudanPed dataset.
+    - [ ] Use `unzip` to extract data.
+    - [ ] Remove zip file after extraction.
+    - [ ] Add informative print messages.
+    - [ ] Make script executable (`chmod +x`).
+- [ ] Ensure `.gitignore` ignores `data/`.
+- [ ] Implement base configuration in `configs/base_config.py` (`base_config` dictionary).
+- [ ] Implement specific experiment configuration in `configs/pennfudan_maskrcnn_config.py` (`config` dictionary, importing/updating base config).
+
+## Phase 2: Data Handling & Model
+
+- [ ] Implement `PennFudanDataset` class in `utils/data_utils.py`.
+    - [ ] `__init__`: Load image and mask paths.
+    - [ ] `__getitem__`: Load image/mask, parse masks, generate targets (boxes, labels, masks, image_id, area, iscrowd), apply transforms.
+    - [ ] `__len__`: Return dataset size.
+- [ ] Implement `get_transform(train)` function in `utils/data_utils.py` (using `torchvision.transforms.v2`).
+- [ ] Implement `collate_fn(batch)` function in `utils/data_utils.py`.
+- [ ] Implement `get_maskrcnn_model(num_classes, ...)` function in `models/detection.py`.
+    - [ ] Load pre-trained Mask R-CNN (`maskrcnn_resnet50_fpn_v2`).
+    - [ ] Replace box predictor head (`FastRCNNPredictor`).
+    - [ ] Replace mask predictor head (`MaskRCNNPredictor`).
+
+## Phase 3: Training Script & Core Logic
+
+- [ ] Set up basic `train.py` structure.
+    - [ ] Add imports.
+    - [ ] Implement `argparse` for `--config` argument.
+    - [ ] Implement dynamic config loading (`importlib`).
+    - [ ] Set random seeds.
+    - [ ] Determine compute device (`cuda` or `cpu`).
+    - [ ] Create output directory structure (`outputs/<config_name>/checkpoints`).
+    - [ ] Instantiate `PennFudanDataset` (train).
+    - [ ] Instantiate `DataLoader` (train) using `collate_fn`.
+    - [ ] Instantiate model using `get_maskrcnn_model`.
+    - [ ] Move model to device.
+    - [ ] Add `if __name__ == "__main__":` guard.
+- [ ] Implement minimal training step in `train.py`.
+    - [ ] Instantiate optimizer (`torch.optim.SGD`).
+    - [ ] Set `model.train()`.
+    - [ ] Fetch one batch.
+    - [ ] Move data to device.
+    - [ ] Perform forward pass (`loss_dict = model(...)`).
+    - [ ] Calculate total loss (`sum(...)`).
+    - [ ] Perform backward pass (`optimizer.zero_grad()`, `loss.backward()`, `optimizer.step()`).
+    - [ ] Print/log loss for the single step (and temporarily exit).
+- [ ] Implement logging setup in `utils/log_utils.py` (`setup_logging` function).
+    - [ ] Configure `logging.basicConfig` for file and console output.
+- [ ] Integrate logging into `train.py`.
+    - [ ] Call `setup_logging`.
+    - [ ] Replace `print` with `logging.info`.
+    - [ ] Log config, device, and training progress/losses.
+- [ ] Implement full training loop in `train.py`.
+    - [ ] Remove single-step exit.
+    - [ ] Add LR scheduler (`torch.optim.lr_scheduler.StepLR`).
+    - [ ] Add epoch loop.
+    - [ ] Add batch loop, integrating the single training step logic.
+    - [ ] Log loss periodically within the batch loop.
+    - [ ] Step the LR scheduler at the end of each epoch.
+    - [ ] Log total training time.
+- [ ] Implement checkpointing in `train.py`.
+    - [ ] Define checkpoint directory.
+    - [ ] Implement logic to find and load the latest checkpoint (resume training).
+    - [ ] Save checkpoints periodically (based on frequency or final epoch).
+        - [ ] Include epoch, model state, optimizer state, scheduler state, config.
+    - [ ] Log checkpoint loading/saving.
+
+## Phase 4: Evaluation & Testing
+
+- [ ] Add evaluation dependencies (`pycocotools` - optional initially).
+- [ ] Create `utils/eval_utils.py` and implement `evaluate` function.
+    - [ ] Set `model.eval()`.
+    - [ ] Use `torch.no_grad()`.
+    - [ ] Loop through validation/test dataloader.
+    - [ ] Perform forward pass.
+    - [ ] Calculate/aggregate metrics (start with average loss, potentially add mAP later).
+    - [ ] Log evaluation metrics and time.
+    - [ ] Return metrics.
+- [ ] Integrate evaluation into `train.py`.
+    - [ ] Create validation `Dataset` and `DataLoader` (using `torch.utils.data.Subset`).
+    - [ ] Call `evaluate` at the end of each epoch.
+    - [ ] Log validation metrics.
+    - [ ] (Later) Implement logic to save the *best* model based on validation metric.
+- [ ] Implement `test.py` script.
+    - [ ] Reuse argument parsing, config loading, device setup, dataset/dataloader (test split), model creation from `train.py`.
+    - [ ] Add `--checkpoint` argument.
+    - [ ] Load model weights from the specified checkpoint.
+    - [ ] Call `evaluate` function using the test dataloader.
+    - [ ] Log/print final evaluation results.
+    - [ ] Setup logging for testing (e.g., `test.log`).
+- [ ] Create unit tests in `tests/` using `pytest`.
+    - [ ] `tests/test_config.py`: Test config loading.
+    - [ ] `tests/test_model.py`: Test model creation and head configuration.
+    - [ ] `tests/test_data_utils.py`: Test dataset instantiation, length, and item format (requires data).
+    - [ ] (Optional) Use fixtures in `tests/conftest.py` if needed.
+- [ ] Add `pytest` execution to `pre-commit-config.yaml`.
+- [ ] Test pre-commit hooks (`pre-commit run --all-files`).
+
+## Phase 5: Refinement & Documentation
+
+- [ ] Refine error handling in `train.py` and `test.py` (`try...except`).
+- [ ] Add configuration validation checks.
+- [ ] Improve evaluation metrics (e.g., implement mAP in `evaluate` function).
+- [ ] Add more data augmentations to `get_transform(train=True)`.
+- [ ] Expand `README.md` significantly.
+    - [ ] Goals
+    - [ ] Detailed Setup
+    - [ ] Configuration explanation
+    - [ ] Training instructions (including resuming)
+    - [ ] Testing instructions
+    - [ ] Project Structure overview
+    - [ ] Dependencies list
+    - [ ] (Optional) Results section
+- [ ] Perform final code quality checks (`ruff format .`, `ruff check . --fix`).
+- [ ] Ensure all pre-commit hooks pass.