6.8 KiB
6.8 KiB
Project To-Do List
This list outlines the steps required to complete the Torchvision Finetuning project, derived from prompt_plan.md.
Phase 1: Foundation & Setup
- Set up project structure (directories:
configs,data,models,utils,tests,scripts). - Initialize Git repository.
- Create
.gitignorefile (ignoredata,outputs,logs,.venv, caches,*.pth). - Initialize
pyproject.tomlusinguv init, set Python 3.10. - Add core dependencies (
torch,torchvision,ruff,numpy,Pillow,pytest) usinguv add. - Create
pre-commit-config.yamland configureruffhooks (format, lint, import sort). - Create
__init__.pyfiles in necessary directories. - Create empty placeholder files (
train.py,test.py,configs/base_config.py,utils/data_utils.py,models/detection.py,tests/conftest.py). - Create basic
README.md. - Install pre-commit hooks (
pre-commit install). - Create
scripts/download_data.shscript.- Check if data exists.
- Create
data/directory. - Use
wgetto download PennFudanPed dataset. - Use
unzipto extract data. - Remove zip file after extraction.
- Add informative print messages.
- Make script executable (
chmod +x).
- Ensure
.gitignoreignoresdata/. - Implement base configuration in
configs/base_config.py(base_configdictionary). - Implement specific experiment configuration in
configs/pennfudan_maskrcnn_config.py(configdictionary, importing/updating base config).
Phase 2: Data Handling & Model
- Implement
PennFudanDatasetclass inutils/data_utils.py.__init__: Load image and mask paths.__getitem__: Load image/mask, parse masks, generate targets (boxes, labels, masks, image_id, area, iscrowd), apply transforms.__len__: Return dataset size.
- Implement
get_transform(train)function inutils/data_utils.py(usingtorchvision.transforms.v2). - Implement
collate_fn(batch)function inutils/data_utils.py. - Implement
get_maskrcnn_model(num_classes, ...)function inmodels/detection.py.- Load pre-trained Mask R-CNN (
maskrcnn_resnet50_fpn_v2). - Replace box predictor head (
FastRCNNPredictor). - Replace mask predictor head (
MaskRCNNPredictor).
- Load pre-trained Mask R-CNN (
Phase 3: Training Script & Core Logic
- Set up basic
train.pystructure.- Add imports.
- Implement
argparsefor--configargument. - Implement dynamic config loading (
importlib). - Set random seeds.
- Determine compute device (
cudaorcpu). - Create output directory structure (
outputs/<config_name>/checkpoints). - Instantiate
PennFudanDataset(train). - Instantiate
DataLoader(train) usingcollate_fn. - Instantiate model using
get_maskrcnn_model. - Move model to device.
- Add
if __name__ == "__main__":guard.
- Implement minimal training step in
train.py.- Instantiate optimizer (
torch.optim.SGD). - Set
model.train(). - Fetch one batch.
- Move data to device.
- Perform forward pass (
loss_dict = model(...)). - Calculate total loss (
sum(...)). - Perform backward pass (
optimizer.zero_grad(),loss.backward(),optimizer.step()). - Print/log loss for the single step (and temporarily exit).
- Instantiate optimizer (
- Implement logging setup in
utils/log_utils.py(setup_loggingfunction).- Configure
logging.basicConfigfor file and console output.
- Configure
- Integrate logging into
train.py.- Call
setup_logging. - Replace
printwithlogging.info. - Log config, device, and training progress/losses.
- Call
- Implement full training loop in
train.py.- Remove single-step exit.
- Add LR scheduler (
torch.optim.lr_scheduler.StepLR). - Add epoch loop.
- Add batch loop, integrating the single training step logic.
- Log loss periodically within the batch loop.
- Step the LR scheduler at the end of each epoch.
- Log total training time.
- Implement checkpointing in
train.py.- Define checkpoint directory.
- Implement logic to find and load the latest checkpoint (resume training).
- Save checkpoints periodically (based on frequency or final epoch).
- Include epoch, model state, optimizer state, scheduler state, config.
- Log checkpoint loading/saving.
Phase 4: Evaluation & Testing
- Add evaluation dependencies (
pycocotools- optional initially). - Create
utils/eval_utils.pyand implementevaluatefunction.- Set
model.eval(). - Use
torch.no_grad(). - Loop through validation/test dataloader.
- Perform forward pass.
- Calculate/aggregate metrics (start with average loss, potentially add mAP later).
- Log evaluation metrics and time.
- Return metrics.
- Set
- Integrate evaluation into
train.py.- Create validation
DatasetandDataLoader(usingtorch.utils.data.Subset). - Call
evaluateat the end of each epoch. - Log validation metrics.
- (Later) Implement logic to save the best model based on validation metric.
- Create validation
- Implement
test.pyscript.- Reuse argument parsing, config loading, device setup, dataset/dataloader (test split), model creation from
train.py. - Add
--checkpointargument. - Load model weights from the specified checkpoint.
- Call
evaluatefunction using the test dataloader. - Log/print final evaluation results.
- Setup logging for testing (e.g.,
test.log).
- Reuse argument parsing, config loading, device setup, dataset/dataloader (test split), model creation from
- Create unit tests in
tests/usingpytest.tests/test_config.py: Test config loading.tests/test_model.py: Test model creation and head configuration.tests/test_data_utils.py: Test dataset instantiation, length, and item format (requires data).- (Optional) Use fixtures in
tests/conftest.pyif needed.
- Add
pytestexecution topre-commit-config.yaml. - Test pre-commit hooks (
pre-commit run --all-files).
Phase 5: Refinement & Documentation
- Refine error handling in
train.pyandtest.py(try...except). - Add configuration validation checks.
- Improve evaluation metrics (e.g., implement mAP in
evaluatefunction). - Add more data augmentations to
get_transform(train=True). - Expand
README.mdsignificantly.- Goals
- Detailed Setup
- Configuration explanation
- Training instructions (including resuming)
- Testing instructions
- Project Structure overview
- Dependencies list
- (Optional) Results section
- Perform final code quality checks (
ruff format .,ruff check . --fix). - Ensure all pre-commit hooks pass.