docs: 添加 CIFAR-100 WideResNet 项目开发指南文档

创建详细的 AGENTS.md 文档,为项目贡献者提供完整的开发指导。文档涵盖了项目架构、模块组织、编码规范、测试方法以及内存配置管理。

主要内容包括:
- 项目目录结构及核心模块功能说明
- 推荐的安装方式和运行命令
- Python 编码风格和命名规范
- 手动测试验证清单
- 提交信息和 PR 指南
- 不同 GPU 内存限制的配置预设
- 内存优化技术说明

这份文档将帮助新贡献者快速理解项目结构,遵循统一的开发规范,有效管理 GPU 资源。
This commit is contained in:
drd_vic
2025-11-18 18:29:33 +08:00
parent 8548ae416e
commit 361f082477

130
AGENTS.md Normal file
View File

@@ -0,0 +1,130 @@
# Repository Guidelines
This document provides coding conventions and workflow guidance for contributing to the CIFAR-100 WideResNet classification project.
## Project Structure & Module Organization
The project follows a standard `src` layout for clean packaging and distribution:
```
Cifar100/
├── src/cifar100/ # Core package modules
│ ├── config.py # Model configuration presets (2GB/4GB/8GB)
│ ├── data.py # Dataset loading and augmentation
│ ├── model.py # WideResNet architecture implementation
│ ├── trainer.py # Training and evaluation loops
│ └── visualizer.py # Training metrics visualization
├── main.py # Main training script entry point
├── test_memory.py # GPU memory usage testing utility
├── plots/ # Generated training visualization outputs
└── dist/ # Build artifacts (wheel packages)
```
**Key modules:**
- `config.py`: Memory-optimized configurations for different GPU sizes
- `model.py`: WideResNet blocks with GELU activation, dropout, and label smoothing
- `trainer.py`: Gradient accumulation, label smoothing, early stopping logic
- `visualizer.py`: Matplotlib-based training curve plotting
## Build, Test, and Development Commands
**Installation:**
```bash
# Recommended: Install with uv package manager
uv sync
# Alternative: Install with pip
pip install .
```
**Run training:**
```bash
python main.py
```
**Memory testing:**
```bash
python test_memory.py
```
**Build distribution:**
```bash
python -m build
```
## Coding Style & Naming Conventions
- **Indentation**: 4 spaces (Python PEP 8 standard)
- **Imports**: Group standard library, third-party, and local imports separately
- **Docstrings**: Use triple quotes for module and function documentation
- **Naming**:
- Classes: `PascalCase` (e.g., `WideResNet`, `TrainingVisualizer`)
- Functions/methods: `snake_case` (e.g., `train_epoch`, `get_config`)
- Constants: `UPPER_SNAKE_CASE` (e.g., `CONFIG_4GB`)
- Private methods: prefix with underscore (e.g., `_initialize_weights`)
- **File naming**: Use `snake_case` for Python modules
**Code patterns:**
- Use `torch.device` for GPU/CPU device management
- Apply gradient accumulation via `accumulation_steps` parameter
- Include memory optimization: `torch.cuda.empty_cache()`, `gc.collect()`
- Use `non_blocking=True` for async GPU transfers
## Testing Guidelines
This project does not currently have a formal test suite. For validation:
- Run `test_memory.py` to verify GPU memory usage across configurations
- Monitor training metrics (loss/accuracy) to validate model changes
- Check visualizations in `plots/` directory after training runs
**Manual validation checklist:**
- Model loads without errors
- Training completes at least 10 epochs
- Memory usage stays within GPU limits
- Plots are generated correctly
## Commit & Pull Request Guidelines
**Commit message format** (based on project history):
```
<type>: <brief description>
Types:
- docs: Documentation updates (README, comments)
- build: Build system or dependencies
- chore: Maintenance tasks (version bumps, cleanup)
- feat: New features
- fix: Bug fixes
```
**Examples:**
- `docs: Update README.md with installation instructions`
- `build: Add build artifacts to dist/`
- `chore: Fix version badge repository link`
**Before committing:**
1. Verify code runs without errors
2. Test memory usage if modifying model architecture
3. Update relevant documentation (README, docstrings)
4. Clean up any debug print statements
## Configuration & Memory Management
**Memory presets** in `config.py`:
- `2gb`: WRN-16-2, batch_size=32, accumulation_steps=4
- `4gb`: WRN-22-4, batch_size=64, accumulation_steps=2 (default)
- `8gb`: WRN-28-10, batch_size=128, accumulation_steps=1
**To modify configuration:**
```python
config = get_config("4gb") # Choose preset
# Or override specific parameters
config["learning_rate"] = 0.05
```
**Memory optimization techniques used:**
- Gradient accumulation to simulate larger batch sizes
- Periodic `torch.cuda.empty_cache()` calls
- `cudnn.benchmark = True` for performance
- Gradient clipping at `max_norm=1.0`