docs: 添加 CIFAR-100 WideResNet 项目开发指南文档
创建详细的 AGENTS.md 文档,为项目贡献者提供完整的开发指导。文档涵盖了项目架构、模块组织、编码规范、测试方法以及内存配置管理。 主要内容包括: - 项目目录结构及核心模块功能说明 - 推荐的安装方式和运行命令 - Python 编码风格和命名规范 - 手动测试验证清单 - 提交信息和 PR 指南 - 不同 GPU 内存限制的配置预设 - 内存优化技术说明 这份文档将帮助新贡献者快速理解项目结构,遵循统一的开发规范,有效管理 GPU 资源。
This commit is contained in:
130
AGENTS.md
Normal file
130
AGENTS.md
Normal file
@@ -0,0 +1,130 @@
|
||||
# Repository Guidelines
|
||||
|
||||
This document provides coding conventions and workflow guidance for contributing to the CIFAR-100 WideResNet classification project.
|
||||
|
||||
## Project Structure & Module Organization
|
||||
|
||||
The project follows a standard `src` layout for clean packaging and distribution:
|
||||
|
||||
```
|
||||
Cifar100/
|
||||
├── src/cifar100/ # Core package modules
|
||||
│ ├── config.py # Model configuration presets (2GB/4GB/8GB)
|
||||
│ ├── data.py # Dataset loading and augmentation
|
||||
│ ├── model.py # WideResNet architecture implementation
|
||||
│ ├── trainer.py # Training and evaluation loops
|
||||
│ └── visualizer.py # Training metrics visualization
|
||||
├── main.py # Main training script entry point
|
||||
├── test_memory.py # GPU memory usage testing utility
|
||||
├── plots/ # Generated training visualization outputs
|
||||
└── dist/ # Build artifacts (wheel packages)
|
||||
```
|
||||
|
||||
**Key modules:**
|
||||
- `config.py`: Memory-optimized configurations for different GPU sizes
|
||||
- `model.py`: WideResNet blocks with GELU activation, dropout, and label smoothing
|
||||
- `trainer.py`: Gradient accumulation, label smoothing, early stopping logic
|
||||
- `visualizer.py`: Matplotlib-based training curve plotting
|
||||
|
||||
## Build, Test, and Development Commands
|
||||
|
||||
**Installation:**
|
||||
```bash
|
||||
# Recommended: Install with uv package manager
|
||||
uv sync
|
||||
|
||||
# Alternative: Install with pip
|
||||
pip install .
|
||||
```
|
||||
|
||||
**Run training:**
|
||||
```bash
|
||||
python main.py
|
||||
```
|
||||
|
||||
**Memory testing:**
|
||||
```bash
|
||||
python test_memory.py
|
||||
```
|
||||
|
||||
**Build distribution:**
|
||||
```bash
|
||||
python -m build
|
||||
```
|
||||
|
||||
## Coding Style & Naming Conventions
|
||||
|
||||
- **Indentation**: 4 spaces (Python PEP 8 standard)
|
||||
- **Imports**: Group standard library, third-party, and local imports separately
|
||||
- **Docstrings**: Use triple quotes for module and function documentation
|
||||
- **Naming**:
|
||||
- Classes: `PascalCase` (e.g., `WideResNet`, `TrainingVisualizer`)
|
||||
- Functions/methods: `snake_case` (e.g., `train_epoch`, `get_config`)
|
||||
- Constants: `UPPER_SNAKE_CASE` (e.g., `CONFIG_4GB`)
|
||||
- Private methods: prefix with underscore (e.g., `_initialize_weights`)
|
||||
- **File naming**: Use `snake_case` for Python modules
|
||||
|
||||
**Code patterns:**
|
||||
- Use `torch.device` for GPU/CPU device management
|
||||
- Apply gradient accumulation via `accumulation_steps` parameter
|
||||
- Include memory optimization: `torch.cuda.empty_cache()`, `gc.collect()`
|
||||
- Use `non_blocking=True` for async GPU transfers
|
||||
|
||||
## Testing Guidelines
|
||||
|
||||
This project does not currently have a formal test suite. For validation:
|
||||
|
||||
- Run `test_memory.py` to verify GPU memory usage across configurations
|
||||
- Monitor training metrics (loss/accuracy) to validate model changes
|
||||
- Check visualizations in `plots/` directory after training runs
|
||||
|
||||
**Manual validation checklist:**
|
||||
- Model loads without errors
|
||||
- Training completes at least 10 epochs
|
||||
- Memory usage stays within GPU limits
|
||||
- Plots are generated correctly
|
||||
|
||||
## Commit & Pull Request Guidelines
|
||||
|
||||
**Commit message format** (based on project history):
|
||||
```
|
||||
<type>: <brief description>
|
||||
|
||||
Types:
|
||||
- docs: Documentation updates (README, comments)
|
||||
- build: Build system or dependencies
|
||||
- chore: Maintenance tasks (version bumps, cleanup)
|
||||
- feat: New features
|
||||
- fix: Bug fixes
|
||||
```
|
||||
|
||||
**Examples:**
|
||||
- `docs: Update README.md with installation instructions`
|
||||
- `build: Add build artifacts to dist/`
|
||||
- `chore: Fix version badge repository link`
|
||||
|
||||
**Before committing:**
|
||||
1. Verify code runs without errors
|
||||
2. Test memory usage if modifying model architecture
|
||||
3. Update relevant documentation (README, docstrings)
|
||||
4. Clean up any debug print statements
|
||||
|
||||
## Configuration & Memory Management
|
||||
|
||||
**Memory presets** in `config.py`:
|
||||
- `2gb`: WRN-16-2, batch_size=32, accumulation_steps=4
|
||||
- `4gb`: WRN-22-4, batch_size=64, accumulation_steps=2 (default)
|
||||
- `8gb`: WRN-28-10, batch_size=128, accumulation_steps=1
|
||||
|
||||
**To modify configuration:**
|
||||
```python
|
||||
config = get_config("4gb") # Choose preset
|
||||
# Or override specific parameters
|
||||
config["learning_rate"] = 0.05
|
||||
```
|
||||
|
||||
**Memory optimization techniques used:**
|
||||
- Gradient accumulation to simulate larger batch sizes
|
||||
- Periodic `torch.cuda.empty_cache()` calls
|
||||
- `cudnn.benchmark = True` for performance
|
||||
- Gradient clipping at `max_norm=1.0`
|
||||
Reference in New Issue
Block a user