NanoI2V: Building an Image-to-Video Model from Scratch

NanoI2V is an open-source project that implements an Image-to-Video generation model from scratch, aiming to provide a clean and educational reference for modern video generation techniques. The repository prioritizes readability and reproducibility over the complexity found in most state-of-the-art projects.

Implements core components in a modular fashion using PyTorch.
Covers Transformer-based architectures and diffusion or flow-matching training methods.
Provides independent, modifiable components for experimentation with the generation pipeline.
Focuses on explaining building blocks rather than wrapping existing black-box models.

The project is designed to help researchers and students understand how video generation pieces fit together by avoiding thousands of lines of framework code.