NanoI2V is an open-source project that implements an Image-to-Video generation model from scratch, aiming to provide a clean and educational reference for modern video generation techniques. The repository prioritizes readability and reproducibility over the complexity found in most state-of-the-art projects.

  • Implements core components in a modular fashion using PyTorch.
  • Covers Transformer-based architectures and diffusion or flow-matching training methods.
  • Provides independent, modifiable components for experimentation with the generation pipeline.
  • Focuses on explaining building blocks rather than wrapping existing black-box models.

The project is designed to help researchers and students understand how video generation pieces fit together by avoiding thousands of lines of framework code.