The author pretrained a 500M parameter language model and a 330M parameter image generator from scratch using 40B tokens from fineweb. The image generator was inspired by ByteDance's DreamLite architecture and trained on a mixture of datasets from MidJourney, Flux, and CCW3.
I pretrained and post trained a 500M parameter LLM and 330M parameter Image generator from scratch
from English