A Blind Visual Paradigm for Testing Skill Transfer in Small Models Without Fine-Tuning

The author proposes a cross-domain, blind visual experiment to determine if a large language model can compress its procedural planning into a reusable scaffold that enhances a small model's output without fine-tuning. Using Three.js as the testbed, the study aims to prove that this transfer of skill is genuine and not merely overfitting to the source domain.

The baseline compares outputs from a large model (Model A) and a small 9B parameter model (Model B) on two distinct prompts: a cinematic scene featuring Michael Jackson and other figures, and a low-poly BMPT-72 turret.
The hypothesis posits that Model A can extract a "Procedural Scaffold" containing general construction principles rather than specific answers to the source prompt.
Validation involves applying this scaffold to Model B for the second task and using a fresh instance of a large model (Model C) as a blind judge with zero context about the experiment.
Model C scores the rendered images quantitatively on visual quality, silhouette recognition, structural coherence, and detail density to determine if the scaffolded small model's output improves relative to the large model's baseline.

This setup is intended to serve as a paradigm for proving post-training skill generalization by demonstrating that procedural knowledge can be transferred across semantically distinct domains within the same platform.