Concatenating LLM-generated features to graph neural networks systematically reduces accuracy on homophilous benchmarks, with PubMed accuracy dropping by -17.0 ± 0.3 pp. This degradation is linked to LLM-alone discriminability (Delta_sig), which correlates strongly with concatenation cost (r² = 0.38) and shows a power law relationship with feature dimension and node count (r² = 0.97), particularly in low-Delta_sig, low-node scenarios.
LLM Features Can Hurt GNNs via Concatenation Interference
from English