KARLA: Knowledge-base Augmented Retrieval for Language Models

The authors propose KARLA, a method enabling large language models to automatically retrieve factual knowledge from an external knowledge base during token generation. This approach allows factual updates without retraining the model and ensures that outputs are traceable to the source data.

The core mechanism involves training the model to generate special tokens that trigger queries to the knowledge base.
Factual revisions can be applied through edits to the knowledge base rather than updating model parameters.
The method enables smaller models to achieve factual accuracy comparable to larger models.
Experiments demonstrate improved factual grounding in both short and long-form text generation.

This approach enhances transparency and explainability by tracing facts to a knowledge base, while also allowing for efficient updates without the need for costly parameter retraining.