Cluster-based semantic chunking does not outperform fixed-size or recursive chunking for academic RAG

A study evaluates whether cluster-based semantic chunking improves retrieval and answer quality in Retrieval-Augmented Generation (RAG) systems compared to fixed-size and recursive chunking strategies. The evaluation focuses on long, structured academic theses using the RAGAs framework.

Cluster-based chunking did not outperform simpler strategies under the tested configuration.
Performance on fixed versus document-specific questions varied substantially, likely related to document formatting and preprocessing.
RAGAs-based faithfulness showed limited reliability in this setup.

The findings suggest that more complex chunking methods may not provide advantages over simpler approaches for this specific use case.