Average Rankings Mask Per-Subject Optimality: A Friedman-Nemenyi Benchmark of EEG Motor-Imagery BCI Decoders

This study evaluates whether any single decoding pipeline dominates across subjects in motor imagery brain-computer interfaces by testing 1,056 configurations on three public datasets using rigorous statistical benchmarks.

Evaluated >340,000 subject-level model fits across PhysionetMI, Cho2017, and Zhou2016 datasets within the MOABB framework.
Applied Friedman omnibus tests, Nemenyi critical-difference analysis, and Wilcoxon signed-rank tests to compare feature extractors, scalers, and classifiers.
Covariance tangent-space projection (cov-tgsp) and Common Spatial Patterns (CSP) were the strongest families but showed dataset-dependent ordering.
On the PhysionetMI cohort, the best pipelines were statistically indistinguishable (Nemenyi p = 0.27; Kendall's W = 0.11).
The single best pipeline was optimal for only 35% of participants, while nonlinear descriptors were best for roughly one third.
Matching the pipeline to the participant improved accuracy by approximately seven points over the best fixed choice.

The findings indicate that no universal decoder exists even under favorable conditions, providing a quantitative case for participant-aware model selection rather than relying on average rankings.