Research paper — korshunov.ai

Research paper Page 1 / 15

POS Tagging of Arabic-English Dictionary Senses via WordNet

The paper presents an algorithm that transfers English part-of-speech tags from Princeton WordNet to Arabic-English dictionary senses after disambiguation. This enables linking bilingual dictionaries to WordNet and standardizing them into WordNet-LMF format, where synsets are the fundamental unit, with high accuracy at low cost.

arxiv arXiv cs.CL · 1d ago

MorfFlex: Managing Rich Morphology in Czech

MorfFlex is a morphological dictionary architecture designed for languages with complex inflection and derivation. MorfFlex CZ, its primary implementation, contains over 100 million wordforms and more than 1 million lemmas, reduced through encoded inflectional and derivational patterns. It supports annotation consistency in the Prague Dependency Treebanks and powers tools like MorphoDiTa.

arxiv arXiv cs.CL · 1d ago

Stability of Prompt Ranking in LLM Evaluation

Prompt rankings in large language model evaluation are often unstable under minor variations like random seeds and limited subsets. A stability-aware selection strategy using lower confidence bounds improves robustness by accounting for both performance and variance, while maintaining competitiveness in stable settings.

arxiv arXiv cs.CL · 1d ago

AutoSpecNER: Fine-Grained NER Dataset for Vehicle Specifications

AutoSpecNER is a dataset of 659 car advertisements with over 10,000 entities annotated across 15 categories. It achieves 91.5% inter-annotator agreement and shows that DeBERTa outperforms both rule-based methods and large language models in vehicle specification extraction, reaching a 90% micro-F1 score.

arxiv arXiv cs.CL · 1d ago

LLM-based Two-Stage Transformer for Bearing Fault Diagnosis

A lightweight GPT-2-style Transformer enables hierarchical feature extraction from vibration signals. The framework achieves 92.61% average accuracy using only 10% labeled data, outperforming state-of-the-art methods by 17.24 percentage points in cross-domain bearing fault diagnosis.

arxiv arXiv cs.CL · 1d ago

UOL@IDEM Submits L1-Aware Vocabulary Prediction Model

UOL@IDEM presents a closed-track submission to BEA 2026, modeling vocabulary difficulty prediction as regression for Spanish, German, and Chinese. The system integrates multilingual contextual embeddings with engineered features like frequency and cognate similarity, achieving lower RMSE scores than baselines, with feature analysis highlighting frequency as the most stable predictor and contextual predictability as a key L1-sensitive signal.

arxiv arXiv cs.CL · 1d ago

RaDaR: AI Model Improves Rare Disease Diagnosis

RaDaR, a compact reasoning large language model, outperformed other open-source models in rare disease diagnosis. In a randomized trial, RaDaR improved physicians' diagnostic accuracy by 21.44 percentage points over internet search alone.

arxiv arXiv cs.CL · 1d ago

Poster: Exploring Audio-Based Scam Detection in Turkish

This research introduces the first public multi-modal dataset of 100 aligned audio-transcript pairs for Turkish scam and benign calls. It evaluates seven large language models under raw audio, automatic, and human-corrected transcript inputs, finding that transcript-based inputs outperform direct audio processing, with human correction having minimal impact.

arxiv arXiv cs.CL · 1d ago

AdversaBench: Automated LLM Red-Teaming with Multi-Judge Confirmation

AdversaBench introduces an end-to-end red-teaming pipeline that generates adversarial prompts via five structured operators, evaluates target models, and confirms failures through a three-judge panel with meta-judge tiebreaker. Experiments on 45 seed prompts across reasoning, instruction-following, and tool use show every seed produces a confirmed failure, with operator effectiveness, failure iteration counts, judge agreement, and cross-model transferability revealing key patterns in LLM vulnerability.

arxiv arXiv cs.CL · 1d ago

Qwen-AgentWorld: Language World Models for General Agents

Qwen-AgentWorld-35B-A3B and Qwen-AgentWorld-397B-A17B are the first language world models that simulate agentic environments across seven domains using long chain-of-thought reasoning. Trained via a three-stage pipeline—CPT, SFT, and RL—these models outperform existing frontier models on AgentWorldBench, a benchmark derived from real-world interactions of five models on nine established tasks.

arxiv arXiv cs.CL · 1d ago

SIFT and WSP Improve Fact-Checking Accuracy

SIFT introduces claim-conditioned re-scoring of evidence spans to better align with full claims, recovering up to 27.6 points in accuracy on FEVER, SciFact, 5PILS, and DP. WSP, an automatic NLI check, achieves AUC 0.92 and precision 0.98 when calibrating against human gold evidence.

arxiv arXiv cs.AI · 1d ago

MedLayXPlain: Benchmarking Expert-Lay Gap in Medical Vision-Language Models

MedLayXPlain introduces the first large-scale benchmark for medical lay language generation, featuring 122,789 region-grounded samples across eight imaging modalities. It evaluates medical vision-language models on expert-lay alignment using a hierarchical ontology system and a lightweight evaluator, revealing a systematic gap: expert-level performance in captioning coexists with significant degradation in lay language, while general-purpose models lack clinical precision.

arxiv arXiv cs.AI · 1d ago

QBioFusion-QSAR: Quantum Kernel Learning for Small-Data Ligand Classification

QBioFusion-QSAR integrates a quantum fidelity kernel with Morgan/Tanimoto fingerprints to improve ligand classification. On the PsychLight-A benchmark, QMKL increased accuracy and MCC compared to Morgan/Tanimoto alone, with improvements attributed to better predictions of molecules with activity cliffs, such as N-Me-5-HT and N-Me-tryptamine. Auditable analysis confirms localized quantum-kernel contributions in small-data settings.

arxiv arXiv cs.AI · 1d ago

Topological Neural Dynamics: Neuron-wise Sequence Modeling

Topological Neural Dynamics (TND) introduces a neuron-wise framework for sequence modeling, where each neuron evolves independently through a directed graph structure. In a single-player Pong behavior cloning task, TND achieves a mean of 17.47 consecutive catches per round, surpassing all baseline models by more than three times.

arxiv arXiv cs.AI · 1d ago

NASDAQ: Normalized Observation Space Dynamics-Augmented Q-Learning

NASDAQ addresses low-dimensional observation challenges in reinforcement learning by normalizing observation spaces to balance reconstruction losses across dimensions. The framework combines value learning with short-term value and next observation prediction, achieving competitive or superior performance with less training time compared to existing methods.

arxiv arXiv cs.AI · 1d ago

Social World Model for Lifelong Social Intelligence

The Social World Model decomposes social interaction into five dimensions to enable closed-loop learning. It allows open-source models to sustainably improve and retain social capabilities, outperforming baselines and matching closed-source Gemini 3 Flash in key metrics without forgetting across difficulty levels.

arxiv arXiv cs.AI · 1d ago

Ramanujan Graph Rewiring Alleviates GNN Over-Squashing

Ramanujan Propagation uses Ramanujan graphs to reduce over-squashing in Graph Neural Networks by ensuring non-negative resistance curvature. The method preserves local connectivity while enabling efficient long-range information flow, outperforming nine state-of-the-art rewiring techniques.

arxiv arXiv cs.AI · 1d ago

SOHET: Self-Supervised Transformer for Heterogeneous Event Streams

SOHET introduces a hierarchical transformer architecture with event-type-specific tabular encoders and self-supervised pre-training objectives. It outperforms existing methods by 5.8% on Booking.com's fraud detection task and achieves faster convergence with 2.4% additional gain from pre-training. On the EBES benchmark, bidirectional SOHET matches or exceeds the best published results on six out of eight tasks.

arxiv arXiv cs.AI · 1d ago

Graph-of-Differences for Anatomy-Structured MedReID

Graph-of-Differences (GoD) introduces anatomy-graph representations to enable medical image re-identification with explicit structural grounding. It computes differences across named anatomical regions and aligns them with global backbone differences, providing clinically auditable, structure-level explanations. GoD improves Rank-1 accuracy by 7.1 pp on fundus and 3.1 pp on CXR, with better performance on zero-shot transfers.

arxiv arXiv cs.AI · 1d ago

Functional Orthogonality Ensures Identifiability in Unsupervised Disentanglement

The paper proves that latent concepts can be identified in unsupervised learning through functional orthogonality, an orthogonality constraint on the generative mapping's Jacobian. This condition enables identifiability in general nonlinear models without needing statistical independence or causal assumptions, as long as the latent domain supports all factor combinations. Experiments with normalizing flows confirm reliable recovery of true factors, offering a viable foundation for disentangled representation learning.