OpenMythos Benchmarks Released with SWE-bench and Cybersecurity Results

OpenMythos benchmarks are now available, evaluating performance on SWE-bench Pro, CyberGym, and cybench. The model shows strong capabilities for a small cybersecurity-focused model, though further training is planned to improve performance. Results highlight discrepancies between Qwen 3.5 and 3.6 SWE-bench scores due to different evaluation methods and problem filtering.