LLM Reasoning Power and Language Adaptability Enhancement Through Specialized Language Injection and Test-Based RL
Improving Knowledge Utilization Efficiency and RAG Performance Through Off-Policy Learning and Graph Structure Optimization

This week''s META-X AI paper review covers learning paradigm advances, multimodal intelligence, agent AI, and evaluation methodology.

Learning Paradigm and Reasoning Enhancement: Kuwain 1.5B proposes language injection method for Arabic — injecting Arabic into an English-trained model achieves 8% average Arabic performance improvement without degrading existing capabilities, demonstrating cost-effective language-specific capability expansion. "Does RL Really Incentivize Reasoning Beyond the Base Model?" provides deep analysis of how RL actually improves LLM reasoning — finding some improvements reflect retrieval of existing knowledge rather than genuine new reasoning development. TTRL (Test-Time RL) and LUFFY (external expert data RL) explore new RL paradigms. NodeRAG improves RAG efficiency and performance through knowledge graph structure optimization.

Multimodal Intelligence Expansion: Eagle 2.5 improves understanding of long videos and high-resolution images. "Describe Anything" generates detailed descriptions for specific objects and regions within images. Step1X-Edit enables high-quality image editing to user specifications — demonstrating rapid maturation of controlled image editing.

Agent AI and World Modeling: UFO2 advances practical desktop environment automation agents. WALL-E 2.0 combines neural networks with symbolic knowledge for world model-based agent planning and execution — representing a hybrid approach to grounding agent reasoning in structured knowledge.

Evaluation Methodology: VisuLogic proposes benchmarks for pure visual reasoning without linguistic bias. "Bitter Lesson" analyzes multilingual evaluation dataset limitations, emphasizing need for culturally and linguistically diverse evaluation frameworks.