[25W01] Latest AI Paper Tech Trends (HuatuoGPT-o1, Medical Imaging MLLM, VideoRefer)

This article reviews notable AI research papers published in the first week of 2025, covering medical/healthcare AI, computer vision/multimodal AI, model optimization, AI systems/frameworks, and 3D/4D generation.

Medical/Healthcare AI: HuatuoGPT-o1 proposes a two-stage approach using validated medical datasets and reinforcement learning to improve complex medical reasoning in LLMs — achieving superior performance with only 40,000 validated problems. Medical Imaging MLLM introduces combinatorial generalization (CG) to analyze relationships between modality, anatomical region, and task for effective performance in limited data environments.

Computer Vision/Multimodal: VideoRefer Suite enhances spatiotemporal object understanding in video LLMs. VideoAnydoor develops video object insertion with precise motion control. 2.5 Years in Class proposes a new VLM training approach using educational videos. Explanatory Instructions introduces explanatory instruction concepts for zero-shot generalization in computer vision.

Model Optimization: 1.58-bit FLUX quantizes text-to-image generation models to 1.58-bit weights, significantly improving model size and inference speed. VA-VAE resolves optimization issues in Latent Diffusion models enabling 21x faster image generation.

AI Systems/Frameworks: OS-Genesis presents automated data generation pipelines for GUI agents. CodeElo proposes a new benchmark system for evaluating LLM code generation capability. Next Token Prediction presents a unified framework for multi-modality learning. Bringing Objects to Life develops 4D generation methods adding realistic motion to 3D models using NeRF and text-based image-to-video diffusion models.

[25W01] Latest AI Paper Tech Trends (HuatuoGPT-o1, Medical Imaging MLLM, VideoRefer)

Related Articles

The Privacy Paradox: Why We Worry Yet Share Our Data So Easi

[Paper Review] Generational Differences in Acceptance of AI

Are Large Language Models Truly Intelligent, or Just Sophist

Related Articles

논문리뷰
The Privacy Paradox: Why We Worry Yet Share Our Data So Easi
이든 기자 · 2026.06.05

논문리뷰
[Paper Review] Generational Differences in Acceptance of AI
류성훈 기자 · 2026.06.04

논문리뷰
Are Large Language Models Truly Intelligent, or Just Sophist
이든 기자 · 2026.06.04