This week''s review covers notable AI research from 2024 Week 34, spanning image generation, 3D reconstruction, computational imaging, LLM evaluation, multimodal learning, and automated AI system design.

Imagen 3 (Google DeepMind): Advanced text-to-image generation model with significantly improved prompt understanding, finer detail rendering, richer lighting, and fewer artifacts vs. previous versions; optimized variants from rapid sketching to high-resolution output; incorporates SynthID watermarking for AI-generated content identification; available through ImageFX and Vertex AI.

MeshFormer: Builds realistic 3D models from sparse multi-view images using transformer architecture — demonstrating that transformer attention mechanisms can effectively reason about 3D geometry from limited viewpoints, enabling rapid 3D asset creation for games and industrial applications.

DifuzCam: Lensless camera system restoring high-quality images using diffusion models — revolutionary computational imaging approach eliminating traditional optics, enabling ultra-thin cameras with applications in medical imaging, robotics, and mobile devices.

Self-Taught Evaluator (Meta): LLM performance evaluation without human intervention — the model generates its own evaluation criteria and scores responses, addressing the bottleneck of human annotation in LLM quality assessment pipelines.

BLIP-3: Efficient large-scale multimodal model training framework optimizing the tradeoff between model capability and training compute, providing accessible path to multimodal LLM development.

DEEM: Improves LLM visual perception capability for enhanced multimodal model robustness — addressing the systematic gap between language and vision processing that limits multimodal model reliability.

ADAS (Automated Design of Agentic Systems): AI systems designing stronger AI systems — meta-learning approach where AI searches over agent designs, demonstrating early potential for recursive AI capability improvement with important safety implications.