The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding Paper • 2512.19693 • Published Dec 22, 2025 • 65
Both Semantics and Reconstruction Matter: Making Representation Encoders Ready for Text-to-Image Generation and Editing Paper • 2512.17909 • Published Dec 19, 2025 • 37
Multi-SpatialMLLM: Multi-Frame Spatial Understanding with Multi-Modal Large Language Models Paper • 2505.17015 • Published May 22, 2025 • 9