TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times Paper • 2512.16093 • Published 22 days ago • 92
GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents Paper • 2506.03143 • Published Jun 3, 2025 • 53
GUI-AIMA: Aligning Intrinsic Multimodal Attention with a Context Anchor for GUI Grounding Paper • 2511.00810 • Published Nov 2, 2025 • 3
UI Agent Collection a collection of algorithmic agents for user interfaces/interactions, program synthesis, and robotics • 438 items • Updated 24 days ago • 66
view article Article A failed experiment: Infini-Attention, and why we should keep trying? +1 Aug 14, 2024 • 74
Mini-Omni: Language Models Can Hear, Talk While Thinking in Streaming Paper • 2408.16725 • Published Aug 29, 2024 • 53
view article Article makeMoE: Implement a Sparse Mixture of Experts Language Model from Scratch May 7, 2024 • 112