chenhao's picture

chenhao

chenhaodev

·

dreamclinger

AI & ML interests

Note: check LLM performance on Med-task @ https://gist.github.com/chenhaodev

Recent Activity

upvoted a paper about 6 hours ago

SAM 3: Segment Anything with Concepts

upvoted a paper about 6 hours ago

Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm

upvoted a paper about 15 hours ago

Step-GUI Technical Report

View all activity

Organizations

None yet

upvoted 2 papers about 6 hours ago

SAM 3: Segment Anything with Concepts

Paper • 2511.16719 • Published Nov 20, 2025 • 125

Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm

Paper • 2511.04570 • Published Nov 6, 2025 • 211

upvoted 4 papers about 15 hours ago

Step-GUI Technical Report

Paper • 2512.15431 • Published 17 days ago • 125

Memory in the Age of AI Agents

Paper • 2512.13564 • Published 19 days ago • 123

Probing Scientific General Intelligence of LLMs with Scientist-Aligned Workflows

Paper • 2512.16969 • Published 16 days ago • 108

DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI

Paper • 2512.16676 • Published 16 days ago • 199

upvoted a collection 2 months ago

Domain Reasoners: CoT Series

Specialized reasoning models for different domains—all using the same step-by-step Chain-of-Thought format. Over 20,000 total downloads combined. • 3 items • Updated 17 days ago • 5

upvoted a collection 3 months ago

MiniCPM-o & MiniCPM-V

Multimodal models with leading performance. • 28 items • Updated Sep 1, 2025 • 59

upvoted a collection 5 months ago

Finetunes | SLMs and LLMs

Various variants of LLMs finetuned using proprietary data. • 26 items • Updated Aug 17, 2025 • 4

upvoted 3 articles 6 months ago

Article

SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data

+7

Jun 3, 2025

•

305

Article

Vision Language Models (Better, faster, stronger)

+3

May 12, 2025

•

580

Article

nanoVLM: The simplest repository to train your VLM in pure PyTorch

+5

May 21, 2025

•

247

upvoted a collection 7 months ago

olmOCR

olmOCR is a document recognition pipeline for efficiently converting documents into plain text. olmocr.allenai.org • 12 items • Updated 11 days ago • 140

upvoted an article 8 months ago

Article

How to Build an MCP Server with Gradio

Apr 30, 2025

•

201

upvoted a paper 8 months ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4, 2025 • 252

upvoted a collection 8 months ago

GraphRAG Papers

Research relating graphs and GenAI. For discussion, find dedicated threads on https://discord.gg/graphrag • 52 items • Updated Sep 17, 2025 • 49

upvoted 3 articles 10 months ago

Article

Train 400x faster Static Embedding Models with Sentence Transformers

Jan 15, 2025

•

222

Article

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

+2

Mar 12, 2025

•

480

Article

Finally, a Replacement for BERT: Introducing ModernBERT

+13

Dec 19, 2024

•

715

upvoted a paper 11 months ago

LightRAG: Simple and Fast Retrieval-Augmented Generation

Paper • 2410.05779 • Published Oct 8, 2024 • 27