OpenSafetyLab

non-profit

https://open-trust-lab.vercel.app

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

adwardlee authored a paper about 10 hours ago

RenderIH: A Large-scale Synthetic Dataset for 3D Interacting Hand Pose Estimation

adwardlee authored a paper about 10 hours ago

From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities

adwardlee authored a paper about 10 hours ago

SALAD-Bench: A Hierarchical and Comprehensive Safety Benchmark for Large Language Models

View all activity

adwardlee

authored 13 papers about 10 hours ago

RenderIH: A Large-scale Synthetic Dataset for 3D Interacting Hand Pose Estimation

Paper • 2309.09301 • Published Sep 17, 2023 • 1

From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities

Paper • 2401.15071 • Published Jan 26, 2024 • 37

SALAD-Bench: A Hierarchical and Comprehensive Safety Benchmark for Large Language Models

Paper • 2402.05044 • Published Feb 7, 2024 • 2

T2ISafety: Benchmark for Assessing Fairness, Toxicity, and Privacy in Image Generation

Paper • 2501.12612 • Published Jan 22, 2025

Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning

Paper • 2504.15275 • Published Apr 21, 2025 • 2

Visual Contextual Attack: Jailbreaking MLLMs with Image-Driven Context Injection

Paper • 2507.02844 • Published Jul 3, 2025

SafeWork-R1: Coevolving Safety and Intelligence under the AI-45$^{\circ}$ Law

Paper • 2507.18576 • Published Jul 24, 2025 • 8

A-MemGuard: A Proactive Defense Framework for LLM-Based Agent Memory

Paper • 2510.02373 • Published Sep 29, 2025 • 10

Collaborative Shadows: Distributed Backdoor Attacks in LLM-Based Multi-Agent Systems

Paper • 2510.11246 • Published Oct 13, 2025 • 2

ProGuard: Towards Proactive Multimodal Safeguard

Paper • 2512.23573 • Published 3 days ago • 4

Foreshhh

authored 5 papers 3 months ago

IS-Bench: Evaluating Interactive Safety of VLM-Driven Embodied Agents in Daily Household Tasks

Paper • 2506.16402 • Published Jun 19, 2025 • 1

SafeWork-R1: Coevolving Safety and Intelligence under the AI-45$^{\circ}$ Law

Paper • 2507.18576 • Published Jul 24, 2025 • 8

Frontier AI Risk Management Framework in Practice: A Risk Analysis Technical Report

Paper • 2507.16534 • Published Jul 22, 2025 • 7

Taming Masked Diffusion Language Models via Consistency Trajectory Reinforcement Learning with Fewer Decoding Step

Paper • 2509.23924 • Published Sep 28, 2025 • 8

LLMs Learn to Deceive Unintentionally: Emergent Misalignment in Dishonesty from Misaligned Samples to Biased Human-AI Interactions

Paper • 2510.08211 • Published Oct 9, 2025 • 22

Foreshhh

in OpenSafetyLab/MD-Judge-v0_2-internlm2_7b 4 months ago

Unable to download the model

#2 opened about 1 year ago by

sriharshasurineni

Foreshhh

in OpenSafetyLab/t2i_safety_dataset 5 months ago

Improve dataset card for T2ISafety benchmark

#1 opened 5 months ago by

nielsr

AI & ML interests

Recent Activity

Team members 4

OpenSafetyLab's activity

Unable to download the model

Improve dataset card for T2ISafety benchmark