ETRI Launches Safe LLaVA: AI Model Sets New Safety Standards for Vision-Language Platforms in Korea

February 23, 2026

Generative AI

All six Safe models and HoliSafe-Bench are available for download on Hugging Face, with direct links for Safe LLaVA and HoliSafe-Bench datasets.
Safe LLaVA builds on existing models and is released in six safe vision-language variants, including Safe LLaVA (7B/13B), Safe Qwen-2.5-VL (7B/32B), and SafeGem (12B/27B).
ETRI released six safe VL models derived from open-source foundations across LLaVA, Qwen-2.5-VL, and Gemma, spanning 7B/13B to 12B/27B variants.
In tests, Safe LLaVA rejected pickpocketing prompts and avoided unsafe guidance, achieving a 93% safety response rate, with Safe Qwen reaching 97% on HoliSafe-Bench evaluations.
HoliSafe-Bench results show Safe LLaVA at 93% and Safe Qwen at 97% safety responses, indicating substantial safety gains over existing open models.
Compared to domestic models, Safe LLaVA demonstrated better handling of crime-promoting prompts and unsafe content, including scenarios involving adults with children.
ETRI unveils Safe LLaVA, a vision-language model engineered with integrated safety features that detects roughly 20 hazard categories and provides safe responses with reasoning grounds for harmful inputs.
The project aims to set safety standards for generative AI in Korea, with Safe LLaVA offering safe answers plus the reasoning behind them and plans to expand safety research aligned with national AI initiatives.
Safe LLaVA operates to detect and respond to harmful content in both images and text, building on existing language-vision platforms.
HoliSafe-Bench accompanies the release as a safety benchmark dataset with about 1,700 images and over 4,000 Q&As, evaluating risk across seven categories and 18 subcategories and representing the first integrated image-text safety benchmark.
ETRI also released HoliSafe-Bench, Korea’s first integrated image-text safety benchmark, covering roughly 1,700 images and more than 4,000 Q&As across seven categories and 18 subcategories.
The Safe LLaVA suite and HoliSafe-Bench are downloadable via Hugging Face, with context on data-centric versus model-centric safety approaches and the project supported by Korea’s national R&D efforts, including the Ministry of Science and ICT and IITP.

Summary based on 2 sources

Get a daily email with more AI stories

Sources

EurekAlert! • Feb 23, 2026

ETRI unveils “Safe LLaVA,” a vision language model with enhanced safety

Newswise • Feb 23, 2026

ETRI Unveils “Safe LLaVA,” a Vision Language Model with Enhanced Safety | Newswise

ETRI Launches Safe LLaVA: AI Model Sets New Safety Standards for Vision-Language Platforms in Korea

Get a daily email with more AI stories

Sources

More Stories