ETRI Launches Safe LLaVA: AI Model Sets New Safety Standards for Vision-Language Platforms in Korea

February 23, 2026
ETRI Launches Safe LLaVA: AI Model Sets New Safety Standards for Vision-Language Platforms in Korea
  • All six Safe models and HoliSafe-Bench are available for download on Hugging Face, with direct links for Safe LLaVA and HoliSafe-Bench datasets.

  • Safe LLaVA builds on existing models and is released in six safe vision-language variants, including Safe LLaVA (7B/13B), Safe Qwen-2.5-VL (7B/32B), and SafeGem (12B/27B).

  • ETRI released six safe VL models derived from open-source foundations across LLaVA, Qwen-2.5-VL, and Gemma, spanning 7B/13B to 12B/27B variants.

  • In tests, Safe LLaVA rejected pickpocketing prompts and avoided unsafe guidance, achieving a 93% safety response rate, with Safe Qwen reaching 97% on HoliSafe-Bench evaluations.

  • HoliSafe-Bench results show Safe LLaVA at 93% and Safe Qwen at 97% safety responses, indicating substantial safety gains over existing open models.

  • Compared to domestic models, Safe LLaVA demonstrated better handling of crime-promoting prompts and unsafe content, including scenarios involving adults with children.

  • ETRI unveils Safe LLaVA, a vision-language model engineered with integrated safety features that detects roughly 20 hazard categories and provides safe responses with reasoning grounds for harmful inputs.

  • The project aims to set safety standards for generative AI in Korea, with Safe LLaVA offering safe answers plus the reasoning behind them and plans to expand safety research aligned with national AI initiatives.

  • Safe LLaVA operates to detect and respond to harmful content in both images and text, building on existing language-vision platforms.

  • HoliSafe-Bench accompanies the release as a safety benchmark dataset with about 1,700 images and over 4,000 Q&As, evaluating risk across seven categories and 18 subcategories and representing the first integrated image-text safety benchmark.

  • ETRI also released HoliSafe-Bench, Korea’s first integrated image-text safety benchmark, covering roughly 1,700 images and more than 4,000 Q&As across seven categories and 18 subcategories.

  • The Safe LLaVA suite and HoliSafe-Bench are downloadable via Hugging Face, with context on data-centric versus model-centric safety approaches and the project supported by Korea’s national R&D efforts, including the Ministry of Science and ICT and IITP.

Summary based on 2 sources


Get a daily email with more AI stories

More Stories