guardrail

Tag

Cards List
#guardrail

@AdinaYakup: SingGuard from Ant Group @AntLingAGI A multimodal guardrail where the safety policy is an input, not a fixed weight. - …

X AI KOLs Timeline · 2d ago Cached

SingGuard is a multimodal guardrail system from Ant Group that treats safety policy as an input, allowing dynamic adaptation via natural language. It is released under Apache 2.0 and covers text and image modalities.

0 favorites 0 likes
#guardrail

CHILLGuard: Towards Fine-Grained Chinese LLM Safety Guardrail with Scalable Data Construction and Model-aware Preference Alignment

arXiv cs.CL · 2026-06-16 Cached

This paper introduces CHILLGuard, a fine-grained Chinese LLM content safety guardrail built on a new 5-macro, 31-micro category risk taxonomy and a scalable multi-stage data construction pipeline. The model achieves state-of-the-art performance, improving F1 score by 15.92% over existing baselines.

0 favorites 0 likes
← Back to home

Submit Feedback