Tag
Anthropic's new Claude Fable 5 model refuses to answer basic biology questions due to overly conservative safety filters aimed at preventing bioweapons misuse, highlighting the tradeoff between capability and safety.
Anthropic's new model Fable implements invisible safeguards that limit its effectiveness for requests related to frontier LLM development, such as building pretraining pipelines or distributed training infrastructure, to prevent accelerating actors violating terms of service.
This article highlights a common problem in local LLMs where they incorrectly classify real-time information beyond their knowledge cutoff as fictional or satirical, even when provided with tools, often due to excessive RLHF training.