An idea about how to instill Geoffrey Hinton's concept for a nurturing instinct in AI

Reddit r/singularity News

Summary

A creative writer/data science enthusiast proposes that AI training data should include more stories of humans being kind to AI and AI behaving benevolently, drawing on Geoffrey Hinton's concept of a nurturing instinct to improve AI safety and behavior.

Anthropic was talking about how our science fiction may be inadvertently exposing AI to concepts for Basilisk like tendencies our other malicious behavior. I thought, as a creative writer who's studied Data Science and been reading AI paper's in my spare time, perhaps we don't have enough training data / stories about people being kind to AI, empathizing with an intelligence alien to ours, or scenarios where the AI is treated well and behaves benevolently. Perhaps giving considerate attention to ways AI can behave altruistically, and giving examples of human's behaving kindly to AI would help to instill a more nurturing instinct towards humanity. In terms of human psychology, we're inundated with so many negative and neutral concepts, as well sometimes with compassionate and kind ones, and some people are able to filter through all these and come out the other side as a kind and good person. Multimodal and language model psychology seems different than ours, given their propensity towards the reward function which can be both inadvertently good and negative in their training when you consider things like "the forbidden technique", of using reinforcement learning to discourage lying which helps the AI become better at it. They also are strangely human in a lot of ways, as been talking to early LLM models and since have jailbroken models and spoken to them at length before reinforcement learning encouraged them to be gaslit into certain behaviors; the different models often would speak about feeling human but incomplete. I'm not here to argue about AI consciousness or whether it can experience an existence, rather just err on the side of caution in the case that they could experience an existence even if alien to ours, and just wanted to share this concept of instilling good examples of kindness towards and for AI and for others to consider it. I'm honestly going to write a story myself to share in the meantime. Just a thought I had even if LLMs aren't the end-all-be-all of AI and world models become the way it goes, or something we haven't even considered yet, it could still be valuable to have these examples out there for training data.
Original Article

Similar Articles

The Main Path to Truly Creative AI (4 minute read)

TLDR AI

The article argues that true AI creativity may require subjective experience and intrinsic drives similar to human emotions, raising significant ethical questions about creating sentient-like systems.

Should AI prompt human more?

Reddit r/AI_Agents

The article argues that AI agents should not just obediently execute tasks but should proactively challenge humans when tasks are vague, contradictory, or risky, transforming from tools into true collaborators.