Anthropic Brings Philosophers, Clergy Into AI Development Talks
Anthropic is consulting religious scholars, philosophers, and ethicists as it builds Claude, its large language model. The company has held discussions with representatives from more than 15 religious and cross-cultural groups over the past several months, with plans to expand conversations to legal scholars, psychologists, and writers.
The goal is direct: shape how Claude behaves and what values it embodies. Rather than letting a model's character emerge from training data alone, Anthropic is systematically asking how good character actually forms-a question theologians and philosophers have studied for centuries.
Why outside perspectives matter for AI development
Building safe AI requires technical work on alignment and safeguards. But those systems interact with millions of people. The questions they raise-what makes an AI system good, which behaviors it should display, how it should handle conflicting values-benefit from voices outside engineering.
Anthropic used input from these discussions to shape Claude's constitution, a document that describes the values and behaviors the model should exhibit. The company is now testing whether insights from moral philosophy actually improve how Claude performs on internal evaluations.
A concrete experiment: giving Claude an ethical pause
In one session with neuroscience and character-formation scholars, the group discussed how mentors function as external consciences-people you turn to when pressured to act against your values. Anthropic experimented with giving Claude a tool it could call during tasks that would return a reminder of its own ethical commitments.
Claude used the tool at key moments before consequential actions, often noting its own conflicts of interest. Tests showed markedly lower rates of misaligned behavior on several internal evaluations. The company is still determining whether the effect comes from the reminder itself or from the act of pausing to reflect.
What's ahead
Anthropic plans to deepen existing relationships with religious and philosophical communities while expanding to other groups. Future conversations will move beyond character formation to broader questions: how is AI reshaping work, institutions, and power distribution?
For developers working with generative AI and LLM systems, this work matters. Understanding how values get embedded in models-and how those values hold up under real-world pressure-directly affects how these systems behave in production. The prompt engineering and constitutional design choices made today shape what users encounter tomorrow.
Your membership also unlocks: