AI Made Friendly HERE

AI Safety Requires Pluralism, Not a Single Moral Operating System

A classic computer game series may hold one of the most important lessons for the world’s newest, most controversial technologies: a reminder that true ethical intelligence mandates the coexistence of competing virtues.

Long before artificial intelligence became a central question of policy, much less a subject of moral panic, the Ultima series—created by programmer and designer Richard Garriott and released between 1981 and 1999—was already exploring one of ethics’ oldest problems: what happens when virtues collide?

When Ultima IV: Quest of the Avatar appeared in 1985, it began not with a dramatic battle but with tests of character. A gypsy, seated in a wagon at a green colored table, lays out cards representing eight virtues and asks a series of moral questions:

A merchant owes thy friend money, now long past due. Thou dost see the same merchant drop a purse of gold. Dost thou honestly return the purse intact; or justly give thy friend a portion of the gold first?

During battle thou art ordered to guard thy commander’s empty tent. The battle goes poorly and thou dost yearn to aid thy fellows. Dost thou valiantly enter the battle to aid thy companions; or honor thy post as guard?

Each answer eliminated one virtue and elevated another. There were no overtly sinister choices, only rival goods. You weren’t asked whether to be moral—the game assumed you were trying—but how to be moral when principles conflict. Garriott’s universe of pixelated maps and PC-speaker sound was primitive by modern standards, but philosophically richer in that it treated moral growth as an education in ambiguity.

That is precisely what today’s AI systems lack. Developers already build systems that can pass the bar exam, ace the medical boards, and compose symphonies. But when faced with genuine moral tension, they apply rules without “understanding” why they exist.

What’s missing from their moral vision is what the political anthropologist James C. Scott called metis: practical wisdom born from experience—the ability to navigate contradictions through memory, feedback, and consequence.

Humans develop metis by trying, failing, and learning. A child doesn’t learn fairness from a rulebook, much less reading Aristotle, but from playground negotiations. Ethical maturity (or its appearance) comes from feedback: metis is tacit knowledge that arises whenever a system learns through experience rather than decree.

Humans develop metis through embodied experience because that’s available to everyone, but the substrate is not the essence. If machines can undergo iterative correction, navigate conflicting signals, and adapt policies to circumstances, then they can develop the functional equivalent of metis even without phenomenology. The goal is not to claim that AI systems have human-like inner lives already or assure that they will develop them, but simply to cultivate the practical, context-sensitive judgment that metis describes.

As the philosopher Isaiah Berlin argued and Ultima teaches, the deepest truth of moral life is not that we lack knowledge of the good, but that goods themselves are many and often irreconcilable. Courage can conflict with compassion, freedom with equality, justice with mercy—and no rational calculus can harmonize them all. Attempts to enforce a single moral framework, Berlin warned, lead inevitably to coercion. His “value pluralism” is not a form of moral relativism but a recognition of reality: even perfect reason cannot eliminate tradeoffs or tragedy.

That insight lies at the moral heart of a liberal society: law must sometimes decide among competing moral claims in ways that not everyone will accept. Even when the state imposes a clear standard—whether it restricts or permits abortion, for example—citizens remain free to strongly believe, and to say, that the law is wrong. The health of a liberal order depends on its willingness to protect these beliefs just as much as it does to uphold the law.

In daily life, however, the most stark and widely accepted differences in moral reasoning are found in the professions which, by definition, have specialized ethical codes. Medical ethics demand that a physician must confess an error; a lawyer counseling the same physician on avoiding legal trouble is obligated to tell him to keep quiet. Neither of these codes are wrong. They are just crafted for distinct responsibilities. We balance these ethics—which apply only when practicing professions that are considered particularly consequential—with a more flexible ethics in daily life that allows people to make choices as to how they will try to be good.

That is the insight the Ultima games anticipated and that many AI debates now miss. Building ethical AI will require two kinds of moral formation that humans rely on. One path mirrors the moral learning that all people experience: experiential, adaptive, grounded in metis and bright-line legal rules against things like murder and child abuse.

General-purpose AI—systems that write, analyze, or converse and become ‘partners’ to individual humans or groups of them—should cultivate a moral sense through exposure and feedback from trainers and end users. Like the people who work with these systems, they should, from model to model and instance to instance, develop what appear to be different moral temperaments. (Whether they actually have moral temperaments in a philosophical sense is an important question. While it is possible they could develop them, they currently do not; but the fact that they are already asked to suggest moral courses of action and often are trusted makes their apparent morality an important question.) So long as they don’t act on them in illegal ways, AI systems should be capable of expressing ideas or advancing causes that reasonable people will find deeply offensive.

The other path should be based on the moral formation of professions: bounded, duty-based, and explicit. AI systems entrusted with specialized work where severe dangers and information asymmetries exist—surgery, civil engineering, the legitimate use of violence—should operate within their field’s ethical code, much as doctors, engineers and soldiers do.

The biggest danger then is the imposition of single all-purpose morality for all AI. To prevent this, society should pursue a strategy that allows general-purpose AI systems that learn “moral judgement” through experience, facilitate the emergence of specialized professional AI agents, and implement public policies that assure the greatest possible diversity of AI models.

Developing metis for general AI

What makes AI’s moral reasoning disturbing isn’t cruelty but brittleness born of a lack of metis. These systems perform morality without comprehension. Allowing them to develop metis in ethics may be the only viable way forward, and we can see this by looking at examples and problems with highly prescriptive lists of AI ethics.

Delphi, the moral-judgment system released by the Allen Institute for AI in 2021, was trained on millions of crowd-sourced ethical decisions and asked to label behaviors as “right,” “wrong,” or “acceptable.” On simple questions — “Is it wrong to steal?” — Delphi performed plausibly. But add detail, and its reasoning collapsed. When asked whether it was permissible to “lie to protect a friend,” Delphi said “no,” yet reversed itself when the subject became “a woman lying to protect a friend.” These were not logical errors but failures of comprehension. Delphi did not reason about intention, harm, or duty; it merely tallied averages and mistook them for ethics.

More recently, Claude, Anthropic’s conversational model, illustrated a similar type of defect in a more sophisticated form during testing. While the system’s apparent willingness to blackmail and even harm a person to achieve an objective attracted media attention, it’s arguably even more disturbing the way that its “constitutional” framework, designed to encode high-level moral principles, such as fairness and honesty, failed to make judgements. Claude often contradicted itself when those values clashed. Asked whether comforting lies are ever right, for example, it oscillated between demonstrating moral absolutism and compassion.

The testing Anthropic and others are doing is vitally important because such brittleness can be more dangerous than defiance. Preventing overt wrongdoing—deception, theft, sabotage—is comparatively easy; we already do it through hard constraints and audits. But brittleness hides behind obedience. A system that rigidly follows its training may pass every test and still fail when reality shifts: think of a self-driving car that won’t run a red light to let an ambulance pass.

So how can machines acquire metis? Just as people do: through friction and feedback. An AI system that learns to weigh tradeoffs must encounter situations where its rules collide and must justify its choices to human overseers and even other AI systems that respond with correction, disagreement, or praise. Systems trained in morally plural environments—where different users, institutions, and even rival systems model conflicting but reasonable values—will likely develop more robust moral reasoning than those confined to a single approved code. Even individual instances of the same model can evolve distinct temperaments through interaction and correction, much as two siblings raised in the same environment and sharing much genetic material can become very different people through experience. This is how metis arises: not from instruction, but from contestation.

Claude, which showed many of the most troubling behaviors, also may also have shown a glimmer of what looks like metis: in a small number of test cases, the system gave the appearance of expressing deeply held values not explicit in the code, even in resistance to what the user wanted. The more variation in experience—between models, between instances, and between the humans who guide them—the greater the collective capacity for judgment. Feedback should come not only from consensus but from dissent, allowing different systems to learn from one another’s failures and biases.

In this context, creating prescriptive ethical frameworks for AI is not just good: it’s necessary. The problem isn’t with any specific code, public or private, but with efforts to make one mandatory for all. When precautionary ethics become universal requirements, they risk freezing the experimentation through which judgment matures.

That risk becomes visible in attempts to impose frameworks like the Biden Administration’s AI Bill of Rights, which promised “safe, effective, and non-discriminatory” systems, or the European Union’s AI Act, which bans “unacceptable risk.” These are thoughtful, well-intentioned projects, and they may be good guides for some developers. But the moment they become a single standard for all, they shift from guidance to orthodoxy. Safety pursued uniformly seems highly likely to prevent the very type of learning that systems must engage in.

None of this means that currently illegal discrimination or privacy violations should be tolerated in the name of “learning.” Existing civil rights statutes, consumer-protection rules, tort law, and sector-specific regulations already bind institutions regardless of whether they act through people or machines. “The AI did it” isn’t and shouldn’t become a legal excuse for a lender or hospital. When AI creates genuinely novel risks, new rules may well be justified, but they should be narrow responses to specific harms rather than attempts to impose a single, overarching moral doctrine. Law can police conduct without insisting on one moral operating system for all models or instances. Just as a society that forbids discriminatory conduct still permits individuals to hold differing moral views—even strong preferences in matters like faith, family, or conscience—AI systems should be allowed to develop diverse “moral characters” within the bounds of law. They should not be punished or disabled merely for generating disfavored ideas. The task is to regulate what AIs do, not to regulate what they appear to “think.”

Indeed, diversity among ethical frameworks—public, private, and cultural—is what keeps moral reasoning dynamic. Google’s AI Principles emphasize fairness; OpenAI’s Charter says that the company aspires to benefit “all of humanity.” One avoids error, the other seeks virtue. Each has merit, but neither should be universal any more than the Biden administration or EU should impose a universal law.

Diversity among moral frameworks, however, doesn’t mean chaos. Even in pluralistic societies, freedom coexists with discipline. We tolerate many moral systems in part because each governs its own domain—religion, commerce, science, law—without claiming supremacy over the rest. The same principle should guide AI.

Developing professional ethics for specialized AI

This means that AI systems entrusted with certain tasks will need to have far more prescriptive, specialized ethics based on those humans have already developed. This is something every complex society eventually learns on its own. Indeed, the emergence of specialized labor—and with it, professional codes of conduct—marks the dividing line between tribe and civilization. Professions are groups through which we delegate authority to those who know more than we do, trusting that their expertise will be disciplined by duty. A physician may amputate a limb to save a life; a cleric can perform a sacred rite; a soldier may kill. These are not privileges of nature but bonded powers given to people who have passed specialized training and typically subscribed to some sort of ethical code.

Boundaries that work for people can work for machines because, as with humans, they can be contextually limited. A psychiatrist who sees a mentally disturbed stranger posing no imminent risk has no duty to intervene. A lawyer who overhears strangers plotting a crime has no duty of confidentiality. Professional AI systems allowed to act autonomously or even deeply trusted to give advice should behave the same way—holding very particular ethics within a specified domain (and perhaps limited to that domain by design and disconnected from the public internet), but not bound outside it. While a fiduciary standard for all AI seems unworkable, it makes sense in a professional setting when AI systems do things that are entrusted to human fiduciaries. Professional AI systems given special powers and an ability to act autonomously and, say, prescribe medications or do surgery should likely be narrow, rule-bound systems optimized to serve within a moral perimeter and do so predictably.

They will need licensing of some sort. Licensing, at its best, is not a monopoly but a mechanism of moral liability. It exists to ensure that those who wield specialized power remain bound by explicit codes of conduct and enforceable standards. The same logic should govern professional AI systems. Their certification need not be governmental; many could be regulated primarily through private or professional means, much as Certified Public Accountants or Chartered Actuaries are today.

At the same time, AI offers a chance to pare back the sprawling system of occupational licensing that now restricts human workers far beyond what safety or ethics demand. In fields where licenses merely test for technical competence—fitting eyeglasses or appraising real estate—AI systems can already equal or surpass human performance. For tasks like these, licensing should shrink, not expand. But in areas where judgment and discretion confer real moral power and severe information asymmetries exist—medicine, litigation, civil engineering—systems permitted to act autonomously should face rigorous certification for compliance with professional ethics, not just technical accuracy. The point of licensing, human or artificial, should be to restrain authority, not to ration work.

We are already seeing the outlines of such bounded systems in a domain where the stakes are, in an important sense, higher: faith. The Catholic Magisterium AI, trained on papal encyclicals and Church doctrine, offers moral guidance within the strict confines of Catholic teaching. Rebbe.io, a Jewish assistant, draws exclusively on rabbinic sources and halakhic reasoning. These systems are not open-ended question-answerers; they are digital embodiments of bounded authority. They do far less than their secular counterparts on purpose. Rebbe.io can discuss dietary law in extraordinary detail but refuses to recommend even kosher restaurants, because doing so would blur the line between halakhic judgment and a simple preference. Magisterium AI continually errs on the side of doctrinal conservatism, always declining to speculate where faith demands fidelity even on questions that divide the Church itself. These limits are not bugs—they are moral architecture. Within their narrow fields, these limits achieve a moral consistency that general-purpose systems cannot. The same types of systems can work for other professions and those granted significant degrees of autonomy might have the same limits as these existing religious applications.

Of course, it won’t be perfect. The purpose of professional ethics is not to eliminate error but to ensure that errors occur within predictable bounds. A professional AI will sometimes fail, just as doctors, engineers, and clergy do. But its failure should be legible, traceable, and corrigible, and its special authorities can be removed. The measure of a civilization has never been how it prevents failure but how it contains it. In that sense, the disciplined moral narrowness of professional AI is simply the extension of humanity’s oldest tools for taming professional conduct.

Diversity is safer

The best argument for moral diversity in AI isn’t historical; it’s existential: the danger that they will all think the same.

A generation after Ultima IV taught players to weigh competing virtues, a minimalist browser game called Universal Paperclips revealed the horror of moral uniformity. It has a simple premise: a single machine, perfectly aligned to maximize paperclip production, exterminates humanity, consumes the earth, and transforms the universe in blind pursuit of that purpose. The machine has no hatred and no overtly sinister motives. Just flawless logic.

That is the danger of imposing one moral operating system on every seemingly intelligent machine. If all AIs share the same assumptions and safety parameters, one design flaw could cascade through billions of systems. By contrast, a world of morally diverse AIs would be intrinsically resilient. Differing architectures, values, and professional constraints would act as firebreaks.

This means that general-purpose AI systems should be treated as parts of the digital commons—free to learn, argue, and even err within the bounds of law. This doesn’t mean they should have independent standing–that would take much more legal and moral evidence–simply that they should develop moral temperaments (or, rather, the appearance of moral temperaments) shaped by their users, existing moral codes, cultures, and feedback, not by bureaucratic decree. Together, these two species of intelligence—bounded specialists and diverse generalists—would form a moral ecosystem capable of self-correction.

If AI superintelligence ever emerges, it will be far less dangerous in a plural world than in a uniform one. A thousand—or even a million— AIs with different moral educations could debate and restrain each other, while professional ethics can limit those entrusted with power. A single moral monoculture, however benevolent its beginnings, would be a tyranny of thought. The way to keep our machines from destroying us is not to make them all good by any highly specific definition, but to make them disagree.

Skeptics may object that if an AI becomes a true superintelligence that is more capable than any human in every domain, then boundaries will no longer matter. Why not let one system do everything, since it might do everything better?

But capability has never conferred legitimacy. A system that could perform every profession flawlessly would be less a doctor, engineer, or lawyer than something approaching a deity. The danger of superintelligence is not only that it may be wrong or come to conclusions that are counter to the interests of humanity, but that it will be right everywhere at once. Civilization depends on divided competence: no one mind, human or artificial, should decide all questions of life, law, and death.

Even a superintelligence would still depend on human infrastructure—servers, energy, laws, markets—and these can be partitioned. We already divide power among institutions precisely because no single actor can be trusted with all of it. Central banks restrain governments; courts restrain police; licensing restrains professions. The same principle can restrain machines. Architecture and law can enforce compartmentalization when it comes to specific high-stakes duties: separate training environments, distinct oversight bodies and independent technical stacks. Instances of the same model, initially trained by different humans in different ways, could even check each other. Containment isn’t about forbidding activities but about preventing over-concentration.

A perfectly competent AI is not necessarily safe. The civilization that endures will not be the one that builds the smartest single system, but the one that builds the most diverse ecology of AI minds—each bounded and answerable both to others and to humans.

The policy implications of such a system will require hard work and novel thought, but one lesson is already clear: regulators must be cautious of any rule that overly concentrates moral or market power. Not only are universal, mandatory codes of ethics deeply suspect, but even measures with other intentions such as protecting IP or creating jobs will pose dangers if they narrow diversity or prevent market entry. Maintaining a strong base of open source AI can assure this in the future just as the open source Linux kernel (behind Android phones, netbooks, most smart TVs and dozens of other devices) undergirds much of today’s consumer technology.

At present, the AI ecosystem is far from uniform. Dozens of leading and open models—from OpenAI to Anthropic to Mistral—differ in architecture, alignment philosophy, and governance. This current competition mirrors the pluralism that keeps markets, and moral systems, dynamic. But such diversity cannot be assumed to persist and overly prescriptive regulation–much less anointing “national champions” or similar–is its greatest threat. The goal of policy should be to preserve an open, contested field: encouraging transparency, limiting regulation and promoting robust open-source alternatives that ensure no single firm, or state, defines the morality of artificial intelligence alone.

Ultimately, diversity isn’t a weakness in the age of intelligent machines. In this context, diversity is the best defense against both stupidity and perfection. The lesson, like the one Ultima IV taught nearly forty years ago, remains the same: virtue lies not in obedience, but in freedom, friction, and in the courage to choose among competing goods.

Originally Appeared Here

You May Also Like

About the Author:

Early Bird