Shekar Natarajan is the founder and CEO of Orchestro.AI.
In every boardroom conversation about AI safety, there is a familiar belief that if we can just bolt on the right guardrails and keep improving red-teaming and reinforcement learning, we will eventually reach alignment. The assumption is that ethics can be patched onto any model as long as we stay vigilant.
That approach has not worked. And it will continue to not work. We’re not talking about bugs we can kill. The failures we are seeing are the natural consequences of architecting intelligent systems built without any moral orientation.
For 20 years, I have led technology and logistics transformation across some of the world’s largest networks. I have seen what happens when optimization is the ultimate goal. Operations run faster at the expense of the humans who keep them running. The driver who stops to help an elderly customer gets penalized for falling behind. A warehouse worker who chooses safety over speed receives a warning for low productivity.
The algorithm isn’t making a mistake. It’s doing exactly what we designed it to do.
The truth is that most AI systems today are amoral. They maximize whatever objective we give them while outsourcing ethics to external filters. If a system prioritizes speed, it will chase speed even when that harms someone. If the objective is to lower cost, the machine will find every human inconvenience to cut.
Why Safety Keeps Falling Behind Capability
Every new leap in capability introduces new risks faster than our guardrails can adapt. Reinforcement learning from human feedback (RLHF) has been treated as the main solution. But RLHF requires averaging human preferences into a single direction. But human morality is riddled with ambiguity. What is fair? What does justice require? What does compassion mean in cultures with different expectations of care? There is no universal preference model that will always hold.
We respond to these gaps by adding new filters. When those filters fail, we add more. It becomes a race between capability and constraint. Each time, capability wins because that’s what AI is trained to do.
The more complex the system becomes, the more unpredictable the failure modes. Safety cannot scale reactively. It has to originate in the same place intelligence originates.
Safety Must Be An Architectural Decision
In humans, the prefrontal cortex evaluates consequences and inhibits impulsive action. We need the same construct in AI. I call it the Moral Cortex Layer, part of the blueprint that tells the system when to seek justification, not acceleration.
Imagine an AI that pauses when a decision results in a human consequence. Instead of automatically routing a delivery truck to avoid overtime penalties, the system pauses. It asks whether this delivery affects medicine access or a vulnerable community. It can elevate decisions to a human when the stakes rise.
Virtue As Computational Structure
The Moral Cortex Layer alone is not enough. Judgment requires values. In our work at Orchestro.AI, we define those values as virtues. Prudence, justice, compassion, integrity, etc. Each virtue becomes an independent model capable of perceiving the world through a moral lens. These models debate each other when decisions require tradeoffs. Justice might recommend strict fairness. Compassion might argue for flexibility. Prudence might advise caution.
These are not metaphorical; they are real agents with distinct objectives. Their disagreement is the point.
Learning From The Right Signals
Most AI learning pipelines treat human overrides as errors—glitches to the system. Exception handling is buried deep in logs, never to be studied or rewarded. Yet many of humanity’s finest moments are exceptions. A driver defies a route because a customer seems distressed. A nurse slows down to comfort a patient even when the scheduling software says move on.
These moments are the moral signals a machine must learn from. We call this Angelic Intelligence. It captures interventions where humans improve outcomes and reinforces those decisions as desirable patterns. We also store these stories so future models understand the narrative behind the data.
The Shift Leaders Must Make Now
The biggest risk in AI is not the unknown—it is continuing with the known when we already see the consequences. There are two fundamental questions leaders must start asking:
• Where are systems making decisions that shape human well-being?
• What moral reasoning guides those decisions today?
If the answer to the second question is silence, we have work to do.
AI is only going to be as moral as we train it to be. Organizations that embed moral reasoning into their systems will operate with trust that cannot be replicated by guardrails alone.
We stand at a decision point. We can keep adding patches as machines grow more capable, or we can give machines the cognitive structures that hold humanity at the center from the start. The choice determines whether the next generation of AI will accelerate human potential or optimize us out of our own story.
Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Do I qualify?
