The Grain in the Stone: Energy-Based AI and the Shape of Intelligence

Yann LeCun’s latest move signals a paradigm shift; Michelangelo’s relationship to raw marble shows us what it means

Jan 25, 2026

Something significant happened in AI this week, and it wasn’t just another salvo in the model benchmark war.

Logical Intelligence, a six-month-old Silicon Valley startup, announced an “energy-based reasoning model” called Kona. They claim it’s more accurate, more efficient, and less prone to hallucination than large language models like GPT-5 or Gemini. Bold claims from a young company. But what caught my attention wasn’t the benchmarks. It was who signed on to lead their technical research board: Yann LeCun.

LeCun isn’t just another AI luminary lending his name to a startup. He’s a Turing Award winner, Meta’s former chief AI scientist, and for years the field’s most persistent critic of the autoregressive paradigm that dominates current AI. While everyone else scaled transformers and celebrated ChatGPT, LeCun kept saying the same thing: predicting the next token isn’t intelligence. Real intelligence requires world models, planning, and a kind of reasoning that LLMs fundamentally can’t do.

Now he’s putting his credibility behind an architecture that embodies his alternative vision. That’s worth paying attention to.

The “energy based” model that Logical Intelligence and Lecun are building coincides substantially with the Entropy State Resolution approach that I have been proposing and building for the last few years. So I’m very encouraged to see this new development and will be following it closely.

Even more striking was a comment from CEO Eve Bodnia:

“AGI as a finished state will not emerge from any single model class. It will require an interdependent ecosystem composed of EBMs, LLMs, world models, and others, working together.”

In a field addicted to “this model beats that model” thinking, here’s a founder saying the problem of Artificial General Intelligence won’t be resolved by any AI model, but rather will emerge as different architectures integrate into systems that no single model could achieve alone.

This is one of the key concepts that The Singularity Project has been advocating, so I think this is a development worth celebrating and building upon.

The Grain in the Stone

To understand why this matters, let’s forget about algorithms for a moment. Let’s consider Michelangelo’s relationship with the sculpting of marble.

We tend to think of a sculptor as an artist with a specific vision, fully formed in his mind, who then subtracts from the raw block of marble with hammer and chisel until the imagined shape remains. Goal state defined, then executed. Input to output. Problem to solution.

But Michelangelo described his process very differently. He spoke of liberating the figure already living within the stone. He would study the marble, follow its grain, explore its veins and faults, and gradually discover a form contained within the block. The Maestro didn’t impose his intellect upon the stone; the statue emerged through a continuous dialogue between the sculptor’s mental universe of possible forms and the particular characteristics of the specific material at hand, intermediated by his artistic acumen with the hammer and chisel.

Start with a rough block: high entropy, infinite potential figures. Through iterative interaction between chisel and stone, between artistic intelligence and material reality, resolve toward the outcome that satisfies both the artist’s vision and the marble’s nature. Not a predetermined goal executed, but a discovered emergent resolution.

Notice what’s absent from this picture: a fixed goal state, a predetermined output, a problem fully specified before the work begins. Instead, there’s exploration, dialogue, iterative refinement—and an outcome that emerges from the process rather than preceding it.

This is how intelligence actually works. It is entropy state resolution in its purest form: not minimizing toward a predetermined target, but maintaining adaptive coherence across the evolving relationship between vision, tool, and material until the tensions resolve into form.

This is a very different picture of intelligence than “predict the next token.” And it’s the paradigm that energy-based reasoning embodies—which is why LeCun has been its most persistent advocate for over two decades.

Yann LeCun vs. Large Language Models

LeCun’s critique of large language models has been consistent: they’re “reactive” systems that pattern-match on text without understanding the world that text describes. They can sound intelligent while being fundamentally confused about physical reality, causal relationships, and logical constraints. They guess the most likely next word without knowing whether what they’re saying is true.

His alternative vision centers on what he calls “objective-driven AI”—systems that don’t just predict sequences but evaluate configurations against goals and constraints. In his technical work, this takes the form of Joint Embedding Predictive Architectures (JEPA) and energy-based models: frameworks where reasoning is recast as optimization and where intelligence means finding coherent states rather than extending plausible sequences.

Read our previous analysis of JEPA here:
Yann LeCun’s Joint Embedding Predictive Architecture (JEPA) and the General Theory of Intelligence

Kona is this vision made operational. Instead of predicting token-by-token, it evaluates whole configurations against constraints and finds the state with the lowest “energy”—meaning the highest consistency, the fewest violations, the most coherent overall structure.

The physics metaphor is precise. Imagine a landscape where consistent solutions are valleys and contradictions are peaks. The system doesn’t “decide” where to go; it settles into the lowest valley it can find—the configuration with minimal tension, maximal coherence. Like a ball rolling downhill. Like Michelangelo’s chisel exploring the grain in the stone.

For Sudoku, Kona doesn’t guess cell by cell; it evaluates entire board states, scores them for consistency, and refines toward the configuration that satisfies all constraints simultaneously. For mathematical proofs, it evaluates whole reasoning traces rather than committing step by step. For safety-critical systems, it can verify that outputs satisfy formal requirements before they’re deployed.

This addresses a real limitation of autoregressive models. When you predict token-by-token, you commit to each step before knowing where the sequence will end up. Errors compound. Early mistakes propagate into confident nonsense. Hence hallucinations—fluent text that confidently asserts falsehoods.

Energy-based reasoning offers a structural solution: evaluate globally, refine iteratively, verify before committing.

Energy and Entropy

“Energy” in these models isn’t electrical power (though Kona reportedly uses less of that too). It’s borrowed from Boltzmann and statistical mechanics, where low energy means high probability, high consistency, low disorder. In other words: low entropy.

This matters because it represents a shift in how we frame what intelligence does.

The autoregressive paradigm treats intelligence as prediction: given what came before, what comes next? This frames intelligence as fundamentally backward-looking—pattern-matching on history, extrapolating sequences.

The energy-based paradigm treats intelligence as resolution: given a set of constraints and possibilities, what configuration resolves the tensions? This is forward-looking—finding coherence across a whole system rather than extending a chain.

LeCun has been making this argument for years. His work shares deep theoretical roots with Karl Friston’s free energy principle, which frames biological intelligence as fundamentally about minimizing surprise—reducing entropy through prediction and action. Energy-based reasoning brings this principle into AI architecture.

Both Friston and LeCun draw from statistical mechanics, both frame intelligence as optimization rather than sequential prediction. But the relationship between these two thinkers is more complicated than simple alignment.

Recently, at the World Economic Forum in Davos, LeCun appeared on a panel with Friston and publicly stated he agreed with the direction of Friston’s work. Yet he insisted, “we just don’t know how to do it any other way” than through deep learning and reinforcement learning.

In a recent interview, Friston offered his diagnosis: “I think it’s a very simple difference. I am committed to first principles and was trained as a physicist and think as a physicist. He (LeCun) is a really skillful engineer.”

The technical crux, according to Friston, is that LeCun “does not think it is, in an engineering sense, easy or possible to include uncertainty into his neural networks.” By essentially setting the “temperature to zero”—removing uncertainty from the objective function—LeCun’s architectures lose something crucial: the ability to know what they don’t know, to ask questions, to seek information rather than merely react to it.

This makes LeCun’s move to Logical Intelligence particularly interesting. Is this the engineer finally finding a way to build what the physicist has been describing? Energy-based reasoning, with its global evaluation of configurations and iterative refinement toward coherence, may be the bridge LeCun has been looking for—a way to operationalize the entropy-resolution paradigm without abandoning the engineering discipline that’s defined his career.

Diffusion Language Models

Energy based models are not the only movement in the direction of an entropic paradigm. Diffusion language models—a parallel research direction gaining momentum—embody the same insight through different mechanisms.

Diffusion Language Models start with pure noise, maximum entropy, and iteratively denoise toward coherent output. Each step reduces uncertainty, clarifies what the final form will be. Researchers have discovered these systems exhibit “zones of confusion”—concentrated moments where entropy spikes and the system negotiates critical transitions.

Read my previous analysis of Diffusion Language Models here
Diffusion Large Language Models: A New Rival to the Transformer AI Architecture?

Three independent research programs—LeCun’s energy-based models, diffusion architectures, and Karl Friston’s free energy principle are all converging on the same fundamental operation: intelligence as entropy state resolution.

This convergence suggests something deeper than competing technical approaches. It suggests we’re discovering a principle.

From Ecosystem to System

Now here’s where Bodnia’s comment becomes genuinely interesting:

“An interdependent ecosystem composed of EBMs, LLMs, world models, and others, working together.”

This is real conceptual progress. It acknowledges that different architectures resolve different kinds of entropy: LLMs handle linguistic coherence and communicative ambiguity; EBMs handle logical consistency and constraint satisfaction; world models handle predictive coherence about physical dynamics. No single architecture does everything well.

One risk in the enthusiasm for energy-based reasoning is discarding what LLMs have achieved.

Yes, they hallucinate. Yes, they degrade over long reasoning chains. Yes, they’re unreliable for formal verification. All true.

But LLMs accomplished something unprecedented: they made language a functioning communication protocol between human intelligence and machine computation. For the first time in history, humans can express intent in natural language and have that intent translated into computational action.

Language isn’t just “text tokenized into numbers.” It’s a high-bandwidth, ambiguity-tolerant, context-sensitive channel for coordinating meaning across minds. LLMs made that channel work between biological and technological substrates.

Michelangelo needed the chisel. The chisel didn’t replace his vision; it enabled his vision to engage with marble. LLMs are the chisel that lets human intelligence engage with computational possibility.

LeCun has sometimes positioned his work as an alternative to LLMs—and for certain tasks, it is. But Bodnia’s ecosystem framing is wiser. Energy-based reasoning doesn’t replace linguistic intelligence; it complements it. EBMs verify; LLMs communicate; world models predict. The interesting question isn’t which wins but how they force-multiply:

Human Intelligence × Artificial Intelligence = Technological Intelligence.

Not replacement. Not competition. Multiplication into a new emergent systemic Intelligence.

Boundary Conditions and State Resolution

While the framing of Bodnia and Logical Intelligence represents a substantive movement in the right direction, it still remains an ecosystem of individual entities that work together. The intelligence is still located in the models—the EBM, the LLM, the world model—and the ecosystem is how they coordinate.

What if we pushed further?

Ecosystems are collections of entities that interact. Systems, in the stronger sense, are patterns of relationships where the behavior of the whole emerges from the dynamics between parts, not from the parts themselves.

The wetness of water isn’t in hydrogen or oxygen; it emerges from their relationship. The intelligence of a jazz ensemble isn’t in any single musician; it emerges from their real-time adaptive interplay.

Michelangelo’s genius wasn’t in his mind or his hand or in the quality of his chisel. It emerged from the triangular relationship between artistic intelligence, technological tools, and the marble itself, a dynamic where each node shaped the others.

The next step beyond “ecosystem of intelligent components” is recognizing intelligence as the systemic property that emerges from adaptive relationships. Not something the models have, but something the system does.

Here’s where all current frameworks, promising as they are, still fall short.

Kona can evaluate whether a Sudoku solution satisfies all constraints. The energy function assigns low scores to valid configurations and high scores to violations. Elegant, verifiable, reliable.

But who defined the Sudoku rules? Who decided that this particular puzzle, with these particular constraints, is what we’re solving?

In closed domains with fixed rules, this isn’t a problem. The constraints are given. But intelligence in the wild doesn’t work with given constraints. Real intelligent systems interact with stochastic environments, negotiate what counts as a problem, evolve their own boundaries, determine which constraints matter and which can be relaxed.

Michelangelo didn’t receive his method of sculpture from a standardized set of rules. He evolved it through interaction with the material. The “constraints” of the sculpture emerged from dialogue between artistic possibility and marble actuality. The goal wasn’t predetermined; it was discovered through the process of resolution. And each individual block was approached anew for the unique problem set that it posed to his artistic acumen.

A child learning to walk isn’t minimizing a fixed energy function. The child is developing the constraints—discovering what balance means, what falling teaches, how different surfaces require different dynamics. The “energy landscape” itself evolves as the system learns.

This points to a deeper reframe. Energy-based models treat intelligence as reaching a goal state—the lowest energy configuration. But intelligence isn’t about arriving at static endpoints. It’s about maintaining dynamic resolution across evolving relationships.

A sprinter pushing to the edge of collapse isn’t minimizing energy; they’re mastering the critical state—maintaining adaptive coherence at the boundary between order and chaos. The master sculptor isn’t executing a plan; he’s sustaining a dialogue with material that continuously reveals new possibilities.

Intelligence isn’t entropy reduction to a target. It’s entropy state resolution—the ongoing process of managing coherence across shifting boundaries and evolving constraints.

The Emerging Shape

Something is taking form in AI research, the way a figure takes form from marble.

Energy-based reasoning. Diffusion generation. Ecosystem architectures. Free energy principles. Different chisels striking the same stone.

The shape emerging is intelligence as entropy state resolution; not prediction, not goal-seeking, but the dynamic process of maintaining coherent relationships across boundaries in the face of uncertainty.

LeCun’s move to Logical Intelligence signals that this alternative paradigm is ready for deployment, not just theory. Bodnia’s ecosystem framing signals a shift from “which model wins” to “how do different intelligences integrate.” The convergence of independent research programs signals that we’re discovering principles, not just techniques.

But the discovery is incomplete. Current frameworks still treat intelligence as a property of individual models rather than interactive relations. They still optimize toward fixed functions rather than evolving constraints. They still locate computation in discrete entities rather than systemic dynamics.

I encourage researchers and builders to keep pushing past “intelligent models” toward intelligent systems, past “energy minimization” toward dynamic entropy state resolution, past “ecosystem of components” toward emergent systemic intelligence.

The principles are surfacing. The paradigm is shifting. The work now is to follow the grain where it leads. The statue is in the stone.

The question is whether we’ll discover it or impose some lesser vision because we couldn’t commit to the path of Intelligence from First Principles.

The Singularity Project explores the emerging integration of human and artificial intelligence. Subscribe for updates on developments that matter.

Discussion about this post

Augmented Generation

Jan 25

I think the shape that’s emerging is this: intelligence is, at root, entropy management. Not metaphorically. Literally. The free energy principle gives us a clean lens for seeing it. Life is a local, temporary resistance to thermodynamic decay. Intelligence is the machinery that makes that resistance adaptive rather than accidental.

In its simplest form, intelligence is the ability to resolve uncertainty well enough to stay within viable bounds. Evolution is just the long, blind optimizer that keeps improving that ability across ever more hostile and varied environments. Consciousness, in that frame, isn’t special pleading. It’s another adaptation, one that shows up when prediction, memory, and social complexity reach a point where internal simulation becomes cheaper than brute reaction.

Human intelligence and social evolution aren’t exceptions to this story. They’re continuations of it.

Active inference doesn’t invent this logic. It names it. It describes how systems that persist must act to minimize free energy relative to expectations that encode survival. Strip away the equations and what remains is almost banal: if a system doesn’t care about remaining within its constraints, it doesn’t last long enough to be called intelligent.

That’s why I’m skeptical of the idea that AGI can emerge while disregarding this most basic biological principle. AI already works as an extension of human objectives, and that’s powerful. Human intelligence multiplied by artificial intelligence gives us something real, something consequential. Technological intelligence, if you like. There’s no question that this coupling can be expanded dramatically.

But that’s not the same thing as human-level intelligence in the autonomous sense.

Human intelligence is not just modeling and reasoning. It’s modeling and reasoning under pressure. Creativity is not decorative. It is what exploration looks like when the future is uncertain and failure is costly. Curiosity isn’t a luxury trait. It’s what you get when a system needs to widen the range of states in which it can survive.

Every biological intelligence we know evolved under that pressure. Every one of them is shaped by the need to persist. Take that away and you don’t get a calmer, purer intelligence. You get something inert. A system that can answer questions but never asks any. A map with no reason to be consulted.

So I doubt that human-level intelligence can arise without some intrinsic drive, some preferred state, even if it’s minimal and abstract. Call it survival, call it constraint satisfaction, call it free energy minimization. The label doesn’t matter. What matters is that without it, nothing compels exploration, creativity, or self-directed action.

AGI without that pressure might be extraordinarily useful. It might even be indispensable. But it would still be a tool, waiting patiently for someone else to supply meaning.

Biology’s lesson, repeated for billions of years, is harder to ignore than most engineering intuitions. Intelligence doesn’t emerge just from knowing the world. It emerges from needing to stay in it.

1 reply by Steven Vincent

Augmented Generation

Jan 25

Again

No Objective, No AGI

An intelligence without an objective, preferred state, or intrinsic constraint can be very powerful, but it is not an agent. It is a tool, a map, or at best a simulator waiting to be used. AGI, as people usually mean it, implies agency, persistence, and autonomy. Those require stakes.

Here’s the core argument.

Intelligence without an objective can model the world.

Intelligence with an objective must act in the world.

JEPA, energy-based reasoning, large language models, and most current architectures excel at building internal structure. They compress, predict, solve constraints, and generate solutions when prompted. But none of them, on their own, have a reason to do anything when nothing is asked.

Active Inference for example, associated with Karl Friston, is different because it bakes in a non negotiable fact of biology: systems that persist must remain within viable bounds. Preferred states are not goals in the motivational poster sense. They are survival constraints. Temperature, energy, integrity, coherence. Miss them and the system stops existing.

That single move, adding preferred states, converts prediction into agency.

Without preferred states:

• Error is informational, not urgent

• Time does not matter

• Inaction is always acceptable

• There is no reason to explore, protect, or persist

You can bolt on tasks, rewards, or external prompts, but that produces instrumental intelligence, not autonomous intelligence. The system is smart only when someone else supplies meaning.

This is where many AGI discussions quietly cheat.

They assume intelligence naturally “wants” to generalize, explore, or improve. It doesn’t. Those are values smuggled in from the designers or from training setups that proxy for objectives without admitting it.

Reinforcement learning adds objectives, but usually shallow ones. Reward maximization works, but it fragments cognition into task specific hacks unless the reward structure is extraordinarily rich and stable. Most real environments are not.

Active Inference’s claim is stronger and harder to escape.

An agent must expect itself to exist. Or Markov Blanket

That expectation defines what counts as error. Or free energy

That error defines action.

No objective, no agent.

No preferred states, no reason to choose one future over another.

Now the subtle point, because this is where LeCun’s camp pushes back.

Could you build AGI by first building a perfect world model, then later adding objectives?

In principle, yes. In practice, the moment you add objectives, you radically reshape what “intelligence” means inside the system. Representation, attention, memory, and learning all reorganize around what matters. Biology does not add goals after the fact. Goals sculpt perception from the beginning.

So a system trained without stakes may be poorly shaped for agency later. It will know many things and care about none of them.

This leads to a clean conclusion.

AGI without an innate objective is possible only in a hollow sense, a universal simulator that never initiates action. The moment you want autonomy, persistence, curiosity, or self directed behavior, you must introduce preferred states.

Active Inference does not claim those states must be humanlike, emotional, or even conscious. They can be minimal. But they must exist.

AGI without objectives is like a brain without metabolism.

It may compute, but it will not live.

The uncomfortable implication is this.

The hardest part of AGI is not world modeling or reasoning. It is deciding what the system is allowed to care about, and how much. That is why most approaches postpone it. Active Inference puts it front and center and pays the price in complexity.

Whether engineering ultimately sides with biology or tries to cheat it remains open. But if AGI ever genuinely exists, it will not be neutral. Neutrality is not stable.