Are you bored of your computer doing exactly what its told in a deterministic and explainable fashion?
Then
Okay, its easy to make fun of things, but lets take a serious look.
what is a neural computer?
The preprint positions them as a complete replacement for conventional computers.
Its a machine where memory, I/O, and all computational logic are substituted by a learned model. This means they are governed by some sort of loss instead of rigid rules.
If we have just the model and no other parts, then we call it a completely neural computer. But not just any model will suffice. Ideally, the model ensures that our machine is turing complete, universally programmable, and behaviorly consistent unless explicitly reprogrammed.
These sounds like sensible things that we would want from a computer.
But how do we define and apply concepts such as turing completeness to neural networks?
What exactly does it mean for a neural computer to be behaviourly consistent?
The paper responds with some vague references, hypothesises and roadmaps.
But lets assume we could achieve such a machine.
why would we?
Why do we want to make our computers completely neural?
Seems like a pretty important question to answer in a paper on this new “frontier” of research.
Yet, the paper barely glances at it.
Something is only worth working towards if it is better in some way, so let us examine the differences to conventional computers more closely.
According to the authors, conventional computers are exact and interpretable, but brittle under noise and model mismatch. “Neural computers, by contrast, realize holistic, distributed numerical semantics, trading precise local semantics for robustness and generalization” (yes, all of the paper is written this way).
Okay, so basically, old computers are exact, but this exactness comes at the cost of brittleness. Neural computers are inexact but flexible and capable of generalization instead.
Lets take a look at an example from the paper: CLI / REPL. The authors train a video generation model on screen recordings of a terminal where some basic commands are executed. In a command line, “ls” outputs the contents of the current directory. In a python REPL, “3+5” yields “8”. The model, once trained on these commands can simulate such outputs when given according inputs. This is supposed to be the first step on the path to neural computers.
Ignoring their pretty evaluation scores, what reason is there to execute commands like “ls” or “3+5” in a neural computer instead of a conventional one?
Basic addition is a solved problem. What benefit do we get from using a neural network?
I can think of two possible reasons neural computers could be advantageous over conventional methods:
syntax
As mentioned previously, conventional computers are rigid. You must follow the given syntactical rules. Neural computers could be more flexible, allowing the machine to interprete your commands. Instead of “ls”, you could ask “what is in this directory?”.
But is this what we want?
Perhaps sometimes some flexibility is nice, but rarely in execution of straight-forward tasks. A model that interpretes commands and executes them conventionally would be much better here. And even in other, more complicated examples, we must understand that leaving commands up for interpretation is never the preferred way of doing anything. It is a fallback. Exact instructions are always better when possible, its just that they are sometimes not feasible.
If the authors focused on such tasks where conventional methods fail, it would make some sense. But this is not the case. The authors explicitly state that neural computers are meant to fulfill all of the functions of conventional computers but with a different “underlying runtime substrate”.
efficiency
The second reason neural computers could be better is that it is possible that their execution time might be approximately constant or independant of the task. The neural architecture may take just as long to display a complex physics simulation as to add “3+5”. This makes it very efficient at some conventionally complicated work which sounds amazing until you remember that this comes at the cost of all guarantees and reliability. What use is a super fast physics simulation that isn’t accurate?
At the same time, if this were true, it would also mean simple things such as adding numbers would likely be much slower than conventional methods, giving us a trade-off instead of general efficiency.
conclusion
In total, these reasons are not very compelling and sadly, the paper does not offer much help to its own cause.
The authors mention that progress can be evaluated by conventional solvers and combinatorial program search like LLM-based code generation but offers no good argument as to why we dont just focus on model-assisted code generation in the first place.
One of my main questions is: If we have advanced in the field of aritificial intelligence so far that we can construct a completely neural computer to replace conventional ones, then why dont we use all of that intelligence and capacity to create instructions for conventional machines whenever possible?
I understand the appeal of a general purpose AI machine that accepts any input and can do anything you ask it to, but replacing conventional architecture and methods completely is unfounded and nonsensical.
When we made video games neural, I thought it was at least fun. And I could even see the potential when combined with some sort of determinism and persistence. On-the-fly adaption, generalizability and the infinite possibilities just might make the effort worth while.
But this?
It seems like the authors got so caught up in the technical implementation of a fancy idea, that nobody stopped to think about whether the mountain was worth climbing.
I see this happening way too often now, especially in bigger teams where everyone is working on their part and can easily avoid the hassle of responsiblity. Its hard to be the one that says “hey everyone, perhaps we should rethink our entire approach and throw away everything we have worked on if it doesnt make sense”.
But in an age where doing is faster than thinking, we must take the time to ask “why do we want this?” and “are we still headed in the right direction?” or risk wasting our time paving a road up a mountain that nobody wants to climb.
attribution:
Preprint on Arxiv: https://arxiv.org/abs/2604.06425