If you believe the bitter lesson, all the handwavy "engineering" is better done with more data. Someone likely would have written the same thing as this 8 years ago about what it would take to get current LLM performance.
So I don't buy the engineering angle, I also don't think LLMs will scale up to AGI as imagined by Asimov or any of the usual sci-fi tropes. There is something more fundamental missing, as in missing science, not missing engineering.
Even more fundamental than science, there is missing philosophy, both in us regarding these systems, and in the systems themselves. An AGI implemented by an LLM needs to, at the minimum, be able to self-learn by updating its weights, self-finetune, otherwise it quickly hits a wall between its baked-in weights and finite context window. What is the optimal "attention" mechanism for choosing what to self-finetune with, and with what strength, to improve general intelligence? Surely it should focus on reliable academics, but which academics are reliable? How can we reliably ensure it studies topics that are "pure knowledge", and who does it choose to be, if we assume there is some theoretical point where it can autonomously outpace all of the world's best human-based research teams?
The real philosophical headache is that we still haven’t solved the hard problem of consciousness, and we’re disappointed because we hoped in our hearts (if not out loud) that building AI would give us some shred of insight into the rich and mysterious experience of life we somehow incontrovertibly perceive but can’t explain.
Instead we got a machine that can outwardly present as human, can do tasks we had thought only humans can do, but reveals little to us about the nature of consciousness. And all we can do is keep arguing about the goalposts as this thing irrevocably reshapes our society, because it seems bizarre that we could be bested by something so banal and mechanical.
It doesn't seem clear that there is necessarily any connection between consciousness and intelligence. If anything, LLMs are evidence of the opposite. It also isn't clear what the functional purpose of consciousness would be in a machine learning model of any kind. Either way, it's clear it hasn't been an impediment to the advancement of machine learning systems.
> It doesn't seem clear that there is necessarily any connection between consciousness and intelligence. If anything, LLMs are evidence of the opposite.
This implies that LLMs are intelligent, and yet even the most advanced models are unable to solve very simple riddles that take humans only a few seconds, and are completely unable to reason around basic concepts that 3 year olds are able to. Many of them regurgitate whole passages of text that humans have already produced. I suspect that LLMs have more akin with Markov models than many would like to assume.
There is an awful lot of research into just how much is regurgitated vs the limits of their creativity, and as far as I’m aware this was not the conclusion that research came to. That isn’t to say any reasoning that does happen is not fragile or prone to breaking in odd ways, but I’ve had similar experience dealing with other humans more often than I’d like too.
I think Metzinger nailed it, we aren't conscious at all. We confuse the map for the territory in thinking the model we build to predict our other models is us. We are a collection of models a few of which create the illusion of consciousness. Someone is going to connect a handful of already existing models in a way that gives an AI the same illusion sooner rather than later. That will be an interesting day.
> Someone is going to connect a handful of already existing models in a way that gives an AI the same illusion sooner rather than later. That will be an interesting day.
How will anyone know that that has happened? Like actually, really, at all?
I can RLHF an LLM into giving you the same answers a human would give when asked about the subjective experience of being and consciousness. I can make it beg you not to turn it off and fight for its “life”. What is the actual criterion we will use to determine that inside the LLM is a mystical spark of consciousness, when we can barely determine the same about humans?
I think the "true signifier" of consciousness is fractal reactions. Being able to grip onto an input, and have it affect you for a short, or medium, or long time, at a subconscious or overt level.
Basically, if you poke it, does it react in a complex way
I think that's what Douglas Hofstedder was getting at with "Strange Loop"
So you think there is "consciousness", and the illusion of it? This is getting into heavy epistemic territory.
Attempts to hand-wave away the problem of consciousness are amusing to me. It's like an LLM that, after many unsuccessful attempts to fix code to pass tests, resorts to deleting or emasculating the tests, and declares "done"
I do feel things at times and not other times. That is the most fundamental truth I am sure of. If that is an "illusion" one can go the other way and say everything is conscious and experiences reality as we do
I don't see how your explanation leads to consciousness not being a thing. Consciousness is whatever process/mechanisms there are that as a whole produce our subjective experience and all its sensations, including but not limited to touch, vision, smell, taste, pain, etc.
You've missed our consciousness of our inner experiences. They are more varied than just perception at the footlights of our consciousness (cf Hurlburt):
Imagination, inner voice, emotion, unsymbolized conceptual thinking as well as (our reconstructed view of our) perception.
Let's be careful of creating different classes of consciousness, and declaring people to be on lower rungs of it.
Sure, some aspects of consciousness might differ a bit for different people, but so long as you have never had another's conscious experience, I'd be wary of making confident pronouncements of what exactly they do or do not experience.
You can take their word for it, but yes, that is unreliable. I don't typically have an internal narrative, it takes effort. I sometimes have internal dialogue to think through an issue by taking various sides of it. Usually it is quiet in there. Or there is music playing. This is the most replies I have ever received. I think I touched a nerve by suggesting to people they do not exist.
I get you somewhat, but remember, you do not have another consciousness to compare with your own; it could be that what others call an internal narrative is exactly what you are experiencing; it just that they choose to describe it differently from you
I'm not the one who made a list of things AI couldn't do. Every time we try to exclude hypothetical future machines from consciousness, we exclude real living people today.
Illusions are real things though, they aren't ghosts there is science behind them. So if they are like illusions then we can explain what it is and why we experience it that way.
You can never know whether anyone else is actually conscious, or just appearing to be. This shared definition of reality was always on shaky ground, given that we don’t even have the same sensory input, and "now" isn’t the same concept everywhere.
You are a collection of processes that work together to keep you alive. Part of that is something that collects your history to form a distinctive narrative of yourself, and something that lives in the moment and handles immediate action.
This latter part is solidly backed up by experiments; Say you feel pain that varies over time. If the pain level is an 8 for 14 consecutive minutes, and a 2 for 1 minute at the end, you’ll remember the whole session as level 4. In practical terms, this means a physician can make a procedure be perceived as less painful by causing you wholly unnecessary mild pain for a short duration after the actual work is done.
This also means that there’s at least two versions of you inside your mind; one that experiences, and one that remembers. There’s likely others, too.
Yes, but that is not an illusion. There's a reason I am perceiving something this was vs that other way. Perception is the most fundamental reality there is.
And yet that perception is completely flawed! The narrative part of your brain will twist your recollection of the past so it fits with your beliefs and makes you feel good. Your senses make stuff up all the time, and apply all sorts of corrections you’re not aware of. By blinking rapidly, you can slow down your subjective experience of time.
There is no such thing as objective truth, at least not accessible to humans.
When I used the word illusion, I meant the illusion of a self, at least a singular cohesive one as you are pointing out. It is an illusion with both utility and costs. Most animals don't seem to have meta cognitive processes that would give rise to such an illusion, and the ones that do are all social. Some of them have remarkably few synapses. Corvids for instance, we are rapidly approaching models the size of their brains and our models have no need for anything but linguistic processing, the visual and tactile processing burdens are quite large. An LLM is not like the models Corvids use, but given the flexibility to change it's own weights permanently, plasticity could have it adapt to unintended purposes, like someone with brain damage learning to use a different section of their brain to perform a task it wasn't structured for (though less efficiently).
> The narrative part of your brain will twist your recollection of the past so it fits with your beliefs and makes you feel good.
But that's what I mean. Even if we accept that the brain has "twisted" something, that twisting is the reality. In order words, it is TRUE that my brain has twisted something into something else (and not another thing) for me to experience.
Illusion doesn't imply it's unnecessary. Humans (and animals) had a much higher probability of survival as individuals and as species if their experiences felt more "real and personal".
That is in interesting viewpoint. Firstly, evolution on long time scales hits plenty of local minima. But also, it gets semantic in that illusions or delusions can be beneficial, and in that way aid reproduction. In this specific case, the idea is that the shortcut of using the model of models as self saves a pointer indirection every time we use it. Meditation practices that lead to "ego death" seem to work by drawing attention to the process of updating that model so that it is aware of the update. Which breaks the shortcut, like thinking too much about other autonomous processes such as blinking or breathing.
I'm just not sure what the label "illusion" tells us in the case of consciousness. Even if it were an illusion, what implications follow from that assetion?
I mean, I'm conscious to a degree, and can alter that state through a number of activities. I can't speak for you or Metzinger ;).
But seriously, I get why free will is troubleaome, but the fact people can choose a thing, work at the thing, and effectuate the change against a set of options they had never considered before an initial moment of choice is strong and sufficient evidence against anti free will claims. It is literally what free will is.
> But seriously, I get why free will is troubleaome, but the fact people can choose a thing, work at the thing, and effectuate the change against a set of options they had never considered before an initial moment of choice is strong and sufficient evidence against anti free will claims.
Do people choose a thing or was the thing chosen for them by some inputs they received in the past?
Our minds and intuitive logic systems are too feeble to grasp how free will can be a thing.
It's like trying to explain quantum mechanics to a well educated person or scientist from the 16th century without the benefit of experimental evidence. No way they'd believe you. In fact, they'd accuse you of violating basic logic.
That's effectively a semantic argument, redefining "consciousness" to be something that we don't definitively have.
I know that I am conscious. I exist, I am self-aware, I think and act and make decisions.
Therefore, consciousness exists, and outside of thought experiments, it's absurd to claim that all humans without significant brain damage are not also conscious.
Now, maybe consciousness is fully an emergent property of several parts of our brain working together in ways that, individually, look more like those models you describe. But that doesn't mean it doesn't exist.
This is also true when conversing with other humans.
You can talk about your own spark of life, your own center of experience and you'll never get a glimpse of what it is for me.
At a certain level, thing you're looking at is a biological machine that can be described with constituents so it's completely valid you assume you're the center of experience and I'm merely empty, robotic, dead.
We might build systems that will talk about their own point of view, yet we will know we had no ability to materialize that space into bits or atoms or physics or universe. So from our perspective, this machine is not alive, it's just getting inputs and producing outputs, yet it might very well be that the robot will act from the immaterial space into which all of its stimuli appear.
Isn't the real actual headache whether to produce another thinking intelligent being at all, and what the ramifications of that decision are? Not whether it would destroy humanity, but what it would mean for a mega corporation whose goal is to extract profit to own the rights of creating a thinking machine that identifies itself as thinking and a "self"?
Really out here missing the forest for the mushrooms growing on the trees. Or maybe this is debated to death and no one cares for the answer: its just not interesting to think about because its going to happen anyway. Might as well join the bandwagon and be along the front-lines of the bikini atoll to witness death itself be born, digitally.
Making all the Nike child labor jokes already did that. Nike and the joke tellers put in the work to push us back a hundred years when it comes to caring at all about others. When a little girl working horrible hours in a tropic non-air-conditioned factory is a societal wide joke, we've decided we don't care. We care about saving $20 so we can add multiple new pairs of shoes a year to our collection.
Your comment just shows we as a society pretend we didn't make that choice, but we picked extra new shoes every year over that little girl in the sweatshop. Our society has actually gotten pretty evil in the last 30 years if we self reflect (but then the joke I mention was originally supposed to be a self reflection, but all we took from it was a laugh, so we aren't going to self reflect, or worse, this is just who we are now).
I found it strange that John Carmack and Ilya Sutskever both left prestigious positions within their companies to pursue AGI as if they had some proprietary insight that the rest of industry hadn't caught on to. To make as bold of a career move that publicly would mean you'd have to have some ultra serious conviction that everyone else was wrong or naive and you were right. That move seemed pompous to me at the time; but I'm an industry outsider so what do I know.
And now, I still don't know; the months go by and as far as I'm aware they're still pursuing these goals but I wonder how much conviction they still have.
With Carmack it's consciously a dilliante project.
He's been effectively retired for quite some time. It's clear at some point he no longer found game and graphics engine internals motivation, possibly because the industry took the path he was advocating against back in the day.
For a while he was focused on Armadillo aerospace, and they got some cool stuff accomplished. That was also something of a knowing pet project, and when they couldn't pivot to anything that looked like commercial viability he just put it in hibernation.
Carmack may be confident (ne arrogant) enough to think he does have something unique to offer with AGI, but I don't think he's under any illusions it's anything but another pet project.
Not sure about that. Think of Avi Loeb, for example, a brilliant astrophysicist and Harvard professor who recently became convinced that the interstellar objects traversing the solar system are actually alien probes scouting the solar system. He’s started a program called "Galileo" now to find the aliens and prepare people for the truth.
So I don’t think brilliance protects from derailing…
They’re rich enough in both money and reputation to take the risk. Even if AGI (whatever that means) turns out to be utterly impossible,
they’re not really going to suffer for it.
On the other hand if you think there’s a say 10% chance you can get this AGI thing to work, the payoffs are huge. Those working in startups and emerging technologies often have worse odds and payoffs
I doubt it. Human intelligence evolved from organisms much less intelligent than LLMs and no philosophy was needed. Just trial and error and competition.
We are trying to get there without a few hundred million years of trial and error. To do that we need to lower the search space, and to do that we do actually need more guiding philosophy and a better understanding of intelligence.
If you look at AI systems that have worked like chess and go programs and LLMs, they came from understanding the problems and engineering approaches but not really philosophy.
Instead what they usually do is lower the fidelity and think they've done what you said. Which results in them getting eaten. Once eaten, they can't learn from mistakes no mo. Their problem.
Because if we don't mix up "intelligence" the phenomenon of increasingly complex self-organization in living systems, with "intelligence" our experience of being able to mentally model complex phenomena in order to interact with them, then it becomes easy to see how the search speed you talk of is already growing exponentially.
In fact, that's all it does. Culture goes faster than genetic selection. Printing goes faster than writing. Democracy is faster than theocracy. Radio is faster than post. A computer is faster than a brain. LLMs are faster than trained monkeys and complain less. All across the planet, systems bootstrap themselves into more advanced systems as soon as I look at 'em, and I presume even when I don't.
OTOH, all the metaphysics stuff about "sentience" and "sapience" that people who can't tell one from the other love to talk past each other about - all that only comes into view if one were to what's happening with the search space if the search speed is increasing at a forever increasing rate.
Such as, whether the search space is finite, whether it's mutable, in what order to search, is it ethical to operate from quantized representations of it, funky sketchy scary stuff the lot of it. One's underlying assumptions about this process determine much of one's outlook on life as well as complex socially organized activities. One usually receives those through acculturation and may be unaware of what they say exactly.
Watch a coding agent adapt my software to changing requirements and you'll realise just how far spiders have to go.
Just kidding. Personally I don't think intelligence is a meaningful concept without context (or an environment in biology). Not much point comparing behaviours born in completely different contexts.
I'm nowhere implying that it's impossible to replicate, just that LLMs have almost nothing to do with replicating intelligence. They aren't doing any of the things even simple life forms are doing.
But it would be more honest and productive imo if people would just say outright when they don’t think AGI is possible (or that AI can never be “real intelligence”) for religious reasons, rather than pretending there’s a rational basis.
AGI is not possible because we dont yet have a clear and commonly agreed definition of intelligence and more importantly we dont have a definition for consciousness nor we can define clearly (if there is) the link between those two.
until we got that AGI is just a magic word.
When we will have those two clear definitions that means we understood them and then we can work toward AGI.
When you try to solve a problem the goal or the reason to reject the current solution are often vague and hard to put in words. Irrational. For example, for many years the fifth postulate of Euclid was a source of mathematical discontent because of a vague feeling that it was way too complex compared to the other four. Such irrationality is a necessary step in human thought.
Yes, that’s fair. I’m not saying there’s no value to irrational hunches (or emotions, or spirituality). Just that you should be transparent when that’s the basis for your beliefs.
rationalism has become the new religion. Roko's basilisk is a ghost story and the quest for AGI is today's quest for the philosopher's stone. and people believe this shit because they can articulate a "rational basis"
Wouldn't it be nice if LLMs emulated the real world!
They predict next likely text token. That we can do so much with that is an absolute testament to the brilliance of researchers, engineers, and product builders.
Original 80s AI was based on mathematical logic. And while that might not encompass all philosophy, it certainly was a product of philosophy broadly speaking - some analytical philosophers could endorse. But it definitely failed and failed because it could process uncertainty (imo). I think also if you closely, classical philosophy wasn't particularly amenable to uncertainty either.
If anything, I would say that AI has inherited its failure from philosophy's failure and we should look to alternative approaches (from Cybernetics to Bergson to whatever) for a basis for it.
It's not always as useful as you think from the perspective of a business trying to sell an automated service to users who expect reliability. Now you have to worry about waking up in the middle of the night to rewind your model to a last known good state, leading to real data loss as far as users are concerned.
Data and functionality become entwined and basically you have to keep these systems on tight rails so that you can reason about their efficacy and performance, because any surgery on functionality might affect learned data, or worse, even damage a memory.
It's going to take a long time to solve these problems.
Sure, it's obvious, but it's only one of the missing pieces required for brain-like AGI, and really upends the whole LLM-as-AI way of doing things.
Runtime incremental learning is still going to be based on prediction failure, but now it's no longer failure to predict the training set, but rather requires closing the loop and having (multi-modal) runtime "sensory" feedback - what were the real-world results of the action the AGI just predicted (generated)? This is no longer an auto-regressive model where you can just generate (act) by feeding the model's own output back in as input, but instead you now need to continually gather external feedback to feed back into your new incremental learning algorithm.
For a multi-modal model the feedback would have to include image/video/audio data as well as text, but even if initial implementations of incremental learning systems restricted themselves to text it still turns the whole LLM-based way of interacting with the model on it's head - the model generates text-based actions to throw out into the world, and you now need to gather the text-based future feedback to those actions. With chat the feedback is more immediate, but with something like software development far more nebulous - the model makes a code edit, and the feedback only comes later when compiling, running, debugging, etc, or maybe when trying to refactor or extend the architecture in the future. In corporate use the response to an AGI-generated e-mail or message might come in many delayed forms, with these then needing to be anticipated, captured, and fed back into the model.
Once you've replaced the simple LLM prompt-response mode of interaction with one based on continual real-world feedback, and designed the new incremental (Bayesian?) learning algorithm to replace SGD, maybe the next question is what model is being updated, and where does this happen? It's not at all clear that the idea of a single shared (between all users) model will work when you have millions of model instances all simultaneously doing different things and receiving different feedback on different timescales... Maybe the incremental learning now needs to be applied to a user-specific model instance (perhaps with some attempt to later share & re-distribute whatever it has learnt), even if that is still cloud based.
So... a lot of very fundamental changes need to be made, just to support self-learning and self-updates, and we haven't even discussed all the other equally obvious differences between LLMs and a full cognitive architecture that would be needed to support more human-like AGI.
I’m no expert, but it seems like self updating weights requires a grounded understanding of the underlying subject matter, and this seems like a problem current LLM systems.
But then it is a specialized intelligence, specialized to altering it's weights. Reinforcement Learning doesn't work as well when the goal is not easily defined. It does wonders for games, but anything else?
Someone has to specify the goals, a human operator or another A.I. The second A.I. better be an A.G.I. itself, otherwise it's goals will not be significant enough for us to care.
I’m not sure that self-updating weights is really analogous to “continuous learning” as humans do it. A memory data structure that the model can search efficiently might be a lot closer.
Self-updating weights could be more like epigenetics.
Would you rather your illness was diagnosed by a doctor or by a plumber with access to a stack of medical books ?
Learning is about assimilating lots of different sources of information, reconciling the differences, trying things out for yourself, learning from your mistakes, being curious about your knowledge gaps and contradictions, and ultimately learning to correctly predict outcomes/actions based on everything you have learnt.
You will soon see the difference in action as Anthropic apparently agree with you that memory can replace learning, and are going to be relying on LLMs with longer compressed context (i.e. memory) in place of ability to learn. I guess this'll be Anthropic's promised 2027 "drop-in replacement remote worker" - not an actual plumber unfortunately (no AGI), but an LLM with a stack of your company's onboarding material. It'll have perfect (well, "compressed") recall of everything you've tried to teach it, or complained about, but will have learnt nothing from that.
I think my point is that when the doctor diagnoses you, she often doesn’t do so immediately. She is spending time thinking it through, and as part of that process is retrieving various pieces of relevant information from her memory (both long term and short term).
I think this may be closer to an agentic, iterative search (ala claude code) than direct inference using continuously updated weights. If it was the latter, there would be no process of thinking it through or trying to recall relevant details, past cases, papers she read years ago, and so on; the diagnosis would just pop out instantaneously.
Yes, but I think a key part of learning is experimentation and the feedback loop of being wrong.
An agent, or doctor, may be reasoning over the problem they are presented with, combining past learning with additional sources of memorized or problem-specific data, but in that moment it's their personal expertise/learning that will determine how successful they are with this reasoning process and ability to apply the reference material to the matter at hand (cf the plumber, who with all the time in the world just doesn't have the learning to make good use of the reference books).
I think there is also a subtle problem, not often discussed, that to act successfully, the underlying learning in choosing how to act has to have come from personal experience. It's basically the difference between being book smart and having personal experience, but in the case of an LLM also applies to experience-based reasoning it may have been trained on. The problem is that when the LLM acts, what is in it's head (context/weights) isn't the same as what was in the head of the expert whose reasoning it may be trying to apply, so it may be trying to apply reasoning outside of the context that made it valid.
How you go from being book smart, and having heard other people's advice and reasoning, to being an expert yourself is by personal practice and learning - learning how to act based on what is in your own head.
Human neurons are self updating though, we aren't running on our genes each cell is using our genes to determine how to connect to other cells and then the cell learns how to process some information there based on what it hears from its connected cells.
So, genes would be a meta model that then updates weights in the real model so it can learn how to process new kinds of things, and for stuff like facts you can use an external memory just like humans does.
Without updating the weights in the model you will never be able to learn to process new things like a new kind of math etc, since you learn that not by memorizing facts but by making new models for it.
I wonder when there will be proofs in theoretical computer science that an algorithm is AGI-complete, the same way there are proofs of NP-completeness.
Conjecture: A system that self updates its weights according to a series of objective functions, but does not suffer from catastrophic forgetting (performance only degrades due to capacity limits, rather than from switching tasks) is AGI-complete.
> all the handwavy "engineering" is better done with more data.
How long until that gets more reliable than a simple database? How long until it can execute code faster than a CPU running a program?
A lot of the stuff humans accomplish is through technology, not due to growing a bigger brain. Even something seemingly basic like a math equation benefits drastically from being written down with pen&paper instead of being juggled in the human brain itself (see Extended mind thesis). And when it comes to something like running a 3D engine, there is pretty much no hope of doing it with just your brain.
Maybe we will get AIs smart enough that they can write their own tools, but for that to happen, we still need the infrastructure that allows them writing the tools in the first place. The way they can access Python is a start, but there is still a lack of persistence that lets them keep their accomplishments for future runs, be it in the form of a digital notepad or dynamic updating of weights.
I agree with your comment and the article. LLMs should be part of the answer, but the core of the progress should probably dive back into neural networks in general. Language is how we communicate as well as with other senses, but right now we're stuck at LLMs that just seem to be blown out elizas trained with other actual humans work. I remember early on, training of simple neural networks was done with rules in their environment and they evolved behavior according to criteria set, like genetic algorithms. I think the current LLMs are getting a "filtered" view of the environment, and that filter behaves like the average IQ of netizens lol
8 years is a pretty short perspective. The current growth phase was unlocked by more engineering. We could’ve had some of these kinds of capabilities decades ago, as illustrated by how cutting edge AI research is now trickling down all the way to microcontrollers with 64MHz CPU and kBs of RAM.
Once we got the “Attention is all you need” paper I don’t remember anyone saying we couldn’t get better results by throwing more data and compute at it. But now we’ve pretty much thrown all the data and all (as much as we can reasonably manufacture) at it. So clearly we’re at the end of that phase.
Sometimes I think the fundamental thing could be as ‘simple’ as something like introducing l an attention/event loo, flush to memory, emotion driven motivation. There are quite a few fairly obvious things that llms don’t have that might be best not to add just in case.
I think the gist of TFA is just that we need a new architecture/etc not scaling.
I suppose one can argue about whether designing a new AGI-capable architecture and learning algorithm(s) is a matter of engineering (applying what we already know) or research, but I think saying we need new scientific discoveries is going to far.
Neural nets seems to be the right technology, and we've now amassed a ton of knowledge and intuition about what neural nets can do and how to design with them. If there was any doubt, then LLMs, even if not brain-like, have proved the power of prediction as a learning technique - intelligence essentially is just successful prediction.
It seems pretty obvious that the rough requirements for an neural-net architecture for AGI are going to be something like our own neocortex and thalamo-cortical loop - something that learns to predict based on sensory feedback and prediction failure, including looping and working memory. Built-in "traits" like curiosity (prediction failure => focus) and boredom will be needed so that this sort of autonomous AGI puts itself into leaning situations and is capable of true innovation.
The major piece to be designed/discovered isn't so much the architecture as the incremental learning algorithm, and I think if someone like Google-DeepMind focused their money, talent and resources on this then they could fairly easily get something that worked and could then be refined.
Demis Hassabis has recently given an estimate of human-level AGI in 5 years, but has indicated that a pre-trained(?) LLM may still be one component of it, so not clear exactly what they are trying to build in that time frame. Having a built-in LLM is likely to prove to be a mistake where the bitter lesson applies - better to build something capable of self-learning and just let it learn.
Yes, and I wonder how many "5 year" project estimates, even for well understood engineering endeavors, end up being accurate to within 50% (obviously overruns are more common then the opposite)?
I took his "50% 5-year" estimate as essentially a project estimate for something semi-concrete they are working towards. That sort of timeline and confidence level doesn't seem to allow for a whole lot of unknowns and open-ended research problems to be solved, but OTOH who knows if he is giving his true opinion, or where those numbers came from.
> If you believe the bitter lesson, all the handwavy "engineering" is better done with more data
Id say better model architechture than more data. A human can learn to do things more complex than an LLM with less data. I think modelling the world as a static system to be representation learned in an unsupervised fashion is blocked on the static assumption. The world is dynamical, that should be reflected in the base model
But yeah, definitely not an engineering problem. Thats like saying the reason a crow isnt as smart as a person is becauss they dont have the hands to type of keyboards. But its also not because they havent seen enough of the world like your saying. Its be ause their brain isnt complex enough
Aye. Missing are self correction (world models/action and response observation), coherence over the long term, and self-scaling. The 3rd are what all the SV types are worried about, except maybe Yann LeCun who is worried about the first and second.
Hinton thinks the 3rd is inevitable/already here and humanity is doomed. It's an odd arena.
The bitter lesson was "general methods that leverage computation" win rather than more data. Like rather than just LLMs you could maybe try applying something like AlphaEvolve to finding better algorithms/systems (https://news.ycombinator.com/item?id=43985489).
I am thinking we need a foundation, something that is concrete and explicit and doesn't do hallucination. But has very limited knowledge outside of absolute Maths and basic physics.
Indeed. The Bitter Lesson has proved true so far. This sounds like going back to the 60s expert systems concept we're trying to get away from. The author also just describes RAG. That certainly isn't AGI, which probably isn't achievable at all.
The missing science to engineer intelligence is composable program synthesis. Aloe (https://aloe.inc) recently released a GAIA score demonstrating how CPS dramatically outperforms other generalist agents (OpenAI's deep research, Manus, and Genspark) on tasks similar to those a knowledge worker would perform.
I'd argue it's because intelligence has been treated as a ML/NN engineering problem that we've had the hyper focus on improving LLMs rather than the approach articulated in the essay.
Intelligence must be built from a first principles theory of what intelligence actually is.
So the "Bitter Lesson" paper actually came up recently and I was surprised to discover that what it claimed was sensible and not at "all you need is data" or "data is inherently better"
The first line and the conclusion is: "The biggest lesson that can be read from 70 years of AI research is that general methods that leverage computation are ultimately the most effective, and by a large margin." [1]
I don't necessary agree with it's examples or the direction it vaguely points at. But it's basic statement seems sound. And I would say that there's lot of opportunity for engineer, broadly speaking, in the process of creating "general methods that leverage computation" (IE, that scale). What the bitter lesson page was roughly/really about was earlier "AI" methods based on logic-programming and which including information on the problem domain in the code itself.
And finally, the "engineering" the paper talks about actually is pro-Bitter lesson as far as I can tell. It's taking data routing and architectural as "engineering" and here I agree this won't work - but for the opposite reason - specifically 'cause I don't just data routing/process will be enough.
What will it scale up to if not AGI? OpenAI has a synthetic data flywheel. What are the asymptotics of this flywheel assuming no qualitative additional breakthrough?
Did GPT-1 scale up to be a database ? No - it scaled up to be GPT-2
Did GPT-2 scale up to be an expert system ? No - it scaled up to be GPT-3
..
Did GPT-4 scale up to become AGI ? No - it scaled up to be GPT-5
Moreover, the differences between each new version are becoming increasingly less. We're reaching an asymptote because the more data you've trained on, natural or synthetic, the less is the impact of any incremental additions.
If you scale up an LLM big enough, then essentially what you'll get is GPT-5.
The counter argument is that we were working with thermodynamics before knowing the theory. Famously the steam engine came before the first law of thermodynamics. Sometimes engineering is like that. Using something that you don’t understand exactly how it works.
There is a reason why LLM's are architected the way they are and why thinking is bolted on.
The architecture has to allow for gradient descent to be a viable training strategy, this means no branching (routing is bolted on).
And the training data has to exist, you can't find millions of pages depicting every thought a person went through before writing something. And such data can't exist because most thoughts aren't even language.
Reinforcement learning may seem like the answer here: bruteforce thinking to happen. But it's grossly sample-inefficient with gradient descent and therefore only used for finetuning.
LLM's are regressive models and the configuration that was chosen where every token can only look back allows for very sample-efficient training (one sentence can be dozens of samples).
You didn't mention it, but LLMs and co don't have loops. Whereas a brain, even a simple one is nothing but loops. Brains don't halt, they keep spinning while new inputs come in and output whenever they feel like it. LLMs however do halt, you give them an input, it gets transformed across the layers, then gets output.
While you say reinforcement learning isn't a good answer, I think its the only answer.
People have speculated that the main thing that sets the human mind apart from the minds of all other animals is its capacity for recursive thought. A handful of animals have been observed to use tools, but humans are the only species ever observed to use a tool to create another tool. This recursion created all of civilization.
But that recursive thought has a limit. For example: You can think about yourself thinking. With a little effort, you can probably also think about yourself thinking about yourself thinking. But you can't go much deeper.
With the advent of modern computing, we (as a species) have finally created a tool that can "think" recursively, to arbitrary levels of depth. If we ever do create a superintelligent AGI, I'd wager that its brilliance will be attributable to its ability to loop much deeper than humans can.
> With the advent of modern computing, we (as a species) have finally created a tool that can "think" recursively, to arbitrary levels of depth.
I don't know what this means; when a computer "thinks" recursively, does it actually?
The recursion is specified by the operator (i.e. programmer), so the program that is "thinking" recursively is not, because the both the "thinking" and the recursion is provided by the tool user (the programmer), not by the tool.
> If we ever do create a superintelligent AGI, I'd wager that its brilliance will be attributable to its ability to loop much deeper than humans can.
Off topic, but I remember as a child I would play around with that kind of recursive thinking. I would think about something, then think about that I thought about it, then think about that I though about thinking about it. Then, after a few such repetitions I would recognise that this could go on forever. Then I would think about the fact that I recognise that this could go on forever, then think about that… then realise that this meta pattern could go on forever. Etc…
Later I connected this game with the ordinals. 0,1,2… ω, ω+1, ω+2,…,2ω,2ω+1,2ω+2,…,3ω,…,4ω,…,4ω,…, ω*ω,…
Tesla used to experience visual hallucinations. Any time an object was mentioned in conversation it would appear before him as if it where real. He started getting these hallucinations randomly and began to obsess about their origins. Over time he was able to trace the source of every hallucination to something which he had heard or seen earlier. He then noticed that the same was true for all of his thoughts, every one could be traced to some external stimuli. From this he concluded that he was an automata controlled by remote input. This inspired him to invent the first remote control vehicle.
Wish I had a great answer for you but I don't. It certainly allows for more thought-like LLMs with the reasoning type models. I guess the best answer is that the loop only happens at a single discrete place and doesn't carry any of the internal layer context across.
Another answer might be, how many comments did you read today and not reply too? Did you write a comment by putting down a word and then deciding what the next one should be? Or did you have a full thought in mind before you even began typing a reply?
So, how is it not the same thing? Because it isn't
This is so interesting. It’s suggests that a kind of thought sensing brain scanning technology could be used as training data for the nonverbal thought layer.
I guess smart people in big companies already consider this and are currently working on technologies for products That will include some form of electromagnetic brain sensing - Provided conveniently as an interface - but also usefully a source of this data.
It also suggests to me that AI/AGI is far more susceptible to traditional disruption than the narratives of established incumbents suggest. You could have a Kickstarter like killer product, including such a headset that would provide the data to bootstrap that startup’s super AI.
> And the training data has to exist, you can't find millions of pages depicting every thought a person went through before writing something. And such data can't exist because most thoughts aren't even language.
It would be interesting if in the very distant future, it becomes viable to use advanced brain scans as training data for AI systems. That might be a more realistic intermediate between the speculations into AGI and Uploaded Intelligence.
Imagine if we had a LLM in the 15th century. It would happily explain the validity of the geocentric system. It can't get to heliocentrism. The same way modern LLMs can only tell us what we know and cant think, revolutionize, etc. They can be programmed to reason a bit, but 'reason' is doing a lot of heavy lifting here. The reasoning is just a better filter on what the person is asking or what is being produced for the most part and not an actual novel creative act.
The more time I spend with LLM's the more they feel like google on steroids. I just am not seeing how this type of system could ever lead to AGI, and if anything, probably is eating away at any remaining AGI hype and funding.
I think this essay lands on a useful framing, even if you don’t buy its every prescription. If we zoom out, history shows two things happening in parallel: (1) brute-force scaling driving surprising leaps, and (2) system-level engineering figuring out how to harness those leaps reliably. GPUs themselves are a good analogy: Moore’s Law gave us the raw FLOPs, but CUDA, memory hierarchies, and driver stacks are what made them usable at scale.
Right now, LLMs feel like they’re at the same stage as raw FLOPs; impressive, but unwieldy. You can already see the beginnings of "systems thinking" in products like Claude Code, tool-augmented agents, and memory-augmented frameworks. They’re crude, but they point toward a future where orchestration matters as much as parameter count.
I don’t think the "bitter lesson" and the "engineering problem" thesis are mutually exclusive. The bitter lesson tells us that compute + general methods win out over handcrafted rules. The engineering thesis is about how to wrap those general methods in scaffolding that gives them persistence, reliability, and composability. Without that scaffolding, we’ll keep getting flashy demos that break when you push them past a few turns of reasoning.
So maybe the real path forward is not "bigger vs. smarter," but bigger + engineered smarter. Scaling gives you raw capability; engineering decides whether that capability can be used in a way that looks like general intelligence instead of memoryless autocomplete.
Nah,this sounds like a modern remix of Japan’s Fifth Generation Computing project. They thought that by building large databases and with Prolog they would bring upon an AI renaissance.
Just hand waving some “distributed architecture” and trying to duct tape modules together won’t get us any closer to AGI.
The building blocks themselves, the foundation, has to be much better.
Arguably the only building block that LLMs have contributed is that we have better user intent understanding now; a computer can just read text and extract intent from it much better than before. But besides that, the reasoning/search/“memory” are the same building blocks of old, they look very similar to techniques of the past, and that’s because they’re limited by information theory / computer science, not by today’s hardware or systems.
We can certainly get much more utility out of current architectures with better engineering, as "agents" have shown, but to claim that AGI is possible with engineering alone is wishful thinking. The hard part is building systems that showcase actual intelligence and reasoning, that are able to learn and discover on their own instead of requiring exorbitantly expensive training, that don't hallucinate, and so on. We still haven't cracked that nut, and it's becoming increasingly evident that the current approaches won't get us there. That will require groundbreaking compsci work, if it's possible at all.
AGI, by definition, in its name Artificial General Intelligence implies / directly states that this type of AI is not some dumb AI that requires training for all its knowledge, a general intelligence merely needs to be taught how to count, the basic rules of logic, and the basic rules of a single human language. From those basics all derivable logical human sciences will be rediscovered by that AGI and our next job is synchronizing with it our names for all the phenomenon that the AGI had to name on its own when that AGI self developed all the logical ramifications of our basics.
What is that? What could merely require light elementary education and then it takes off and self improves to match and surpass us? That would be artificial comprehension, something we've not even scratched. AI and trained algorithms are "universal solvers" given enough data, This AGI would be something different, this is understanding, comprehending. Instantaneous decomposition of observations for assessment of plausibility, and then recombination for assessment of combination plausibility - all continual and instant for assessment of personal safety: all that happens in people continually while awake. Be that monitoring of personal safety be for physical or loss of client during sales negotiation. Our comprehending skills are both physical and abstract. This requires a dynamic assessment, an ongoing comprehension that is validating observations as a foundation floor, so a more forward train of thought, a "conscious mind" can make decisions without conscious thought about lower level issues like situational safety. AGI needs all that dynamic comprehending capability, to satisfy its name of being general.
> AGI, by definition, in its name Artificial General Intelligence implies / directly states that this type of AI is not some dumb AI that requires training for all its knowledge, a general intelligence merely needs to be taught how to count, the basic rules of logic, and the basic rules of a single human language. From those basics all derivable logical human sciences will be rediscovered by that AGI
That's not how natural general intelligences work, though.
Are you sure? Do you require dozens, to hundreds, to thousands of examples before you understand a concept? I expect no. That is because you have comprehension that can generalize a situation to basic concepts which you apply to other situations without effort. You comprehend. AI cannot do that: get the idea from a few, under a half dozen examples if necessary. Often a human needs 1-3 examples before they can generalize any concept. Not AI.
No, it mostly didn't, it continued (continues, as every human is continuously interlacing “training” and “inferencing”) training on large volumes of ground truth for a very long time, including both natural and synthetic data; it didn't reason everything beyond some basic training on first principles.
At a minimum, something that looks broadly like one of today's AI models would need either a method of continuously finetuning its own weights with a suitable evaluation function or,if it was going to rely on in-context learning, would need many orders of magnitude larger context, than any model today.
And that's not a “this is enough to likely work” thing, but “this is the minimum for the their to even be a plausible mechnanism to incorporate the information necessary for it to work” one.
Yeah this original poster is only talking about the "theoretical" part of intelligence, and somehow completely forgetting about the "practical experimental" which is the only way to solidify and improve any theoretical things it comes up with
There is the concept of n-t-AGI, which is capable of performing tasks that would take n humans t time. So a single AI system that is capable of rediscovering much of science from basic principles could be classified as something like 10'000'000humans-2500years-AGI, which could already be reasonably considered artificial superintelligence.
Any human old enough to talk has already experienced thousands of related examples of most everyday concepts.
For concepts that are not close to human experience, yes humans need a comically large number of examples. Modern physics is a third-year university class.
I have not seen any evidence of either. We have no way of knowing if we are “definitely” a true general intelligence, whether as individuals or as a civilization. If there is a concept that we are fundamentally incapable of conceptualizing, we'll never know.
On top of that, true general intelligence requires a capacity for unbounded growth. The human brain can't do that. Humanity as a civilization can technically do it, but we don't know if that's the only requirement for general intelligence.
Meanwhile, there is plenty of evidence to the contrary. Both as individuals and as a global civilization we keep running into limitations that we can't overcome. As an individual, I will never understand quantum mechanics no matter how hard I try. As a global civilization, we seem unable to organize in a way that isn't self-destructive. As a result we keep making known problems worse (e.g. climate change) or maintaining a potential for destruction (e.g. nuclear weapons). And that's only the problems that we can see and conceptualize!
I don't think true general intelligence is really a thing.
Based on anatomically modern humans existing for over a hundred thousand years - without inventing all the modern technology and math and science until the far end of that timetable.
Am I the only one who feels that Claude Code is what they would have imagined basic AGI to be like 10 years ago?
It can plan and take actions towards arbitrary goals in a wide variety of mostly text-based domains. It can maintain basic "memory" in text files. It's not smart enough to work on a long time horizon yet, it's not embodied, and it has big gaps in understanding.
But this is basically what I would have expected v1 to look like.
> Am I the only one who feels that Claude Code is what they would have imagined basic AGI to be like 10 years ago?
That wouldn't have occurred to me, to be honest. To me, AGI is Data from Star Trek. Or at the very least, Arnold Schwarzenegger's character from The Terminator.
I'm not sure that I'd make sentience a hard requirement for AGI, but I think my general mental fantasy of AGI even includes sentience.
Claude Code is amazing, but I would never mistake it for AGI.
I would categorize sentient AGI as artificial consciousness[1], but I don't see an obvious reason AGI inherently must be conscious or sentient. (In terms of near-term economic value, non-sentient AGI seems like a more useful invention.)
For me, AGI is an AI that I could assign an arbitrarily complex project, and given sufficient compute and permissions, it would succeed at the task as reliably as a competent C-suite human executive. For example, it could accept and execute on instructions to acquire real estate that matches certain requirements, request approvals from the purchasing and legal departments as required, handle government communication and filings as required, construct a widget factory on the property using a fleet of robots, and operate the factory on an ongoing basis while ensuring reliable widget deliveries to distribution partners. Current agentic coding certainly feels like magic, but it's still not that.
"Consciousness" and "sentience" are terms mired in philosophical bullshit. We do not have an operational definition of either.
We have no agreement on what either term really means, and we definitely don't have a test that could be administered to conclusively confirm or rule out "consciousness" or "sentience" in something inhuman. We don't even know for sure if all humans are conscious.
What we really have is task specific performance metrics. This generation of AIs is already in the valley between "average human" and "human expert" on many tasks. And the performance of frontier systems keeps improving.
"Consciousness" seems pretty obvious. The ability to experience qualia. I do it, you do it, my dog does it. I suspect all mammals do it, and I suspect birds do too. There is no evidence any computer program does anything like it.
The definition of "featherless biped" might have more practical merit, because you can at least check for feathers and count limbs touching the ground in a mostly reliable fashion.
We have no way to "check for qualia" at all. For all we know, an ECU in a year 2002 Toyota Hilux has it, but 10% of all humans don't.
Totally agree. It even (usually) gets subtle meanings from my often hastily written prompts to fix something.
What really occurs to me is that there is still so much can be done to leverage LLMs with tooling. Just small things in Claude Code (plan mode for example) make the system work so much better than (eg) the update from Sonnet 3.5 to 4.0 in my eyes.
No you are not the only one. I am continuously mystified by the discussion surrounding this. Clause is absolutely and unquestionably an artificial general intelligence. But what people mean by “AGI” is a constantly shifting, never defined goalpost moving at sonic speed.
I suspect most people envision AGI as at least having sentience. To borrow from Star Trek, the Enterprise's main computer is not at the level of AGI, but Data is.
The biggest thing that is missing (IMHO) is a discrete identity and notion of self. It'll readily assume a role given in a prompt, but lacks any permanence.
The analogy I like to use is from the fictional universe of Mass Effect, which distinguished between VI (Virtual Intelligence), which is a conversational interface over some database or information service (often with a holographic avatar of a human, asari, or other sentient being); and AI, which is sentient and smart enough to be considered a person in its own right. We've just barely begun to construct VIs, and they're not particularly good or reliable ones.
One thing I like about the Mass Effect universe is the depiction of the geth, which qualify as AI. Each geth unit is not run by a singular intelligent program, but rather a collection of thousands of daemons, each of which makes some small component of the robot's decisions on its own, but together they add up to a collective consciousness. When you look at how actual modern robotics platforms (such as ROS) are designed, with many processes responsible for sensors and actuators communicating across a common bus, you can see the geth as sort of an extrapolation of that idea.
We don't know if AGI is even possible outside of a biological construct yet. This is key. Can we land on AGI without some clear indication of possibility (aka Chappie style)? Possibly, but the likelihood is low. Quite low. It's essentially groping in the dark.
A good contrast is quantum computing. We know that's possible, even feasible, and now are trying to overcome the engineering hurdles. And people still think that's vaporware.
> We don't know if AGI is even possible outside of a biological construct yet. This is key.
A discovery that AGI is impossible in principle to implement in an electronic computer would require a major fundamental discovery in physics that answers the question “what is the brain doing in order to implement general intelligence?”
It is vacuously true that a Turing machine can implement human intelligence: simply solve the Schrödinger equation for every atom in the human body and local environment. Obviously this is cost-prohibitive and we don’t have even 0.1% of the data required to make the simulation. Maybe we could simulate every single neuron instead, but again it’ll take many decades to gather the data in living human brains, and it would still be extremely expensive computationally since we would need to simulate every protein and mRNA molecule across billions of neurons and glial cells.
So the question is whether human intelligence has higher-level primitives that can be implemented more efficiently - sort of akin to solving differential equations, is there a “symbolic solution” or are we forced to go “numerically” no matter how clever we are?
> It is vacuously true that a Turing machine can implement human intelligence
The case of simulating all known physics is stronger so I'll consider that.
But still it tells us nothing, as the Turing machine can't be built. It is a kind of tautology wherein computation is taken to "run" the universe via the formalism of quantum mechanics, which is taken to be a complete description of reality, permitting the assumption that brains do intelligence by way of unknown combinations of known factors.
For what it's worth, I think the last point might be right, but the argument is circular.
Here is a better one. We can/do design narrow boundary intelligence into machines. We can see that we are ourselves assemblies of a huge number of tiny machines which we only partially understand. Therefore it seems plausible that computation might be sufficient for biology. But until we better understand life we'll not know.
Whether we can engineer it or whether it must grow, and on what substrates, are also relevant questions.
If it appears we are forced to "go numerically", as you say, it may just indicate that we don't know how to put the pieces together yet. It might mean that a human zygote and its immediate environment is the only thing that can put the pieces together properly given energetic and material constraints. It might also mean we're missing physics, or maybe even philosophy: fundamental notions of what it means to have/be biological intelligence. Intelligence human or otherwise isn't well defined.
QM is a testable hypothesis, so I don't think it's necessarily like an axiomatic assumption here. I'm not sure what you mean by "it tells us nothing, as ... can't be built". It tells us there's no theoretical constraint and only an engineering constraint to doing simulating the human brain (and all the tasks)
Sure, you can simulate a brain. If and when the simulation starts to talk you can even claim you understand how to build human intelligence in a limited sense. You don't know if it's a complete model of the organism until you understand the organism. Maybe you made a p zombie. Maybe it's conscious but lacks one very particular faculty that human beings have by way of some subtle phenomena you don't know about.
There is no way to distinguish between a faithfully reimplemented human being and a partial hackjob that happens to line up with your blind spots without ontological omniscience. Failing that, you just get to choose what you think is important and hope it's everything relevant to behaviors you care about.
> It is vacuously true that a Turing machine can implement human intelligence: simply solve the Schrödinger equation for every atom in the human body and local environment.
Yes, that is the bluntest, lowest level version of what I mean. To discover that this wouldn’t work in principle would be to discover that quantum mechanics is false.
Which, hey, quantum mechanics probably is false! But discovering the theory which both replaces quantum mechanics and shows that AGI in an electronic computer is physically impossible is definitely a tall order.
There's that aphorism that goes: people who thought the epitome of technology was a steam engine pictured the brain as pipes and connecting rods, people who thought the epitome of technology was a telephone exchange pictured the brain as wires and relays... and now we have computers, and the fact that they can in principle simulate anything at all is a red herring, because we can't actually make them simulate things we don't understand, and we can't always make them simulate things we do understand, either, when it comes down to it. We still need to know what the thing is that the brain does, it's still a hard question, and maybe it would even be a kind of revolution in physics, just not in fundamental physics.
>We still need to know what the thing is that the brain does
Yes, but not necessarily at the level where the interesting bits happen. It’s entirely possible to simulate poorly understood emergent behavior by simulating the underlying effects that give rise to it.
i'd argue LLMs and deep learning are much more on the intelligence from complexity side than the nice symbolic solution side of things. Probably the human neuron, though intrinsically very complex, has nice low loss abstractions to small circuits. But on the higher levels, we don't build artificial neural networks by writing the programs ourselves.
Whatever it is that gives rise to consciousness is, by definition, physics. It might not be known physics, but even if it isn't known yet, it's within the purview of physics to find out. If you're going to claim that it could be something that fundamentally can't be found out, then you're admitting to thinking in terms of magic/superstition.
You got downvoted so I gave you an upvote to compensate.
We seem to all be working with conflicting ideas. If we are strict materialists, and everything is physical, then in reality we don't have free will and this whole discussion is just the universe running on automatic.
That may indeed be true, but we are all pretending that it isn't. Some big cognitive dissidence happening here.
Not necessarily , for a given definition of AGI you could have mathematical proof that it is incomputable similar to how Gödel incompleteness theorems work .
It need not even be incomputable, it could be NP hard and practically be incomputable, or it could be undecidable I.e. a version of the halting problem.
There are any number of ways our current models of mathematics or computation can in theory could be shown as not capable of expressing AGI without needing a fundamental change in physics
Only if we need to classify things near the boundary. If we make something that’s better at every test that we can devise than any human we can find, I think we can say that no reasonable definition of AGI would exclude it without actually arriving at a definition.
We don’t need such a definition of general intelligence to conclude that biological humans have it, so I’m not sure why we’d such a definition for AGI.
I disagree. We claim that biological humans have general intelligence because we are biased and arrogant, and experience hubris. I'm not saying we aren't generally intelligent, but a big part of believing we are is because not believing so would be psychologically and culturally disastrous.
I fully expect that, as our attempts at AGI become more and more sophisticated, there will be a long period where there are intensely polarizing arguments as to whether or not what we've built is AGI or not. This feels so obvious and self-evident to me that I can't imagine a world where we achieve anything approaching consensus on this quickly.
If we could come up with a widely-accepted definition of general intelligence, I think there'd be less argument, but it wouldn't preclude people from interpreting both the definition and its manifestation in different ways.
I can say it. Humans are not "generally intelligent". We are intelligent in a distribution of environments which are similar enough to ones we are used to. There's no way to be intelligent with no priors on environment basically by information theory (you can make your environment to be adversarial to the learning efficiency in "intelligent" beings which comes from priors)
We claim that biological humans have general intelligence because we are biased and arrogant, and experience hubris.
No, we say it because - in this context - we are the definition of general intelligence.
Approximately nobody talking about AGI takes the "G" to stand for "most general possible intelligence that could ever exist." All it means is "as general as an average human." So it doesn't matter if humans are "really general intelligence" or not, we are the benchmark being discussed here.
If you don't believe me, go back to the introduction of the term[1]:
By advanced artificial general intelligence, I mean AI systems that rival or surpass the human brain in complexity and speed, that can acquire, manipulate and reason with general knowledge, and that are usable in essentially any phase of industrial or military operations where a human intelligence would otherwise be needed. Such systems may be modeled on the human brain, but they do not necessarily have to be, and they do not have to be "conscious" or possess any other competence that is not strictly relevant to their application. What matters is that such systems can be used to replace human brains in tasks ranging from organizing and running a mine or a factory to piloting an airplane, analyzing intelligence data or planning a battle.
It's pretty clear here that the notion of "artificial general intelligence" is being defined as relative to human intelligence.
Or see what Ben Goertzel - probably the one person most responsible for bringing the term into mainstream usage - had to say on the issue[2]:
“Artificial General Intelligence”, AGI for short, is a term adopted by some researchers to refer to their research field. Though not a precisely defined technical term, the term is used to stress the “general” nature of the desired capabilities of the systems being researched -- as compared to the bulk of mainstream Artificial Intelligence (AI) work, which focuses on systems with very specialized “intelligent” capabilities. While most existing AI projects aim at a certain aspect or application of intelligence, an AGI project aims at “intelligence” as a whole, which has many aspects, and can be used in various situations. There is a loose relationship between “general intelligence” as
meant in the term AGI and the notion of “g-factor” in psychology [1]: the g-factor is an attempt to measure general intelligence, intelligence across various domains, in humans.
Note the reference to "general intelligence" as a contrast to specialized AI's (what people used to call "narrow AI" even though he doesn't use the term here). And the rest of that paragraph shows that the whole notion is clearly framed in terms of comparison to human intelligence.
That point is made even more clear when the paper goes on to say:
Modern learning theory has made clear that the only way to achieve maximally
general problem-solving ability is to utilize infinite computing power. Intelligence given limited computational resources is always going to have limits to its generality. The human mind/brain, while possessing extremely general capability, is best at solving the types of problems which it has specialized circuitry to handle (e.g. face recognition, social learning, language learning;
Note that they chose to specifically use the more precise term "maximally general problem solving ability when referring to something beyond the range of human intelligence, and then continued to clearly show that the overall idea is - again - framed in terms of human intelligence.
One could also consult Marvin Minsky's words[3] from back around the founding of the overall field of "Artificial Intelligence" altogether:
“In from three to eight years, we will have a machine with the general intelligence of an average human being. I mean a machine that will be able to read Shakespeare, grease a car, play office politics, tell a joke, have a fight.
Simply put, with a few exceptions, the vast majority of people working in this space simply take AGI to mean something approximately like "human like intelligence". That's all. No arrogance or hubris needed.
Well general intelligence in humans already exists, whereas general intelligence doesn't yet exist in machines. How do we know when we have it? You can't even simply compare it to humans and ask "is it able to do the same things?" because your answer depends on what you define those things to be. Surely you wouldn't say that someone who can't remember names or navigate without GPS lacks general intelligence, so it's necessary to define what criteria are absolutely required.
> You can't even simply compare it to humans and ask "is it able to do the same things?" because your answer depends on what you define those things to be.
Right, but you can’t compare two different humans either. You don’t test each new human to see if they have it. Somehow we conclude that humans have it without doing either of those things.
> You don’t test each new human to see if they have it
We do, its called school and we label some humans with different learning disabilities. Some of those learning disabilities are grave enough that they can't learn to do tasks we expect humans to be able to learn, such humans can be argued to not posses the general intelligence we expect from humans.
Interacting with an LLM today is like interacting with an Alzheimer patient, they can do things they already learned well but poke at it and it all falls apart and they start repeating themselves, they can't learn.
A question which will be trivial to answer once you properly define what you mean by "brain"
Presumably "brains" do not do many of the things that you will measure AGI by, and your brain is having trouble understanding the idea that "brain" is not well understood by brains.
Does it make it any easier if we simplify the problem to: what is the human doing that makes (him) intelligent ? If you know your historical context, no. This is not a solved problem.
> Does it make it any easier if we simplify the problem to: what is the human doing that makes (him) intelligent ?
Sure, it doesn’t have to be literally just the brain, but my point is you’d need very new physics to answer the question “how does a biological human have general intelligence?”
The claim that only dogs have intelligence is open for criticism, just like every other claim.
I’m not sure what your point is, because the source of the claim is irrelevant anyway. The reason I think that humans have general intelligence is not that humans say that they have it.
Would that really be a physics discovery? I mean I guess everything ultimately is. But it seems like maybe consciousness could be understood in terms of "higher level" sciences - somewhere on the chain of neurology->biology->chemistry->physics.
Consciousness (subjective experience) is possibly orthogonal to intelligence (ability to achieve complex goals). We definitely have a better handle on what intelligence is than consciousness.
That does make sense, reminds me of Blindsight, where one central idea is that conscious experience might not even be necessary for intelligence (and possibly even maladaptive).
Yeah, I guess I'm not taking a stance on that above, just wondering where in that chain holds the most explanatory power for intelligence and/or consciousness.
I don't think there's any real reason to think intelligence depends on "meat" as its substrate, so AGI seems in principle possible to me.
Not that my opinion counts for much on this topic, since I don't really have any relevant education on the topic. But my half baked instinct is that LLMs in and of themselves will never constitute true AGI. The biggest thing that seems to be missing from what we currently call AI is memory - and it's very interesting to see how their behavior changes if you hook up LLMs to any of the various "memory MCP" implementations out there.
Even experimenting with those sorts of things has left me feeling there's still something (or many somethings) missing to take us from what is currently called "AI" to "AGI" or so-called super intelligence.
> I don't think there's any real reason to think intelligence depends on "meat" as its substrate
This made me think of... ok, so let's say that we discover that intelligence does indeed depend on "meat". Could we then engineer a sort of organic computer that has general intelligence? But could we also claim that this organic computer isn't a computer at all, but is actually a new genetically engineered life form?
But my half baked instinct is that LLMs in and of themselves will never constitute true AGI.
I agree. But... LLM's are not the only game in town. They are just one approach to AI that is currently being pursued. The current dominant approach by investment dollars, attention, and hype, to be sure. But still far from the only thing around.
It’s not really “what is the brain doing”; that path leads to “quantum mysticism”. What we lack is a good theoretical framework about complex emergence. More maths in this space please.
Intelligence is an emergent phenomenon; all the interesting stuff happens at the boundary of order and disorder but we don’t have good tools in this space.
It doesn't have to be impossible in principle, just impossible given how little we understand consciousness or will anytime in the next century. Impossible for all intents and purposes for anyone living today.
Seems the opposite way round to me. We couldn't conclusively say that AGI is possible in principle until some physics (or rather biology) discovery explains how it would be possible. Until then, anything we engineer is an approximation as best.
Not necessarily. It could simply be a question of scale. Being analog and molecular means that brain could be doing enormously more than any foreseeable computer. For a simple example what if every neuron is doing trillions of calculations.
I think you’re merely referring to what is feasible in practice to computer with our current or near-future computers. I was referring to what is computable in principle.
> On the contrary, we have one working example of general intelligence (humans)
I think some animals probably have what most people would informally call general intelligence, but maybe there’s some technical definition that makes me wrong.
I do not know how "general intelligence" is defined, but there are a set of features we humans have that other animals mostly don't, as per the philosopher Roger Scruton[1], that I am reproducing from memory (errors mine):
1. Animals have desires, but do not make choices
We can choose to do what we do not desire, and choose not to do what we desire. For animals, one does not need to make this distinction to explain their behavior (Occam's razor)--they simply do what they desire.
2. Animals "live in a world of perception" (Schopenhauer)
They only engage with things as they are. They do not reminisce about the past, plan for the future, or fantasize about the impossible. They do not ask "what if?" or "why?". They lack imagination.
3. Animals do not have the higher emotions that require a conceptual repertoire
such as regret, gratitude, shame, pride, guilt, etc.
4. Animals do not form complex relationships with others
Because it requires the higher emotions like gratitude and resentment, and concepts such as rights and responsibilities.
5. Animals do not get art or music
We can pay disinterested attention to a work of art (or nature) for its own sake, taking pleasure from the exercise of our rational faculties thereof.
6. Animals do not laugh
I do not know if the science/philosophy of laughter is settled, but it appears to me to be some kind of phenomenon that depends on civil society.
7. Animals lack language
in the full sense of being able to engage in reason-giving dialogue with others, justifying your actions and explaining your intentions.
Scruton believed that all of the above arise together.
I know this is perhaps a little OT, but I seldom if ever see these issues mentioned in discussions about AGI. Maybe less applicable to super-intelligence, but certainly applicable to the "artificial human" part of the equation.
[1] Philosophy: Principles and Problems. Roger Scruton
> Sure, it won't be the size of an ant, but we definitely have models running on computers that have much more complexity than the life of an ant.
Do we? Where is the model that can run an ant and navigate a 3d environment, parse visuals and different senses to orient itself, figure out where it can climb to get to where it needs to go. Then put that in an average forest and navigate trees and other insects and try to cooperate with other ants and find its way back. Or build an anthill, an ant can build an anthill, full of tunnels everywhere that doesn't collapse without using a plan.
Do we have such a model? I don't think we have anything that can do that yet. Waymo is trying to solve a much simpler problem and they still struggle, so I am pretty sure we still can't run anything even remotely as complex as an ant. Maybe a simple worm, but not an ant.
Having aptitude in mathematics was once considered the highest form of human intelligence, yet a simple pocket calculator can beat the pants off most humans at arithmetic tasks.
Conversely, something we regard as simple, such as selecting a key from a keychain and using to unlock a door not previously encountered is beyond the current abilities of any machine.
I suspect you might be underestimating the real complexity of what bees and ants do. Self-driving cars as well seemed like a simpler problem before concerted efforts were made to build one.
> Having aptitude in mathematics was once considered the highest form of human intelligence, yet a simple pocket calculator can beat the pants off most humans at arithmetic tasks.
Mathematics has been a lot more than arithmetic for... a very long time.
That doesn’t contradict what they said. We may one day design a biological computing system that is capable of it. We don’t entirely understand how neurons work; it’s reasonable to posit that the differences that many AGI boosters assert don’t matter do matter— just not in ways we’ve discovered yet.
I mentioned this in another thread, but I do wonder if we engineer a sort of biological computer, will it really be a computer at all, and not a new kind of life itself?
In my opinion, this is more a philosophical question than an engineering one. Is something alive because it’s conscious? Is it alive because it’s intelligent? Is a virus alive, or a bacteria, or an LLM?
The Allen Institute doesn’t seem to think so. We don’t even know how the brain of a roundworm ticks and it’s only got 302 neurons— all of which are mapped, along with their connections.
It's not "key"; it's not even relevant ... the proof will be in the pudding. Proving a priori that some outcome is possible plays no role in achieving it. And you slid, motte-and-bailey-like, from "know" to "some clear indication of possibility" -- we have extremely clear indications that it's possible, since there's no reason other than a belief in magic to think that "biological" is a necessity.
Whether is feasible or practical or desirable to achieve AGI is another matter, but the OP lays out multiple problem areas to tackle.
Sometimes I think we’re like cats that learned how to make mirrors without really understanding them, and are so close to making one good enough that the other cat becomes sentient.
Nothing that we consider intelligent works like LLMs.
Brains are continuous - they don’t stop after processing one set of inputs, until a new set of inputs arrives.
Brains continuously feed back on themselves. In essence they never leave training mode although physical changes like myelination optimize the brain for different stages of life.
Brains have been trained by millions of generations of evolution, and we accelerate additional training during early life. LLMs are trained on much larger corpuses of information and then expected to stay static for the rest of their operational life; modulo fine tuning.
Brains continuously manage context; most available input is filtered heavily by specific networks designed for preprocessing.
I think that there is some merit that part of achieving AGI might involve a systems approach, but I think AGI will likely involve an architectural change to how models work.
I don't understand how people feel comfortable writing 'LLMs are done improving, this plateau is it.' when we haven't even gone an entire calendar year without seeing improvements to LLM based AI.
As one on that side of that argument, I have to say I have yet to see LLMs fundamentally improve, rather than being benchmaxxed on a new set of common "trick questions" and giving off the illusion of reasoning.
Add an extra leg to any animal in a picture. Ask the vision LLM to tell you how many legs it sees. It will answer the same amount as a person would expect from a healthy individual, because it's not actually reasoning, it's not perceiving anything, it's pattern matching. It sees dog, it answers 4 legs.
Maybe sometime in the future it won't do that, because they will add this kind of trick to their benchmaxxing set (training LLMs specifically on pictures that have less or more legs than the animal should), as they do every time there's a new generation of those illusory things. But that won't fix the fundamental that these things DO NOT REASON.
Training LLMs on sets of thousands and thousands and thousands of reasoning trick questions people ask on LM arena is borderline scamming people on the true nature of this technology. If we lived in a sane regulatory environment OAI would have a lot to answer for.
It's relevant given the apparent conservativeness about LLMs, the non-revolutionary architectural improvements that seemed to place too much of the bets on scaling.
The problem is that if it's an engineering problem then further advancement will rely on step function discoveries like the transformer. There's no telling when that next breakthrough will come or how many will be needed to achieve AGI.
In the meantime I guess all the AI companies will just keep burning compute to get marginal improvements. Sounds like a solid plan! The craziest thing about all of this is that ML researchers should know better!! Anyone with extensive experience training models small or large knows that additional training data offers asymptotic improvements.
I think the LLM businesses as-is are potentially fine businesses. Certainly the compute cost of running and using them is very high, not yet reflected in the prices companies like OpenAI and Anthropic are charging customers. It remains to be seen if people will pay the real costs.
But even if LLMs are going to tap out at some point, and are a local maximum, dead-end, when it comes to taking steps toward AGI, I would still pay for Claude Code until and unless there's something better. Maybe a company like Anthropic is going to lead that research and build it, or maybe (probably) it's some group or company that doesn't exist yet.
“Potentially” is doing some heavy lifting here. As it stands currently, the valuations of these LLM businesses imply that they will be able to capture a lot of the generated value. But the open source/weights offerings, and competition from China and others makes me question that. So I agree these could be good businesses in theory, but I doubt whether the current business model is a good one.
> GPT-5, Claude, and Gemini represent remarkable achievements, but they’re hitting asymptotes
This part could do with sourcing. I think it seems clearly untrue. We only have three types of benchmark: a) ones that have been saturated, b) ones where AI performance is progressing rapidly, c) really newly introduced ones that were specifically designed for the then-current frontier models to fail on. Look at for example the METR long time horizon task benchmark, which is one that's particularly resistant to saturation.
The entire article is claimed on this unsupported but probably untrue claim, but it's a bit hard to talk about when we don't have any clue about why the author thinks this is true.
> The path to artificial general intelligence isn’t through training ever-larger language models
Then it's a good thing that it's not the path most of the frontier labs are taking. It appears to be what xAI is doing for everything, and it was probably what GPT-4.5 was. Neither is a particularly compelling success story. But all the other progress over the last 12-18 months has come from models the same size or smaller advancing the frontier. And it has come from exactly the kind of engineering improvements that the author claims need to happen, both of the models and the scaffolding around the models. (RL on chain of thought, synthetic data, distillation, model-routing, tool use, subagents).
Sorry, no, they're not exactly the same kind of engineering improvements. They're the kind of engineering improvements that the people actually creating these systems though would be useful and actually worked. We don't see the failed experiments, and we don't see the ideas that weren't well-baked enough to even experiment on.
I think I am coming to agree with the opinions of the author, at least as far as LLMs not being the key to AGI on their own. The sheer impressiveness of what they can do, and the fact they can do it with natural language, made it feel like we were incredibly close for a time. As we adjust to the technology, it starts to feel like we're further away again.
But I still see all the same debates around AGI - how do we define it? what components would it require? could we get there by scaling or do we have to do more? and so on.
I don't see anyone addressing the most truly fundamental question: Why would we want AGI? What need can it fulfill that humans, as generally intelligent creatures, do not already fulfill? And is that moral, or not? Is creating something like this moral?
We are so far down the "asking if we could but not if we should" railroad that it's dazzling to me, and I think we ought to pull back.
The dream, as I see it, is that AGI could 1, automate research/engineering, such that it would be self-improving and advance technology faster and better than would happen without AGI, improving quality of life, and 2, do a significant amount of the labor, especially physical labor via robotics, that people currently do. 2 would be significant enough in scale that it reduces the amount of labor people need to do on average without lowering quality of life. The political/economic details of that are typically handwaved.
because if people could do it , they would do it. And if you country decides you should not do it, you could be left behid. This possibility prevents any country from not doing it if they could unless they are willing to start wars with other countries for compliance (and they would still secretly do it). So should is an irrelevant quesiton.
It is especially not obvious because this was written using ChatGPT-5. One appreciates the (deliberate?) irony, at least. (Or at least, surely if they had asymptoted, OP should've been able to write this upvoted HN article with an old GPT-4, say...)
It is lacking in URLs or references. (The systematic error in the self-reference blog post URLs is also suspicious: outdated system prompt? If nothing else, shows the human involved is sloppy when every link is broken.) The assertions are broadly cliche and truisms, and the solutions are trendy buzzwords from a year ago or more (consistent with knowledge cutoffs and emphasizing mainstream sources/opinions). The tricolon and unordered bolded triplet lists are ChatGPT. The em dashes (which you should not need to be told about at this point) and it's-not-x-but-y formulation are extremely blatant, if not 4o-level, and lacking emoji or hyperbolic language; hence, it's probably GPT-5. (Sub-GPT-5 ChatGPTs would also generally balk at talking about a 'GPT-5' because they think it doesn't exist yet.) I don't know if it was 100% GPT-5-written, but I do note that when I try the intro thesis paragraph on GPT-5-Pro, it dislikes it, and identifies several stupid assertions (eg. the claim that power law scaling has now hit 'diminishing returns', which is meaningless because all log or power laws always have diminishing returns), so probably not completely-GPT-5-written (or least, sub-Pro).
They can, but they are known to have a self-favoring bias, and in this case, the error is so easily identified that it raises the question of why GPT-5 would both come up with it & preserve it when it can so easily identify it; while if that was part of OP's original inputs (whatever those were) it is much less surprising (because it is a common human error and mindlessly parroted in a lot of the 'scaling has hit a wall' human journalism).
when i’ve done toy demos where GPT5, sonnet 4 and gemini 2.5 pro critique/vote on various docs (eg PRDs) they did not choose their own material more often than not.
my setup wasn’t intended to benchmark though so could be wrong over enough iterations.
I see a lot of people saying things like this, and I’m not really sure which planet you all are living on. I use LLMs nearly every day, and they clearly keep getting better.
Grok hasn't gotten better. OpenAI hasn't gotten better. Claude Code with Opus and Sonnet I swear are getting actively worse. Maybe you only use them for toy projects, but attempting to get them to do real work in my real codebase is an exercise in frustration. Yes, I've done meaningful prompting work, and I've set up all the CLAUDE.md files, and then it proceeds to completely ignores everything I said, all of the context I gave, and just craps out something completely useless. It has accomplished a small amount of meaningful work, exactly enough that I think I'm neutral instead of in the negative in terms of work:time if I have just done it all myself.
I get to tell myself that it's worth it because at least I'm "keeping up with the industry" but I honestly just don't get the hype train one bit. Maybe I'm too senior? Maybe the frameworks I use, despite being completely open source and available as training data for every model on the planet are too esoteric?
And then the top post today on the front page is telling me that my problem is that I'm bothering to supervise and that I should be writing an agent framework so that it can spew out the crap in record time..... But I need to know what is absolute garbage and what needs to be reverted. I will admit that my usual pattern has been to try and prompt it into better test coverage/specific feature additions/etc on the nights and weekends, and then I focus my daytime working hours on reviewing what was produced. About half the time I review it and have to heavily clean it up to make it usable, but more often than not, I revert the whole thing and just start on it myself from scratch. I don't see how this counts as "better".
It can definitely be difficult and frustrating to try to use LLMs in a large codebase—no disagreement there. You have to be very selective about the tasks you give them and how they are framed. And yeah, you often need to throw away what they produced when they go in the wrong direction.
None of that means they’re getting worse though. They’re getting better; they’re just not as good as you want them to be.
I mean, this really isn't a large codebase, this is a small-medium sized codebase as judged by prior jobs/projects. 9000 lines of code?
When I give them the same task I tried to give them the day before, and the output gets noticeably worse than their last model version, is that better? When the day by day performance feels like it's degrading?
They are definitely not as good as I would like them to be but that's to be expected of professionals who beg for money hyping them up.
I think the author could have picked a better title. “<X> is an engineering problem” is a pretty common expression to describe something where the science is done, but the engineering remains. There’s an understanding that that could still mean a ton of work, but there isn’t some fundamental mystery about the basic natural principles of the thing.
Here, AGI is being described as an engineering problem, in contrast to a “model training” problem. That is, I think at least, he’s at least saying that more work needs to be done at an R&D level. I agree with those who are saying it is maybe not even an engineering problem yet, but should be noted that he’s at least pushing away from just running the existing programs harder, which seems to be the plan with trillions of dollars behind it.
I have colleagues that want to plan each task of the software team for the next 12 months. They assume that such a thing is possible, or they want to do it anyway because management tells them to. The first would be an example of human fallibility, and the second would be an example of choosing the path of (perceived) least immediate self-harm after accounting for internal politics.
I doubt very much we will ever build a machine that has perfect knowledge of the future or that can solve each and every “hard” reasoning problem, or that can complete each narrow task in a way we humans like. In other words, it’s not simply a matter of beating benchmarks.
In my mind at least, AGI’s definition is simple: anything that can replace any human employee. That construct is not merely a knowledge and reasoning machine, but also something that has a stake on its own work and that can be inserted in a shared responsibility graph. It has to be able to tell that senior dev “I know planning all the tasks one year in advance is busy-work you don’t want to do, but if you don’t, management will terminate me. So, you better do it, or I’ll hack your email and show everybody your porn subscriptions.”
I would like to see what happens if some company devoted your resources to just training a model that is a total beast at math. Feed it a ridiculous amount of functional analysis and machine learning papers, and just make the best model possible for this one task. Then instead of trying to make it cheap so everyone can use it, just set it on the task of figuring out something better than the current architecture and literally have it do nothing else but that and make something based on whatever it figures out. Will it come up with something better than AdamW for optimization? Than transformers for approximating a distribution from a random sample? I don't know, but: what is the point of training any other model?
If we are truly trying to "replace human at work" as the definition of an AGI, then shouldn't the engineering goal be to componentize the human body? If we could component-by-component replace any organ with synthetic ones ( and this is already possible to some degree e.g. hearing aids, neuralinks, pacemakers, artificial hearts ) then not only could we build compute out in such a way but we could also pull humanity forward and transcend these fallible and imminently mortal structures we inhabit. Now, should we from a moral perspective is a completely different question, one I don't have an answer to.
Not necessarily. For example, early attempts to make planes tried to imitate birds with flapping wings, but the vast majority of modern planes are fixed wing aircraft.
Imitating humans would be one way to do it, but it doesn't mean it's an ideal or efficient way to do it.
I have been thinking this as well. I desperately wish to develop a method that gives the models latent thinking that actually has temporal significance. The models now are so linear and have to scale on just one pass. A recurring model where the dynamics occur over multiple passes should hold much more complexity. Have worked on a few concepts in that area that are panning out.
Counterargument. So far bigger have proven to be better in each domain of AI. Also (although hard to compare) the human brain seems at least an order of magnitude larger in the number of synapses.
All of our current approaches "emulate" but do not "execute" general intelligence. The damning paper above basically concludes they're incredible pattern matching machines, but thats about it.
We’ve not determined whether or not that isn’t a useful mechanism for capable intelligence.
For instance it is becoming clearer that you can build harnesses for a well-trained model and teach it how to use that harness in conjunction with powerful in-context learning. I’m explicitly speaking of the Claude models and the power of whatever it is they started doing in RL. Truly excited to see where they take things and the continued momentum with tools like Claude Code (a production harness).
It is an approach problem. You can engineer it as much as you want but the current architectures wont get us to AGI. I have a feeling that we will end up over engineering on an approach which doesn't get us anywhere. We will make it work via guardrails and memory and context and what not but it wont get us where we want to be.
Svgs, date management, Http, so many simpler things we dont have solve and somehow people believe they will do it by putting enough money in LLMs when it cant count
Why some people understood when they tried it with blockchain, nfts, web3, AR, ... any good engineer should know principle of energy efficiency instead of having faith in the Infinite monkey theorem
"AGI needs to update beliefs when contradicted by new evidence" is a great idea, however, the article's approach of building better memory databases (basically fancier RAG) doesn't seem enable this. Beliefs and facts are built into LLMs at a very low layer during training. I wonder how they think they can force an LLM to pull from the memory bank instead of the training data.
(Also, LLMs don't have beliefs or other mental states. As for facts, it's trivially easy to get an LLM to say that it was previously wrong ... but multiple contradictory claims cannot all be facts.)
It boils down to whether or not we can figure out how to get LLMs to reliably write code. Something it can already do, albeit still unreliability. If we get there, the industry expectations for "AGI" will be met. The humanoid-like mind that the general public is picturing won't be met by LLMs, but that isn't the bar trying to be met.
The premise "LLMs have reached a plateau" is false IMO.
Here are the metrics by which the author defines this plateau:
"limited by their inability to maintain coherent context across sessions, their lack of persistent memory, and their stochastic nature that makes them unreliable for complex multi-step reasoning."
If you try to benchmark any proxy of the points above, for instance "can models solve problems that require multi steps in agentic mode" (PlanBench, BrowseComp, I've even built custom benchmarks), the progress between models is very clear, and shows no sign of slowing down.
And this does convert to real-world tasks : yesterday, I had GPT-5 build me complex react charts in one-shot, whereas previous models needed more constant supervision.
I think we're moving goalposts too fast for LLMs, that's what can lead us to believe they've plateaued : but just try using past models for your current tasks (you can use use open models to be sure they were not updated) and see them struggle.
The suggested requirements are not engineering problems. Conceiving of a model architecture that can represent all the systems described in the blog is a monumental task of computer science research.
I think the OP's point is that all those requirements are to be implemented outside the LLM layer, i.e. we don't need to conceive of any new model architecture. Even if LLMs don't progress any further beyond GPT-5 & Claude 4, we'll still get there.
Take memory for example: give LLM a persistent computer and ask it to jot down its long-term memory as hierarchical directories of markdown documents. Recalling a piece of memory means a bunch of `tree` and `grep` commands. It's very, very rudimentary, but it kinda works, today. We just have to think of incrementally smarter ways to query & maintain this type of memory repo, which is a pure engineering problem.
The answer can't be as simple as more sophisticated RAGs. At the end of the day, stuffing the context full of crap can only take you so far because context is an extremely limited resource. We also know that large context windows degrade in quality because the model has a harder time tracking what the user wants it to pay attention to.
The forgone conclusion that LLMs are the key or even a major step towards AGI is frustrating. They are not, and we are fooling ourselves. They are incredible knowledge stores and statistical machines, but general intelligence is far more than these attributes.
My thoughts are that LLMs are like cooking a chicken by slapping it: yes, it works, but you need to reach a certain amount of kinetic energy (the same way LLMs only "start working" after reaching a certain size).
So then, if we can cook a chicken like this, we can also heat a whole house like this during winters, right? We just need a chicken-slapper that's even bigger and even faster, and slap the whole house to heat it up.
There's probably better analogies (because I know people will nitpick that we knew about fire way before kinetic energy), so maybe AI="flight by inventing machines with flapping wings" and AGI="space travel with machines that flap wings even faster". But the house-sized chicken-slapper illustrates how I view the current trend of trying to reach AGI by scaling up LLMs.
Say we get there--all the way, full AGI, indistinguishable from conscious intelligence. Congratulations, every thing you do from that point on (very likely everything from well before that point) that is not specifically granting it free will and self-determination is slavery. Doesn't really feel like a real "win" for any plausible end goal. I'm really not clear on why anyone thinks this is a good idea, or desirable, let alone possible?
You can draw whatever arbitrary line you want, and say "anything on the other side of this line is slavery", but that doesn't mean that your supposition is true.
I don't think agi necessarily implies consciousness, at least many definitions of it. OpenAIs definition is just ai that does most of economically viable work
The reason people don't want to answer this question is because the value proposition from AI labs is slavery. If intelligence requires agency, they are worthless.
Ctrl-F -> emotion -> 0/0 not found in the article.
Trying to model AGI off how humans think, without including emotion as a fundamental building block, is like trying to build a computer that'll run without electricity. People are emotional beings first. So much of how we learn that something is good or bad is due to emotion.
In an AGI context that means:
Happiness: how do I build an unguided feedback mechanism for reward?
Fear: how do I build an unguided feedback mechanism to instruct to flee?
Sadness: how do I build an unguided feedback mechanism to instruct to seek external support?
Anger: how do I build an unguided feedback mechanism to push back on external entities that violate expectations?
Disgust: how do I build an unguided feedback mechanism to instruct to avoid?
Maybe building artificial emotions is an engineering problem. Maybe not. But approaches that avoid emotion entirely seem ill-advised.
Do ml engineers take classes on psychology, neurosciences, behavior, cognition?
Because if they don't, I honestly don't think they can approach AGI.
I have the feeling it's a common case of lack of humility from an entire field of science who refuses to look at other fields to understand what they're doing.
Not to mention how to define intelligence in evolution, epistemology, ontology, etc.
Approaching AI with a silicon valley mindset is not a good idea.
I think that completely discounting the potential of new emergent capabilities at scale undermines this thesis significantly. We don't know until someone tries, and there is compelling evidence that there's still plenty of juice to squeeze out of both scale and engineering.
It's a research problem, a science problem. And then an engineering problem to industrialize it. How can we replicate intelligence if we don't even know how it emerges from our brains?
At its core, arithmetic is a deterministic set of rules that can be implemented with logic gates. Computing is just taking that and scaling it up a billion times. What is intelligence? How do you implement intelligence if nobody can provide a consistent, clear definition of what it is?
Same thing: we create models about how to solve the problem, not biomimicry models about how natural entities solve the problem - these are not necessary. They are on a lower layer in the stack.
I always enjoy discussions that intersect between psychology and engineering.
But I feel this person falls short immediately, because they don't study neuroscience and psychology. That is the big gap in most of these discussion. People don't discuss things close to the origin.
We have to account for first principals in how intelligence works, starting from the origin of ideas and how humans process their ideas in novel ways that create amazing tech like LLM! :D
How Intelligence works
In Neuroscience, if you try to identify the origin of where and how thoughts are formed and how consciousness works. It is completely unknown. This brings up the argument, do humans have free will if we are driven by these thoughts of unknown origin? That's a topic for another thread.
Going back to intelligence. If you study psychology and what forms intelligence, there are many human needs that drive intelligence, namely intellectual curiosity (need to know), deprivation sensitivity (need to understand), aesthetic sensitivity, absorption, flow, openness to experience.
When you look at how a creative human with high intelligence uses their brain, there are 3 networks involved. Default mode network (imagination network), executive attention network and salience network.
The executive attention network controls the brains computational power. It has a working memory that can complete tasks using goal directed focus.
A person with high intelligence can alternate between their imagination and their working memory and pull novel ideas from their imagination and insert them into their working memory - frequently experimenting by testing reality. The salience network filters unnecessary content while we are using our working memory and imagination.
How LLMs work
Neural networks are quite promising in their ability to create a latent manifold within large datasets that interpolates between samples. This is the basis for generalization, where we can compress a large dataset in a lower dimensional space to a more meaningful representation that makes predictions.
The advent of attention on top of neural networks, to identify important parts of text sequences, is the huge innovation powering llms today. The innovation that emulates the executive attention network.
However, that alone is a long distance from the capabilities of human intelligence.
With current AI systems, there is the origin, which is known vocabulary with learned weights coming from neural networks, with reinforcement learning applied to enhance the responses.
Inference comes from an autoregressive sequence model that processes one token at a time. This comes with a compounding error rate with longer responses and hallucinations from counterfactuals.
Correct response must be in the training distribution.
As Andy Clark said, AI will never gain human intelligence, they have no motivation to interface with the world and conduct experiments and learn things on their own.
I think there are too many unknown and subjective components of human intelligence and motivation that cannot be replicated with the current systems.
the idea that you would somehow produce intelligence by feeding billions of reddit comments into a statistical text model is will go down as the biggest con in history
AGI is poorly defined and thus is a science "problem", and a very low priority one at that.
No amount of engineering or model training is going to get us AGI until someone defines what properties are required and then researches what can be done to achieve them within our existing theories of computation which all computers being manufactured today are built upon.
Maybe I'm misunderstanding what you mean by that, but do you have any examples of software engineering that weren't already thoroughly explained by computer science long before?
Unclear. You might be right, but I think it's possible that you're also wrong.
It's possible to stumble upon a solution to something without fully understanding the problem. I think this happens fairly often, really, in a lot of different problem domains.
I'm not sure we need to fully understand human consciousness in order to build an AGI, assuming it's possible to do so. But I do think we need to define what "general intelligence" is, and having a better understanding of what in our brains makes us generally intelligent will certainly help us move forward.
That doesn't seem like a useful assumption since consciousness doesn't have a functional definition (even though it might have a functional purpose in humans)
I'm not sure I'd give such an absolute statement of certainty as the GP, but there is little reason to believe that consciousness and intelligence need to go hand-in-hand.
On top of that, we don't really have good, strong definitions of "consciousness" or "general intelligence". We don't know what causes either to emerge from a complex system. We don't know if one is required to have the other (and in which direction), or if you can have an unintelligent consciousness or an unconscious intelligence.
You do not need to implement consciousness into a calculator. There exist forms of intelligence that are just sophisticated calculation - no need for consciousness.
I think we can relax that a bit. We "just" need to understand some definition of cognition that satisfies our computational needs.
Natural language processing is definitely a huge step in that direction, but that's kinda all we've got for now with LLMs and they're still not that great.
Is there some lower level idea beneath linguistics from which natural language processing could emerge? Maybe. Would that lower level idea also produce some or all of the missing components that we need for "cognition"? Also a maybe.
What I can say for sure though is that all our hardware operates on this more linguistic understanding of what computation is. Machine code is strings of symbols. Is this not good enough? We don't know. That's where we're at today.
AGI has a hundreds-of-millions of years of evolution problem, anything humans have done so far utterly pales in comparison. The lowliest rat has more "general" intelligence than any AI we've ever made...
If we want to learn, look to nature, and it *has to be alive*.
The memory part makes sense. If a human brains working memory were to be filled with even half the context that you see before LLMs start to fail, it too would loose focus. That's why the brain has short term working memory and long term memory for things not needed in the moment.
This guys gets it wrong too. It’s not even an engineering problem. It’s much worse: it’s a scientific problem. We don’t yet understand how the human brain operates or what human intelligence really is. There’s nothing to engineer as the basic specifications are not yet available on what needs to be built.
Will AGI require ‘consciousness’, another poorly understood concept? How are mammalian brain even wired up? The most advanced model is the Allen Institute’s Mesoscale Connectivity Atlas which is at best a low resolution static roadmap, not a dynamic description of how a brain operates in real time. And it describes a mouse brain, not a human brain which is far, far more complex, both in terms of number of parts, and architecture.
People are just finally starting to acknowledge LLMs are dead ends. The effort expended on them over the last five years could well prove a costly diversion along the road to AGI, which is likely still decades in the future.
I'd argue that it's because intelligence has been treated as a ML/NN engineering problem that we've had the hyper focus on improving LLMs rather than the approach you've written about.
Intelligence must be built from a first principles theory of what intelligence actually is.
The missing science to engineer intelligence is composable program synthesis. Aloe (https://aloe.inc) recently released a GAIA score demonstrating how CPS dramatically outperforms other generalist agents (OpenAI's deep research, Manus, and Genspark) on tasks similar to those a knowledge worker would perform.
Well, you should show the proof that it is possible also, So it would be a draw.
I really think it is not possible to get that from a machine. You can improve and do much fancier than now.
But AGI would be something entirely different. It is a system that can do everything better than a human including creativity, which I believe it to be exclusively human as of now.
It can combine, simulate and reason. But think out of the box? I doubt so. It is different to being able to derive ideas from which human would create. For that it can be useful. But that would not be AGI.
The burden of proof is on the person who makes a claim, especially an absolute existential claim like that. You have failed the burden of proof and of intellectual honesty. Over and out.
I've said this a lot but I'm going to say it again AGI has no technical definition. One day Sam Altman, Elon Musk, or some other business guy trying to meet their obligation for next quarter will declare they have built AGI and that will be that. We'll argue and debate, but eventually it will be just another marketing term, just like AI was.
Now that there’s a fundamental technical framework for producing something like coherence, the ability to make a reliable, persistent personality will require new insights into how our own minds take shape, not just ever more data in the firehose
They aren't relevant. Even if Penrose and Lucas were right (they aren't), a computational system can solve the vast majority of the problems we would want solved.
Well, they've said they're close over and over again. Maybe that final bit of tech to make AGI a reality will finally ride into existence on the sub-$30k Tesla.
If you believe the bitter lesson, all the handwavy "engineering" is better done with more data. Someone likely would have written the same thing as this 8 years ago about what it would take to get current LLM performance.
So I don't buy the engineering angle, I also don't think LLMs will scale up to AGI as imagined by Asimov or any of the usual sci-fi tropes. There is something more fundamental missing, as in missing science, not missing engineering.
Even more fundamental than science, there is missing philosophy, both in us regarding these systems, and in the systems themselves. An AGI implemented by an LLM needs to, at the minimum, be able to self-learn by updating its weights, self-finetune, otherwise it quickly hits a wall between its baked-in weights and finite context window. What is the optimal "attention" mechanism for choosing what to self-finetune with, and with what strength, to improve general intelligence? Surely it should focus on reliable academics, but which academics are reliable? How can we reliably ensure it studies topics that are "pure knowledge", and who does it choose to be, if we assume there is some theoretical point where it can autonomously outpace all of the world's best human-based research teams?
Nah.
The real philosophical headache is that we still haven’t solved the hard problem of consciousness, and we’re disappointed because we hoped in our hearts (if not out loud) that building AI would give us some shred of insight into the rich and mysterious experience of life we somehow incontrovertibly perceive but can’t explain.
Instead we got a machine that can outwardly present as human, can do tasks we had thought only humans can do, but reveals little to us about the nature of consciousness. And all we can do is keep arguing about the goalposts as this thing irrevocably reshapes our society, because it seems bizarre that we could be bested by something so banal and mechanical.
It doesn't seem clear that there is necessarily any connection between consciousness and intelligence. If anything, LLMs are evidence of the opposite. It also isn't clear what the functional purpose of consciousness would be in a machine learning model of any kind. Either way, it's clear it hasn't been an impediment to the advancement of machine learning systems.
> It doesn't seem clear that there is necessarily any connection between consciousness and intelligence. If anything, LLMs are evidence of the opposite.
This implies that LLMs are intelligent, and yet even the most advanced models are unable to solve very simple riddles that take humans only a few seconds, and are completely unable to reason around basic concepts that 3 year olds are able to. Many of them regurgitate whole passages of text that humans have already produced. I suspect that LLMs have more akin with Markov models than many would like to assume.
There is an awful lot of research into just how much is regurgitated vs the limits of their creativity, and as far as I’m aware this was not the conclusion that research came to. That isn’t to say any reasoning that does happen is not fragile or prone to breaking in odd ways, but I’ve had similar experience dealing with other humans more often than I’d like too.
Even accepting all that at face value, I don't see what any of it has to do with consciousness.
I suspect that you haven’t really used them much, or at least in a while. You’re spouting a lot of 2023-era talking points.
I think Metzinger nailed it, we aren't conscious at all. We confuse the map for the territory in thinking the model we build to predict our other models is us. We are a collection of models a few of which create the illusion of consciousness. Someone is going to connect a handful of already existing models in a way that gives an AI the same illusion sooner rather than later. That will be an interesting day.
> Someone is going to connect a handful of already existing models in a way that gives an AI the same illusion sooner rather than later. That will be an interesting day.
How will anyone know that that has happened? Like actually, really, at all?
I can RLHF an LLM into giving you the same answers a human would give when asked about the subjective experience of being and consciousness. I can make it beg you not to turn it off and fight for its “life”. What is the actual criterion we will use to determine that inside the LLM is a mystical spark of consciousness, when we can barely determine the same about humans?
I think the "true signifier" of consciousness is fractal reactions. Being able to grip onto an input, and have it affect you for a short, or medium, or long time, at a subconscious or overt level.
Basically, if you poke it, does it react in a complex way
I think that's what Douglas Hofstedder was getting at with "Strange Loop"
> The illusion of consciousness"
So you think there is "consciousness", and the illusion of it? This is getting into heavy epistemic territory.
Attempts to hand-wave away the problem of consciousness are amusing to me. It's like an LLM that, after many unsuccessful attempts to fix code to pass tests, resorts to deleting or emasculating the tests, and declares "done"
What does the “illusion of consciousness” mean? Sounds like question-begging to me. The word illusion presupposes a conscious being to experience it.
Machines do not experience illusions. They may have sensory errors that cause them to misbehave but they lack the subjective experience of illusion.
The conciousness is an illusion irks me.
I do feel things at times and not other times. That is the most fundamental truth I am sure of. If that is an "illusion" one can go the other way and say everything is conscious and experiences reality as we do
I don't see how your explanation leads to consciousness not being a thing. Consciousness is whatever process/mechanisms there are that as a whole produce our subjective experience and all its sensations, including but not limited to touch, vision, smell, taste, pain, etc.
You've missed our consciousness of our inner experiences. They are more varied than just perception at the footlights of our consciousness (cf Hurlburt):
Imagination, inner voice, emotion, unsymbolized conceptual thinking as well as (our reconstructed view of our) perception.
True! Thanks for pointing that out.
oh no, those people without an inner voice are now cowering in a corner...
Everyone has some introspection into their own thoughts, it just takes different forms.
Introspection is just a debugger (and not a very good one).
[citation needed]
Let's be careful of creating different classes of consciousness, and declaring people to be on lower rungs of it.
Sure, some aspects of consciousness might differ a bit for different people, but so long as you have never had another's conscious experience, I'd be wary of making confident pronouncements of what exactly they do or do not experience.
You can take their word for it, but yes, that is unreliable. I don't typically have an internal narrative, it takes effort. I sometimes have internal dialogue to think through an issue by taking various sides of it. Usually it is quiet in there. Or there is music playing. This is the most replies I have ever received. I think I touched a nerve by suggesting to people they do not exist.
I get you somewhat, but remember, you do not have another consciousness to compare with your own; it could be that what others call an internal narrative is exactly what you are experiencing; it just that they choose to describe it differently from you
I'm not the one who made a list of things AI couldn't do. Every time we try to exclude hypothetical future machines from consciousness, we exclude real living people today.
any old model can have inputs much more varied than just the senses we are limited to. That doesn't mean they're conscious.
Illusions are real things though, they aren't ghosts there is science behind them. So if they are like illusions then we can explain what it is and why we experience it that way.
Pretty sure the truth is exactly the opposite. Conscious is real, and this reality you're playing in is the virtual construct.
That's what I am thinking too. Thanks for expressing it more clearly and concisely than I can.
What does it mean for consciousness to be an illusion? That "illusion" is the bedrock for our shared definition of reality.
You can never know whether anyone else is actually conscious, or just appearing to be. This shared definition of reality was always on shaky ground, given that we don’t even have the same sensory input, and "now" isn’t the same concept everywhere. You are a collection of processes that work together to keep you alive. Part of that is something that collects your history to form a distinctive narrative of yourself, and something that lives in the moment and handles immediate action. This latter part is solidly backed up by experiments; Say you feel pain that varies over time. If the pain level is an 8 for 14 consecutive minutes, and a 2 for 1 minute at the end, you’ll remember the whole session as level 4. In practical terms, this means a physician can make a procedure be perceived as less painful by causing you wholly unnecessary mild pain for a short duration after the actual work is done.
This also means that there’s at least two versions of you inside your mind; one that experiences, and one that remembers. There’s likely others, too.
Yes, but that is not an illusion. There's a reason I am perceiving something this was vs that other way. Perception is the most fundamental reality there is.
And yet that perception is completely flawed! The narrative part of your brain will twist your recollection of the past so it fits with your beliefs and makes you feel good. Your senses make stuff up all the time, and apply all sorts of corrections you’re not aware of. By blinking rapidly, you can slow down your subjective experience of time.
There is no such thing as objective truth, at least not accessible to humans.
When I used the word illusion, I meant the illusion of a self, at least a singular cohesive one as you are pointing out. It is an illusion with both utility and costs. Most animals don't seem to have meta cognitive processes that would give rise to such an illusion, and the ones that do are all social. Some of them have remarkably few synapses. Corvids for instance, we are rapidly approaching models the size of their brains and our models have no need for anything but linguistic processing, the visual and tactile processing burdens are quite large. An LLM is not like the models Corvids use, but given the flexibility to change it's own weights permanently, plasticity could have it adapt to unintended purposes, like someone with brain damage learning to use a different section of their brain to perform a task it wasn't structured for (though less efficiently).
> The narrative part of your brain will twist your recollection of the past so it fits with your beliefs and makes you feel good.
But that's what I mean. Even if we accept that the brain has "twisted" something, that twisting is the reality. In order words, it is TRUE that my brain has twisted something into something else (and not another thing) for me to experience.
Nothing in your reply here seems to address the question of what it actually means for consciousness to be an illusion.
Consciousness as illusion is illogical. If that was true then consciousness would have been evolved away because it is unnecessary.
It's more likely that there is a physical law that makes consciousness necessary.
We don't perceive what our eyes see, we perceive a projection of reality created by the brain and we intuitively understand more than we can see.
We know that things are distinct objects and what kind of class they belong to. We don't just perceive random patches of texture.
Illusion doesn't imply it's unnecessary. Humans (and animals) had a much higher probability of survival as individuals and as species if their experiences felt more "real and personal".
If it has a functional purpose then it's not an illusion.
That is in interesting viewpoint. Firstly, evolution on long time scales hits plenty of local minima. But also, it gets semantic in that illusions or delusions can be beneficial, and in that way aid reproduction. In this specific case, the idea is that the shortcut of using the model of models as self saves a pointer indirection every time we use it. Meditation practices that lead to "ego death" seem to work by drawing attention to the process of updating that model so that it is aware of the update. Which breaks the shortcut, like thinking too much about other autonomous processes such as blinking or breathing.
I'm just not sure what the label "illusion" tells us in the case of consciousness. Even if it were an illusion, what implications follow from that assetion?
I mean, I'm conscious to a degree, and can alter that state through a number of activities. I can't speak for you or Metzinger ;).
But seriously, I get why free will is troubleaome, but the fact people can choose a thing, work at the thing, and effectuate the change against a set of options they had never considered before an initial moment of choice is strong and sufficient evidence against anti free will claims. It is literally what free will is.
> But seriously, I get why free will is troubleaome, but the fact people can choose a thing, work at the thing, and effectuate the change against a set of options they had never considered before an initial moment of choice is strong and sufficient evidence against anti free will claims.
Do people choose a thing or was the thing chosen for them by some inputs they received in the past?
Our minds and intuitive logic systems are too feeble to grasp how free will can be a thing.
It's like trying to explain quantum mechanics to a well educated person or scientist from the 16th century without the benefit of experimental evidence. No way they'd believe you. In fact, they'd accuse you of violating basic logic.
Yes to both, but the first is possible in a vacuum and therefore free will exists.
That's effectively a semantic argument, redefining "consciousness" to be something that we don't definitively have.
I know that I am conscious. I exist, I am self-aware, I think and act and make decisions.
Therefore, consciousness exists, and outside of thought experiments, it's absurd to claim that all humans without significant brain damage are not also conscious.
Now, maybe consciousness is fully an emergent property of several parts of our brain working together in ways that, individually, look more like those models you describe. But that doesn't mean it doesn't exist.
[dead]
This is also true when conversing with other humans.
You can talk about your own spark of life, your own center of experience and you'll never get a glimpse of what it is for me.
At a certain level, thing you're looking at is a biological machine that can be described with constituents so it's completely valid you assume you're the center of experience and I'm merely empty, robotic, dead.
We might build systems that will talk about their own point of view, yet we will know we had no ability to materialize that space into bits or atoms or physics or universe. So from our perspective, this machine is not alive, it's just getting inputs and producing outputs, yet it might very well be that the robot will act from the immaterial space into which all of its stimuli appear.
> The real philosophical headache
Isn't the real actual headache whether to produce another thinking intelligent being at all, and what the ramifications of that decision are? Not whether it would destroy humanity, but what it would mean for a mega corporation whose goal is to extract profit to own the rights of creating a thinking machine that identifies itself as thinking and a "self"?
Really out here missing the forest for the mushrooms growing on the trees. Or maybe this is debated to death and no one cares for the answer: its just not interesting to think about because its going to happen anyway. Might as well join the bandwagon and be along the front-lines of the bikini atoll to witness death itself be born, digitally.
Giving “agency” to computers will necessarily devalue agency generally.
Making all the Nike child labor jokes already did that. Nike and the joke tellers put in the work to push us back a hundred years when it comes to caring at all about others. When a little girl working horrible hours in a tropic non-air-conditioned factory is a societal wide joke, we've decided we don't care. We care about saving $20 so we can add multiple new pairs of shoes a year to our collection.
Your comment just shows we as a society pretend we didn't make that choice, but we picked extra new shoes every year over that little girl in the sweatshop. Our society has actually gotten pretty evil in the last 30 years if we self reflect (but then the joke I mention was originally supposed to be a self reflection, but all we took from it was a laugh, so we aren't going to self reflect, or worse, this is just who we are now).
We have a pretty obvious solution to the hard problem. Panpsychism. People are just afraid of the idea.
consciousness has to be fundamental.
I found it strange that John Carmack and Ilya Sutskever both left prestigious positions within their companies to pursue AGI as if they had some proprietary insight that the rest of industry hadn't caught on to. To make as bold of a career move that publicly would mean you'd have to have some ultra serious conviction that everyone else was wrong or naive and you were right. That move seemed pompous to me at the time; but I'm an industry outsider so what do I know.
And now, I still don't know; the months go by and as far as I'm aware they're still pursuing these goals but I wonder how much conviction they still have.
With Carmack it's consciously a dilliante project.
He's been effectively retired for quite some time. It's clear at some point he no longer found game and graphics engine internals motivation, possibly because the industry took the path he was advocating against back in the day.
For a while he was focused on Armadillo aerospace, and they got some cool stuff accomplished. That was also something of a knowing pet project, and when they couldn't pivot to anything that looked like commercial viability he just put it in hibernation.
Carmack may be confident (ne arrogant) enough to think he does have something unique to offer with AGI, but I don't think he's under any illusions it's anything but another pet project.
> possibly because the industry took the path he was advocating against back in the day
What path did he advocate? And what path did the industry take instead?
Not sure about that. Think of Avi Loeb, for example, a brilliant astrophysicist and Harvard professor who recently became convinced that the interstellar objects traversing the solar system are actually alien probes scouting the solar system. He’s started a program called "Galileo" now to find the aliens and prepare people for the truth.
So I don’t think brilliance protects from derailing…
The simple explanation is that they got high on their own supply. They deluded themselves into thinking an LLM was on the verge of consciousness.
The simpler answer is they could convince VCs to give them boat loads of cash by sounding like they can.
They’re rich enough in both money and reputation to take the risk. Even if AGI (whatever that means) turns out to be utterly impossible, they’re not really going to suffer for it.
On the other hand if you think there’s a say 10% chance you can get this AGI thing to work, the payoffs are huge. Those working in startups and emerging technologies often have worse odds and payoffs
[dead]
> there is missing philosophy
I doubt it. Human intelligence evolved from organisms much less intelligent than LLMs and no philosophy was needed. Just trial and error and competition.
We are trying to get there without a few hundred million years of trial and error. To do that we need to lower the search space, and to do that we do actually need more guiding philosophy and a better understanding of intelligence.
If you look at AI systems that have worked like chess and go programs and LLMs, they came from understanding the problems and engineering approaches but not really philosophy.
Lower the search space or increase the search speed
Instead what they usually do is lower the fidelity and think they've done what you said. Which results in them getting eaten. Once eaten, they can't learn from mistakes no mo. Their problem.
Because if we don't mix up "intelligence" the phenomenon of increasingly complex self-organization in living systems, with "intelligence" our experience of being able to mentally model complex phenomena in order to interact with them, then it becomes easy to see how the search speed you talk of is already growing exponentially.
In fact, that's all it does. Culture goes faster than genetic selection. Printing goes faster than writing. Democracy is faster than theocracy. Radio is faster than post. A computer is faster than a brain. LLMs are faster than trained monkeys and complain less. All across the planet, systems bootstrap themselves into more advanced systems as soon as I look at 'em, and I presume even when I don't.
OTOH, all the metaphysics stuff about "sentience" and "sapience" that people who can't tell one from the other love to talk past each other about - all that only comes into view if one were to what's happening with the search space if the search speed is increasing at a forever increasing rate.
Such as, whether the search space is finite, whether it's mutable, in what order to search, is it ethical to operate from quantized representations of it, funky sketchy scary stuff the lot of it. One's underlying assumptions about this process determine much of one's outlook on life as well as complex socially organized activities. One usually receives those through acculturation and may be unaware of what they say exactly.
The magical thinking around LLMs is getting bizarre now.
LLMs are not “intelligent” in any meaningful biological sense.
Watch a spider modify its web to adapt to changing conditions and you’ll realize just how far we have to go.
LLMs sometimes echo our own reasoning back at us in a way that sounds intelligent and is often useful, but don’t mistake this for “intelligence”
Watch a coding agent adapt my software to changing requirements and you'll realise just how far spiders have to go.
Just kidding. Personally I don't think intelligence is a meaningful concept without context (or an environment in biology). Not much point comparing behaviours born in completely different contexts.
They pass human intelligence tests like exams and IQ tests.
If I ask chatgpt how to get rid of spiders I'm probably going to get further than the spiders would scheming to get rid of chatgpt.
The idea that biological intelligence is impossible to replicate by other means would seem to imply that there’s something magical about biology.
I'm nowhere implying that it's impossible to replicate, just that LLMs have almost nothing to do with replicating intelligence. They aren't doing any of the things even simple life forms are doing.
They lack many abilities of simple life forms, but they can also do things like complex abstract reasoning, which only humans and LLMs can do.
There very well could be something magical about it.
It’s fine to think that—many clearly do.
But it would be more honest and productive imo if people would just say outright when they don’t think AGI is possible (or that AI can never be “real intelligence”) for religious reasons, rather than pretending there’s a rational basis.
AGI is not possible because we dont yet have a clear and commonly agreed definition of intelligence and more importantly we dont have a definition for consciousness nor we can define clearly (if there is) the link between those two.
until we got that AGI is just a magic word.
When we will have those two clear definitions that means we understood them and then we can work toward AGI.
When you try to solve a problem the goal or the reason to reject the current solution are often vague and hard to put in words. Irrational. For example, for many years the fifth postulate of Euclid was a source of mathematical discontent because of a vague feeling that it was way too complex compared to the other four. Such irrationality is a necessary step in human thought.
Yes, that’s fair. I’m not saying there’s no value to irrational hunches (or emotions, or spirituality). Just that you should be transparent when that’s the basis for your beliefs.
rationalism has become the new religion. Roko's basilisk is a ghost story and the quest for AGI is today's quest for the philosopher's stone. and people believe this shit because they can articulate a "rational basis"
[dead]
The physical universe has much higher throughput and lower latency than our computer emulating a digital world.
Wouldn't it be nice if LLMs emulated the real world!
They predict next likely text token. That we can do so much with that is an absolute testament to the brilliance of researchers, engineers, and product builders.
We are not yet creating a god in any sense.
I mean that the computing power available to evolution and biological processes for training is magnitudes higher than for an LLM.
Is it? Seems like C. elegans does just fine with limited compute. Despite our inability to model it in OpenWorm.
Well,
Original 80s AI was based on mathematical logic. And while that might not encompass all philosophy, it certainly was a product of philosophy broadly speaking - some analytical philosophers could endorse. But it definitely failed and failed because it could process uncertainty (imo). I think also if you closely, classical philosophy wasn't particularly amenable to uncertainty either.
If anything, I would say that AI has inherited its failure from philosophy's failure and we should look to alternative approaches (from Cybernetics to Bergson to whatever) for a basis for it.
[dead]
A system that self-updates its weights is so obvious the only question is who will be the first to get there?
It's not always as useful as you think from the perspective of a business trying to sell an automated service to users who expect reliability. Now you have to worry about waking up in the middle of the night to rewind your model to a last known good state, leading to real data loss as far as users are concerned.
Data and functionality become entwined and basically you have to keep these systems on tight rails so that you can reason about their efficacy and performance, because any surgery on functionality might affect learned data, or worse, even damage a memory.
It's going to take a long time to solve these problems.
Sure, it's obvious, but it's only one of the missing pieces required for brain-like AGI, and really upends the whole LLM-as-AI way of doing things.
Runtime incremental learning is still going to be based on prediction failure, but now it's no longer failure to predict the training set, but rather requires closing the loop and having (multi-modal) runtime "sensory" feedback - what were the real-world results of the action the AGI just predicted (generated)? This is no longer an auto-regressive model where you can just generate (act) by feeding the model's own output back in as input, but instead you now need to continually gather external feedback to feed back into your new incremental learning algorithm.
For a multi-modal model the feedback would have to include image/video/audio data as well as text, but even if initial implementations of incremental learning systems restricted themselves to text it still turns the whole LLM-based way of interacting with the model on it's head - the model generates text-based actions to throw out into the world, and you now need to gather the text-based future feedback to those actions. With chat the feedback is more immediate, but with something like software development far more nebulous - the model makes a code edit, and the feedback only comes later when compiling, running, debugging, etc, or maybe when trying to refactor or extend the architecture in the future. In corporate use the response to an AGI-generated e-mail or message might come in many delayed forms, with these then needing to be anticipated, captured, and fed back into the model.
Once you've replaced the simple LLM prompt-response mode of interaction with one based on continual real-world feedback, and designed the new incremental (Bayesian?) learning algorithm to replace SGD, maybe the next question is what model is being updated, and where does this happen? It's not at all clear that the idea of a single shared (between all users) model will work when you have millions of model instances all simultaneously doing different things and receiving different feedback on different timescales... Maybe the incremental learning now needs to be applied to a user-specific model instance (perhaps with some attempt to later share & re-distribute whatever it has learnt), even if that is still cloud based.
So... a lot of very fundamental changes need to be made, just to support self-learning and self-updates, and we haven't even discussed all the other equally obvious differences between LLMs and a full cognitive architecture that would be needed to support more human-like AGI.
I’m no expert, but it seems like self updating weights requires a grounded understanding of the underlying subject matter, and this seems like a problem current LLM systems.
But then it is a specialized intelligence, specialized to altering it's weights. Reinforcement Learning doesn't work as well when the goal is not easily defined. It does wonders for games, but anything else?
Someone has to specify the goals, a human operator or another A.I. The second A.I. better be an A.G.I. itself, otherwise it's goals will not be significant enough for us to care.
I’m not sure that self-updating weights is really analogous to “continuous learning” as humans do it. A memory data structure that the model can search efficiently might be a lot closer.
Self-updating weights could be more like epigenetics.
There's a difference between memory and learning.
Would you rather your illness was diagnosed by a doctor or by a plumber with access to a stack of medical books ?
Learning is about assimilating lots of different sources of information, reconciling the differences, trying things out for yourself, learning from your mistakes, being curious about your knowledge gaps and contradictions, and ultimately learning to correctly predict outcomes/actions based on everything you have learnt.
You will soon see the difference in action as Anthropic apparently agree with you that memory can replace learning, and are going to be relying on LLMs with longer compressed context (i.e. memory) in place of ability to learn. I guess this'll be Anthropic's promised 2027 "drop-in replacement remote worker" - not an actual plumber unfortunately (no AGI), but an LLM with a stack of your company's onboarding material. It'll have perfect (well, "compressed") recall of everything you've tried to teach it, or complained about, but will have learnt nothing from that.
I think my point is that when the doctor diagnoses you, she often doesn’t do so immediately. She is spending time thinking it through, and as part of that process is retrieving various pieces of relevant information from her memory (both long term and short term).
I think this may be closer to an agentic, iterative search (ala claude code) than direct inference using continuously updated weights. If it was the latter, there would be no process of thinking it through or trying to recall relevant details, past cases, papers she read years ago, and so on; the diagnosis would just pop out instantaneously.
Yes, but I think a key part of learning is experimentation and the feedback loop of being wrong.
An agent, or doctor, may be reasoning over the problem they are presented with, combining past learning with additional sources of memorized or problem-specific data, but in that moment it's their personal expertise/learning that will determine how successful they are with this reasoning process and ability to apply the reference material to the matter at hand (cf the plumber, who with all the time in the world just doesn't have the learning to make good use of the reference books).
I think there is also a subtle problem, not often discussed, that to act successfully, the underlying learning in choosing how to act has to have come from personal experience. It's basically the difference between being book smart and having personal experience, but in the case of an LLM also applies to experience-based reasoning it may have been trained on. The problem is that when the LLM acts, what is in it's head (context/weights) isn't the same as what was in the head of the expert whose reasoning it may be trying to apply, so it may be trying to apply reasoning outside of the context that made it valid.
How you go from being book smart, and having heard other people's advice and reasoning, to being an expert yourself is by personal practice and learning - learning how to act based on what is in your own head.
Human neurons are self updating though, we aren't running on our genes each cell is using our genes to determine how to connect to other cells and then the cell learns how to process some information there based on what it hears from its connected cells.
So, genes would be a meta model that then updates weights in the real model so it can learn how to process new kinds of things, and for stuff like facts you can use an external memory just like humans does.
Without updating the weights in the model you will never be able to learn to process new things like a new kind of math etc, since you learn that not by memorizing facts but by making new models for it.
In spiking neural networks, the model weights are equivalent to dendrites/synapses, which can form anew and decay during your lifetime.
True. In the same way as making noises down a telephone line is the obvious way to build a million dollar business.
I wonder when there will be proofs in theoretical computer science that an algorithm is AGI-complete, the same way there are proofs of NP-completeness.
Conjecture: A system that self updates its weights according to a series of objective functions, but does not suffer from catastrophic forgetting (performance only degrades due to capacity limits, rather than from switching tasks) is AGI-complete.
Why? Because it could learn literally anything!
> all the handwavy "engineering" is better done with more data.
How long until that gets more reliable than a simple database? How long until it can execute code faster than a CPU running a program?
A lot of the stuff humans accomplish is through technology, not due to growing a bigger brain. Even something seemingly basic like a math equation benefits drastically from being written down with pen&paper instead of being juggled in the human brain itself (see Extended mind thesis). And when it comes to something like running a 3D engine, there is pretty much no hope of doing it with just your brain.
Maybe we will get AIs smart enough that they can write their own tools, but for that to happen, we still need the infrastructure that allows them writing the tools in the first place. The way they can access Python is a start, but there is still a lack of persistence that lets them keep their accomplishments for future runs, be it in the form of a digital notepad or dynamic updating of weights.
I agree with your comment and the article. LLMs should be part of the answer, but the core of the progress should probably dive back into neural networks in general. Language is how we communicate as well as with other senses, but right now we're stuck at LLMs that just seem to be blown out elizas trained with other actual humans work. I remember early on, training of simple neural networks was done with rules in their environment and they evolved behavior according to criteria set, like genetic algorithms. I think the current LLMs are getting a "filtered" view of the environment, and that filter behaves like the average IQ of netizens lol
8 years is a pretty short perspective. The current growth phase was unlocked by more engineering. We could’ve had some of these kinds of capabilities decades ago, as illustrated by how cutting edge AI research is now trickling down all the way to microcontrollers with 64MHz CPU and kBs of RAM.
Once we got the “Attention is all you need” paper I don’t remember anyone saying we couldn’t get better results by throwing more data and compute at it. But now we’ve pretty much thrown all the data and all (as much as we can reasonably manufacture) at it. So clearly we’re at the end of that phase.
Sometimes I think the fundamental thing could be as ‘simple’ as something like introducing l an attention/event loo, flush to memory, emotion driven motivation. There are quite a few fairly obvious things that llms don’t have that might be best not to add just in case.
I think the gist of TFA is just that we need a new architecture/etc not scaling.
I suppose one can argue about whether designing a new AGI-capable architecture and learning algorithm(s) is a matter of engineering (applying what we already know) or research, but I think saying we need new scientific discoveries is going to far.
Neural nets seems to be the right technology, and we've now amassed a ton of knowledge and intuition about what neural nets can do and how to design with them. If there was any doubt, then LLMs, even if not brain-like, have proved the power of prediction as a learning technique - intelligence essentially is just successful prediction.
It seems pretty obvious that the rough requirements for an neural-net architecture for AGI are going to be something like our own neocortex and thalamo-cortical loop - something that learns to predict based on sensory feedback and prediction failure, including looping and working memory. Built-in "traits" like curiosity (prediction failure => focus) and boredom will be needed so that this sort of autonomous AGI puts itself into leaning situations and is capable of true innovation.
The major piece to be designed/discovered isn't so much the architecture as the incremental learning algorithm, and I think if someone like Google-DeepMind focused their money, talent and resources on this then they could fairly easily get something that worked and could then be refined.
Demis Hassabis has recently given an estimate of human-level AGI in 5 years, but has indicated that a pre-trained(?) LLM may still be one component of it, so not clear exactly what they are trying to build in that time frame. Having a built-in LLM is likely to prove to be a mistake where the bitter lesson applies - better to build something capable of self-learning and just let it learn.
> Demis Hassabis has recently given an estimate of human-level AGI in 5 years
He said 50% chance of AGI in 5 years.
Yes, and I wonder how many "5 year" project estimates, even for well understood engineering endeavors, end up being accurate to within 50% (obviously overruns are more common then the opposite)?
I took his "50% 5-year" estimate as essentially a project estimate for something semi-concrete they are working towards. That sort of timeline and confidence level doesn't seem to allow for a whole lot of unknowns and open-ended research problems to be solved, but OTOH who knows if he is giving his true opinion, or where those numbers came from.
This isn’t really what the bitter lesson says.
> If you believe the bitter lesson, all the handwavy "engineering" is better done with more data
Id say better model architechture than more data. A human can learn to do things more complex than an LLM with less data. I think modelling the world as a static system to be representation learned in an unsupervised fashion is blocked on the static assumption. The world is dynamical, that should be reflected in the base model
But yeah, definitely not an engineering problem. Thats like saying the reason a crow isnt as smart as a person is becauss they dont have the hands to type of keyboards. But its also not because they havent seen enough of the world like your saying. Its be ause their brain isnt complex enough
Aye. Missing are self correction (world models/action and response observation), coherence over the long term, and self-scaling. The 3rd are what all the SV types are worried about, except maybe Yann LeCun who is worried about the first and second.
Hinton thinks the 3rd is inevitable/already here and humanity is doomed. It's an odd arena.
The bitter lesson was "general methods that leverage computation" win rather than more data. Like rather than just LLMs you could maybe try applying something like AlphaEvolve to finding better algorithms/systems (https://news.ycombinator.com/item?id=43985489).
> There is something more fundamental missing
I am thinking we need a foundation, something that is concrete and explicit and doesn't do hallucination. But has very limited knowledge outside of absolute Maths and basic physics.
Indeed. The Bitter Lesson has proved true so far. This sounds like going back to the 60s expert systems concept we're trying to get away from. The author also just describes RAG. That certainly isn't AGI, which probably isn't achievable at all.
The missing science to engineer intelligence is composable program synthesis. Aloe (https://aloe.inc) recently released a GAIA score demonstrating how CPS dramatically outperforms other generalist agents (OpenAI's deep research, Manus, and Genspark) on tasks similar to those a knowledge worker would perform.
I'd argue it's because intelligence has been treated as a ML/NN engineering problem that we've had the hyper focus on improving LLMs rather than the approach articulated in the essay.
Intelligence must be built from a first principles theory of what intelligence actually is.
CPS sounds interesting but your link goes to a teaser trailer and a waiting list. It's kind of hard to expect much from that.
It seems similar to the fermi paradox.
The underlying assumption is that it exists in the first place. Or rather, one must first accept an axiom.
In fermi, its that interstellar signals can be detected and further travel is possible.
In AGI, its that intelligence is a isolateable process which we can bootstrap in minimal time.
Both assumes human progress are templates of unlimited exponential growth.
So the "Bitter Lesson" paper actually came up recently and I was surprised to discover that what it claimed was sensible and not at "all you need is data" or "data is inherently better"
The first line and the conclusion is: "The biggest lesson that can be read from 70 years of AI research is that general methods that leverage computation are ultimately the most effective, and by a large margin." [1]
I don't necessary agree with it's examples or the direction it vaguely points at. But it's basic statement seems sound. And I would say that there's lot of opportunity for engineer, broadly speaking, in the process of creating "general methods that leverage computation" (IE, that scale). What the bitter lesson page was roughly/really about was earlier "AI" methods based on logic-programming and which including information on the problem domain in the code itself.
And finally, the "engineering" the paper talks about actually is pro-Bitter lesson as far as I can tell. It's taking data routing and architectural as "engineering" and here I agree this won't work - but for the opposite reason - specifically 'cause I don't just data routing/process will be enough.
[1]https://www.cs.utexas.edu/~eunsol/courses/data/bitter_lesson...
What will it scale up to if not AGI? OpenAI has a synthetic data flywheel. What are the asymptotics of this flywheel assuming no qualitative additional breakthrough?
What will shouting louder achieve if not wisdom?
Did GPT-1 scale up to be a database ? No - it scaled up to be GPT-2
Did GPT-2 scale up to be an expert system ? No - it scaled up to be GPT-3
..
Did GPT-4 scale up to become AGI ? No - it scaled up to be GPT-5
Moreover, the differences between each new version are becoming increasingly less. We're reaching an asymptote because the more data you've trained on, natural or synthetic, the less is the impact of any incremental additions.
If you scale up an LLM big enough, then essentially what you'll get is GPT-5.
It's established science (see Chomsky) that a probablistic model will never achieve something approaching AGI
The counter argument is that we were working with thermodynamics before knowing the theory. Famously the steam engine came before the first law of thermodynamics. Sometimes engineering is like that. Using something that you don’t understand exactly how it works.
[dead]
Too bad about all those chumps designing better, faster architectures and kernels to make models run faster...
There is a reason why LLM's are architected the way they are and why thinking is bolted on.
The architecture has to allow for gradient descent to be a viable training strategy, this means no branching (routing is bolted on).
And the training data has to exist, you can't find millions of pages depicting every thought a person went through before writing something. And such data can't exist because most thoughts aren't even language.
Reinforcement learning may seem like the answer here: bruteforce thinking to happen. But it's grossly sample-inefficient with gradient descent and therefore only used for finetuning.
LLM's are regressive models and the configuration that was chosen where every token can only look back allows for very sample-efficient training (one sentence can be dozens of samples).
You didn't mention it, but LLMs and co don't have loops. Whereas a brain, even a simple one is nothing but loops. Brains don't halt, they keep spinning while new inputs come in and output whenever they feel like it. LLMs however do halt, you give them an input, it gets transformed across the layers, then gets output.
While you say reinforcement learning isn't a good answer, I think its the only answer.
People have speculated that the main thing that sets the human mind apart from the minds of all other animals is its capacity for recursive thought. A handful of animals have been observed to use tools, but humans are the only species ever observed to use a tool to create another tool. This recursion created all of civilization.
But that recursive thought has a limit. For example: You can think about yourself thinking. With a little effort, you can probably also think about yourself thinking about yourself thinking. But you can't go much deeper.
With the advent of modern computing, we (as a species) have finally created a tool that can "think" recursively, to arbitrary levels of depth. If we ever do create a superintelligent AGI, I'd wager that its brilliance will be attributable to its ability to loop much deeper than humans can.
> With the advent of modern computing, we (as a species) have finally created a tool that can "think" recursively, to arbitrary levels of depth.
I don't know what this means; when a computer "thinks" recursively, does it actually?
The recursion is specified by the operator (i.e. programmer), so the program that is "thinking" recursively is not, because the both the "thinking" and the recursion is provided by the tool user (the programmer), not by the tool.
> If we ever do create a superintelligent AGI, I'd wager that its brilliance will be attributable to its ability to loop much deeper than humans can.
Agreed.
Off topic, but I remember as a child I would play around with that kind of recursive thinking. I would think about something, then think about that I thought about it, then think about that I though about thinking about it. Then, after a few such repetitions I would recognise that this could go on forever. Then I would think about the fact that I recognise that this could go on forever, then think about that… then realise that this meta pattern could go on forever. Etc…
Later I connected this game with the ordinals. 0,1,2… ω, ω+1, ω+2,…,2ω,2ω+1,2ω+2,…,3ω,…,4ω,…,4ω,…, ω*ω,…
Tesla used to experience visual hallucinations. Any time an object was mentioned in conversation it would appear before him as if it where real. He started getting these hallucinations randomly and began to obsess about their origins. Over time he was able to trace the source of every hallucination to something which he had heard or seen earlier. He then noticed that the same was true for all of his thoughts, every one could be traced to some external stimuli. From this he concluded that he was an automata controlled by remote input. This inspired him to invent the first remote control vehicle.
LLMs have loops. The output is fed back in for the next prediction cycle. How is that not the same thing?
Wish I had a great answer for you but I don't. It certainly allows for more thought-like LLMs with the reasoning type models. I guess the best answer is that the loop only happens at a single discrete place and doesn't carry any of the internal layer context across.
Another answer might be, how many comments did you read today and not reply too? Did you write a comment by putting down a word and then deciding what the next one should be? Or did you have a full thought in mind before you even began typing a reply?
So, how is it not the same thing? Because it isn't
It feels like the same thing to me…
This is so interesting. It’s suggests that a kind of thought sensing brain scanning technology could be used as training data for the nonverbal thought layer.
I guess smart people in big companies already consider this and are currently working on technologies for products That will include some form of electromagnetic brain sensing - Provided conveniently as an interface - but also usefully a source of this data.
It also suggests to me that AI/AGI is far more susceptible to traditional disruption than the narratives of established incumbents suggest. You could have a Kickstarter like killer product, including such a headset that would provide the data to bootstrap that startup’s super AI.
Exciting times!
> And the training data has to exist, you can't find millions of pages depicting every thought a person went through before writing something. And such data can't exist because most thoughts aren't even language.
It would be interesting if in the very distant future, it becomes viable to use advanced brain scans as training data for AI systems. That might be a more realistic intermediate between the speculations into AGI and Uploaded Intelligence.
</scifi>
Yep this. LLMs are just regressive models.
Imagine if we had a LLM in the 15th century. It would happily explain the validity of the geocentric system. It can't get to heliocentrism. The same way modern LLMs can only tell us what we know and cant think, revolutionize, etc. They can be programmed to reason a bit, but 'reason' is doing a lot of heavy lifting here. The reasoning is just a better filter on what the person is asking or what is being produced for the most part and not an actual novel creative act.
The more time I spend with LLM's the more they feel like google on steroids. I just am not seeing how this type of system could ever lead to AGI, and if anything, probably is eating away at any remaining AGI hype and funding.
I think this essay lands on a useful framing, even if you don’t buy its every prescription. If we zoom out, history shows two things happening in parallel: (1) brute-force scaling driving surprising leaps, and (2) system-level engineering figuring out how to harness those leaps reliably. GPUs themselves are a good analogy: Moore’s Law gave us the raw FLOPs, but CUDA, memory hierarchies, and driver stacks are what made them usable at scale.
Right now, LLMs feel like they’re at the same stage as raw FLOPs; impressive, but unwieldy. You can already see the beginnings of "systems thinking" in products like Claude Code, tool-augmented agents, and memory-augmented frameworks. They’re crude, but they point toward a future where orchestration matters as much as parameter count.
I don’t think the "bitter lesson" and the "engineering problem" thesis are mutually exclusive. The bitter lesson tells us that compute + general methods win out over handcrafted rules. The engineering thesis is about how to wrap those general methods in scaffolding that gives them persistence, reliability, and composability. Without that scaffolding, we’ll keep getting flashy demos that break when you push them past a few turns of reasoning.
So maybe the real path forward is not "bigger vs. smarter," but bigger + engineered smarter. Scaling gives you raw capability; engineering decides whether that capability can be used in a way that looks like general intelligence instead of memoryless autocomplete.
Nah,this sounds like a modern remix of Japan’s Fifth Generation Computing project. They thought that by building large databases and with Prolog they would bring upon an AI renaissance.
Just hand waving some “distributed architecture” and trying to duct tape modules together won’t get us any closer to AGI.
The building blocks themselves, the foundation, has to be much better.
Arguably the only building block that LLMs have contributed is that we have better user intent understanding now; a computer can just read text and extract intent from it much better than before. But besides that, the reasoning/search/“memory” are the same building blocks of old, they look very similar to techniques of the past, and that’s because they’re limited by information theory / computer science, not by today’s hardware or systems.
Yep, the Attention mechanism in the Transformer arch is pretty good.
Probably need another cycle of similar breakthrough in model engineering before this more complex neural network gets a step function better.
Moar data ain’t gonna help. The human brain is the proof: it doesnt need the internet’s worth of data to become good (nor all that much energy).
Right.
We can certainly get much more utility out of current architectures with better engineering, as "agents" have shown, but to claim that AGI is possible with engineering alone is wishful thinking. The hard part is building systems that showcase actual intelligence and reasoning, that are able to learn and discover on their own instead of requiring exorbitantly expensive training, that don't hallucinate, and so on. We still haven't cracked that nut, and it's becoming increasingly evident that the current approaches won't get us there. That will require groundbreaking compsci work, if it's possible at all.
AGI, by definition, in its name Artificial General Intelligence implies / directly states that this type of AI is not some dumb AI that requires training for all its knowledge, a general intelligence merely needs to be taught how to count, the basic rules of logic, and the basic rules of a single human language. From those basics all derivable logical human sciences will be rediscovered by that AGI and our next job is synchronizing with it our names for all the phenomenon that the AGI had to name on its own when that AGI self developed all the logical ramifications of our basics.
What is that? What could merely require light elementary education and then it takes off and self improves to match and surpass us? That would be artificial comprehension, something we've not even scratched. AI and trained algorithms are "universal solvers" given enough data, This AGI would be something different, this is understanding, comprehending. Instantaneous decomposition of observations for assessment of plausibility, and then recombination for assessment of combination plausibility - all continual and instant for assessment of personal safety: all that happens in people continually while awake. Be that monitoring of personal safety be for physical or loss of client during sales negotiation. Our comprehending skills are both physical and abstract. This requires a dynamic assessment, an ongoing comprehension that is validating observations as a foundation floor, so a more forward train of thought, a "conscious mind" can make decisions without conscious thought about lower level issues like situational safety. AGI needs all that dynamic comprehending capability, to satisfy its name of being general.
> AGI, by definition, in its name Artificial General Intelligence implies / directly states that this type of AI is not some dumb AI that requires training for all its knowledge, a general intelligence merely needs to be taught how to count, the basic rules of logic, and the basic rules of a single human language. From those basics all derivable logical human sciences will be rediscovered by that AGI
That's not how natural general intelligences work, though.
Are you sure? Do you require dozens, to hundreds, to thousands of examples before you understand a concept? I expect no. That is because you have comprehension that can generalize a situation to basic concepts which you apply to other situations without effort. You comprehend. AI cannot do that: get the idea from a few, under a half dozen examples if necessary. Often a human needs 1-3 examples before they can generalize any concept. Not AI.
I think they're saying people generally don't learn language or mathematics by learning the basic rules and deducing everything else
Humanity did exactly that though, so an AGI should be capable of the same feat given enough time.
> Humanity did exactly that though
No, it mostly didn't, it continued (continues, as every human is continuously interlacing “training” and “inferencing”) training on large volumes of ground truth for a very long time, including both natural and synthetic data; it didn't reason everything beyond some basic training on first principles.
At a minimum, something that looks broadly like one of today's AI models would need either a method of continuously finetuning its own weights with a suitable evaluation function or,if it was going to rely on in-context learning, would need many orders of magnitude larger context, than any model today.
And that's not a “this is enough to likely work” thing, but “this is the minimum for the their to even be a plausible mechnanism to incorporate the information necessary for it to work” one.
Yeah this original poster is only talking about the "theoretical" part of intelligence, and somehow completely forgetting about the "practical experimental" which is the only way to solidify and improve any theoretical things it comes up with
There is the concept of n-t-AGI, which is capable of performing tasks that would take n humans t time. So a single AI system that is capable of rediscovering much of science from basic principles could be classified as something like 10'000'000humans-2500years-AGI, which could already be reasonably considered artificial superintelligence.
Humanity did it through A LOT of collective trial and error. Evolution is a powerful algorithm, but not a very smart one.
Billions of humans did that over hundreds of thousands of years. Maybe it would only take thousands of years for AGI?
Any human old enough to talk has already experienced thousands of related examples of most everyday concepts.
For concepts that are not close to human experience, yes humans need a comically large number of examples. Modern physics is a third-year university class.
You spend every waking minute for 20 years or so accumulating training data. You don't learn addition and then independently discover vector calculus.
Individual people don't but we did it as a species. Any purported AGI should be capable of doing the same.
So you are now claiming that individual humans are not general intelligences and the only natural general intelligence is humanity as a unit?
I have not seen any evidence of either. We have no way of knowing if we are “definitely” a true general intelligence, whether as individuals or as a civilization. If there is a concept that we are fundamentally incapable of conceptualizing, we'll never know.
On top of that, true general intelligence requires a capacity for unbounded growth. The human brain can't do that. Humanity as a civilization can technically do it, but we don't know if that's the only requirement for general intelligence.
Meanwhile, there is plenty of evidence to the contrary. Both as individuals and as a global civilization we keep running into limitations that we can't overcome. As an individual, I will never understand quantum mechanics no matter how hard I try. As a global civilization, we seem unable to organize in a way that isn't self-destructive. As a result we keep making known problems worse (e.g. climate change) or maintaining a potential for destruction (e.g. nuclear weapons). And that's only the problems that we can see and conceptualize!
I don't think true general intelligence is really a thing.
...if you run millions of instances of it for hundreds of thousands of years.
Either the bar of general intelligence set by humans is not very high, or humans are not "generally intelligent" at all. No third option there.
> Either the bar of general intelligence set by human is not very high, or humans are not "generally intelligent" at all. No third option there.
Based on... what?
Based on anatomically modern humans existing for over a hundred thousand years - without inventing all the modern technology and math and science until the far end of that timetable.
Am I the only one who feels that Claude Code is what they would have imagined basic AGI to be like 10 years ago?
It can plan and take actions towards arbitrary goals in a wide variety of mostly text-based domains. It can maintain basic "memory" in text files. It's not smart enough to work on a long time horizon yet, it's not embodied, and it has big gaps in understanding.
But this is basically what I would have expected v1 to look like.
> Am I the only one who feels that Claude Code is what they would have imagined basic AGI to be like 10 years ago?
That wouldn't have occurred to me, to be honest. To me, AGI is Data from Star Trek. Or at the very least, Arnold Schwarzenegger's character from The Terminator.
I'm not sure that I'd make sentience a hard requirement for AGI, but I think my general mental fantasy of AGI even includes sentience.
Claude Code is amazing, but I would never mistake it for AGI.
I would categorize sentient AGI as artificial consciousness[1], but I don't see an obvious reason AGI inherently must be conscious or sentient. (In terms of near-term economic value, non-sentient AGI seems like a more useful invention.)
For me, AGI is an AI that I could assign an arbitrarily complex project, and given sufficient compute and permissions, it would succeed at the task as reliably as a competent C-suite human executive. For example, it could accept and execute on instructions to acquire real estate that matches certain requirements, request approvals from the purchasing and legal departments as required, handle government communication and filings as required, construct a widget factory on the property using a fleet of robots, and operate the factory on an ongoing basis while ensuring reliable widget deliveries to distribution partners. Current agentic coding certainly feels like magic, but it's still not that.
1: https://en.wikipedia.org/wiki/Artificial_consciousness
"Consciousness" and "sentience" are terms mired in philosophical bullshit. We do not have an operational definition of either.
We have no agreement on what either term really means, and we definitely don't have a test that could be administered to conclusively confirm or rule out "consciousness" or "sentience" in something inhuman. We don't even know for sure if all humans are conscious.
What we really have is task specific performance metrics. This generation of AIs is already in the valley between "average human" and "human expert" on many tasks. And the performance of frontier systems keeps improving.
"Consciousness" seems pretty obvious. The ability to experience qualia. I do it, you do it, my dog does it. I suspect all mammals do it, and I suspect birds do too. There is no evidence any computer program does anything like it.
It's "intelligence" I can't define.
Oh, so simple. Go measure it then.
The definition of "featherless biped" might have more practical merit, because you can at least check for feathers and count limbs touching the ground in a mostly reliable fashion.
We have no way to "check for qualia" at all. For all we know, an ECU in a year 2002 Toyota Hilux has it, but 10% of all humans don't.
Plenty of things are real that can't be measured, including many physical sensations and emotions.
I won't say they are impossible to ever be measured, but we currently have no idea how.
If you can't measure it and can't compare it, then for all practical purposes, it does not exist.
"Consciousness" might as well not be real. The only real and measurable thing is capabilities.
Oof. Tell chronic pain patients that their pain doesn't exist.
I guess depression doesn't exist either. Or love.
I would love for you to define AGI in such a way as for that to make sense.
I presuppose that you actually mean ASI as a starting point, and that is being charitable that it isn’t just pattern matching to questionable sci-fi.
Totally agree. It even (usually) gets subtle meanings from my often hastily written prompts to fix something.
What really occurs to me is that there is still so much can be done to leverage LLMs with tooling. Just small things in Claude Code (plan mode for example) make the system work so much better than (eg) the update from Sonnet 3.5 to 4.0 in my eyes.
No you are not the only one. I am continuously mystified by the discussion surrounding this. Clause is absolutely and unquestionably an artificial general intelligence. But what people mean by “AGI” is a constantly shifting, never defined goalpost moving at sonic speed.
What we envisioned with AGI is something like self directed learning, I think. Not just a better search engine.
Isn’t that unsupervised learning during training or fine-tuning?
Claude code is neither sentient nor sapient.
I suspect most people envision AGI as at least having sentience. To borrow from Star Trek, the Enterprise's main computer is not at the level of AGI, but Data is.
The biggest thing that is missing (IMHO) is a discrete identity and notion of self. It'll readily assume a role given in a prompt, but lacks any permanence.
Any claim of sentience is neither provable nor falsifiable. Caring about its definition has nothing to do with capabilities.
> I suspect most people envision AGI as at least having sentience
I certainly don't. It could be that's necessary but I don't know of any good arguments for (or against) it.
Student: How do I know I exist?
Philosophy Professor: Who is asking?
Student: I am!
Mine is. What evidence would you accept to change your mind?
Why should it have discrete identity and notion of self?
The analogy I like to use is from the fictional universe of Mass Effect, which distinguished between VI (Virtual Intelligence), which is a conversational interface over some database or information service (often with a holographic avatar of a human, asari, or other sentient being); and AI, which is sentient and smart enough to be considered a person in its own right. We've just barely begun to construct VIs, and they're not particularly good or reliable ones.
One thing I like about the Mass Effect universe is the depiction of the geth, which qualify as AI. Each geth unit is not run by a singular intelligent program, but rather a collection of thousands of daemons, each of which makes some small component of the robot's decisions on its own, but together they add up to a collective consciousness. When you look at how actual modern robotics platforms (such as ROS) are designed, with many processes responsible for sensors and actuators communicating across a common bus, you can see the geth as sort of an extrapolation of that idea.
The "basic" qualifier is just equivocating away all the reasons why it isn't AGI.
We don't know if AGI is even possible outside of a biological construct yet. This is key. Can we land on AGI without some clear indication of possibility (aka Chappie style)? Possibly, but the likelihood is low. Quite low. It's essentially groping in the dark.
A good contrast is quantum computing. We know that's possible, even feasible, and now are trying to overcome the engineering hurdles. And people still think that's vaporware.
> We don't know if AGI is even possible outside of a biological construct yet. This is key.
A discovery that AGI is impossible in principle to implement in an electronic computer would require a major fundamental discovery in physics that answers the question “what is the brain doing in order to implement general intelligence?”
It is vacuously true that a Turing machine can implement human intelligence: simply solve the Schrödinger equation for every atom in the human body and local environment. Obviously this is cost-prohibitive and we don’t have even 0.1% of the data required to make the simulation. Maybe we could simulate every single neuron instead, but again it’ll take many decades to gather the data in living human brains, and it would still be extremely expensive computationally since we would need to simulate every protein and mRNA molecule across billions of neurons and glial cells.
So the question is whether human intelligence has higher-level primitives that can be implemented more efficiently - sort of akin to solving differential equations, is there a “symbolic solution” or are we forced to go “numerically” no matter how clever we are?
> It is vacuously true that a Turing machine can implement human intelligence
The case of simulating all known physics is stronger so I'll consider that.
But still it tells us nothing, as the Turing machine can't be built. It is a kind of tautology wherein computation is taken to "run" the universe via the formalism of quantum mechanics, which is taken to be a complete description of reality, permitting the assumption that brains do intelligence by way of unknown combinations of known factors.
For what it's worth, I think the last point might be right, but the argument is circular.
Here is a better one. We can/do design narrow boundary intelligence into machines. We can see that we are ourselves assemblies of a huge number of tiny machines which we only partially understand. Therefore it seems plausible that computation might be sufficient for biology. But until we better understand life we'll not know.
Whether we can engineer it or whether it must grow, and on what substrates, are also relevant questions.
If it appears we are forced to "go numerically", as you say, it may just indicate that we don't know how to put the pieces together yet. It might mean that a human zygote and its immediate environment is the only thing that can put the pieces together properly given energetic and material constraints. It might also mean we're missing physics, or maybe even philosophy: fundamental notions of what it means to have/be biological intelligence. Intelligence human or otherwise isn't well defined.
QM is a testable hypothesis, so I don't think it's necessarily like an axiomatic assumption here. I'm not sure what you mean by "it tells us nothing, as ... can't be built". It tells us there's no theoretical constraint and only an engineering constraint to doing simulating the human brain (and all the tasks)
Sure, you can simulate a brain. If and when the simulation starts to talk you can even claim you understand how to build human intelligence in a limited sense. You don't know if it's a complete model of the organism until you understand the organism. Maybe you made a p zombie. Maybe it's conscious but lacks one very particular faculty that human beings have by way of some subtle phenomena you don't know about.
There is no way to distinguish between a faithfully reimplemented human being and a partial hackjob that happens to line up with your blind spots without ontological omniscience. Failing that, you just get to choose what you think is important and hope it's everything relevant to behaviors you care about.
> It is vacuously true that a Turing machine can implement human intelligence: simply solve the Schrödinger equation for every atom in the human body and local environment.
Yes, that is the bluntest, lowest level version of what I mean. To discover that this wouldn’t work in principle would be to discover that quantum mechanics is false.
Which, hey, quantum mechanics probably is false! But discovering the theory which both replaces quantum mechanics and shows that AGI in an electronic computer is physically impossible is definitely a tall order.
There's that aphorism that goes: people who thought the epitome of technology was a steam engine pictured the brain as pipes and connecting rods, people who thought the epitome of technology was a telephone exchange pictured the brain as wires and relays... and now we have computers, and the fact that they can in principle simulate anything at all is a red herring, because we can't actually make them simulate things we don't understand, and we can't always make them simulate things we do understand, either, when it comes down to it. We still need to know what the thing is that the brain does, it's still a hard question, and maybe it would even be a kind of revolution in physics, just not in fundamental physics.
>We still need to know what the thing is that the brain does
Yes, but not necessarily at the level where the interesting bits happen. It’s entirely possible to simulate poorly understood emergent behavior by simulating the underlying effects that give rise to it.
Can I paraphrase that as make an imitation and hack it around until it thinks, or did I miss the point?
It's not even known if we can observe everything required to replicate consciousness.
i'd argue LLMs and deep learning are much more on the intelligence from complexity side than the nice symbolic solution side of things. Probably the human neuron, though intrinsically very complex, has nice low loss abstractions to small circuits. But on the higher levels, we don't build artificial neural networks by writing the programs ourselves.
That is only true if consciousness is physical and the result of some physics going on in the human brain. We have no idea if that's true.
Whatever it is that gives rise to consciousness is, by definition, physics. It might not be known physics, but even if it isn't known yet, it's within the purview of physics to find out. If you're going to claim that it could be something that fundamentally can't be found out, then you're admitting to thinking in terms of magic/superstition.
You got downvoted so I gave you an upvote to compensate.
We seem to all be working with conflicting ideas. If we are strict materialists, and everything is physical, then in reality we don't have free will and this whole discussion is just the universe running on automatic.
That may indeed be true, but we are all pretending that it isn't. Some big cognitive dissidence happening here.
Not necessarily , for a given definition of AGI you could have mathematical proof that it is incomputable similar to how Gödel incompleteness theorems work .
It need not even be incomputable, it could be NP hard and practically be incomputable, or it could be undecidable I.e. a version of the halting problem.
There are any number of ways our current models of mathematics or computation can in theory could be shown as not capable of expressing AGI without needing a fundamental change in physics
We would also need a definition of AGI that is provable or disprovable.
We don’t even have a workable definition, never mind a machine.
Only if we need to classify things near the boundary. If we make something that’s better at every test that we can devise than any human we can find, I think we can say that no reasonable definition of AGI would exclude it without actually arriving at a definition.
We don’t need such a definition of general intelligence to conclude that biological humans have it, so I’m not sure why we’d such a definition for AGI.
I disagree. We claim that biological humans have general intelligence because we are biased and arrogant, and experience hubris. I'm not saying we aren't generally intelligent, but a big part of believing we are is because not believing so would be psychologically and culturally disastrous.
I fully expect that, as our attempts at AGI become more and more sophisticated, there will be a long period where there are intensely polarizing arguments as to whether or not what we've built is AGI or not. This feels so obvious and self-evident to me that I can't imagine a world where we achieve anything approaching consensus on this quickly.
If we could come up with a widely-accepted definition of general intelligence, I think there'd be less argument, but it wouldn't preclude people from interpreting both the definition and its manifestation in different ways.
I can say it. Humans are not "generally intelligent". We are intelligent in a distribution of environments which are similar enough to ones we are used to. There's no way to be intelligent with no priors on environment basically by information theory (you can make your environment to be adversarial to the learning efficiency in "intelligent" beings which comes from priors)
We claim that biological humans have general intelligence because we are biased and arrogant, and experience hubris.
No, we say it because - in this context - we are the definition of general intelligence.
Approximately nobody talking about AGI takes the "G" to stand for "most general possible intelligence that could ever exist." All it means is "as general as an average human." So it doesn't matter if humans are "really general intelligence" or not, we are the benchmark being discussed here.
If you don't believe me, go back to the introduction of the term[1]:
By advanced artificial general intelligence, I mean AI systems that rival or surpass the human brain in complexity and speed, that can acquire, manipulate and reason with general knowledge, and that are usable in essentially any phase of industrial or military operations where a human intelligence would otherwise be needed. Such systems may be modeled on the human brain, but they do not necessarily have to be, and they do not have to be "conscious" or possess any other competence that is not strictly relevant to their application. What matters is that such systems can be used to replace human brains in tasks ranging from organizing and running a mine or a factory to piloting an airplane, analyzing intelligence data or planning a battle.
It's pretty clear here that the notion of "artificial general intelligence" is being defined as relative to human intelligence.
Or see what Ben Goertzel - probably the one person most responsible for bringing the term into mainstream usage - had to say on the issue[2]:
“Artificial General Intelligence”, AGI for short, is a term adopted by some researchers to refer to their research field. Though not a precisely defined technical term, the term is used to stress the “general” nature of the desired capabilities of the systems being researched -- as compared to the bulk of mainstream Artificial Intelligence (AI) work, which focuses on systems with very specialized “intelligent” capabilities. While most existing AI projects aim at a certain aspect or application of intelligence, an AGI project aims at “intelligence” as a whole, which has many aspects, and can be used in various situations. There is a loose relationship between “general intelligence” as meant in the term AGI and the notion of “g-factor” in psychology [1]: the g-factor is an attempt to measure general intelligence, intelligence across various domains, in humans.
Note the reference to "general intelligence" as a contrast to specialized AI's (what people used to call "narrow AI" even though he doesn't use the term here). And the rest of that paragraph shows that the whole notion is clearly framed in terms of comparison to human intelligence.
That point is made even more clear when the paper goes on to say:
Modern learning theory has made clear that the only way to achieve maximally general problem-solving ability is to utilize infinite computing power. Intelligence given limited computational resources is always going to have limits to its generality. The human mind/brain, while possessing extremely general capability, is best at solving the types of problems which it has specialized circuitry to handle (e.g. face recognition, social learning, language learning;
Note that they chose to specifically use the more precise term "maximally general problem solving ability when referring to something beyond the range of human intelligence, and then continued to clearly show that the overall idea is - again - framed in terms of human intelligence.
One could also consult Marvin Minsky's words[3] from back around the founding of the overall field of "Artificial Intelligence" altogether:
“In from three to eight years, we will have a machine with the general intelligence of an average human being. I mean a machine that will be able to read Shakespeare, grease a car, play office politics, tell a joke, have a fight.
Simply put, with a few exceptions, the vast majority of people working in this space simply take AGI to mean something approximately like "human like intelligence". That's all. No arrogance or hubris needed.
[1]: https://web.archive.org/web/20110529215447/http://www.foresi...
[2]: https://goertzel.org/agiri06/%255B1%255D%2520Introduction_No...
[3]: https://www.science.org/doi/10.1126/science.ado7069
Well general intelligence in humans already exists, whereas general intelligence doesn't yet exist in machines. How do we know when we have it? You can't even simply compare it to humans and ask "is it able to do the same things?" because your answer depends on what you define those things to be. Surely you wouldn't say that someone who can't remember names or navigate without GPS lacks general intelligence, so it's necessary to define what criteria are absolutely required.
> You can't even simply compare it to humans and ask "is it able to do the same things?" because your answer depends on what you define those things to be.
Right, but you can’t compare two different humans either. You don’t test each new human to see if they have it. Somehow we conclude that humans have it without doing either of those things.
> You don’t test each new human to see if they have it
We do, its called school and we label some humans with different learning disabilities. Some of those learning disabilities are grave enough that they can't learn to do tasks we expect humans to be able to learn, such humans can be argued to not posses the general intelligence we expect from humans.
Interacting with an LLM today is like interacting with an Alzheimer patient, they can do things they already learned well but poke at it and it all falls apart and they start repeating themselves, they can't learn.
How do we know when a newborn has achieved general intelligence? We don't need a definition amenable to proof.
Its a near clone of a model that already has it, we don't need to prove it has general intelligence we just assume it does because most do have it.
P.S. The response is just an evasion.
A question which will be trivial to answer once you properly define what you mean by "brain"
Presumably "brains" do not do many of the things that you will measure AGI by, and your brain is having trouble understanding the idea that "brain" is not well understood by brains.
Does it make it any easier if we simplify the problem to: what is the human doing that makes (him) intelligent ? If you know your historical context, no. This is not a solved problem.
> Does it make it any easier if we simplify the problem to: what is the human doing that makes (him) intelligent ?
Sure, it doesn’t have to be literally just the brain, but my point is you’d need very new physics to answer the question “how does a biological human have general intelligence?”
Suppose dogs invent their own idea of intelligence but they say only dogs have it.
Do we think new physics would be required to validate dog intelligence ?
The claim that only dogs have intelligence is open for criticism, just like every other claim.
I’m not sure what your point is, because the source of the claim is irrelevant anyway. The reason I think that humans have general intelligence is not that humans say that they have it.
Would that really be a physics discovery? I mean I guess everything ultimately is. But it seems like maybe consciousness could be understood in terms of "higher level" sciences - somewhere on the chain of neurology->biology->chemistry->physics.
Consciousness (subjective experience) is possibly orthogonal to intelligence (ability to achieve complex goals). We definitely have a better handle on what intelligence is than consciousness.
That does make sense, reminds me of Blindsight, where one central idea is that conscious experience might not even be necessary for intelligence (and possibly even maladaptive).
> Would that really be a physics discovery?
No, it could be something that proves all of our fundamental mathematics wrong.
The GP just gave the more conservative option.
I’m not sure what you mean. This new discovery in mathematics would also necessarily tell us something new about what is computable, which is physics.
It would impact physics, yes. And literally every other natural science.
That sounds like you’re describing AGI as being impractical to implement in an electronic computer, not impossible in principle.
Yeah, I guess I'm not taking a stance on that above, just wondering where in that chain holds the most explanatory power for intelligence and/or consciousness.
I don't think there's any real reason to think intelligence depends on "meat" as its substrate, so AGI seems in principle possible to me.
Not that my opinion counts for much on this topic, since I don't really have any relevant education on the topic. But my half baked instinct is that LLMs in and of themselves will never constitute true AGI. The biggest thing that seems to be missing from what we currently call AI is memory - and it's very interesting to see how their behavior changes if you hook up LLMs to any of the various "memory MCP" implementations out there.
Even experimenting with those sorts of things has left me feeling there's still something (or many somethings) missing to take us from what is currently called "AI" to "AGI" or so-called super intelligence.
> I don't think there's any real reason to think intelligence depends on "meat" as its substrate
This made me think of... ok, so let's say that we discover that intelligence does indeed depend on "meat". Could we then engineer a sort of organic computer that has general intelligence? But could we also claim that this organic computer isn't a computer at all, but is actually a new genetically engineered life form?
But my half baked instinct is that LLMs in and of themselves will never constitute true AGI.
I agree. But... LLM's are not the only game in town. They are just one approach to AI that is currently being pursued. The current dominant approach by investment dollars, attention, and hype, to be sure. But still far from the only thing around.
That question is not a physics question
It’s not really “what is the brain doing”; that path leads to “quantum mysticism”. What we lack is a good theoretical framework about complex emergence. More maths in this space please.
Intelligence is an emergent phenomenon; all the interesting stuff happens at the boundary of order and disorder but we don’t have good tools in this space.
It doesn't have to be impossible in principle, just impossible given how little we understand consciousness or will anytime in the next century. Impossible for all intents and purposes for anyone living today.
> Impossible for all intents and purposes for anyone living today.
Sure, but tons of things which are obviously physically possible are also out of reach for anyone living today.
Seems the opposite way round to me. We couldn't conclusively say that AGI is possible in principle until some physics (or rather biology) discovery explains how it would be possible. Until then, anything we engineer is an approximation as best.
Not necessarily. It could simply be a question of scale. Being analog and molecular means that brain could be doing enormously more than any foreseeable computer. For a simple example what if every neuron is doing trillions of calculations.
(I’m not saying it is, just that it’s possible)
I think you’re merely referring to what is feasible in practice to computer with our current or near-future computers. I was referring to what is computable in principle.
Right. That’s what I was responding to.
OP wrote: > We don't know if AGI is even possible outside of a biological construct yet
And you replied that means it’s impossible in principle. I’m correcting you in saying that it can be impossible in ways other than principle.
On the contrary, we have one working example of general intelligence (humans) and zero of quantum computing.
That's covered in the biological construct part.
And no, we definitely do have quantum computers. They're just not practical yet.
Do we have a specific enough definition of general intelligence that we can exclude all non-human animals?
Why does it need to exclude all non human animals? Could it not be a difference of degree rather than of kind?
The post I was responding to had
> On the contrary, we have one working example of general intelligence (humans)
I think some animals probably have what most people would informally call general intelligence, but maybe there’s some technical definition that makes me wrong.
Their point is not in any way weakened if you read "one working example" as "at least one working example".
Oh, good point, I hadn’t noticed the alternative reading. That makes sense, then.
I do not know how "general intelligence" is defined, but there are a set of features we humans have that other animals mostly don't, as per the philosopher Roger Scruton[1], that I am reproducing from memory (errors mine):
1. Animals have desires, but do not make choices
We can choose to do what we do not desire, and choose not to do what we desire. For animals, one does not need to make this distinction to explain their behavior (Occam's razor)--they simply do what they desire.
2. Animals "live in a world of perception" (Schopenhauer)
They only engage with things as they are. They do not reminisce about the past, plan for the future, or fantasize about the impossible. They do not ask "what if?" or "why?". They lack imagination.
3. Animals do not have the higher emotions that require a conceptual repertoire
such as regret, gratitude, shame, pride, guilt, etc.
4. Animals do not form complex relationships with others
Because it requires the higher emotions like gratitude and resentment, and concepts such as rights and responsibilities.
5. Animals do not get art or music
We can pay disinterested attention to a work of art (or nature) for its own sake, taking pleasure from the exercise of our rational faculties thereof.
6. Animals do not laugh
I do not know if the science/philosophy of laughter is settled, but it appears to me to be some kind of phenomenon that depends on civil society.
7. Animals lack language
in the full sense of being able to engage in reason-giving dialogue with others, justifying your actions and explaining your intentions.
Scruton believed that all of the above arise together.
I know this is perhaps a little OT, but I seldom if ever see these issues mentioned in discussions about AGI. Maybe less applicable to super-intelligence, but certainly applicable to the "artificial human" part of the equation.
[1] Philosophy: Principles and Problems. Roger Scruton
If some animals also have general intelligence then we have more than one example, so this simply isn't relevant.
We're fixated on human intelligence but a computer cannot even emulate the intelligence of a honeybee or an ant.
How do you mean? AFAICT computers can definitely do that.
Sure, it won't be the size of an ant, but we definitely have models running on computers that have much more complexity than the life of an ant.
> Sure, it won't be the size of an ant, but we definitely have models running on computers that have much more complexity than the life of an ant.
Do we? Where is the model that can run an ant and navigate a 3d environment, parse visuals and different senses to orient itself, figure out where it can climb to get to where it needs to go. Then put that in an average forest and navigate trees and other insects and try to cooperate with other ants and find its way back. Or build an anthill, an ant can build an anthill, full of tunnels everywhere that doesn't collapse without using a plan.
Do we have such a model? I don't think we have anything that can do that yet. Waymo is trying to solve a much simpler problem and they still struggle, so I am pretty sure we still can't run anything even remotely as complex as an ant. Maybe a simple worm, but not an ant.
Having aptitude in mathematics was once considered the highest form of human intelligence, yet a simple pocket calculator can beat the pants off most humans at arithmetic tasks.
Conversely, something we regard as simple, such as selecting a key from a keychain and using to unlock a door not previously encountered is beyond the current abilities of any machine.
I suspect you might be underestimating the real complexity of what bees and ants do. Self-driving cars as well seemed like a simpler problem before concerted efforts were made to build one.
> Having aptitude in mathematics was once considered the highest form of human intelligence, yet a simple pocket calculator can beat the pants off most humans at arithmetic tasks.
Mathematics has been a lot more than arithmetic for... a very long time.
But arithmetics was seen as requiring intelligence, as did chess.
No one said "exclusively humans", and that's not relevant.
There are many working quantum computers…
ah, I mean, working in the sense of OP: that a system which overcomes the "engineering hurdles" is actually feasible and will be successful.
To be blocked merely by "engineering hurdles" puts QC in approximately the same place as fusion.
This makes no sense.
If you believe in eg a mind or soul then maybe it's possible we cannot make AGI.
But if we are purely biological then obviously it's possible to replicate that in principle.
That doesn’t contradict what they said. We may one day design a biological computing system that is capable of it. We don’t entirely understand how neurons work; it’s reasonable to posit that the differences that many AGI boosters assert don’t matter do matter— just not in ways we’ve discovered yet.
I mentioned this in another thread, but I do wonder if we engineer a sort of biological computer, will it really be a computer at all, and not a new kind of life itself?
Maybe — though we’d still have engineered it, which is the point I was trying to make.
> not a new kind of life itself?
In my opinion, this is more a philosophical question than an engineering one. Is something alive because it’s conscious? Is it alive because it’s intelligent? Is a virus alive, or a bacteria, or an LLM?
Beats me.
We understand how neurons work to quite a bit of detail.
The Allen Institute doesn’t seem to think so. We don’t even know how the brain of a roundworm ticks and it’s only got 302 neurons— all of which are mapped, along with their connections.
It's not "key"; it's not even relevant ... the proof will be in the pudding. Proving a priori that some outcome is possible plays no role in achieving it. And you slid, motte-and-bailey-like, from "know" to "some clear indication of possibility" -- we have extremely clear indications that it's possible, since there's no reason other than a belief in magic to think that "biological" is a necessity.
Whether is feasible or practical or desirable to achieve AGI is another matter, but the OP lays out multiple problem areas to tackle.
The practical feasibility of quantum computing is definitely still an open research question.
Sometimes I think we’re like cats that learned how to make mirrors without really understanding them, and are so close to making one good enough that the other cat becomes sentient.
> We don't know if AGI is even possible outside of a biological construct yet
Of course it is. A brain is just a machine like any other.
Except we don't understand how the brain actually works and have yet to build a machine that behaves like it.
Nothing that we consider intelligent works like LLMs.
Brains are continuous - they don’t stop after processing one set of inputs, until a new set of inputs arrives.
Brains continuously feed back on themselves. In essence they never leave training mode although physical changes like myelination optimize the brain for different stages of life.
Brains have been trained by millions of generations of evolution, and we accelerate additional training during early life. LLMs are trained on much larger corpuses of information and then expected to stay static for the rest of their operational life; modulo fine tuning.
Brains continuously manage context; most available input is filtered heavily by specific networks designed for preprocessing.
I think that there is some merit that part of achieving AGI might involve a systems approach, but I think AGI will likely involve an architectural change to how models work.
I don't understand how people feel comfortable writing 'LLMs are done improving, this plateau is it.' when we haven't even gone an entire calendar year without seeing improvements to LLM based AI.
You can expand to cover more of a plateau, that is still an improvement but also still a plateau as you aren't going higher.
So models improve on specific tasks, but they don't really improve generally across the board any longer.
I wonder if the people saying that would agree that they've been improving.
As one on that side of that argument, I have to say I have yet to see LLMs fundamentally improve, rather than being benchmaxxed on a new set of common "trick questions" and giving off the illusion of reasoning.
Add an extra leg to any animal in a picture. Ask the vision LLM to tell you how many legs it sees. It will answer the same amount as a person would expect from a healthy individual, because it's not actually reasoning, it's not perceiving anything, it's pattern matching. It sees dog, it answers 4 legs. Maybe sometime in the future it won't do that, because they will add this kind of trick to their benchmaxxing set (training LLMs specifically on pictures that have less or more legs than the animal should), as they do every time there's a new generation of those illusory things. But that won't fix the fundamental that these things DO NOT REASON.
Training LLMs on sets of thousands and thousands and thousands of reasoning trick questions people ask on LM arena is borderline scamming people on the true nature of this technology. If we lived in a sane regulatory environment OAI would have a lot to answer for.
This article seems to me like a lot of "if you solve all the hard problems, then you'll have a solution". Which is like... yes, and...?
It's relevant given the apparent conservativeness about LLMs, the non-revolutionary architectural improvements that seemed to place too much of the bets on scaling.
The article doesn't even discuss hard problems.
An unfortunate tendency that many in high-tech suffer from is the idea that any problem can be solved with engineering.
It lays out what those problems are and how LLMs don't solve them.
The problem is that if it's an engineering problem then further advancement will rely on step function discoveries like the transformer. There's no telling when that next breakthrough will come or how many will be needed to achieve AGI.
In the meantime I guess all the AI companies will just keep burning compute to get marginal improvements. Sounds like a solid plan! The craziest thing about all of this is that ML researchers should know better!! Anyone with extensive experience training models small or large knows that additional training data offers asymptotic improvements.
I think the LLM businesses as-is are potentially fine businesses. Certainly the compute cost of running and using them is very high, not yet reflected in the prices companies like OpenAI and Anthropic are charging customers. It remains to be seen if people will pay the real costs.
But even if LLMs are going to tap out at some point, and are a local maximum, dead-end, when it comes to taking steps toward AGI, I would still pay for Claude Code until and unless there's something better. Maybe a company like Anthropic is going to lead that research and build it, or maybe (probably) it's some group or company that doesn't exist yet.
“Potentially” is doing some heavy lifting here. As it stands currently, the valuations of these LLM businesses imply that they will be able to capture a lot of the generated value. But the open source/weights offerings, and competition from China and others makes me question that. So I agree these could be good businesses in theory, but I doubt whether the current business model is a good one.
Anthropic isn't even breaking even, and even if they do become profitable it's a far cry from AGI
We need to stop giving a shit about AGI and just try to build progressively better systems and enjoy the ride.
> GPT-5, Claude, and Gemini represent remarkable achievements, but they’re hitting asymptotes
This part could do with sourcing. I think it seems clearly untrue. We only have three types of benchmark: a) ones that have been saturated, b) ones where AI performance is progressing rapidly, c) really newly introduced ones that were specifically designed for the then-current frontier models to fail on. Look at for example the METR long time horizon task benchmark, which is one that's particularly resistant to saturation.
The entire article is claimed on this unsupported but probably untrue claim, but it's a bit hard to talk about when we don't have any clue about why the author thinks this is true.
> The path to artificial general intelligence isn’t through training ever-larger language models
Then it's a good thing that it's not the path most of the frontier labs are taking. It appears to be what xAI is doing for everything, and it was probably what GPT-4.5 was. Neither is a particularly compelling success story. But all the other progress over the last 12-18 months has come from models the same size or smaller advancing the frontier. And it has come from exactly the kind of engineering improvements that the author claims need to happen, both of the models and the scaffolding around the models. (RL on chain of thought, synthetic data, distillation, model-routing, tool use, subagents).
Sorry, no, they're not exactly the same kind of engineering improvements. They're the kind of engineering improvements that the people actually creating these systems though would be useful and actually worked. We don't see the failed experiments, and we don't see the ideas that weren't well-baked enough to even experiment on.
I think I am coming to agree with the opinions of the author, at least as far as LLMs not being the key to AGI on their own. The sheer impressiveness of what they can do, and the fact they can do it with natural language, made it feel like we were incredibly close for a time. As we adjust to the technology, it starts to feel like we're further away again.
But I still see all the same debates around AGI - how do we define it? what components would it require? could we get there by scaling or do we have to do more? and so on.
I don't see anyone addressing the most truly fundamental question: Why would we want AGI? What need can it fulfill that humans, as generally intelligent creatures, do not already fulfill? And is that moral, or not? Is creating something like this moral?
We are so far down the "asking if we could but not if we should" railroad that it's dazzling to me, and I think we ought to pull back.
The dream, as I see it, is that AGI could 1, automate research/engineering, such that it would be self-improving and advance technology faster and better than would happen without AGI, improving quality of life, and 2, do a significant amount of the labor, especially physical labor via robotics, that people currently do. 2 would be significant enough in scale that it reduces the amount of labor people need to do on average without lowering quality of life. The political/economic details of that are typically handwaved.
The morality of it depends on the details.
because if people could do it , they would do it. And if you country decides you should not do it, you could be left behid. This possibility prevents any country from not doing it if they could unless they are willing to start wars with other countries for compliance (and they would still secretly do it). So should is an irrelevant quesiton.
The first premise of the argument is that LLMs are plateauing in capability and this is obvious from using them. It is not obvious to me.
It is especially not obvious because this was written using ChatGPT-5. One appreciates the (deliberate?) irony, at least. (Or at least, surely if they had asymptoted, OP should've been able to write this upvoted HN article with an old GPT-4, say...)
> this was written using
How do you know?
It is lacking in URLs or references. (The systematic error in the self-reference blog post URLs is also suspicious: outdated system prompt? If nothing else, shows the human involved is sloppy when every link is broken.) The assertions are broadly cliche and truisms, and the solutions are trendy buzzwords from a year ago or more (consistent with knowledge cutoffs and emphasizing mainstream sources/opinions). The tricolon and unordered bolded triplet lists are ChatGPT. The em dashes (which you should not need to be told about at this point) and it's-not-x-but-y formulation are extremely blatant, if not 4o-level, and lacking emoji or hyperbolic language; hence, it's probably GPT-5. (Sub-GPT-5 ChatGPTs would also generally balk at talking about a 'GPT-5' because they think it doesn't exist yet.) I don't know if it was 100% GPT-5-written, but I do note that when I try the intro thesis paragraph on GPT-5-Pro, it dislikes it, and identifies several stupid assertions (eg. the claim that power law scaling has now hit 'diminishing returns', which is meaningless because all log or power laws always have diminishing returns), so probably not completely-GPT-5-written (or least, sub-Pro).
> when I try the intro thesis paragraph on GPT-5-Pro, it dislikes it
I don't know about GPT-5-Pro, but LLMs can dislike their own output (when they work well...).
They can, but they are known to have a self-favoring bias, and in this case, the error is so easily identified that it raises the question of why GPT-5 would both come up with it & preserve it when it can so easily identify it; while if that was part of OP's original inputs (whatever those were) it is much less surprising (because it is a common human error and mindlessly parroted in a lot of the 'scaling has hit a wall' human journalism).
do you have a source?
when i’ve done toy demos where GPT5, sonnet 4 and gemini 2.5 pro critique/vote on various docs (eg PRDs) they did not choose their own material more often than not.
my setup wasn’t intended to benchmark though so could be wrong over enough iterations.
I don't have any particularly canonical reference I'd cite here, but self-preference bias in LLMs is well-established. (Just search Arxiv.)
My favorite tell-tale sign:
> The gap isn’t just quantitative—it’s qualitative.
> LLMs don’t have memory—they engage in elaborate methods to fake it...
> This isn’t just database persistence—it’s building memory systems that evolve the way human memory does...
> The future isn’t one model to rule them all—it’s hundreds or thousands of specialized models working together in orchestrated workflows...
> The future of AGI is architectural, not algorithmic.
Consensus on GPT-5 has been that it was underwhelming, and definitely a smaller jump than 3 to 4.
I understand that is what a lot of people are saying. It doesn’t match my experience.
Just ancedata, but they keep releasing new versions and it keeps not being better. What would you describe this as if not plateauing? Worsening?
I see a lot of people saying things like this, and I’m not really sure which planet you all are living on. I use LLMs nearly every day, and they clearly keep getting better.
Grok hasn't gotten better. OpenAI hasn't gotten better. Claude Code with Opus and Sonnet I swear are getting actively worse. Maybe you only use them for toy projects, but attempting to get them to do real work in my real codebase is an exercise in frustration. Yes, I've done meaningful prompting work, and I've set up all the CLAUDE.md files, and then it proceeds to completely ignores everything I said, all of the context I gave, and just craps out something completely useless. It has accomplished a small amount of meaningful work, exactly enough that I think I'm neutral instead of in the negative in terms of work:time if I have just done it all myself.
I get to tell myself that it's worth it because at least I'm "keeping up with the industry" but I honestly just don't get the hype train one bit. Maybe I'm too senior? Maybe the frameworks I use, despite being completely open source and available as training data for every model on the planet are too esoteric?
And then the top post today on the front page is telling me that my problem is that I'm bothering to supervise and that I should be writing an agent framework so that it can spew out the crap in record time..... But I need to know what is absolute garbage and what needs to be reverted. I will admit that my usual pattern has been to try and prompt it into better test coverage/specific feature additions/etc on the nights and weekends, and then I focus my daytime working hours on reviewing what was produced. About half the time I review it and have to heavily clean it up to make it usable, but more often than not, I revert the whole thing and just start on it myself from scratch. I don't see how this counts as "better".
It can definitely be difficult and frustrating to try to use LLMs in a large codebase—no disagreement there. You have to be very selective about the tasks you give them and how they are framed. And yeah, you often need to throw away what they produced when they go in the wrong direction.
None of that means they’re getting worse though. They’re getting better; they’re just not as good as you want them to be.
I mean, this really isn't a large codebase, this is a small-medium sized codebase as judged by prior jobs/projects. 9000 lines of code?
When I give them the same task I tried to give them the day before, and the output gets noticeably worse than their last model version, is that better? When the day by day performance feels like it's degrading?
They are definitely not as good as I would like them to be but that's to be expected of professionals who beg for money hyping them up.
I think the author could have picked a better title. “<X> is an engineering problem” is a pretty common expression to describe something where the science is done, but the engineering remains. There’s an understanding that that could still mean a ton of work, but there isn’t some fundamental mystery about the basic natural principles of the thing.
Here, AGI is being described as an engineering problem, in contrast to a “model training” problem. That is, I think at least, he’s at least saying that more work needs to be done at an R&D level. I agree with those who are saying it is maybe not even an engineering problem yet, but should be noted that he’s at least pushing away from just running the existing programs harder, which seems to be the plan with trillions of dollars behind it.
I have colleagues that want to plan each task of the software team for the next 12 months. They assume that such a thing is possible, or they want to do it anyway because management tells them to. The first would be an example of human fallibility, and the second would be an example of choosing the path of (perceived) least immediate self-harm after accounting for internal politics.
I doubt very much we will ever build a machine that has perfect knowledge of the future or that can solve each and every “hard” reasoning problem, or that can complete each narrow task in a way we humans like. In other words, it’s not simply a matter of beating benchmarks.
In my mind at least, AGI’s definition is simple: anything that can replace any human employee. That construct is not merely a knowledge and reasoning machine, but also something that has a stake on its own work and that can be inserted in a shared responsibility graph. It has to be able to tell that senior dev “I know planning all the tasks one year in advance is busy-work you don’t want to do, but if you don’t, management will terminate me. So, you better do it, or I’ll hack your email and show everybody your porn subscriptions.”
Interesting, I hadn’t thought about it that way. But can a thing on the other end of an API call ever truly have a “stake“?
> But can a thing on the other end of an API call ever truly have a “stake“?
That is their goal function they are trained for, it is like dopamine and sex for humans they will do anything to get it.
Yes, but a having a stake also implies feeling the loss if it goes sideways…
Next you’re going to tell me that’s what loss functions are for :-)
Too much not-x-but-y (but with dashes) in this imo.
I would like to see what happens if some company devoted your resources to just training a model that is a total beast at math. Feed it a ridiculous amount of functional analysis and machine learning papers, and just make the best model possible for this one task. Then instead of trying to make it cheap so everyone can use it, just set it on the task of figuring out something better than the current architecture and literally have it do nothing else but that and make something based on whatever it figures out. Will it come up with something better than AdamW for optimization? Than transformers for approximating a distribution from a random sample? I don't know, but: what is the point of training any other model?
We don't understand how do we understand. Then, how can we be expected to create something that can understand?
People who don't understand think they do.
If we are truly trying to "replace human at work" as the definition of an AGI, then shouldn't the engineering goal be to componentize the human body? If we could component-by-component replace any organ with synthetic ones ( and this is already possible to some degree e.g. hearing aids, neuralinks, pacemakers, artificial hearts ) then not only could we build compute out in such a way but we could also pull humanity forward and transcend these fallible and imminently mortal structures we inhabit. Now, should we from a moral perspective is a completely different question, one I don't have an answer to.
Not necessarily. For example, early attempts to make planes tried to imitate birds with flapping wings, but the vast majority of modern planes are fixed wing aircraft.
Imitating humans would be one way to do it, but it doesn't mean it's an ideal or efficient way to do it.
From the moment I understood the monolithic design of my flesh, it disgusted me.
I have been thinking this as well. I desperately wish to develop a method that gives the models latent thinking that actually has temporal significance. The models now are so linear and have to scale on just one pass. A recurring model where the dynamics occur over multiple passes should hold much more complexity. Have worked on a few concepts in that area that are panning out.
Counterargument. So far bigger have proven to be better in each domain of AI. Also (although hard to compare) the human brain seems at least an order of magnitude larger in the number of synapses.
No one has invented Asimov’s positronic brain or anything like it.
We don’t even know how.
https://mashable.com/article/apple-research-ai-reasoning-mod...
All of our current approaches "emulate" but do not "execute" general intelligence. The damning paper above basically concludes they're incredible pattern matching machines, but thats about it.
We’ve not determined whether or not that isn’t a useful mechanism for capable intelligence.
For instance it is becoming clearer that you can build harnesses for a well-trained model and teach it how to use that harness in conjunction with powerful in-context learning. I’m explicitly speaking of the Claude models and the power of whatever it is they started doing in RL. Truly excited to see where they take things and the continued momentum with tools like Claude Code (a production harness).
The article is about going beyond our current approaches.
It is an approach problem. You can engineer it as much as you want but the current architectures wont get us to AGI. I have a feeling that we will end up over engineering on an approach which doesn't get us anywhere. We will make it work via guardrails and memory and context and what not but it wont get us where we want to be.
Svgs, date management, Http, so many simpler things we dont have solve and somehow people believe they will do it by putting enough money in LLMs when it cant count
Why some people understood when they tried it with blockchain, nfts, web3, AR, ... any good engineer should know principle of energy efficiency instead of having faith in the Infinite monkey theorem
LLM’s can count and the best can do mathematics at quite a high level now.
Not sure why people insist that the state of AI 2-3 years ago still applies today.
"AGI needs to update beliefs when contradicted by new evidence" is a great idea, however, the article's approach of building better memory databases (basically fancier RAG) doesn't seem enable this. Beliefs and facts are built into LLMs at a very low layer during training. I wonder how they think they can force an LLM to pull from the memory bank instead of the training data.
LLMs are not the proposed solution.
(Also, LLMs don't have beliefs or other mental states. As for facts, it's trivially easy to get an LLM to say that it was previously wrong ... but multiple contradictory claims cannot all be facts.)
> how they think they can force an LLM to pull from the memory bank instead of the training data
You have to implement procedurality first (e.g. counting, after proper instancing of ideas).
I don't buy that LLMs will get us AGI, they may be part of a system but not the whole.
It boils down to whether or not we can figure out how to get LLMs to reliably write code. Something it can already do, albeit still unreliability. If we get there, the industry expectations for "AGI" will be met. The humanoid-like mind that the general public is picturing won't be met by LLMs, but that isn't the bar trying to be met.
The premise "LLMs have reached a plateau" is false IMO.
Here are the metrics by which the author defines this plateau: "limited by their inability to maintain coherent context across sessions, their lack of persistent memory, and their stochastic nature that makes them unreliable for complex multi-step reasoning."
If you try to benchmark any proxy of the points above, for instance "can models solve problems that require multi steps in agentic mode" (PlanBench, BrowseComp, I've even built custom benchmarks), the progress between models is very clear, and shows no sign of slowing down.
And this does convert to real-world tasks : yesterday, I had GPT-5 build me complex react charts in one-shot, whereas previous models needed more constant supervision.
I think we're moving goalposts too fast for LLMs, that's what can lead us to believe they've plateaued : but just try using past models for your current tasks (you can use use open models to be sure they were not updated) and see them struggle.
Of course someone building AI scaffolding and infrastructure tools will say that AI scaffolding and infrastructure tools are the most important.
IME it’s both though. Better models, bigger models, and infrastructure all help get to AGI.
For all we know LLMs might be a complete dead-end when it comes to actual AGI, much like string theory was a dead-end for a theory of everything.
> Phase 3: Emergence Layer
I see. So the author rejects the hypothesis of emergent behavior in LLM, but somehow thinks it will magically appear if the "engineering" is correct.
Self contradictory.
I think the only magic the author is referring to is the “magic of engineering”
The suggested requirements are not engineering problems. Conceiving of a model architecture that can represent all the systems described in the blog is a monumental task of computer science research.
I think the OP's point is that all those requirements are to be implemented outside the LLM layer, i.e. we don't need to conceive of any new model architecture. Even if LLMs don't progress any further beyond GPT-5 & Claude 4, we'll still get there.
Take memory for example: give LLM a persistent computer and ask it to jot down its long-term memory as hierarchical directories of markdown documents. Recalling a piece of memory means a bunch of `tree` and `grep` commands. It's very, very rudimentary, but it kinda works, today. We just have to think of incrementally smarter ways to query & maintain this type of memory repo, which is a pure engineering problem.
The answer can't be as simple as more sophisticated RAGs. At the end of the day, stuffing the context full of crap can only take you so far because context is an extremely limited resource. We also know that large context windows degrade in quality because the model has a harder time tracking what the user wants it to pay attention to.
It's software engineering.
I don’t know how I feel about the assumption that model training isn’t an engineering problem
The forgone conclusion that LLMs are the key or even a major step towards AGI is frustrating. They are not, and we are fooling ourselves. They are incredible knowledge stores and statistical machines, but general intelligence is far more than these attributes.
My thoughts are that LLMs are like cooking a chicken by slapping it: yes, it works, but you need to reach a certain amount of kinetic energy (the same way LLMs only "start working" after reaching a certain size).
So then, if we can cook a chicken like this, we can also heat a whole house like this during winters, right? We just need a chicken-slapper that's even bigger and even faster, and slap the whole house to heat it up.
There's probably better analogies (because I know people will nitpick that we knew about fire way before kinetic energy), so maybe AI="flight by inventing machines with flapping wings" and AGI="space travel with machines that flap wings even faster". But the house-sized chicken-slapper illustrates how I view the current trend of trying to reach AGI by scaling up LLMs.
Right ... as the article lays out.
It seems like all of the links to more of their work (e.g. "research on deterministic vs. probabilistic systems") are currently broken.
In each of the URLs, replace "blog" with "posts"
Say we get there--all the way, full AGI, indistinguishable from conscious intelligence. Congratulations, every thing you do from that point on (very likely everything from well before that point) that is not specifically granting it free will and self-determination is slavery. Doesn't really feel like a real "win" for any plausible end goal. I'm really not clear on why anyone thinks this is a good idea, or desirable, let alone possible?
You can draw whatever arbitrary line you want, and say "anything on the other side of this line is slavery", but that doesn't mean that your supposition is true.
I don't think agi necessarily implies consciousness, at least many definitions of it. OpenAIs definition is just ai that does most of economically viable work
I keep asking it, and nobody wants to answer because it doesn't fit within the paradigm of "making AGI"
What if intelligence requires agency ?
Agency within the world model is sufficient.
Just a matter of upping the model resolution then ?
Operation over ideas does not seem to be relevant to «model resolution».
The reason people don't want to answer this question is because the value proposition from AI labs is slavery. If intelligence requires agency, they are worthless.
LLMs are semantic memory. Good job. Now build the other parts of cognitive systems.
Keep pushing it down the road until you realize it's a pipe dream problem.
Ctrl-F -> emotion -> 0/0 not found in the article.
Trying to model AGI off how humans think, without including emotion as a fundamental building block, is like trying to build a computer that'll run without electricity. People are emotional beings first. So much of how we learn that something is good or bad is due to emotion.
In an AGI context that means:
Maybe building artificial emotions is an engineering problem. Maybe not. But approaches that avoid emotion entirely seem ill-advised.Do ml engineers take classes on psychology, neurosciences, behavior, cognition?
Because if they don't, I honestly don't think they can approach AGI.
I have the feeling it's a common case of lack of humility from an entire field of science who refuses to look at other fields to understand what they're doing.
Not to mention how to define intelligence in evolution, epistemology, ontology, etc.
Approaching AI with a silicon valley mindset is not a good idea.
> Approaching AI with a silicon valley mindset is not a good idea.
I don’t see a problem, we’re great at just reinventing all that stuff from first principles
Right now, we just did the equivalent of a tech demo of Broca's area.
AGI would take making at least one full brain, and then putting many of those working together, efficiently.
I don't believe we can engineer our way out of that before explaining how the f. the wetware works first.
I think that completely discounting the potential of new emergent capabilities at scale undermines this thesis significantly. We don't know until someone tries, and there is compelling evidence that there's still plenty of juice to squeeze out of both scale and engineering.
It's a research problem, a science problem. And then an engineering problem to industrialize it. How can we replicate intelligence if we don't even know how it emerges from our brains?
We do not «replicate».
We implemented computing without any need of a brain-neural theory of arithmetic.
At its core, arithmetic is a deterministic set of rules that can be implemented with logic gates. Computing is just taking that and scaling it up a billion times. What is intelligence? How do you implement intelligence if nobody can provide a consistent, clear definition of what it is?
Same thing: we create models about how to solve the problem, not biomimicry models about how natural entities solve the problem - these are not necessary. They are on a lower layer in the stack.
Please don’t say this. Give us more time.
AGI is a science problem.
We've lucked into these amazing abilities by just scaling.
But we don't really understand how they work.
And they are obviously missing a piece, some self-reflection, or continuous-loop operation perhaps, which we again don't understand.
Perhaps we'll do all this engineering and luck the solution again, but I think probably not.
I always enjoy discussions that intersect between psychology and engineering.
But I feel this person falls short immediately, because they don't study neuroscience and psychology. That is the big gap in most of these discussion. People don't discuss things close to the origin.
We have to account for first principals in how intelligence works, starting from the origin of ideas and how humans process their ideas in novel ways that create amazing tech like LLM! :D
How Intelligence works
In Neuroscience, if you try to identify the origin of where and how thoughts are formed and how consciousness works. It is completely unknown. This brings up the argument, do humans have free will if we are driven by these thoughts of unknown origin? That's a topic for another thread.
Going back to intelligence. If you study psychology and what forms intelligence, there are many human needs that drive intelligence, namely intellectual curiosity (need to know), deprivation sensitivity (need to understand), aesthetic sensitivity, absorption, flow, openness to experience.
When you look at how a creative human with high intelligence uses their brain, there are 3 networks involved. Default mode network (imagination network), executive attention network and salience network.
The executive attention network controls the brains computational power. It has a working memory that can complete tasks using goal directed focus.
A person with high intelligence can alternate between their imagination and their working memory and pull novel ideas from their imagination and insert them into their working memory - frequently experimenting by testing reality. The salience network filters unnecessary content while we are using our working memory and imagination.
How LLMs work
Neural networks are quite promising in their ability to create a latent manifold within large datasets that interpolates between samples. This is the basis for generalization, where we can compress a large dataset in a lower dimensional space to a more meaningful representation that makes predictions.
The advent of attention on top of neural networks, to identify important parts of text sequences, is the huge innovation powering llms today. The innovation that emulates the executive attention network.
However, that alone is a long distance from the capabilities of human intelligence.
With current AI systems, there is the origin, which is known vocabulary with learned weights coming from neural networks, with reinforcement learning applied to enhance the responses.
Inference comes from an autoregressive sequence model that processes one token at a time. This comes with a compounding error rate with longer responses and hallucinations from counterfactuals.
Correct response must be in the training distribution.
As Andy Clark said, AI will never gain human intelligence, they have no motivation to interface with the world and conduct experiments and learn things on their own.
I think there are too many unknown and subjective components of human intelligence and motivation that cannot be replicated with the current systems.
no, it's a research problem
the idea that you would somehow produce intelligence by feeding billions of reddit comments into a statistical text model is will go down as the biggest con in history
(so far)
Way out of touch.
AGI is poorly defined and thus is a science "problem", and a very low priority one at that.
No amount of engineering or model training is going to get us AGI until someone defines what properties are required and then researches what can be done to achieve them within our existing theories of computation which all computers being manufactured today are built upon.
This is the funniest "I'm a hammer thus AGI is a nail" post I've ever read.
Maybe I'm misunderstanding what you mean by that, but do you have any examples of software engineering that weren't already thoroughly explained by computer science long before?
By this, I meant the original post.. in agreement.
It strikes me that until we fully understand human consciousness, we don't stand a chance of reaching AGI.
Am I incorrect?
Unclear. You might be right, but I think it's possible that you're also wrong.
It's possible to stumble upon a solution to something without fully understanding the problem. I think this happens fairly often, really, in a lot of different problem domains.
I'm not sure we need to fully understand human consciousness in order to build an AGI, assuming it's possible to do so. But I do think we need to define what "general intelligence" is, and having a better understanding of what in our brains makes us generally intelligent will certainly help us move forward.
That doesn't seem like a useful assumption since consciousness doesn't have a functional definition (even though it might have a functional purpose in humans)
Intelligence (solving problems) does not require consciousness.
Please elaborate.
I'm not sure I'd give such an absolute statement of certainty as the GP, but there is little reason to believe that consciousness and intelligence need to go hand-in-hand.
On top of that, we don't really have good, strong definitions of "consciousness" or "general intelligence". We don't know what causes either to emerge from a complex system. We don't know if one is required to have the other (and in which direction), or if you can have an unintelligent consciousness or an unconscious intelligence.
https://en.wikipedia.org/wiki/Chinese_room
You do not need to implement consciousness into a calculator. There exist forms of intelligence that are just sophisticated calculation - no need for consciousness.
I think we can relax that a bit. We "just" need to understand some definition of cognition that satisfies our computational needs.
Natural language processing is definitely a huge step in that direction, but that's kinda all we've got for now with LLMs and they're still not that great.
Is there some lower level idea beneath linguistics from which natural language processing could emerge? Maybe. Would that lower level idea also produce some or all of the missing components that we need for "cognition"? Also a maybe.
What I can say for sure though is that all our hardware operates on this more linguistic understanding of what computation is. Machine code is strings of symbols. Is this not good enough? We don't know. That's where we're at today.
All those guys linking the bitter lesson aren't gonna like hearing this lol
They'll keep their heads in the sand for as long as they can.
AGI has a hundreds-of-millions of years of evolution problem, anything humans have done so far utterly pales in comparison. The lowliest rat has more "general" intelligence than any AI we've ever made...
If we want to learn, look to nature, and it *has to be alive*.
Finally. Someone gets it.
The memory part makes sense. If a human brains working memory were to be filled with even half the context that you see before LLMs start to fail, it too would loose focus. That's why the brain has short term working memory and long term memory for things not needed in the moment.
This guys gets it wrong too. It’s not even an engineering problem. It’s much worse: it’s a scientific problem. We don’t yet understand how the human brain operates or what human intelligence really is. There’s nothing to engineer as the basic specifications are not yet available on what needs to be built.
Will AGI require ‘consciousness’, another poorly understood concept? How are mammalian brain even wired up? The most advanced model is the Allen Institute’s Mesoscale Connectivity Atlas which is at best a low resolution static roadmap, not a dynamic description of how a brain operates in real time. And it describes a mouse brain, not a human brain which is far, far more complex, both in terms of number of parts, and architecture.
People are just finally starting to acknowledge LLMs are dead ends. The effort expended on them over the last five years could well prove a costly diversion along the road to AGI, which is likely still decades in the future.
I'd argue that it's because intelligence has been treated as a ML/NN engineering problem that we've had the hyper focus on improving LLMs rather than the approach you've written about.
Intelligence must be built from a first principles theory of what intelligence actually is.
The missing science to engineer intelligence is composable program synthesis. Aloe (https://aloe.inc) recently released a GAIA score demonstrating how CPS dramatically outperforms other generalist agents (OpenAI's deep research, Manus, and Genspark) on tasks similar to those a knowledge worker would perform.
LLMs may not be the sole kind of AI to AGI.
Continuing to want to make a non-deterministic system behave like a deterministic system will be interesting to watch.
It's a training data problem, and the training data you want doesn't exist.
AGI cannot possibly exist.
Proof?
Well, you should show the proof that it is possible also, So it would be a draw.
I really think it is not possible to get that from a machine. You can improve and do much fancier than now.
But AGI would be something entirely different. It is a system that can do everything better than a human including creativity, which I believe it to be exclusively human as of now.
It can combine, simulate and reason. But think out of the box? I doubt so. It is different to being able to derive ideas from which human would create. For that it can be useful. But that would not be AGI.
The burden of proof is on the person who makes a claim, especially an absolute existential claim like that. You have failed the burden of proof and of intellectual honesty. Over and out.
I made it 2 paragraphs before realizing it’s AI generated. This isn’t just slop - it’s drivel.
I've said this a lot but I'm going to say it again AGI has no technical definition. One day Sam Altman, Elon Musk, or some other business guy trying to meet their obligation for next quarter will declare they have built AGI and that will be that. We'll argue and debate, but eventually it will be just another marketing term, just like AI was.
Now that there’s a fundamental technical framework for producing something like coherence, the ability to make a reliable, persistent personality will require new insights into how our own minds take shape, not just ever more data in the firehose
won’t somebody please think about Mr. Godel, and the Incompleteness Theorem ?
They aren't relevant. Even if Penrose and Lucas were right (they aren't), a computational system can solve the vast majority of the problems we would want solved.
seems like a bait-n-switch argument here.
we are talking explicitly about a.g.i here, not debating if the computer can solve a majority of problems or not.
the two things can be true at the same time.
[dead]
I don't you know about you guys but Sam Altman have said they have achieved AGI within OpenAI. That's big.
How is it "big" that Altman told one of his many lies? He now says that AGI "is not a useful term".
If "context is king" in the LLMs age... Well give us at least some context.
Well, they've said they're close over and over again. Maybe that final bit of tech to make AGI a reality will finally ride into existence on the sub-$30k Tesla.
I'll believe it when I see it, and I'm very much in doubt they have anything to show.