Do you think LLM can THINK? Overcoming current AI limitations

The Road to AGI: why Scaling isn't enough and what comes next

Dec 06, 2024

Welcome to part 3 of the series “Do you think LLM can THINK?“. We discussed so far about the Hype and the Transformers architecture.

Artificial General Intelligence (AGI) remains the holy grail of AI research, representing the ability of a machine to understand or learn any intellectual task that a human being can.

While recent advancements in Generative AI have brought us closer to this goal, it's becoming increasingly clear that current approaches have inherent limitations that must be addressed to achieve true AGI.

A wall that brute force cannot overcome

As I tried to explain previously, there is a 3 body problem that little by little is exposing a naked truth.

Few months ago a new ARC-AGI challenge has been launched! With a Prize of $1,000,000+ Kaggle is accepting everyone to participate to this public competition to beat and open source a solution to the ARC-AGI benchmark — hosted by Mike Knoop (Co-founder, Zapier) and François Chollet (Creator of ARC-AGI, Keras).

AGI progress has stalled

This is the starting point of this competition, and also a sad reality.

It is true that the recent release of GPT-4o and Claude-Sonnet changed again the game and raised the evaluation bar on the Generative AI performances. LLMs today are amazing and outperform us in almost every task. So, why François Chollet is calming that AGI progress hit a wall?

Do we really know what is the road ahead? Do we really know how to build an AGI?

The suffocating hype around the untouchable powers of the scaling law and need of compute are a major source of my skepticism.

Stop telling us how great it is, and start giving us concrete and measurable examples. Not examples that fall apart with the tiniest bit of scrutiny.

I was reading some comments on Gary Marcus’s Substack and I found this one quite honest:

I’m also seeing more shrill “AI-good-cause-washing”, where genAI is being pushed as the solution to society’s various ills. Eric Schmidt’s “AI will solve climate crisis” is perhaps the most absurd and high profile example, but I attended a healthcare conference last week where nVidia were pushing their foundation model insta-solutions to the various challenges in the drug discovery industry. This was met by much weary eye rolling from executives for whom this isn’t their first such rodeo, but perhaps plays better in the general media. I take all this as signs that things are slowing down worryingly for the world’s largest tech companies.

LLM, at least for how they are designed now, cannot reach AGI.

The Scaling Dilemma

For years, the AI community has been fixated on the idea that scaling up models—increasing parameter counts and training data sizes—would inevitably lead to more intelligent systems. This approach, underpinned by the concept of scaling laws, has indeed yielded impressive results, but it's now clear that this path has its limits.

A report by Apple researchers recently stated that LLMs do not perform genuine reasoning, instead relying on memorization and pattern recognition. This finding underscores a critical flaw: these models lack the ability to understand and reason about information in a way that humans do.

People aren’t ignoring GenAI - they are waiting to see if it will work.

Here is a pair of stunning statistics from a new CNBC poll, a grounded relevant study, from the CNBC Technology Executive Council that was held in October:

79% of survey correspondents said they had tried Microsoft Copilot. That’s tremendous, given how new the product is (and a clear refutation of any “people are ignoring AI” narrative).
But only 25% of the correspondents thought it was worth it. (Another quarter said the modest $30/month cost was not worth it, and the remaining half said it was too soon to tell.)

Large Language models, although powerful tools, have few fundamental issues:

they cannot handle sequential data (time dependent series). This is a huge limitation considering the cause-effect scheme is a fundamental reasoning paradigm
lack of observability: even knowing that Multi-layer Perceptrons (MLPs) are in charge of somehow store the original information (knowledge base) of a Model, we are still unable to observe from where a LLM is grounding its statements
They have a quadratic computational complexity in the sequence length. This means that as the length of the input sequence increases, the amount of computation required grows quadratically (square of the sequence length).
Scaling Law illusion: scaling up and shaping up strategies are coming to their total saturation. Increasing the pre-training data (with a lot of synthetic one) requires enormous (and not yet available) compute power only in the hope that something better will happen. Increasing the post-training process is giving promising results with Small Language Models, without changing the fundamental skills of the model itself.

The fact is this: these issues are hardcoded in their architecture, so as already proclaimed, there is only one way to get over them:

start it over again, from scratch

There is an extensive analysis done by Ignacio de Gregorio: you can read more here if you like.

An Architecture shift to be expected

Checking out the newest Arxiv papers and the AI community forums, you can feel a growing passion for the future to come. And, at least there, no one is blindly following the hype.

A new awareness is guiding pioneer studies around new Language Model architectures, going beyond the GPT:

Were RNNs All We Needed? — Borealis AI and Mila (Universite de Montreal) revisit traditional recurrent neural networks (RNNs) from over a decade ago: LSTMs (1997) and GRUs (2014).

The intent here is clearly to overcome the scalability limitations of Transformers regarding sequence length. This known fact have renewed interest in recurrent sequence models that are parallelizable during training. As a result, many novel recurrent architectures, such as S4, Mamba, and Aaren, have been proposed that achieve comparable performance.

RWKV: Reinventing RNNs for the Transformer Era — a huge group of researchers from all over the world came up with a similar solution more than a year ago, giving birth to RWKV.

Transformers have revolutionized almost all natural language processing (NLP) tasks but suffer from memory and computational complexity that scales quadratically with sequence length. In contrast, recurrent neural networks (RNNs) exhibit linear scaling in memory and computational requirements but struggle to match the same performance as Transformers due to limitations in parallelization and scalability. We propose a novel model architecture, Receptance Weighted Key Value (RWKV), that combines the efficient parallelizable training of transformers with the efficient inference of RNNs.

Don’t you think it is time to check out all the amazing things going on all around us?

Conclusions — Learn to use GenAI is only the beginning

The GPT series of LLM are indeed an amazing breakthrough. And with them data manipulation and processing has become so easy.

And you know how enthusiasm I have for Generative AI: every week I write new articles with practical code, ideas and tutorials on how to use LLMs.

But to me it look likes they are still the missing stepping stone for a further and bigger revolution. And not knowing the foundations is never a good idea if you want to build something, right?

What is really waiting for us beyond the Transformers and the broken Scaling Law, may depends also on us.

That’s all for this mini-series. And again, here a deep dive article on Generative AI. The “three-body problem” refers to the immense complexity of predicting the orbital mechanics of three stars gravitationally influencing each other. This system can become chaotic for most initial conditions, making long-term predictions challenging. The actual AI advancement is facing as well a three-body problem: we do not have stars here, influencing each others, but we have: benchmarks, computation and training data.

click on the image to read the article for free

This is only the start!

Hope you will find all of this useful. I am using Substack only for the newsletter. Here every week I am giving free links to my paid articles on Medium. Follow me and Read my latest articles https://medium.com/@fabio.matricardi

Check out my Substack page, if you missed some posts. And, since it is free, feel free to share it!