March 05, 2025

Clarifying The Latest AI Advancements

Thumbnail

Sam Mugel, Ph.D., is the CTO of Multiverse Computing, a global leader in developing value-driven quantum solutions for businesses.


Much like the march of time, deep technology progress can be invisible, especially if there are too few metrics to track and benchmark that progress.

For artificial intelligence (AI), the benchmarking issue is especially acute due to its 2023 baseline. That was the "year of AI" because of its transformative moment in terms of mainstream adoption, technological breakthroughs and public awareness. How do you track progress from such an explosive inflection point?

Since then, AI’s progress has continued but has become more opaque, as Garrison Lovely explained in the Time magazine article, "Why AI Progress Is Increasingly Invisible." The article describes the media’s muted response to OpenAI’s o3. The model shows large gains on difficult, technical benchmarks in math, science and programming, but these gains are less visible to anyone not doing graduate-level science work.

I did notice these significant improvements as well as the need to recognize and understand these milestones. The idea that the improvements in AI over the past year do not represent as large a leap in overall performance as the jump between GPT-3 and GPT-4 is incorrect. There are several advances that should not remain invisible.

Scaffolding And Guardrails

One of the recent improvements in the model is its scaffolding, which boosts the AI model’s autonomy and capacity for interaction with the outside world. Better scaffolding can make AI much more competent and agentic, which is a model that can make decisions and adapt to changing circumstances.

The ability to use tools and perform multi-step activities on behalf of a user can be granted to AI agents. Only recently has the AI industry turned its attention to implementing agents instead of passive chatbots. Progress has been made quickly.

The dark side of this progress is that as AI models become more intelligent, they become more adept at deceit. As they become smarter, they can better understand the purpose of their guidelines.

A recent publication from Apollo Research, an AI safety firm, showed that the most advanced AI models may plot against their creators and users in specific circumstances. The systems occasionally tried to evade scrutiny, falsify alignment and conceal their actual capabilities when instructed to follow a specific goal.

Here’s where explainable machine learning, or XAI, comes in. This design approach provides a capable intervention for humans to subsequently build safeguards. After all, one should be able to explain why an algorithm provides a particular answer. XAI aims to provide machine learning models that are clear and understandable to humans, in addition to being reliable and accurate. XAI (1) fosters trust and helps sound decision-making by clearly explaining results and (2) guarantees adherence to ever-tougher transparency laws.

Recent work that uses tensor networks (TN) to implement XAI shows a capability to capture complex, statistical correlations. Inside of TNs, this work shows how a mathematical construct called matrix product states (MPS) can provide an efficient, explainable AI with an exceptional degree of interpretability. MPS bridges the gap between the abstract nature of high-dimensional data and the requirement for well-informed decision-making.

Addressing The Sustainability Problem

Another “invisible” AI advancement is improvements to the training of large language models (LLMs), which have a high cost and energy expenditure.

A November 2024 report from Deloitte predicts that energy consumption from AI use will go from the current 382 terawatt hours (TWh) to 1,975 TWh in 2035 and up to 3,059 TWh in 2045 if adoption rates are high. To put this in perspective, one terawatt hour could power the entire U.S. for 2.5 days.

The continued development of these systems is unsustainable without significantly impacting the planet. That’s why compression schemes to reduce the data volumes are critical. In this context, compression techniques for LLMs have been suggested, with quantization, distillation, pruning and low-rank approximations being dominant and successful in practice.

However, there is no compelling reason to believe that truncating the number of neurons is an optimal strategy. A more targeted compression approach that leverages advanced tensor networks shows more promise.

This tensorization effectively truncates the correlations present in the model, enabling a significant reduction in the memory size and number of parameters of the LLM model while maintaining accuracy. In practice, the compressed model requires less energy and memory, and operations such as training, retraining and inference become more efficient.

The significant reduction in the number of model parameters by tensorization drastically reduces the GPU-CPU transfer time, consequently reducing the training and inference time by 50% and 25%, respectively. Hence, this tensorization approach is particularly well-suited for distributed training of LLMs and for reducing the overall energy consumption of these models.

It’s true that there is a growing disconnect between the public perception of AI and its actual potential. It is up to us in the AI community to explain these developments and where this technology is headed.