August 29, 2024

Harnessing the Power of Tensor Networks: Insights from Román Orús, Chief Scientific Officer, Multiverse Computing

Thumbnail

Professor Román Orús had been exploring the intricacies of tensor networks since his postdoc days. His latest work as the Chief Scientific Officer at Multiverse Computing combines that expertise with a deep knowledge of quantum mechanics to make machine learning algorithms more compact, efficient and portable.

In 2006, Román was the first in Spain to write a doctoral thesis on the topic of quantum algorithms. After finishing his Ph.D. at the University of Barcelona, he continued his work at the University of Queensland, Brisbane, Australia, with a focus on tensor networks. There, he wrote an introduction to tensor networks in 2013, which has since received over 2,000 citations. A quick YouTube search for “Román Orús” returns a long list of results, including a tensor networks lecture at the Institute for Pure and Applied Mathematics, an interview at QUANTUMatter 2024, and a discussion of quantum application development on The Superposition Guy’s podcast.

Román has an extensive library of peer-reviewed published research and even has a patent in the works related to his work in fermionic tensor machine learning for quantum chemistry.

Tensor Networks are efficient mathematical representations of complex data structures. These representations illustrate how variables in a system are related to each other. Román recently demonstrated how tensor networks are a natural tool to describe and improve the efficiency of the machine learning structures behind Large Language Models like ChatGPT. Multiverse’s CompactifAI uses tensor networks to provide significant savings on energy and compute costs.

In this Q&A, Román explains why tensor networks caught his attention initially and how he expects this branch of mathematics to continue to influence the future of technology.

When/why did you first start researching tensor networks?

My first interaction with tensor networks happened while I was doing my Ph.D. in Barcelona. By then, around 2004, Guifre Vidal (then at Caltech, now at Google Quantum AI) had published a paper on how to simulate a quantum computer using a particular mathematical structure which, in the end, happened to be a specific type of tensor network. At roughly the same time, Ignacio Cirac’s group at The Max Planck Institute started to investigate similar topics. Our department in Barcelona had a lot of interaction with all of them, and towards the end of my Ph.D. thesis, I started to implement some of these ideas in my own work. I was fascinated from the beginning, and after my Ph.D., I moved to Australia to work with Guifre as a postdoc fully in the development of these (in those days) new methods.

What makes this technique so powerful?

Tensor Networks efficiently describe complicated data structures, such as high-dimensional operators and vectors. This representation can also be manipulated with relative ease and, what is more interesting, makes explicit the structure of correlations. In physics, it was discovered that tensor networks are the natural language to describe many important phenomena, including complex quantum states of matter, quantum computers and even the fundamental structure of space-time itself. Beyond physics, it was also recently discovered that they are the natural tool to describe machine learning structures and improve all their computational bottlenecks. This allows, for instance, the improvement of the efficiency of Large Language Models, such as those behind ChatGPT.

What is the biggest misconception about tensor networks?

People coming from the machine learning community, without a physics background, tend to believe that “tensor networks” are similar to “tensor flow” or “network algorithms.” They are wrong. Tensor networks are a completely different concept; they are mathematical tools that allow the efficient description and manipulation of complicated structures. Moreover, people tend to believe that some very basic tensor network algorithms (such as those based on something called “Matrix Product States”) are everything that the field has to offer. They are also very wrong, since the number of methods and techniques is huge and can be adapted and tailored almost to any problem. The issue, however, is that few people in the world have deep knowledge of the variety of these methods and how powerful they can actually be.

What are future use cases for tensor networks that you’d like to research?

I would love to fully explore the synergies between tensor networks and machine learning. As such, AI is called to change the world as we know it, and we see it already with generative AI, LLMs and all their applications. The problem is that the methods behind them are inefficient: it is not how Nature works! I believe part of the solution to this problem is tensor networks. We need to reformulate AI entirely in this language, which will make AI efficient, powerful and explainable. In my opinion, tensor networks are not just interesting mathematical tricks, but a technological revolution by themselves.

What has been your most impactful contribution to the field?

This is hard to measure, and I am definitely not the best person to say which is my most impactful contribution. But, looking in perspective, I believe that the application of tensor networks to compress Large Language Models has, and will have, a very deep impact. After all, we showed that most of the information content in LLMs is redundant, and this is something that will have direct implications in our everyday life, especially considering the boom of generative AI. Moreover, I also love to see how these methods, originally developed in the context of quantum physics, are now making the jump to everyday tech, which is something that I could not foresee when I started working in the field and inventing some of the algorithms that we are now using.