Madrid, June 11 2025 – Multiverse Computing, a leader in advanced AI and quantum technologies, has announced the launch of its CompactifAI Application Programming Interface (API) for pre-compressing and optimizing top AI models, on AWS. This milestone follows over a year of intensive development, testing, and strategic planning, including through Multiverse Computing’s membership of the AWS Generative AI Accelerator Program for 2024. It is the first solution launched by Multiverse Computing to support scalable, cost-effective, and user-friendly LLM solutions on AWS.
The CompactifAI API, which is now available in AWS Marketplace, provides a robust, serverless LLM access layer that leverages AWS Sagemaker Hyperpod to scale the ultra-efficient compressed models inference across a cluster of hundreds of cutting edge GPUs. As a member of the AWS Generative AI Accelerator Program for 2024, Multiverse worked closely with AWS to:
- Pre-compress and optimize top AI models for performance and cost-efficiency.
- Design a seamless onboarding experience for users via AWS Marketplace.
- Develop a go-to-market strategy that ensures reach, scalability, and customer success.
The CompactifAI API will offer Meta Llama, DeepSeek and Mistral compressed models, providing users with access to a diverse range of optimized large language models tailored to various performance and cost requirements. By integrating these models, Multiverse Computing empowers clients to scale advanced generative AI solutions with ease, supporting diverse use cases while maximizing efficiency and minimizing infrastructure demands.
"Integrating CompactifAI’s compressed models into our customer support chatbot has been a game changer," said Rubén Espinosa, CTO of the AI assistant, Luzia. "We have reduced our model footprint by over 50% while maintaining high response quality with lower latency and cost. The performance efficiency of the compressed model lets us deliver faster, more reliable interactions to our users across multiple regions, without compromising on natural language understanding"
“This isn’t just a product launch—it’s the culmination of a strategic journey,” said Enrique Lizaso, CEO of Multiverse Computing. “We’ve built CompactifAI API from the ground up to revolutionize AI deployment with top performance and cost effective solutions, and we’re thrilled to bring it to market with AWS.”
"Multiverse Computing's CompactifAI API will help more businesses to make use of advanced AI capabilities with accessible options for optimizing models to their requirements," said Jon Jones, Vice President and Global Head of Startups and Venture Capital at AWS. "Innovations like these can play a valuable role in democratizing access to AI and helping a wide range of businesses to apply the technology."
Multiverse Computing leverages cloud technologies to deliver a scalable, secure, and high-performance API infrastructure. With built-in go-to-market support, it can now effectively reach customers and drive sales through AWS Marketplace. The models are offered in a user-friendly format, featuring clear documentation, licensing, and onboarding—accessible via a dedicated landing page and Marketplace listing.
The CompactifAI API landing page offers:
- Model cards with detailed specs, pricing, and performance metrics.
- Side-by-side comparison tools to help users choose the right model.
- Comprehensive API documentation.
- Clear licensing terms for each model.
- Streamlined onboarding via AWS Marketplace.
About Multiverse Computing
Multiverse Computing is the leader in AI model compression. Thanks to its expertise in quantum software and artificial intelligence, the company has developed CompactifAI, a revolutionary model compressor. CompactifAI compresses LLMs by up to 95% with an accuracy loss of just 2-3%, reduces computational requirements and opens up new use cases for AI in all sectors. Headquartered in Donostia, Spain, Multiverse has offices in Europe, the USA and Canada. It has over 100 customers worldwide, including Iberdrola, Bosch and the Bank of Canada. For further information: multiversecomputing.com