Pruna AI's Open Source Framework Promises Significant Cost Savings Through Model Optimization

Pruna AI releases an open-source framework for AI model optimization combining multiple compression techniques.
Framework supports all model types but currently focuses on image and video generation applications.
The enterprise version offers advanced features including automated optimization with an hourly pricing model.

A European AI startup is making its powerful model optimization framework available as open source, potentially enabling significant cost reductions for businesses deploying AI solutions.

Pruna AI announced Thursday that it is releasing its comprehensive compression framework that applies multiple efficiency methods to AI models.

The framework standardizes a variety of optimization techniques—including caching, pruning, quantization, and distillation—allowing developers to compress models while carefully balancing performance gains against potential quality loss.

"We standardize saving and loading the compressed models, applying combinations of these compression methods, and also evaluating your compressed model after you compress it," Pruna AI co-founder and CTO John Rachwan says

Large AI labs like OpenAI have been using various compression methods internally to create faster versions of flagship models—such as GPT-4 Turbo and Black Forest Labs' Flux.1-schnell—Pruna AI aims to democratize these capabilities for the broader development community.

"For big companies, what they usually do is that they build this stuff in-house. And what you can find in the open source world is usually based on single methods," Rachwan explained.
"But you cannot find a tool that aggregates all of them, makes them all easy to use and combine together. And this is the big value that Pruna is bringing right now."

The framework supports all types of AI models, though the company is currently focusing on image and video generation applications. Early adopters include AI companies Scenario and PhotoRoom.

Beyond the open-source edition, Pruna AI offers an enterprise version with advanced features, including an upcoming "compression agent" that automatically determines optimal compression settings based on user requirements.

"You give it your model, you say: 'I want more speed but don't drop my accuracy by more than 2%.' And then, the agent will just do its magic," Rachwan said.

The enterprise offering uses an hourly pricing model similar to cloud GPU services. The company positions its solution as an investment that quickly pays for itself through reduced inference costs.

In one example, Pruna AI made a Llama model eight times smaller without significant quality degradation.

The release comes just months after Pruna AI raised a $6.5 million seed funding round from investors including EQT Ventures, Daphni, Motier Ventures, and Kima Ventures.

Edited By Annette George

Pruna AI's Open Source Framework Promises Significant Cost Savings Through Model Optimization

Read Next

AI Startup Zaher AI Lands $150K, Becomes Meska Studio’s First Flagship Firm

Pentagon Backs Blue Origin and Anduril to Study Rapid Space-to-Earth Cargo Delivery

FLOQ Reaches Nearly 1 Million Users in Less Than 3 Months

Meta’s $10B Louisiana Data Centre Gets Green Light for Gas Power Amid Criticism

Europe’s DAOs Drive Public-Good Funding and Blockchain Innovation

SecureDApp Launches an AI-Powered Blockchain Forensics Platform to Combat Web3 Financial Crimes

Victim Loses $91M in Bitcoin After Fraudster Poses as Hardware Wallet Support: ZachXBT

Real-Money Gaming Ban in India Risks Jobs, Spurs Legal and Industry Backlash

Edtech Startup Arivihan Raises $4.17 Million in Pre-Series A Round Led by Prosus and Accel

Battery Startup Group14 Lands $463M as Demand for Silicon Anodes Grows

Subscribe to Newsletter