June 25, 2026, (Inside AI) — A new system from MIT and Microsoft dynamically designs and deploys agentic AI workflows, slashing computational needs to as low as 35 percent of traditional methods while cutting energy use to 27 percent and costs to under 25 percent. The platform, called Murakkab, lets developers describe tasks in plain language and automatically selects optimal models, tools, and hardware configurations.
Agentic workflows chain multiple AI models and external tools for complex tasks like video analysis. Their fragmented design often leads to wasted computation and energy. Murakkab tackles this by optimizing the entire pipeline on the fly, adjusting to user priorities such as speed or cost.
Gohar Chaudhry, an EECS graduate student at MIT and lead author, stressed the urgency.
"Agentic workflows are getting very complicated and quickly becoming the backbone of what cloud providers are doing. Energy usage is a huge concern, so we need to be very careful about how efficient these workflows are. It is very easy to over-allocate resources, wasting energy and money. Enabling a cloud provider to intelligently make these workflows more resource-optimal is a win for everyone involved."
The paper includes co-authors Adam Belay, associate professor at MIT CSAIL, and senior author Ricardo Bianchini, technical fellow at Microsoft Azure. It will appear at the USENIX Symposium on Operating Systems Design and Implementation.
From Hard-Coded Chaos to Intent-Driven Design
Today’s developers must manually specify every agent, model, tool, and execution order. Hardware allocation and speed-cost tradeoffs are locked in upfront. When a better model emerges, reconfiguration starts from scratch. The configuration space is too vast for manual optimization.
Murakkab flips this. A developer describes a goal—say, a video Q&A app that extracts frames, transcribes audio, and answers queries—without detailing the assembly. The system picks the best components and decides what runs sequentially or in parallel.
"Even if you wanted to do all this manually, it is unlikely that you'll be able to configure the workflow optimally because the space of possible configurations is so large," Chaudhry said.
The platform adapts over time. If a new GPU or model launches, Murakkab integrates it without developer intervention. Cloud providers gain visibility into multiple workloads, enabling shared resource use that respects user constraints.
Real-World Gains and Hidden Optimizations
Tests on video Q&A and code generation workflows showed Murakkab meeting requirements with a fraction of typical compute. In one case, it cut energy by over an order of magnitude with only a 2 percent accuracy drop. It also discovered a counterintuitive configuration for a video-frame selection model that a human would likely miss.
The system’s dynamic nature lets users balance tradeoffs in real time. Cloud providers can shift resources efficiently across workloads, a critical edge as agentic AI becomes central to cloud services.
Researchers plan to scale Murakkab to more complex workflows and larger clusters. The work was supported by the Semiconductor Research Corporation and DARPA. Chaudhry sees vast potential: "There is a lot of potential to make these workflows more resource-optimal so they consume far less energy, but we need to be thinking about this at the scale of major cloud platforms."