06/29/2026 | Press release | Distributed by Public on 06/29/2026 12:30
A couple of months ago, I sat across from my nine-year-old daughter's teachers at a parent-teacher conference. They were kind but concerned. She takes her time on assignments, they said, she's often deep in thought. How would she do on timed tests next year? I told them I wasn't worried. What they described as a problem is, to me, one of the most important things she can learn: the ability to take a hard problem and reason through it from beginning to end. In a world optimized for efficiency, qualities like patience, perseverance, and attention to detail are not deficiencies. They are the foundation of sound judgment, and this is the most valuable skill set.
The more time I spend working with AI, the more convinced I become that what matters most for her future isn't how quickly she can answer. It's whether she has the judgment to know when an answer can be trusted.
I've spent decades at Microsoft watching this tension play out: first building tools for other developers, then working across AI as models moved from research curiosities to systems deployed at scale. Now we're building Microsoft IQ, where we're exploring how an organization's collective intelligence can become its greatest advantage. Through every one of those chapters, one thing has remained true: it's never enough for a system to be powerful; it must also be trustworthy.
Trust is what turns assistance into delegation. When we can trust an agent to do what we intend, within the limits we set, we can hand off the work we never wanted to spend our lives on: the repetitive tasks that drain attention, the mundane work that fills a day without moving anything meaningful forward, the dangerous work humans should not have to do, the work too vast for any individual or team. Agents should take on that toil, extend our reach, and give us back our time for the work that calls for something only humans bring.
My daughter doesn't know any of this yet. But by the time she's grown, most of the work that rewards speed and repetition will be work we delegate. What will matter then is exactly what gave her teachers pause: the patience to stay with a hard problem, reason through it, and decide when she's reached a conclusion she can trust. The very thing they feared might hold her back could be exactly what the next era prizes most.
So no, I'm not worried about the timed test. I hope she grows up in a world where software carries the toil and people are freed for the work that is unmistakably ours-to think, to judge, to create, to care for one another. That is the future I want agents to make real. But my hope is not evidence it will happen. The future I just described depends on a single question: can we trust agents to do the work? Trust is earned one task at a time. So, I went looking for evidence of where it's been earned, and where it hasn't.
We partnered with MIT Technology Review Insights on new research that draws directly from the technical leaders building this frontier: not the people talking about it, but the people doing it. We surveyed 300 technical experts across AI, data, and cloud domains, spanning 12 industries and 4 regions of the world, asking them to rank their confidence across 101 of the top tasks. What we got back is the 2026 Agent Confidence Index, an honest map of where agents are delivering real value, so our community can see what's working and move forward together with conviction.
Learn from where confidence is highest
Across the 101 tasks measured, average confidence already lands at 64 out of 100, and thirty tasks clear 70. The highest scores cluster on work that is both predictable and draining: the late nights, the interruptions, the low-value repetition. Automated report generation leads at 83.5. Boilerplate code generation for new features sits at 82.5-the hours a developer no longer spends rewriting the same patterns, freed for the work that challenges them. Certificate expiration monitoring and renewal, at 81.5, ends the scramble that pulls engineers off high-stakes problems for something entirely routine. Real-time data stream monitoring follows at 80.5, and release note generation from commit history at 79.5-the manual end-of-sprint commit review, gone. This is where frontier teams are already delegating to agents, regularly.
The pattern holds across every discipline. In developer and AI workflows it extends to API client maintenance and code identification; in cloud operations, to ticket routing and cost optimization; in data, to anomaly detection. Wherever it sits in the stack, this is work technical teams now trust agents to own.
What matters most here isn't what the data says about the tasks; it's what it says about the people delegating them. When technical experts believe in something deeply enough to hand it real work, that belief ripples outward. It becomes the recommendation they make to their leadership, the solution they build for their customers, and the culture they create for their teams.
Even the toughest agent tasks are gaining traction
Here's what strikes me most: the tasks ranked lower on the index are still high in absolute terms. Service mesh configuration and troubleshooting sits at 37.5, database schema migration scripting at 46.5, memory leak detection at 48.5. These sit at the very frontier, the interconnected, high-stakes work where investment and innovation are concentrated right now.
Consider what they demand. Service mesh configuration touches many systems at once. Database migration carries real stakes, requiring precision across data, application, and infrastructure layers at the same time. Memory leak detection means diving deep into a system's behavior under load, accounting for conditions that shift from one deployment to the next. These are the challenges that have separated great engineers from exceptional ones-and even here, experts see agents helping. Not carrying the work alone, but contributing where it used to be unthinkable. That confidence is still climbing, and that's telling.
We're shipping new capabilities constantly to support this momentum. Database migration tooling in GitHub Copilot now covers not just scripts but the full application and infrastructure migration story. The Azure Site Reliability Engineering (SRE) Agent brings decades of experience operating Azure at scale and deep profiling capabilities directly into memory analysis and performance diagnosis.
Why human judgment remains paramount
When we asked technical experts how they're navigating agent adoption, 59% named "keeping humans in the loop" as their top priority-ahead of better observability, ahead of governance documentation, and ahead of everything else. That's a mark of maturity. Teams moving forward with clarity treat agent oversight as non-negotiable, regardless of how capabilities evolve.
The boundary itself is straightforward. Agents excel at well-specified, high-volume, reversible work: they synthesize data, automate known workflows, and surface anomalies at a speed and scale no human team could match. The moment a decision becomes high-stakes, context-dependent, or hard to undo, a human signs off. That isn't a limitation of the technology; it's the architecture of a trustworthy system.
What's changing, and what remains underappreciated, is the skill it takes to draw that boundary well: the discipline of full-lifecycle evaluations and guardrails. Success means measuring agent output against intent and keeping behavior inside your business strategy. It's new territory for most engineering teams, and it's becoming table stakes for modern software faster than most organizations realize. The good news: the same tools generating the work can help you build the harness. Ask GitHub Copilot to write the evals and it will. Frontier teams are already doing this, and it's why they're pulling ahead.
Agents are opening career doors for engineering
Across system reliability and site operations, evaluations and quality assurance, and data pipeline management, 80% or more of respondents see meaningful career opportunity ahead. We believe this is one of the most significant moments in the history of building software, not because agents replace what technical people do, but because what's left when they take on the toil is the work that defines a career: the judgment calls, the architectural vision, the reasoning to navigate complexity under pressure. That fluency will define the next generation of technical leadership.
We're living this shift at Microsoft, right alongside our customers. Junior developers are using agents to explore codebases on their own and arriving at mentoring conversations with sharper, more sophisticated questions. Senior engineers are covering more ground because the repetitive work that used to fill their days is now delegated, and the work that's left is harder, interesting, and consequential. Both are growing into more capable versions of themselves. For me, that's the outcome I've always believed technology could deliver.
An integrated approach to intelligence and trust
Designing more sophisticated agent systems has made one thing clear: agents thrive in well-integrated environments, working best when your whole stack draws on a single source of truth. The high-confidence tasks are the ones we've already figured out; the meaningful frontier is the harder, interconnected work, and that's exactly where observability, governance, security, and unified intelligence have to operate as one.
Microsoft IQ brings your enterprise context into asingle, continuous intelligence layer. Within it, Work IQ builds semantic understanding of how your business operates across email, calendar, meetings, chats, files, people, and collaboration patterns. Such depth of knowledge is the reason technical teams choose us, and it's what drives my focus and passion in learning how people actually work so their agents get them. My colleague Kim Manis, CVP of Product for Microsoft Fabric, has written specifically about what this means for data professionals, and the integral role of Fabric IQ.
It's all part of the Microsoft Agent Platform, which is becoming the operating system for enterprise AI at scale. From building in GitHub and contextualizing with Microsoft IQ, to running in Microsoft Foundry and governing in Microsoft Agent 365, Microsoft is uniquely positioned to help customers bring together data, models, agents, and human judgment into a continuously improving and secure system.
Frontier transformation is being led by builders like you.
Next steps: