2025: Lessons From The Trenches

I was wrong about production AI. This year's conversations with practitioners, failures, and hard lessons changed how I think about everything. Here's what I wish I knew in January.

Dec 27, 2025

I started 2025 thinking I understood production AI.

Eighteen years in data & ML engineering. Spent time working in some of the most strategic, cutting-edge projects with F500 customers. Leading conversations with enterprises about data literacy, governance, “being data-driven“, and now AI adoption. I’d seen enough technology cycles to know the patterns.

But this year humbled me.

I watched the gap widen between teams who shipped and teams who stalled.

And the reasons weren’t what I expected.

These lessons didn’t come from research reports. They came from the conversations with leaders and their teams working with AI. From the data leader who finally said out loud what everyone was thinking. From the community discussions in AgentBuild where practitioners shared what was actually happening - not what their LinkedIn posts claimed.

Here’s what I actually learned.

1. Everyone was solving the wrong problem first

I had a call in March that changed how I think about AI projects.

A team had spent three months building an incredibly sophisticated multi-agent system. Beautiful architecture. Clean abstractions. The kind of thing that gets applause in a demo.

Then I asked: “How do you know if it’s working?”

Silence.

They’d built the entire system without defining what success looked like.
No evaluation criteria.
No baseline metrics.
No way to know if Version 2 was actually better than Version 1.

I started asking this question in every conversation after that. Maybe 20% of teams had a good answer. The rest were flying blind, hoping they’d know “good” when they saw it.

This is what pushed me toward what I now call the Reverse Strategy Framework - starting with evaluation before architecture, defining success before selecting tools. It feels backwards. But every team I’ve seen succeed this year did it this way, whether they had a name for it or not.

2. “We have a data problem” became the most honest sentence in enterprise AI

I used to hear teams blame models. Blame frameworks. Blame vendors.

This year, I finally started hearing the truth: “Our data isn’t ready.”

One conversation sticks with me. A data leader at a large financial services company- someone I respect enormously said: “Sandi, we’ve been talking about data quality for fifteen years. But AI is the first thing that made it impossible to ignore. Every failure traces back to the same place.”

That honesty is spreading. In community discussions, in customer calls, in the quiet admissions after the official meeting ends. The facade is cracking.

Here’s what I’ve come to believe:
Most AI failures aren’t AI failures. They’re data failures that AI made visible.

3. The teams that shipped weren’t the smartest - they were the most boring

I talked to a lot of teams this year who were building impressive things. Complex agent architectures. Novel approaches. Cutting-edge frameworks.

Most of them are still in pilot.

The teams that actually made it to production? Their architectures were almost disappointingly simple. Single-agent systems. Straightforward retrieval patterns. Minimal orchestration complexity.

One engineering lead told me: “We had to kill our egos. The sophisticated version was more fun to build. But the boring version was what we could actually operate.”

I’ve started using this as a diagnostic. When someone shows me an architecture diagram and I’m impressed, I get worried. When I’m slightly underwhelmed, I get optimistic.

4. Human oversight isn’t a training wheel - it’s a feature

Early in the year, I was in a room where someone said “the goal is to remove humans from the loop entirely.”

I’ve heard variations of that statement dozens of times since. And I’ve watched those projects struggle.

The teams that succeeded took a different view. They designed human touchpoints into their systems - not as temporary scaffolding to remove later, but as permanent architecture.

A compliance lead at a healthcare company put it perfectly: “The question isn’t when we can trust the AI to work alone. The question is where humans add the most value in the process. That’s not a limitation. That’s design.”

I’ve stopped seeing human-in-the-loop as a constraint. It’s a capability.

5. Trust compounds. So does distrust.

I had coffee with a CTO who’d just pulled an AI initiative that was technically working fine.

Why? The business didn’t trust it. One bad output early on - something the team had long since fixed - had poisoned the well. Users had stopped engaging. Stakeholders had lost confidence. The perception problem became unsolvable.

“We actually fixed the accuracy issue in week two,” she told me. “But it didn’t matter. We’d already lost them.”

This is the lesson I wish I’d understood earlier in my career: Trust isn’t something you earn once. It’s something you have to build from the first interaction. And once it’s gone, technical improvements don’t bring it back.

Every team I’ve seen scale successfully invested in trust infrastructure - observability, explainability, guardrails - before they invested in features. The ones who added it later were always playing catch-up.

6. The stack changed faster than anyone could learn it

I stopped counting the number of times someone told me they were “standardizing” on a framework, only to be evaluating alternatives three months later.

This isn’t anyone’s fault. The landscape genuinely moved that fast. New models. New capabilities. New frameworks. Best practices that were obsolete by the time they were documented.

One architect described it as “building on quicksand.” You make a decision, you start building, and then the ground shifts underneath you.

The teams that handled this best didn’t try to pick the “right” stack. They built for replaceability. Modular architectures where components could be swapped without rebuilding everything. They accepted that today’s choice wasn’t permanent and designed accordingly.

7. ROI timelines were a fantasy

I sat in a lot of planning conversations this year where someone projected AI ROI at 6-12 months.

Every single time, I watched experienced leaders in the room go quiet. They knew. But often, they didn’t push back.

Here’s the reality I observed: Teams that expected quick returns panicked when they didn’t materialize. They cut projects that needed another year to mature. They declared failure on initiatives that were actually on track - just on a longer track than the spreadsheet assumed.

The organizations that succeeded set honest expectations upfront. They treated AI initiatives like infrastructure investments, not SaaS subscriptions. They planned for two to three years, not two to three quarters.

I’ve started asking a new question in early conversations: “What happens if this takes twice as long as you expect?” The answer tells me everything about whether the project will survive.

8. Workflow redesign was the actual unlock

This one took me a while to see clearly.

I’d watch teams add AI to their existing processes. Same workflows, same handoffs, same bottlenecks - just with an AI component inserted somewhere in the middle.

Then I’d watch teams that rethought the entire workflow around AI capabilities.

They didn’t ask “where can we add AI?” They asked “what would this process look like if we designed it from scratch today?”

The results weren’t even close.

The first approach gave incremental improvements. The second approach gave transformation. Same technology. Completely different outcomes.

I’ve become convinced that most AI value isn’t captured by better models or better tools. It’s captured by better process design.

9. The learning loop is everything

Late in the year, I started noticing something about the teams that were genuinely scaling.

Their Day 100 systems were dramatically better than their Day 1 systems. Not because they’d rebuilt them - but because they’d designed them to learn.

Every interaction generated feedback. Every failure got analyzed. Every edge case became training data for the next version. They’d built loops, not just pipelines.

One team lead described it as “compounding interest, but for AI quality.” The teams that started this early were accelerating away from everyone else.

This shifted how I think about initial deployments. Version 1 isn’t about being good. Version 1 is about being learnable.

10. The community taught me more than any vendor

I need to end with this one, because it’s been a game-changer.

The most valuable insights I got this year didn’t come from vendor briefings or analyst reports. They came from practitioners in the AgentBuild community sharing with me what was actually happening in their work.

The engineer who admitted his “successful” deployment was held together with duct tape. The data scientist who explained why her evaluation framework failed and what she built next. The architect who shared his team’s post-mortem after pulling a production agent.

That kind of honesty is rare. And it’s worth more than any polished case study.

If I learned one thing this year, it’s that the gap between AI marketing and AI reality is vast. The people closing that gap aren’t doing it alone. They’re learning from each other.

Where this leaves me for 2026

I’m heading into next year with more conviction and more humility than I started 2025 with.

Conviction that the evaluation-first approach works. That trust infrastructure matters more than model selection. That simple, learnable systems beat complex, static ones.

Humility that I’m still figuring this out. That the landscape will change faster than my assumptions. That the best insights will come from practitioners doing the work, not observers commenting on it.

If you’re navigating this same terrain, I’d genuinely love to hear what you’ve learned. Reply to this email. Share your own lessons. Push back on mine.

This stuff is too important to figure out alone.

What did you learn this year? Hit reply - I read every response and will compile the best lessons into a follow-up piece.
Go deeper in the community: Tell me what you want to read more of. Tell me what you need help with. Reach out and ask.
“If you don’t ask, yu dont get.”
Working through this for your team? I’m opening some conversation slots in Q1 for teams moving from experimentation to production. Reply to this email if you want to talk.

Happy New Year text — Photo by Crazy nana on Unsplash

Happy New Year, everyone.

Thank you for being part of this journey. Whether you’ve been reading since day one or just joined us, I’m grateful you’re here.

Wishing you and your family a healthy, peaceful, and joyful 2026. May the year ahead bring you closer to the people and things that matter most.

See you on the other side.

Cheers,
Sandi.

Found this useful? Ask your friends to join.
We have so much planned for the community - can’t wait to share more soon.

Share agentbuild.ai

agentbuild.ai

Discussion about this post

Ready for more?