In 2020, Barr Moses and the Monte Carlo team wrote a powerful post on the data team personas involved in what they aptly called the “Bad Data Blame Game”. As Barr wrote then, “All you know is that it’s 3 a.m., your CEO is pissed, your dashboards are wrong, and you need to fix it — stat.” If you work in data – you’ve been there, right?
It’s now 2024. Organizations have more data than ever, AI is the new hotness, and Big Tech leaders like Amazon, Netflix, and Facebook are deploying ML to transformative effects. Every company outside of Big Tech is asking, How do we go from merely making data-driven decisions to deploying data products that move the needle?
But for nearly every organization, there’s a massive gulf between the high expectations around what ML will do for your business and the current reality. All you know is it’s April, and AI has been a top priority for your company since August — but you only have two ML use cases in production, neither is remotely related to generative AI, and no one even agrees if they actually work.
This is the Bad Data Blame Game’s younger sibling — what we like to call the Machine Learning Mirage.
How did we get here?
We’ve spoken with hundreds of teams who, in the age of generative AI, find themselves under extraordinary pressure to deliver game-changing AI products – and find themselves really struggling. We’re not surprised. Over the years we’ve worked with some incredibly high-performing data organizations; yet we’ve also worked with some teams that are train wrecks.
There are some common traits here; while no two teams are exactly alike, each character in the narrative faces a familiar set of roadblocks.
So in today’s post we dive into the seven personas on an ML team: what they do, where they stumble, and what keeps them up at night. We don’t have all the answers, but we are hoping to start a conversation about the core issues.
1. The CEO
Welcome to the newest member of the ML team — Andrew the CEO. He never used to care much about ML; but as of late he’s the not-so-silent partner across all his ML teams.
Andrew doesn’t want to get in the weeds, and frankly, he doesn’t know how to. But he does need to know what’s going on because he’s under constant pressure from the board to build something with AI that’s actually useful.
This all manifests as Andrew constantly forwarding LinkedIn posts about AI to his teams with helpful questions like, “Are we doing this yet?”
As CEO, he has a million problems to manage every day. But there’s a constant drumbeat of anxiety around ML: Are we doing everything we need to with our data? Will we be outpaced by competitors who are using AI better?
2. The CTO
Teresa the CTO didn’t know much about AI until last year. Yet she’s suddenly been thrust into the spotlight as the company’s face of AI. She’s picked up the baton accordingly and is pushing everyone to take big bets and organize their product around it.
This is so hard because AI is so new. Teresa doesn’t have much experience leading teams that ship AI, let alone any experience actually building AI products herself. And the development lifecycle of these products is so different from traditional software engineering — the cycle time is much longer, there’s so much more experimentation, and frankly, her finely honed intuition for what should work and what shouldn’t work doesn’t seem to apply here (who ever dreamed generative AI would be so powerful?!).
She spends a surprising amount of time trying to distinguish the tangible opportunities — and her teams’ progress on them — from the snake oil.
3. The data scientist
Veronica, the visionary data scientist, is the ML team’s dreamer. She wields her PhD in physics to imagine innovative ways to turn data and math into practical solutions that address complex business problems.
Exploring the latest and most sophisticated modeling techniques is her favorite part of her job — in fact, sometimes it might be too much fun. She can lose track of time dreaming up exciting ways to leverage the latest science.
But if the data is broken, she can’t move forward. If she gets the data, but then her infrastructure falls over because it doesn’t scale, she gets stuck. And if her stakeholders change their mind on what she should actually optimize for, months of work go to waste.
Bottom line: Veronica needs reliable data, with the right tools, and clear guidance from leadership. If she has that, and she successfully keeps it as simple as possible (and no simpler), she can deliver. Easy, right?
4. The software engineer
Peter the software engineer acts as the ML team’s pragmatic architect. He has to visualize how all the pieces of Veronica’s model, the product requirements, and the underlying infrastructure work together — so he can make it work and imagine all the possible ways it could break.
That’s because Peter knows that if something goes wrong in production, it won’t be the data scientist getting paged at 3am. It’ll be him.
In the worst case scenarios, Peter isn’t brought into the process early enough to influence decision-making. So he’s left wondering what the data science team is cooking up without him, what kind of spaghetti code might get thrown over the wall, and what kind of edge cases or scalability challenges he’ll be asked to accommodate.
5. The data engineer
Ian the data engineer used to be focused on data needs for analytics alone, but all of a sudden he’s also being asked to manage a much larger array of more complex sources – log and eventing data that the ML models need to ingest. His pipelines are the linchpin of the whole ML operation but too often nobody notices Ian’s work unless something’s gone wrong.
Ian laughs to himself when he hears Peter complain about being left out of the loop. If only he knew how often data engineers are the last to find out about changes!
Especially as his company invests in complex AI products, Ian’s role demands more of a spotlight. His needs should be considered before things break, not after.
6. The ops stakeholder
Samson, with decades of experience in operations, is the bedrock of practical business acumen within the company. Sitting in meetings with data scientists and their freshly minted PhDs, Samson wonders, How on earth do these tech folks really know what my customers need?
Samson’s days are filled with direct customer interactions, resolving issues, and leading sprawling multinational teams — positioning him perfectly to see the tangible benefits of ML products. But Samson remains a skeptic; he doubts any fancy computer model could replicate his experience and intuition. And if it could…what would that mean for his empire?
So he’s a wary collaborator, and (appropriately) questions whether the ML teams are overlooking complex elements, like eroding long term consumer trust or loyalty, when reporting the winning results from their latest experiments.
7. The product manager
Paige the product manager wears her rose-colored glasses to work every day. She’s there to bridge the gap between the operations and tech teams, and confidently sell everyone on a shared vision: a better future for their customers, thanks to data science and AI.
But Paige knows her job requires a politician’s deft touch. She’s both the most empathic and most charismatic of the bunch – yet she’s not an expert in the underlying math or technology.
So she’s the people wrangler in chief: she works to bring teams together, help them build trust, set shared goals, align on timelines, and critically works to convince skeptical Samson and concerned Teresa that the team actually knows what they are doing.
A big part of this, of course, is making sure they’re solving the right problem in the first place.
Untangle the knots to make ML succeed
There’s a common thread here: ML is hard for everyone. Leaders are battling imposter syndrome. Stakeholders are trying to stay in the loop. Practitioners are facing constant change. And everyone’s under a lot of pressure to show that ML isn’t just a mirage, but is actually an oasis.
Hopefully this post makes ML teams feel seen, and helps leaders identify ways to create the kind of environment everyone needs to do their best work.
None of this is easy. But when it works, it can be transformative.
That’s why we’re chronicling the challenges of ML as we build our own products — like our recent posts on who should own ML in the org and the five biggest breaking points for ML in the business. Check them out to dig a little deeper into how ML actually gets done.