Note from Jeremy & Duncan: Hey folks, we’re Jeremy Hermann and Duncan Gilchrist, and we’re all about ML and data products. Today’s post dives into a question we get all the time from senior leaders: where does ML go wrong?
Picture this: a tech leader in a successful organization is personally fascinated by AI. The possibility that competition might leapfrog them keeps them up at night. So they decide to push for an ambitious machine learning project to stay ahead.
The leader manages to attract and hire a few talented data scientists, buys an off-the-shelf ML infra solution, and throws the new team at a big business problem — but weeks turn into months with no results. The transformative impact promised to the CEO never comes to fruition.
The now-humbled tech leader has little to show for their efforts as the project becomes a derided example of “tech gone wrong”. The data scientists gradually leave, and most stakeholders within the organization now consider ML an overhyped waste of time and money.
Sound familiar? Stories like these are all too common, even at the biggest, most well-resourced companies. Because executing machine learning initiatives is incredibly difficult.
We saw this firsthand leading data science and engineering teams at Uber, Tecton and Gopuff, where ML initiatives had tremendous payoffs, but where it could take months to update just one or two existing ML models. Luckily, our leadership team understood the value of ML and how painstakingly slow that work could be.
But in many organizations, CEOs and tech leaders can understandably lose faith in ML as a discipline as they watch project after project stall out, fail, or take months to materialize value.
We feel for those leaders. They have to balance their responsibility to make wise, responsible investments in data and tech with the very real fear of being left behind in the current AI arms race.
We’ve spent years working on ML, from the day-to-day trenches of data science all the way to the boardroom. Along the way, we’ve identified five common breaking points that cause ML projects to fail, time and again:
Fear of the unknown
Lack of buy-in from non-technical teams
Poor collaboration within tech
Not getting the data science right
Failing to invest in the right infrastructure
Let’s unpack what each of these breaking points look like, and their underlying causes — so you can tackle them head-on in your organization.
Breaking point #1: Fear of the unknown
ML is transformational — for better or worse. There are big payoffs and big risks involved, and that can create a fear of the unknown that undermines projects.
When ML goes wrong, it can leave organizations in worse shape than when they relied on manual processes or traditional analytics.
Inaccurate sales forecasting can drive companies to maintain far too much — or too little — inventory.
Poor fraud detection can alienate valuable customers.
Or take Zillow’s big bet on automated home buying, which led to a $304 million inventory loss, plummeting stock prices, and layoffs of 25% of its staff.
Fear of the unknown is letting these horror stories stop you before you even get started, and miss on the real value of ML.
That’s a mistake because when ML projects succeed, they can change a company’s entire trajectory. Organizations can leverage data and automation to predict the future, not just analyze in hindsight.
Consider how ML-enabled realtime personalization has enabled entirely new social networking experiences across TikTok, Instagram, and Youtube. Or how Amazon leverages ML-based forecasting to know what to stock when by predicting inventory needs for 400 million products. Or how Netflix uses ML to figure out what shows to make, and then their encoding and bitrate selection while you stream.
ML at scale is shockingly powerful — at Uber, when Duncan’s team made a meaningful improvement to an ML model, they would routinely deliver over $100m in annualized profit.
The solution here is obvious but easy to miss — leaders need to push forward, but actively manage the risks. Start by making ML investments incrementally — identify a handful of beach head use-cases, give them the right resourcing and visibility, make them work, and build from there.
Breaking point #2: Lack of buy-in from non-technical teams
ML projects make a sizable impact across the organization — and that means non-technical teams need to be ready to collaborate with data scientists to ensure positive outcomes.
Take pricing. In traditional businesses, pricing is owned by operations or merchandising teams — and since it’s high visibility and high leverage, everyone else is also going to have an opinion. Pricing is multifaceted — decisions need to be incorporated into a larger strategy, including brand positioning, discounting, and membership programs.
When machine learning algorithms are put in charge of pricing, they’re going to optimize for things that are easiest for them to measure. A key implication is that ML algorithms will focus on short-term effects — sometimes as short as an individual user’s session. Long-term effects, like how changes in prices impact customer perceptions and brand reputation, aren’t going to be encoded into dynamic pricing models.
That’s why data scientists need to collaborate with non-technical teams to take these factors into account, and operations and brand teams need to be involved in conversations about pricing floors and discounting strategies.
The problem? These non-technical leaders have to leave their egos at the door. (Don’t worry, we’ll get to the technical leader’s egos momentarily!)
Introducing ML often means asking a non-tech team to surrender organizational territory or even headcount to automation. They aren’t simply asking IT to build them tools to make their jobs easier. Rather, the perception is, “I’m being asked to help the tech team build tools to take my job away”. And frankly, since data scientists often struggle to speak the language of business leaders, they probably aren’t doing the best job at communicating the positive value of their work.
It’s easy to see why these non-technical leaders may be reluctant to collaborate.
That can translate to operators — the employees who will eventually use the models to automate their once-manual work — throwing up roadblocks like:
Refusing to be specific enough about their strategy to encode it into variables machine learning models need to optimize for
Setting unrealistic expectations for what controls and backstops they need over the algorithm, which prevents the models from actually automating the work
Changing key things on their own — without telling the ML team — which contaminates the data used by the ML team
And even when ML algorithms eventually perform well, operators can deliberately miss the forest for the trees by focusing on the edge cases of occasional bad decisions and ignoring the larger value being delivered. (Though that isn’t to say data scientists never make mistakes!)
It’s imperative that CEOs and tech leaders understand why their colleagues are reluctant to collaborate — and find ways to incentivize every team to support and share in the success of ML automations. And it’s worth doing since ML is a tide that can lift all boats.
Breaking point #3: Poor collaboration within tech
It’s not always operations teams that fail to collaborate on ML projects — sometimes, the call is coming from inside the house.
Data scientists need many other tech teams to execute ML successfully, including data engineers and upstream producers who provide great data, and downstream machine learning engineers and software engineers providing strong deployment and infrastructure.
For example, consider a data science team that’s working with marketing to build a model that will personalize email communications. Seems like a no brainer, right?
To do it, the team will need the right instrumentation on the website to collect data that shows what a user is interested in. Once they build the model fitted to that data, they’ll need infrastructure to use the model and surface the right content within the email. That will involve multiple systems, multiple frontend and backend engineering teams. They’ll need data engineers to build the right pipelines. And they’ll need to hope everyone can work together to prevent unintended changes that break the entire system – because the broader website is probably owned by a team that has no idea this particular data science team is using their data!
Surprised that this collaboration might not happen seamlessly? Nope.
But when the cracks surface, everything slows down and conflict occurs. We’ve seen upstream changes as seemingly innocuous as refining the definition of a user session bring down major ML systems.
It gets worse. Since data scientists usually come from math and stats backgrounds — not software engineering — they tend to deliver poor quality code to engineers who then have to clean everything up before they can put it into production. Needless to say, no engineer is ever excited to clean up someone else’s mess.
Overall, our experience is that within-tech breakdowns happen when:
Data quality and data science is treated as an afterthought by upstream producers, not baked into product decisions or considered when making schema changes
Data scientists themselves aren’t held to high standards on code quality, and put the burden on software engineers when they produce spaghetti code and “throw it over the fence”
These orgs aren’t aligned on priorities — especially when it comes to shared infrastructure like data quality and ML platform tooling
The solve here is to twofold:
First, make sure data and machine learning are prioritized in the same way up and down the chain and across functional groups. Often this means staffing a given problem with a cross functional group of data scientists, engineers, and a product manager - so that the right set of skills is collectively on the hook to deliver.
Second, make sure ML is viewed as a long term investment which should be held at (or at close to) software engineering standards. Hacky one-offs create big problems.
It’s critical to remember that data scientists are an important piece of a much larger puzzle, and without support from their engineering and product partners they won’t be able to deliver successful ML projects on time or budget.
Breaking point #4: Not getting the data science right
Building great ML models is hard.
It’s sophisticated work — and also slow, painstaking, and labor-intensive. It requires data scientists to manually sift through tables to find relevant data, analyze it, clean it, build model features, and train and tune the model. Then, as we’ve just discussed, data scientists have to collaborate with partner teams to productionize their work. The whole process takes months and even then it doesn’t always work.
To make matters worse, the data scientists and ML engineers who are skilled enough to do this work are expensive and in high demand. Organizations are usually competing with the Metas and Ubers of the tech world to hire them; many companies can’t attract or afford this talent in the first place.
And if they do manage to hire a superstar modeler, they still need to support them well – which is hard because data science is pretty different from more typical software engineering.
As Andrew Ng writes,
Most of the work of building a machine learning system is debugging rather than development … When you’re building a traditional software system, it’s common practice to write a product spec, then write code to that spec, and finally spend time debugging the code and ironing out the kinks. But when you’re building a machine learning system, it’s frequently better to build an initial prototype quickly and use it to identify and fix issues.
Managing data scientists requires adapting your management style accordingly - getting comfortable with prototyping and experimentation so that scientists have the time and space they need to experiment and succeed, while also holding them accountable to deliver results.
The right mix of R&D and pragmatism is hard to accomplish. One rule of thumb we’ve found helpful is to ask often “how do we know this will work”? It’s surprisingly easy for your staff data scientist to get so excited about the math they forget to ask that question. This is especially important at project outset: does the objective function the model is optimizing for actually map to what the business cares about?
Zooming out, these breakdowns are simple: companies can’t hire talented data scientists, don’t give them the support they need to do their best work, or simply can’t manage them correctly. Without getting the science right, ML projects will undoubtedly fail.
The good news is that the rosters of data scientists are exploding, the best practices are getting worked out, and many companies are vying to create tooling to increase their leverage.
Breaking point #5: Failing to invest in the right infrastructure
While many of these challenges are organizational, technology also plays a vital role. ML infrastructure is a double-edged sword: it’s hard to determine the right solution, and getting it right really matters.
At first glance, there are plenty of “end-to-end” ML platforms and point solutions on the market to choose from. But that makes it hard to differentiate between them and evaluate the tradeoffs. And once a solution is selected, it often requires a considerable amount of bespoke work to get everything wired up and functioning correctly — even with so-called end-to-end platforms. That means more data engineering resources and more collaboration. And since this infrastructure tends to sit inside production flows, it can be painful and risky to remove if you find the solution isn’t performing as promised.
When an organization chooses the wrong infrastructure, or doesn’t invest in a robust solution altogether, data science teams will never perform to their full potential.
This can be binary: for example, if a data scientist wants to do real-time machine learning predictions — like personalization, fraud detection, or adtech — they need tech that lets them do real-time serving.
Or it can be death by a thousand cuts. Even at seemingly mature tech companies, it’s shockingly common for data scientists to use their own laptop for modeling, then send off resulting artifacts like CSV or JSON files that encode relevant objects. This kind of manual process is impossible to scale, error-prone, and puts valuable models at risk when a key data scientist departs the company.
Finally, all those handoffs to other tech teams — which are already challenging to navigate — become that much harder without quality, shared infrastructure in place.
Ultimately, this can lead to talented data scientists becoming frustrated by bottlenecks, feeling unsupported, and leaving for another organization with better resources.
Fortunately, the DS tech landscape is changing rapidly. The best leaders stay on top of new solutions, pay close attention to the infrastructure that winning teams are using, and are proactive about carefully vetting and investing in quality tools for their teams.
How leaders can respond
If these breaking points sound familiar, you’re not alone. We’ve encountered them firsthand at multiple organizations, and talked with dozens of data and tech leaders who work to overcome them on a daily basis.
Data science and ML is hard work, but understanding where the challenges arise is a huge step forward in building collaboration and investing in the right talent and technology. Here are a few proactive steps you can take:
Evaluate your organization’s risk factors across each of these areas to understand your potential pitfalls
Prioritize building buy-in, alignment, and collaboration across technical and non-technical teams before taking on any new ML initiative
Build a concentrated portfolio of data science bets and focus on making them work
When you put these principles into practice, your organization will become more attractive to your future data science hires — and your lofty ML goals will be much more likely to succeed.