To be effective in a business setting, data science projects must focus less on the science involved and more on the business. Doing data science in an enterprise requires careful alignment with corporate goals and values, said Keegan Hines, vice president of machine learning at Arthur AI, a platform that monitors the productivity of machine learning models.
In his Tech Talk session for AI Exchange, “Translating Data Science to Business Value,” Hines said there are both technical and corporate challenges to translating data science into business value. Technological challenges include model reproducibility, model deployment, and model monitoring. On the corporate side, organizational challenges and an us-versus-them attitude can make it difficult to bring the full potential of data science to bear on the needs of the business.
During his talk, Hines shared some lessons he learned while trying to implement data science projects that create business value. Here are the key takeaways from his session.
This may sound obvious, but very often data scientists will start thinking about algorithms, optimizations, and metrics—and lose sight of the fundamental question of whether they’re working on something that matters to the business.
That’s why, when designing a project, data scientists need to corral folks from both the technical and business sides of the company, Hines said. This will help create a shared context to ensure that the data scientists are cooking up something that addresses an important problem facing the company.
Acceptance of a data science project in a business environment depends on stakeholder participation. Data scientists may be excited about the techniques in a project they believe will transform a business. But they shouldn’t present it as a fait accompli to the people who will have to work with and implement the project—essentially telling them that the way they’ve been doing things for years is all wrong. That approach is more likely to invite failure for the project.
So even if a team of data scientists gets the “what” of a project right, they’re still courting failure if they don’t get the “how” right, too. The tech can be there, and the business impact can be there, but the inclusivity must be there, as well. “We need to get everyone engaged together in a collaborative way to prevent organizational friction,” Hines said.
To set up a data science project for success requires not only everyone working together from the get-go, but also structuring the project so everyone is set up to share the victory at the end of it. “There should not be any us-versus-them mentality,” Hines said. “We need shared wins and shared commitment from both sides.”
Ask data scientists about the performance of their model, and they’ll likely start talking about such things as ROC and AUC—performance measures used in machine learning. Those measures aren’t very helpful when trying to sell a project to business leaders who are unfamiliar with them.
To do that, data scientists need to focus on the key performance indicators (KPIs) that are meaningful to the business. The KPIs should be easy to measure. Moreover, project designers need to make sure they have the data to show that the model is having an impact on the KPIs.
For example, if an organization has a fraud-detection model, a data scientist can measure the efficacy of that model with metrics such as precision and recall. But those metrics mean little to the business. What it’s interested in—the KPIs that are most important to it—are reducing fraud losses and keeping them down and whether or not the machine learning system is doing that.
After a system is deployed, the people who developed it may move on to other projects, leaving it to run by itself with very little supervision. In that situation, performance will eventually degrade. And when that happens, investigators may find that the conditions in place when the system was implemented have changed.
For example, a data input may have changed and the model may need updating. Perhaps the way data is encoded was altered and no one downstream was told about it.
The point is, you need to have best practices in place around monitoring and governing the system to ensure that, once it’s live, it will continue to run smoothly. Moreover, that’s something that needs to be considered during the design process, so that when the project is done, the tools to perform effective monitoring will already be in place.
Continuous monitoring serves another purpose for developers of nearly every machine learning project. It allows them to justify the project at any time because they can be sure it’s performing as designed and that conditions haven’t changed that could affect its performance.
“In deploying machine learning systems, like any kind of technology system, we have to think about ongoing governance, ongoing maintenance, ongoing monitoring, and bringing best practices to that kind of a project as you would for any other kind of software,” Hines said.
Data science has the potential to transform business, but projects incorporating the technology need to be designed with business goals in mind. In addition, these projects need input from the stakeholders affected by them if they’re to gain any traction in a business. Finally, once implemented, projects need continuous care and feeding to maintain optimum performance.
Watch Hines' full talk here: