Building an AI team and hiring the right roles can be challenging, but getting started is not nearly as costly or complex as you might assume. The key is to have a good understanding of the problem you are trying to solve with AI and the outcomes you want to achieve. Often, the kinds of problems that organizations are trying to address with AI are very simple and don’t require massive investments or a large team.
Depending on your requirements, a single data engineer, two machine learning engineers, and a data lead—a total of four people—might be all you need to get started. How an organization builds on that base depends largely on its objectives for AI and its understanding of the key roles needed in the AI team to help achieve those objectives.
Not all teams will require all positions. For example, large, established AI teams can have roles that may not be entirely necessary for organizations that are just starting or don’t have the resources.
Some roles are also harder to fill than others. ML engineers, for example, are a hot commodity. They are hard to come by because they tend to get pulled into strategic roles very quickly. ML engineers who are good and have any kind of seniority quickly get pushed into more senior roles, such as principal ML engineer or head of AI.
In contrast, ML researchers and scientists are somewhat more straightforward to get because they tend to come out of academic institutions and are available through traditional talent-sourcing mechanisms.
Here are five key AI team roles organizations need to consider, whether they are just starting their AI journey or building on an existing effort.
These are the essential roles to building an effective AI team.
Data scientists help to close the measurement gap between running ML models and understanding their impact on business outcomes. They work with ML engineers and product engineers to quantify impact and provide information and reports that help improve ML models that are in production or to help build new ones.
When a model is not performing as well as expected, it usually is the data scientist who helps to identify why and find correlations in data that explain any performance issues.
In addition to helping organizations measure the impact of ML models that are in production, data scientists identify opportunities where organizations can leverage AI to drive new business benefits or address specific business problems and issues.
Data scientists need some programming ability, but generally, they require less of a technical background than ML engineers. Many data scientists have finance and business backgrounds, which help them connect the technical details to the business outcomes.
Data engineers take raw data from one or more sources and convert and structure it for use by ML engineers and data scientists. The end products they typically produce include extract-transform-load (ETL) pipelines and data stores and data lakes that offer a centralized source of usable data for analytics. Often, data engineers work with data scientists, ML engineers, and product managers to identify new sources of raw data or to help track and mitigate problems with existing data sources.
Data engineers are crucial for any organization with ML models that are in production or have a large volume of raw data from which they want to mine usable information. A data engineer can help amplify the productivity of a data scientist or ML engineer by eliminating the need for them to try to track down and comb out data before they can start building models.
Depending on the complexity of an organization’s internal data, data engineers can start providing value to the organization in the first month, even just by cataloging and making available data sources for data scientists and ML engineers. But they can also get bogged down by internal politics over issues such as data access rights and requirements for data retention and compliance. They may need to spend a lot of time slogging through those issues.
Many organizations use the titles of “ML researcher” and “ML engineer” somewhat interchangeably, and this can lead to mistaken expectations.
ML engineers get ML models working in production. They are responsible for implementing the ML models that data scientists build or prototype and for ensuring that those models work at scale in a production environment. The ML engineer role may involve writing production-level code and building APIs around ML models to integrate them with the rest of the production environment. Often, they work with data engineers to create an end-to-end workflow for the process.
ML researchers, on the other hand, are more focused on advancing a specific domain or domains within the ML field as part of their organization’s broader—and often longer-term—goals. Some companies have ML researchers whose job it is to produce research reports, academic papers, and interactive proofs-of-concept. Their mission often is to work on more fundamental, long-term, theoretical problems. The business impact of an ML researcher’s role often takes years to be fully realized.
Many people make the mistake of hiring a researcher when they really need an ML engineer to get things into production. The best way to differentiate between the two is to look at the desired outcome. If you want to get something working in production that has a business impact, you need an ML engineer. If the outcome you want looks more like a research paper or a demo, you need an ML researcher.
It is a red flag anytime someone you interview for a position as an engineer insists on having the “ML researcher” title. ML engineers who think of themselves as ML researchers may be harder to motivate to address the applied side of the problem.
Software engineers in AI teams typically work with ML engineers to integrate ML models into production systems. The work might involve hooking up a model to front-end systems, connecting to a data source, building APIs and user interfaces, and ensuring that the final output is what the product manager or other internal stakeholder wants.
In most AI teams, the data engineer produces the ML model, and the AI software engineer then works with the ML engineer to ensure that it can be consumed by the applications and users for which it was designed.
Depending on the size of your data science team and its maturity, an ML engineer could conceivably handle the software engineer’s role as well, because at the end of the day, the primary mission of both roles is to ensure that ML models are implemented and integrated with production systems.
The chief AI officer, who may go by the title of “chief research officer,” has both an internal- and external-facing role. The external responsibility is keeping the company informed about what’s going on in the AI field within the research and academic community. But the chief AI officer is also an internal advocate for things that are happening in the research community and helps the business understand how advances in AI and ML can be applied to different problems within the company. Individuals in this role help teams that have never worked with AI before identify potential opportunities for applying data science methods to improve business outcomes and can serve as a resource for a business group looking for ways to use AI to solve a problem.
The chief AI officer is also responsible for setting enterprise policy that spells out what the organization should and should not be doing around AI, as well as for providing guidance on issues such as data bias.
A good chief AI officer has a technical background that matches whatever version of ML engineer or ML researcher exists within the AI team. For example, if a lot of those roles are focused on producing papers and R&D, the chief AI officer should excel at those skills. If the internal team is more focused on getting things into production, the chief AI officer should have a solid background in those areas.
Most importantly, chief AI officers must have the ability to translate deeply technical subjects into nontechnical jargon. They need to be able to explain complex AI and ML concepts and why AI might be good or terrible at solving a particular problem.
As your AI effort grows, you might find these other roles to be helpful—but you won’t need to hire them at the very beginning.
The data analytics lead manages data scientists and data engineers at organizations where the AI team has grown large enough—or has become mature enough—to effectively be a sub-team within the larger enterprise AI group.
This title is a bit of a misnomer; it typically refers to people in roles such as principal ML engineer or principal research scientist. The role falls roughly halfway between that of the ML engineer and the chief AI officer or chief research officer. Like the latter, principal engineers and principal research scientists represent the company externally.
They also function as sounding boards for ideas and issues internally. If there is a sticking point in a companywide initiative, it is the principal ML engineer or research scientist who must step in to discover the problem’s root cause and find a way to resolve it. The role typically tends to report to the CTO or CEO.
There is some value to having one individual in the role for organizations with a large enough AI team. However, it probably is not a good idea to have two or more of them.
Having a sponsor to articulate the business value of the AI effort and highlighting its successes is extremely valuable. Sponsors can help define the business outcomes that the AI team is trying to achieve. They can be an advocate for using AI to address specific problems or explore new opportunities for using data or a cheerleader when ML models achieve desired outcomes.
Another value in having an AI sponsor is that, because many people usually contribute to the effort to build an effective ML model, it is helpful to have someone in place who can procure the executive buy-in necessary to get enterprisewide collaboration and support for the effort.
Building an effective AI team is all about knowing what roles you need to meet the specific goals of your organization. A good starting point is to consider whether your organization’s primary goal is to use ML models in production to impact business outcomes or to conduct longer-term research and understanding opportunities to tap AI at a future date.
The kinds of skills that you need to meet the first objective are different compared to the latter goal—and they are broader as you scale up. A pure ML researcher, for instance, is unlikely to be of much value if your focus is on integrating models into the production environment.
Keep in mind too that if your organization is just getting started on the AI journey, you most likely won’t need to stand up a full-fledged team to get your effort off the ground. Often, it is possible to start with a small squad whose team members can handle each other’s tasks.