How to Structure Data Science Teams

In the rapidly evolving field of data science, companies are continuously adapting their organizational structures to effectively incorporate data teams. This article explores three common approaches—centralized teams, embedded teams, and hybrid teams—and provides guidance on which type of structure suits different types of companies.

Centralized Teams: Building a strong foundation

The primary approach for most companies is to start with a centralized team that operates independently from other teams, but can be easily reached by other departments to complete data-related requests. This allows companies to build and develop their infrastructure centrally, which can be expanded later on as the team matures.

The advantage of centralized teams is that all employees will be aligned with the data stack. Since everyone is working on the same stack, you’ll have a more unified data infrastructure. Training and onboarding employees is easier, since data experts are working closely together. Centralized teams will also be more effective at capturing and transferring knowledge.

The disadvantage is that centralized teams might not be able to act on critical insights they develop. If a central team is not tightly integrated with other departments, these insights might be saved somewhere without anyone following up on them. Response time will also be slower, since one central team must handle all queries and they will need to prioritize things.

Embedded Teams: Tailored expertise for departments

As companies expand, they tend to move from a centralized team to an embedded team. The central team gets divided into sub-groups and assigned to work with separate divisions.

“Companies need to centrally build and mature their data team first before distributing it into the organization: “centralize and then decentralize”.

- Shane Murray CTO @ Monte Carlo Data. From our event
THRIVE: NYC Data Leaders

In other words, develop your team and infrastructure first before subdividing into smaller teams that specialize in assisting specific departments.

The main advantage is that departments can get answers faster, since they have a sub-team purely dedicated to handling their queries. Embedded teams can also include domain experts in the division they are assigned to, which will lead to more tailored solutions and insights instead of generalized ones. The finance branch will benefit from data scientists with a background in econometrics, whereas the marketing department can leverage the skills of a social researcher.

Keep in mind that embedding will not work for all companies. Let’s imagine a non-profit that publishes an industry report on diversity in companies twice/year. In this case, a centralized team is more optimal, since they don’t need to have a rapid response cycle. The data science team will collect and analyze the data, after which it’s submitted, formatted and published once.

Hybrid Teams: Striking a balance

As companies mature, they often settle on a hybrid approach between these two. Data science teams are centrally managed under the same leadership, and use the same platforms and methodologies, but are divided into smaller teams working closely with specific departments. This strategy provides the best of both worlds, since the team can grow and learn as a whole, but are still flexible enough to service tailored requests from other departments.

The most important challenge for hybrid teams is communicating and collaborating in parallel, while staying focused on their own departments’ needs. Central documentation and information flow are essential, so avoid having distributed knowledge. When teams build great solutions, make sure to capture them so they can be studied and leveraged by future generations!

Hybrid or embedded teams are also encouraged to have all-hands meetings to share successful outcomes, discuss bottlenecks and preview new software tools. This can be informal gathering where team members can share feedback, so it can be formally centralized across the entire data department. This can reveal issues that might not have been documented otherwise.

How can Vectice help?

Vectice offers a centralized platform that captures all important assets of data science and machine learning projects (notebooks, datasets, model versions). Regardless of whether you’re using central, embedded or hybrid teams, your organization will benefit from a central place where insights and knowledge can be captured. Contact us and try Vectice today.‍

Back to Blog