In the Machine Learning/Data Engineering (MLDE) department, we like to think of ourselves as the Hollywood rockstars of data with devil-may-care attitudes who follow nobody’s rules but our own. After we’re done fantasizing, we must confess that, as in many growth companies, we actually aren’t very useful in isolation.

For example, an analyst without a data engineer or data scientist can at least run queries and put an occasional report or model together. A data engineer in isolation can move raw data around but cannot necessarily tell you a full data story.

So, why dare call ourselves rockstars? Well, data engineers and data scientists rock because of the way we collaborate, like performing in a rock band. In fact, data engineers and data scientists at ActiveCampaign collaborate with people throughout the entire organization, ranging from analysts to DevOps to product, and, of course, with each other. Through collaboration, we build the binding fabric that lets useful data flow through the organization.

We partner with the Analyst team and support multiple stakeholders

Business dashboards and reports are essential for the operations at ActiveCampaign. Through a tight partnership with our analyst team, we serve the analytics for a wide range of internal stakeholders, from sales and marketing to education and customer success.

The analyst team translates their deep data knowledge into good-looking and meaningful visualizations, while we work hard to build robust data pipelines, handle large data volumes, and ensure good data quality. When we tackle the engineering challenges to build the data infrastructure as the foundation for great analysis, we face another dimension of challenges to swiftly fulfill the increasing demand for data and analysis.

Currently, we feel comfortable saying we have figured out an efficient workflow that works well for us. We have a Google form for our internal stakeholders to submit their requests as the first step; we hold office hours twice a week to discuss the technical details. We also use Jira tickets to share notes and track progress; plus stakeholders can always ask quick questions in our Slack channel.

We customize data pipeline solutions

Although we have created many pipelines that ETL data for the client-facing reports, we don’t provide cookie-cutter solutions to all reports. We customize based on the data patterns of each product feature.

In a recent product feature launch, we met the Product team and talked about what reports they need, what data points are involved, how the data are produced, how the data is being used, and what data is important in their reporting. Then our engineer, Di Zhuang, did a PoC on connecting Kafka to Snowflake, and then came up with two solutions for that Product team to choose.

Unlike other products that save event data into MySQL tables, they publish their clickstream data into Kafka directly, and then our system exports the data from Kafka to Snowflake, powering the reporting data. In this way, not only can the end-users view the data in near real-time, but also the entire system is much more scalable given the distributed data model in Kafka and Snowflake.

We build microservices to provide insights through data science

We build and maintain several machine learning microservices. Our services use ML to receive raw data from all over ActiveCampaign and turn that data into useful insights.

For example, we have been working with another Product team to add sentiment-awareness to customer relationships and automated workflows. Engineers on that team can use our service to decide whether a communication from a user expresses positive or negative feelings, and then we can use that insight to empower our customers to react appropriately.

MLDE also has been working closely with the Product team to make sure that our microservice provides exactly what our customers need to make informed decisions and can scale to meet the needs of our entire business.

We empower cross-functional, organization-wide projects

Right now, our data scientist is working with engineers from multiple product teams, in addition to stakeholders from at least nine departments to use machine learning to build a service that generates insight about what makes our customers churn.

People from the entire organization work with the MLDE department to ensure that we gain an accurate and useful understanding of this key business metric and, more importantly, what we can do to improve it.

The most important skill in coordinating this project has been effective communication. We meet with everyone who has a stake in the project, as many times as it takes, to align on the mission.

It also includes providing well-written documentation that’s available to everyone that is a stakeholder and giving updates, even when the update is that work is still in progress. This ongoing communication gives everyone certainty about the current status of the project and when deliverables become available.

We collaborate with each other

The MLDE team also spends a lot of time collaborating with each other. We believe in pair programming, and we meet many times every week to ensure that we are continuing to work together even when we are physically separated.

We use Slack and Google Meet to their fullest potential to make sure that we stay connected. We come from different backgrounds and have different skill sets, and together we continue to make sure that ActiveCampaign has access to timely, high-quality data.

We innovate together

Data engineering does not work in isolation. Through our collaboration with other teams in the organization, MLDE is a crucial component of ActiveCampaign. Just like the drummer in a rock band, we might not be in the most glamorous role, but we help bind everything else together.