How It’s Made: The Story and Science Behind Predictive Content

How It’s Made: The Story and Science Behind Predictive Content

ActiveCampaign just rolled out Predictive Content in beta for Professional and Enterprise tier users. Built using natural language processing, Predictive Content turns your batch and blast marketing emails into an engine that can further engage customers and inspire them to take action. Create up to five variants of email copy, then we send each customer the one they’re most likely to click on.

Let’s take a look at how we built a model that delivers the right content to the right contact.

Natural Language Processing

Natural Language Processing (NLP for short) is a branch of artificial intelligence that focuses on the interaction between computers and human language. There are many examples of this on the internet that you use today, for example, the fact that Google knows that ActiveCampaign’s headquarters is in Chicago even when all I’ve typed is “ActiveCampaign” into the search bar.
5sdfmiyb 96962cb5bbb2c9cb44cfe22f6041bc0e ac gif for ds blog
Before computers became really powerful, most NLP was done through a series of complex rules that humans wrote. Think of “i before e, except after c”. But now, computers can easily hold all of the words in Wikipedia and represent those words with numbers. How do they do this?

Let’s take a simple sentence, “The cat sat on the mat.” First, we break each of these words into tokens. An easy way to think of these are just as words and punctuation that make up a sentence. Then we create the vectors that represent each of these words. These vectors are just a numerical representation of each token. Notice that “the” is only listed once in the list of vectors, but it’s counted twice in the vector, once as the first word and then as the fifth word.

34fhxqwx screenshot2019 11 08at12.00.41pm
Each of these vectors is compressed until each word is represented by a long string of numbers. The computer understands that this list represents how many times a word is mentioned and the position of each word in a sentence.

What else are we looking for?

So now that we’ve got these vectors that contain all of the language information about words, what else are we looking for? Since we know that some people like all of the details and others want you to get the point, we also include measures of how long your sentences are, the email length overall, and the kind of punctuation you use. We also include measures of email complexity including the Gunning-Fog Index.

All of this information is collapsed into one vector per email variant that you provide. When you write these variants, you’ll want to use different words in each email, vary the length of your variants and sentences, or use different kinds of punctuation. The model will create a vector that represents all of the different variants.

But how do you know what contacts want to see?

We do this same process for each email that your subscribers have clicked on. This vector is compared to the variants that you provide and we determine which variant is most like the variant that your contacts have already clicked on. What about contacts that haven’t clicked on anything yet? Luckily, ActiveCampaign’s Predictive Content system is prepared for that. We’ll randomly send a variant to contacts until they’ve clicked on at least one of your emails. This way, we’re maximizing the type of content that your contacts will see and hopefully activating those contacts who haven’t responded before by showing them something less like what they’ve seen in the past.

All of this means that Predictive Content isn’t magic, but rather backed by decades of research, data and cool math. We hope this blog has answered some of your questions about how we built this feature and makes you excited to try it! We’re excited to have you use it and build the next iteration of this feature.

Never miss an update