Mark Kosoglow: Welcome back to Mythbuster Mondays. Mondays are now my favorite day of the week. We get to talk about some awesome sales myths that everybody talks about. And we're either gonna to prove 'em, or we're gonna debunk 'em. My name is Mark Kosoglow, I'm the VP of Sales at Outreach.
Pavel Dmitriev: I'm Pavel Dmitriev, VP of Data Science.
Yifei Huang: I'm Yifei Huang, Machine Learning Engineer.
Mark Kosoglow: Today we're gonna to talk about some common myths that include video in email and whether they increase reply rates or not. So, how do these things happen? How do people start using video in emails all of a sudden? Well, I can imagine it, you got your sales team, one dude on Monday sends an email that has a video on it, he gets all these replies. He tells all of his friends about it. By Thursday, the whole entire sales team is putting videos in the emails, spending hours a day creating these videos, right? It may or may not work. We have no clue, but everybody's doing it. Should they be doing it? So that's what we're going to find out today.
Pavel Dmitriev: We designed an experiment, where we compare two email templates. Both were short follow-ups on the email which we previously sent to the prospect and did not get any reply. The first template had a video. And the second did not. So the hypothesis that we want to test is whether the template with the video results in a higher reply rate or not.
Mark Kosoglow: Interesting question.
Yifei Huang: Yeah, so the method we used to scientifically test the hypothesis is called A/B Testing. We used the Outreach sequence feature. And at a given sequence step in the sequence, we randomly assign one of these two templates. And the keyword here is randomly. So, essentially you can think that the machine flip a coin in the background. And then according to the flip of the coin randomly picks one of the templates. And then, after we run the experiment for three weeks, we calculate the reply rate for each of the templates.
Mark Kosoglow: So I'm a sales leader, and most of us are not scientists, right? We're not PhDs like you guys. And so, I think what a lot of sales leaders do is they say, "All right, I'm going to divide my team in half. "And Team A, you use Template A. "Team B, you use Template B." Is that a good strategy or not?
Yifei Huang: In many cases, this has some problems. In a good situation, maybe the two teams relatively have the same expertise. They target same kind of prospects. And they all use only their designated templates. I think that in this case, the problem is, like, less severe. The problem can become really severe when these two teams are quite, they target quite different prospects. Or, like, in the worst case, you're saying that "Reps, please randomly choose "which one of the template to use." And in this case, we rarely see that the reps are really choosing randomly.
Mark Kosoglow: So sending out an email that says, "Hey guys, "I want you to try to use one of these two templates. "Pick A or B, and use them randomly." That's probably not gonna be good.
Yifei Huang: No, we see that, and it doesn't work.
Pavel Dmitriev: Yeah, the reason is that the reps also have heard the myths. And they will probably assign the prospects they believe are better to the templates that they think is gonna work better, because they wanna meet their quota. They have different goals from us running the experiment. So we do not want to let them pick what to do.
Mark Kosoglow: So, reps are biased towards hearing the myth, thinking it's gonna generate better results. And maybe even subconsciously they start choosing the template that they think is gonna win and using it most, which ruins the experiment.
Pavel Dmitriev: Exactly, and that's why we want the machine to choose it and not the reps to do it manually.
Yifei Huang: So in this test, we observed that a template without a video link had a 20% reply rate, which is almost double the reply rate of the template with the video link. So, we performed statistical test and found that the result is highly statistically significant. So, basically we have 99% confidence that this is not due to chance. It's helpful to remind that this is one particular video compared to another particular email template. But the result is very solid. In this particular case of comparing that video link versus another email template.
Mark Kosoglow: So I'm trying to figure out, like a larger learning, a metapoint of, should my team be using video or should they not? So, this experiment that we ran wouldn't necessarily answer that question. It only answers the question, is it better with this specific video against this other specific email template, right?
Yifei Huang: Yes, exactly, and I would say there's always a challenge of generalizing a piece of hard evidence. Because in theory, we can never prove that no one can make a super well video that beat another email template. However, if the team tried, make several attempts, and all the test is showing that the video link cannot beat another, like, email template, then this really showing that at least it's very hard in that scenario to make the video link really work.
Mark Kosoglow: So, one thing that strikes me when I talk to you guys sometimes is this experimentation and trying to figure out what's right is really hard. Like, as a sales guy, every time I talk to you, I'm like, "Man, there's so many things I'm doing wrong." Like, if you were to give, say, two or three tips to a sales person to make sure that, like, in an easy way this is how you can make sure the that test you're running is gonna give you results you can trust, what would be your top three tips, Pavel?
Pavel Dmitriev: The first step is what Yifei has been saying. That we need the two groups which are equivalent, in all possible ways. We wanna eliminate any human sort of bias from selecting the prospects into those groups. The second tip is that we need a good amount of data. We can't just make a decision based on 10 emails sent in Template One, and 10 emails sent in Template Two. There is just too much a possibility of a chance. The third tip is to think in advance about the setup of the experiment. What is it that we are going to measure? And what is the size of the difference that we are planning to detect? And that will help to define how long do we need to run this experiment? How much data do we need to collect? And it will make the results more trustworthy.
Yifei Huang: As organization it's good to have such a formal procedure of, like, encouraging people to propose ideas. And when they propose an idea, there's a very short and brief documentation. What's Option A versus Option B? And what is the proposer's hypothesis on which one is better? And what is the reasoning behind that hypothesis? So that way, we have very methodical way of documenting those creative ideas. And also, I think, what's important is to have the psychological safety of being wrong because that's the reality.
Mark Kosoglow: So if you were to be tasked by a sales leader to say, "Listen I need to know if video works, "or it doesn't work." Like, how would you kind of set that up so that you can answer that question with a high level of confidence?
Pavel Dmitriev: Maybe a better way to phrase that question is to identify when video works and when it does not work. Then we could go through a set of scenarios, to run experiment on each one of them, and most likely, we'll find that sometimes video works, perhaps most of the time it doesn't, but sometimes it does.
Mark Kosoglow: So you're saying let's take a look at our five most common scenarios and test those, rather than taking one test and kind of running out a huge generalization based on that.
Pavel Dmitriev: Exactly.
Mark Kosoglow: It's kind of common sense, but I would tell you that in my experience the fact that that's happening, it doesn't exist. Like most sales leaders aren't taking that much thought to figure out, "All right, here's my prospecting scenario. "Here's my customer renewal scenario. "Here's my, you know, pipeline management scenario," and kind of running it across those different areas to see what's working. But otherwise, like, generalization just is very difficult.
Pavel Dmitriev: Yeah, well I think sales leaders should start paying attention to these techniques. In many industries, like software engineering, this is pretty much a gold standard for evaluating any new feature shipped to the product. Companies like Microsoft, Google, Facebook, Netflix, test every single idea. I think we should start doing it in sales.
Mark Kosoglow: Yeah! It's a daunting thing, but you know what? Outreach is here, we're building the feature set. We've got guys like Yifei and Pavel here, whose job it is to bring this difficult concept of science into the realm of sales. So what did we learn in this episode? Not much about video and emails. But I think what we did learn was a powerful lesson on how sales leaders need to be much more methodical in how they're trying to figure out what's working and what's not working. So, in the next episode, we're gonna discuss whether it's better to be forceful or more socially minded in email. So, guys, we're going to see these people...
All: Next Monday!