Why recent attempts to model Facebook’s News Feed algorithms are bogus
Some opportunists have seized upon market frustration to peddle their own snake oil, whether it be how they “cracked the code” to the News Feed algorithm or produced a formula that uncannily can predict how much reach you’ll get. First, we’ll dissolve these arguments…
A simple linear regression follows the structure of y = mX + b, where:
y is your dependent variable- the output of the model or the thing you’re trying to predict.
m is the slope of the line, which is a constant, in this case.
b is the y-intercept, which is the output you get if you put in 0 for X.
You could have x be the number of ad dollars you spend on Facebook and y be paid impressions.
You’d expect a nearly linear relationship, of course, placement mix being equal.
Or perhaps you choose ad dollars vs. paid reach.
Not quite as linear, since the higher the frequency, the lower the reach.
And you could try to fit paid reach against rolling organic impressions in the following 7 days.
Less correlation, but still partially predictive if paid reach is a significant proportion of total reach.
Then you can add in more variables besides X:
- posting frequency
- number of images in a post
- the number of emails sent (doesn’t have to be a Facebook variable)
- whether you’re having a sale
- if a competitor went of our business
- the temperature outside
- or anything you can quantify
We get a bit more realistic with predicting Facebook performance by allowing more variables and including multi-linear regression (a fancy way of saying that we want to have a curved line predict a variable, instead of a straight line).
A model that uses only Facebook data to predict Facebook data is weak, and often self-referential. What happens if a company runs a big TV ad or pumps up their Google AdWords? Certainly, there are spill-over effects into Facebook.
You still have the issue of negative co-factors, coefficients not being constant, and these variables not being independent.
IN PLAIN ENGLISH
- Each variable behaves differently at different volume levels. Let’s say you know on average you get 50 cents of revenue for every email you send. That may hold true up until a certain point of diminishing returns. And then it goes negative, meaning that the more email you send, the LESS you get in conversions. At that point, customers who would have bought are turned off by too much marketing.
- There is no metric for how interesting or on-brand your content is. You can use engagement rate, but it’s gameable to memes, cat photos, and contests.
- The impact of ads, when done properly, is long-term. Growing a high-quality fan base with a paid assist builds a larger audience that monetizes over time. So the time frames of when the advertising occurs and when you receive benefit do not align.
- The most important variables can’t be accurately measured. All else equal, more fans is good. But we have to always balance variables in pairs, so that we aren’t accidentally cheating ourselves. There is always a quality vs quantity trade-off.
- Correlation is not causation. A growing social base might correlate to an increase in sales, but it’s hard to say which one causes the other. In fact, if we believe the true power of social is existing customers spreading the word for us, it most definitely is the case that the reverse funnel is in effect.