This post looks at the impact of exponential growth on projected viewership.  I take Liziqi, a Chinese food and lifestyle blogger with over 1 billion views and over 10 million subscribers, as my example of an internet influencer with exponential growth.  The point of this analysis is to show that projecting growth using a linear model when the growth is exponential will mis-predict growth.

Chart 1, below, shows that the growth of Liziqi’s viewership between November 2017 and May 2020 is more appropriately modeled as exponential.  The dotted line on the graph shows the predicted views from a linear regression of views on calendar month and is an ill-fit for estimating the exponential growth.

In order to transform the exponential curve into a form that makes it possible to apply linear regression, I take the natural logarithm of views and turn the exponential trajectory into a log-linear form.  Chart 2, below, shows the log-linear form and the dotted line shows the predicted views from a regression of log views on calendar month.  Even visibly, the fit of the dotted line to the log-linear curve is better.

Still, we can do better.  The curve of the line suggests that log views are distributed quadratically by calendar month.  Chart 3, below, shows that the predicted views from a regression of log views on month and month-squared almost completely overlap the log views trajectory.

Finally, I compare the results of the predicted views from the ordinary least squares regression and the log-linear quadratic regression and find that the log-linear quadratic predictions are somewhat smaller than the ordinary least squares predictions.  The ordinary least squares regression predicts that, between May 2018 and May 2020, Liziqi’s views grew by 6,194%.  The log-linear quadratic regression predicts that, between May 2018 and May 2020, Liziqi’s views grew by 5,373%.  The log-linear  quadratic predicted change is less than the predicted change of the ordinary least squares.

The exponential distribution of views is not a one-off phenomenon, but rather, many videos that are “viral” have exponential growth.  For example, the video Gangnam Style reached hundreds of thousands of views on the first day, millions on the twentieth day, and hundreds of millions in two months.  Estimating projected views or estimated earnings when viewership is exponentially distributed calls for extra care.

Appendix A.  Linearity Assumption

To show that the log-linear quadratic transformation is a better fit for linear regression than simply ordinary least squares without transformations, I look at the actual versus fitted plot, in Chart 4 below, and find that the relationship between the transformed model’s prediction and the actual result is very strong.  Furthermore, in Chart 5, I look at the residual versus fitted plot and find that the residuals are approximately randomly distributed around zero.

For the basic ordinary least squares scenario without transformations, the actual versus fitted plot looks like Chart 6, which suggests the linear line is a poor-fit for the data.  The residual versus fitted plot looks like Chart 7.  There is a clear pattern to the residuals, which suggests we have the wrong functional form.

 ((1,193,682,000-18,966,350)/(18,966,350))*100 = 6,194%

 (e^(20.93901-16.93662)-1)*100 = 5,373%

 I start with May 2018 and not November 2017 because ordinary least squares actually predicts negative views before May 2018, so it is difficult to calculate percentage growth from a negative baseline.