Excel data Mining in Action: Forecasting Twitter Followers for next week
OK, so you know I recently installed Data Mining Excel add-in: How to enable Data Mining in EXCEL powered by SQL Server Analysis Services? – and I couldn’t wait to go beyond the samples provided with the Excel add-in. So I decided to start with Forecasting. In this blog-post, I downloaded my Twitter stats into Excel. And of course, I had to clean and add computations which was equally exciting and I ended up with a data-set that had the follower count and also number of tweets I had.
The Date-Range in the Data-set is from 23 July. 2012 – 5 Sep. 2012. Of course, to get “better” forecast – you need to feed more historical data. In my case, the Twitter API didn’t allow me to pull ALL historical data at one go – let’s not get into details because that’s not the focus of the blog-post. But rule of thumb is that more historical data gives better forecast. And, Here are the steps I followed:
1. Loaded Data into Excel 2010. (I am using Twitter as an example here. Other real world scenario’s would be Sales Forecast). Note that I have kept it simple for the purpose of the demo.
2. Now, let’s create a forecast model.
Go to Data Mining Tab > Data Modeling > Forecast:

3) Forecast Wizard:
a. Getting Started with Forecast Wizard: NEXT
b. Select Source Data. Then Press NEXT
c. Select input columns. In this case, I selected Date as Time Stamp and Total Follower Count & Total Tweet Count as Input columns.
– Notice the Parameters Button? That is used to set the configuration of how the (Time Series) algorithm runs. For the purpose of this demo – I am going to explore that.
d. Finish.
4) It forecast-ed (Using the Time Series Data Mining Algorithm) the follower count for next week and if you can see – it says that on 12th Sep 2012, I would have 438 followers which is +3 when compared to today’s (5th Sep) follower count.

5) Few Notes
a. I had selected Total Tweet count just to show that It can forecast more than one variable at same time. Here the model used the Date Column as the time-stamp while forecasting.
b. Of course, this may not happen for REAL because your follower count can go up or down based on
Tweet (Quality Tweets!) Frequency
Number-of-bots-that-decide-to-follow-you (kidding!)
Re-Tweeting interesting content and replying your followers. Basically being social!
If tweet gets picked by someone famous, your count increases
Other real life “surprises”..
Here’s the point though: This was just a Toy Example to show “forecasting” with Excel Data Mining – If I explore it further, I would document my experiences!
And oh, BTW here’s a nice video by @MarkTabNet and @SolidQ (SolidQ: I work at this amazing company!) on “Microsoft Data Mining Demo — Forecasting (SQL Server 2008 and Excel 2007”. And MarkTabNet is a great resource for Data Miners, Check it out!
Related articles
Where can we find datasets that we can play with for Business Intelligence, Data Mining, Data Analysis Projects? (parasdoshi.com)
How to enable Data Mining in EXCEL powered by SQL Server Analysis Services? (parasdoshi.com)
Data Mining: Classification VS Clustering (cluster analysis) (parasdoshi.com)
What is the difference between Data Analysis and Data Mining? (parasdoshi.com)
