In January, Twopcharts estimated that Twitter would break through the 500 million user mark some time on the 25th of February. But now it looks like it will happen 3 days sooner, this Wednesday the 22nd.
We’ve created a graph that plots Twitter’s exact number of registered users over time, dynamically. In other words, you can follow it as it happens.
How did we arrive at this graph?
Each tweet on Twitter has a rather big collection of user data embedded in it. From your bio, to your profile background and much more. Which is why Tweets are so much bigger in data terms, than SMS. But for our purposes two of these data fields are of particular interest: the time a user has joined Twitter, and a numerical user identifier.
We noticed that the user id’s always increased when time increased, and wanted to check if they are assigned one after the other, in a linear way. This means the total amount of users at the time when a user joins Twitter, is a function of the user id the user gets assigned. In other words, in its simplest form, the first user got id 1, the second user id 2, and so forth.
We captured hundreds of thousands of Tweets as they flew by in real time. And with the Tweets we got users’ information, and plotted the user ids on one axis and the time the user joined on the other. We then compared these with a few “known points” – points where Twitter actually released their stats.
Not only did this cross-check across all the known points, it also yielded the coefficient we had to multiply the user id with to get the total amount of subscribed users at that point (this happened to be one, by the way – in other words, the user id numbers are dished out in straight forward sequence).
Al that remained was to make the graph dynamic (it collects more data as we speak, so will always stay updated), and write a function that continuously weed out the extra data we don’t need (to stop the graph from becoming slower and slower over time).
Posted by Adriaan Pelzer