In January, Twopcharts estimated that Twitter would break through the 500 million user mark some time on the 25th of February. But now it looks like it will happen 3 days sooner, this Wednesday the 22nd.
We’ve created a graph that plots Twitter’s exact number of registered users over time, dynamically. In other words, you can follow it as it happens.
Two moments in the growth of Twitter stands out. Around March 2007 it had its first major spurt in registration rates when it won to great excitement the SXSW web award. Then, two years later, in March 2009 it had what is often called it’s hockey stick moment – and accelerated dramatically for a second time.
From then the registration rate has slowly but surely accelerated even further. There are no more major surges in registration, even the Twitter integration in Apple’s millions of iOS devices have not created a clear new inflection point. But the ever faster trend in constantly bearing upwards is a glorious bending arch.
Also note that this graph will include accounts that have been deleted. But it is impressive none the less.
How did we arrive at this graph?
Each tweet on Twitter has a rather big collection of user data embedded in it. From your bio, to your profile background and much more. Which is why Tweets are so much bigger in data terms, than SMS. But for our purposes two of these data fields are of particular interest: the time a user has joined Twitter, and a numerical user identifier.
We noticed that the user id’s always increased when time increased, and wanted to check if they are assigned one after the other, in a linear way. This means the total amount of users at the time when a user joins Twitter, is a function of the user id the user gets assigned. In other words, in its simplest form, the first user got id 1, the second user id 2, and so forth.
We captured hundreds of thousands of Tweets as they flew by in real time. And with the Tweets we got users’ information, and plotted the user ids on one axis and the time the user joined on the other. We then compared these with a few “known points” – points where Twitter actually released their stats.
Not only did this cross-check across all the known points, it also yielded the coefficient we had to multiply the user id with to get the total amount of subscribed users at that point (this happened to be one, by the way – in other words, the user id numbers are dished out in straight forward sequence).
Al that remained was to make the graph dynamic (it collects more data as we speak, so will always stay updated), and write a function that continuously weed out the extra data we don’t need (to stop the graph from becoming slower and slower over time).