Here’s how this works: YouTube URLs look like this:
https://www.youtube.com/ watch?v=vXPJVwwEmiM
That bit after “watch?v=” is an 11 digit string. The first ten digits can be a-z,A-Z,0-9 and _-. The last digit is special, and can only be one of 16 values. Turns out there are 2^64 possible YouTube addresses, an enormous number: 18.4 quintillion. There are lots of YouTube videos, but not that many. Let’s guess for a moment that there are 1 billion YouTube videos – if you picked URLs at random, you’d only get a valid address roughly once every 18.4 billion tries.
We refer to this method as “drunk dialing”, as it’s basically as sophisticated as taking swigs from a bottle of bourbon and mashing digits on a telephone, hoping to find a human being to speak to. Jason found a couple of cheats that makes the method roughly 32,000 times as efficient, meaning our “phone call” connects lots more often. Kevin Zheng wrote a whole bunch of scripts to do the dialing, and over the course of several months, we collected more than 10,000 truly random YouTube videos.
Once you’re collecting these random videos, other statistics are easy to calculate. We can look at how old our random videos are and calculate how fast YouTube is growing: we estimate that over 4 billion videos were posted to YouTube just in 2023. We can calculate the mean and median views per video, and show just how long the “long tail” is – videos with 10,000 or more views are roughly 4% of our data set, though they represent the lion’s share of views of the YouTube platform.
Ethan Zuckerman
Fascinating method – and fascinating results! If these are anywhere near accurate, in 2023 there was a new YouTube video for every other person living on the planet! The stats also highlight the extreme inequality in traffic between a tiny minority of popular uploads and the typical YouTube video, which can’t even top 10,000 views.
The growth curve looks exponential, which raises the question: how long can this proliferation of video content continue? We are accustomed to think of the digital realm as limitless, a place where you can create anything and multiply it however many times you want, because each new copy is essentially free. But at the scale of YouTube, the system may start hitting its constraints in terms of server space to store video files and bandwidth to deliver them to viewers. Earlier this year, Google updated its inactive account policy to inform users that they may start deleting accounts and their contents after 2 years of inactivity – they wrapped this announcement in a narrative of security, but I wouldn’t be surprised if the ongoing costs of keeping old content around were a factor as well.
Post a Comment