The state of our video ID tools (by Steve Chen, YouTube co-founder)
Recent speculation and stories like this Wall Street Journal article or this Reuters report on YouTube's use of video identification tools made us think it would be useful to clarify what we’re doing. We’ve been developing improved content identification for months, and we’re confident that in the not-too-distant future, we’ll unveil an innovative solution that will work for users and content creators alike. This is one of the most technologically complicated tasks that we have ever undertaken. But YouTube has always been committed to developing sustainable and scalable tools that work for all content owners.
Even though we haven’t given too many details, we’ve been hard at work. Earlier this year we implemented audio fingerprinting technology from Audible Magic, to help identify the audio content of music partners like Warner Music, Sony BMG, and Universal. Today we're experimenting with video identification tools, and will share with you a few core principles driving our technology development, past and present.
We are beginning tests on an automated system to identify and match specific videos. The technology extracts key visual aspects of uploaded videos and compares that information against reference material provided by copyright holders. Achieving the accuracy to drive automated policy decisions is difficult, and requires a highly tuned system. Once accuracy is achieved, the challenge becomes speed and scale to support the millions of people who use YouTube every day. We are working with some of the major media companies to test what we have developed. We’re excited about the progress so far, and we’re dedicated to making these tests successful, but as always with cutting-edge technologies, there’s no guarantee of success.
Now, when it comes to spotting pornography and graphic violence, and other content prohibited by our terms of use, nothing beats our community flagging. Once a user flags a video, we immediately review it and remove it if we find a violation. But our community can’t identify infringing content. We all know pornography and violence when we see them. But copyright status can only be determined by the copyright holder. That is because almost anyone who creates an original video has the copyright for that work, and such a wide range of copyright holders' preferences vary widely.
Some copyright holders want control over every use of their creation. Many professional artists and media companies post their latest videos without telling us, while some home video-makers don't want their stuff online. Some legal departments take down a video one day and the marketing department puts it up the next. Which is their right, but our community can’t predict those things, and neither can we. The same is true for technology. No matter how good our video identification technology gets, it will never be able to read copyright-holders’ minds.
If a content owner identifies material that she doesn’t want on YouTube, she can request its removal with the click of a mouse. If particular users repeatedly infringe copyrights, we terminate their accounts. We have long made a practice of creating a unique "hash" of every video removed for alleged copyright infringement and blocking re-uploads of the hash. We educate users on what is and isn’t permissible under the law. Our upcoming video identification system will be our latest way of empowering copyright holders, going above and beyond legal requirements.
We’ll continue our focus on delivering a great user experience. YouTube's no-fuss upload lets video artists collapse the gap between the creative moment and its worldwide publication. It helps our hundreds of media partners - as well as marketers and advertisers - spread their hottest work while it's still hot. And it enables presidential candidates participating in our YouChoose 2008 program to engage in a direct, open dialogue with voters, bringing transparency, access and authenticity to the political process. We’re carefully designing our new identification technologies to not impede those free and fast forms of expression.
In conclusion, a content management system has to have technology that provides high quality matching and detection, but it also has to apply business rules in ways that support the business objectives of partners while providing high quality user experiences. With the introduction of our video identification tools, YouTube will continue to be the leader in online video, and the premier destination for watching and sharing original videos worldwide. Now, back to work…