UGA Music Business
Undesirable Lyric Website and Advertiser List
March 20th 2014
University Of Georgia
*Preliminary versions of this report were released Aug 28th and Oct 22nd 2013. We have since adjusted our methodology and expanded pool of data.
(Word formatted PDF here UGA Music Business Undesirable Lyric Website and Adveriser List March 2014 (1.1))
Current list: http://ugalyricwebsitelist.org/2014/03/21/current-uga-undesirable-lyric-website-list/
There has been much attention paid to recorded music and movie infringing websites over the last decade. Songwriters are only tangentially included in these studies. However, unlicensed lyric sites have been largely overlooked even though lyrics are half of the copyright in a song. There are several reasons that illegal lyric sites have been overlooked in studies:
1) The public doesn’t understand that lyric sites are a type of copyright infringement due to the unlicensed use of song lyrics.
2) The many legal actions associated directed at these lyric sites have generated little press coverage.
3) It appears that commentators have always assumed that there is not much money at stake with infringing lyric sites. At the dawn of the Internet age, the analog equivalent of lyric and tablature websites was in the form of “sheet music”. Sheet music sales had become a relatively smaller business by 2000 than it had been even in the 1990s. .
However, there is anecdotal evidence that these lyric websites generate huge web traffic and may involve more money than one might think. For example, we have found that the fully licensed www.azlyrics.com frequently ranks in the Top 500 websites in the U.S. Based on the popularity of lyric searches, it is possible that unlike the sound recording business, the lyric business may be more valuable in the Internet age than before it.
Indeed, the vast majority of these lyric websites seem to have well-established monetization schemes based on advertising. In this iteration of the Undesirable List, we include preliminary analysis of the brands whose advertising appears on the sites we observed and that are included in the Undesirable List.
Many of the sites appear to have accounts with major online advertising exchanges and prominently feature advertising from major brands. There are even companies that appear to specialize in matching specific lyrics to key demographics for advertisers and may actually sell lyrics as key words. The prevalence of illegal lyric sites is even less understandable because songwriters, , music publishers and technology companies have worked together to offer easy and inexpensive licenses for lyric sites through www.lyricfind.com and www.musixmatch.com.
However, the illegal site business appears to be robust and flourishing, and licensed sites struggle to keep up. Just like their music streaming and music download counterparts, licensed lyric sites compete with many unlicensed sites that are not burdened with royalties to songwriters and publishers.
The Lyric Website Undesirability Index And List
Undesirable vs. Illegal
Our use of the term “undesirability index” is inspired by Google’s UK Policy Manager Theo Bertrand, who used the term during a recent debate in London. We use the term “undesirable “ because these sites do not appear to be licensed and are therefore undesirable advertising publishers for legitimate brands. Mr. Bertrand distinguished between a site that was in fact operating outside the law (a legal determination that Google typically interprets as requiring a final nonappealable judgement in the highest court in each country where the Internet obtains on a link by link basis) and a site that was simply one that appeared to be illegal in the mind of a brand manager who did not need there to be a legal determination to disassociate their brand from the site—hence, an “undesirable site”.
We cannot absolutely conclude, from the outside looking in, that these sites do not have licenses. However, we could not locate the sites in the www.lyricsseal.com database and they do not appear to have otherwise been flagged as licensed by The National Association of Music Publishers. We make the assumption that for the purposes of determining undesirability in line with Mr. Bertran’s suggestion, the absence of a determination by the NMPA that a site is operating with licenses is sufficient for us to identify that site as “undesirable”.
This leads us to conclude these sites are most likely unlicensed, but with the caveat that some of the sites on the Undesirable List are licensed or are partially licensed and we have not been able to locate those responsible for their licensing. If you feel your site has been mistakenly included in this list, please contact us at uga_undesirable_list<AT>outlook.com. We will confirm your licenses and will be glad to remove your site from the list if you in fact are licensed.
Partially Licensed Category
With this month’s iteration of the Undesirable List, we have introduced the concept of “partially licensed”. This designation is based on information from the NMPA that at least one of their members have licensed the site in question. Some of the partially licensed sites appear to have been influenced by previous iterations of the Undesirable List and the NMPA’s efforts at encouraging sites to obtain readily available licenses.
Our methodology is evolving and under review. We do not propose the list as final or conclusive (if for no other reason than there are many more apparently unlicensed sites than there are slots on the list.) We are open to suggestions and are particularly interested in developing methodology that can be used across a variety of copyright categories, not just lyrics. Currently, this is the methodology used to compile this list of websites:
- Three songs are selected at random.
- Two of the songs are on Billboard’s Hot 100 current top 10.
- One track is a randomly selected “oldie”. This track must be a top 5, Hot 100 hit from at least 20 years ago not older than 40 years.
- Two search phrases are created for each track.
- “<Song Title>< Artist>”.
- “<popular snippet of lyric> words”. The search engine auto-fill suggestions determine the most popular snippet of song lyric.
- Browser (history, cookies, etc.) is cleaned prior to searches.
- The above searches are performed for each song on Google, Bing, and Yahoo.
- The top 200 search results for each search are imported into an Excel spreadsheet using a specialized plugin for Firefox.
- Each search result is assigned a value. A number one search result is assigned a value of 1. A number two search result is assigned a value of 0.995. A number n search result is assigned a value of 1-(n-1)*.005.
- Scores are tallied for each root domain.
- These scores are averaged together with scores from the two preceding studies. (Studies approximately every 60 days creating 180 day rolling window).
- Domains are sorted from highest score to lowest score.
- “False positives” are eliminated, such as:
- Artist/Label/Publicist websites.
- Site not substantially dedicated to lyrics. Exceptions: subdomain (eg lyrics.website.com); featured homepage tab to lyrics; has majority of current Hot 100 lyrics.
- Music reviews, magazines or blogs that reproduced portions of lyrics.
- “Bridge” pages using hidden, popular search phrases to draw traffic (although these may later be included on a separate list if they target lyric searches).
- Personal blogs and websites containing one of the targeted songs lyrics.
- Websites that appear to be licensed are eliminated.
- The top 50 remaining sites are selected and posted to the list.
Note on Methodology (Feb 2014)
There is one small subtlety in the methodology that we should point out. Because lyric sites often engage in “spammy” SEO practices on occasion a lyric site may not have the lyrics to the song for which we are searching. Yet they could rank high in the search results for that particular song. We will still include this site in the list if they appear to have other unlicensed lyrics on their site. After all the study is intended to find and prioritize seemingly unlicensed lyric sites, not specific examples of infringement. A clear example of the use of Spammy SEO is the case involving the website www.rapgenius.com. They were trying to drive traffic to their rap oriented site using lyrics to Justin Bieber songs.
On a related note some clever folks at Hacker News noticed we didn’t rank the sites by Alexa rank that instead we ordered the sites by overall ranking of our search returns. This is because Alexa ranks overall web traffic. Many sites have been long established and thus have organic or “inertial” traffic that bypasses search engines. While this is important we are more interested in a site’s “searchiness.” Essentially ranking sites by how aggressively they target lyric searches. This is often but not always related to their SEO techniques both “black hat” and “white hat.”
Errors and Margin Of Error
As there are over 3,600 links, in order to complete the results within our available academic resources, we use some simple spreadsheet macros to tally the scores and then spot check for errors.
A particular difficulty is that sometimes the URLs “resolve” to different URLs. In most cases this leads us to underestimating the scores of websites in the Top 50. For instance a URL like www.examplelyricswebsite.ru (note the “s” and “ru”) will resolve to http://www.examplelyricwebsite.com. We may miss this and thus miscalculate the score of the latter website. But in this example, the latter website is almost always the higher ranking website where these small errors matter less to overall rankings.
So what we can say is that any re-tallying of the score based on our search results is unlikely to change the ranking of the websites significantly. We believe that the higher the rank of the website, the less likely it is that any tabulation error would change it’s rank. The top 5 results should have a margin of error of less than 5%.
Video sites and the phrase “substantially dedicated to lyrics” clause and exemptions.
There are three major video sites: YouTube (dominant globally), Daily Motion and Vimeo. While the majority of the videos on these sites are not lyrics videos (i.e., user generated content with only text rendering of lyrics as the video with a recording of the song in the music track of the video), the fact that these sites generally host lyric videos for the songs in the Billboard Hot 100 suggests we should somehow include theme in our study (See Methodology item J. b. iii above.)
So are these video sites fully licensed for lyric videos? We’ve been examining this question for some time and we’ve not been able to definitively answer it yet. However after speaking with various publishers and the NMPA, this is what we have concluded:
YouTube has many confidential licenses with major publishers that may include permissions for lyric videos. Further, many publishers try to manage and monetize their songs through YouTube’s Content Management System. There are many complaints about “CMS”, particularly from smaller music publishers and particularly about user generated videos (it appears that YouTube may well be the only one able to track user generated content to the degree that songwriters and publishers expect from a licensee). It is also important to note that many record labels and artists upload lyric videos to these sites, further muddying the waters in terms of licensing.
It was difficult to find many definitively unlicensed lyric videos on YouTube without having access to YouTube’s confidential agreements with music publishers. Songwriters may be able to obtain copies of these agreements.
On the other hand, based on responses from the NMPA, Vimeo and DailyMotion appear to be largely or entirely unlicensed.
We have decided to leave YouTube, Vimeo and Daily Motion off the Undesirable Lyric Website List for the moment. However, we may include some or all of these sites in the next study.
Comments On Rapgenius
It is has been widely reported that Rapgenius has sought and obtained licenses from some of the major publishers. We assume that these reports are true, but a review of the site suggests that Rapgenius still have not licensed all the song lyrics they display.
In particular they have not licensed songs from Warner Music Group including this author’s songs. As an amusing aside: the account for the “Editor-In-Chief” of Rapgenius (for unknown reasons )posted transcribed lyrics to one of the author’s songs and uploaded it to the Rapgenius site immediately after publication of the October list. Rapgenius also has a “news genius” section. If you go to that section of the website you will see that they have also re-published (without permission) the last UGA Undesirable Lyric Website List Report. We have not determined any appropriate action.
The Nitty Gritty Search Details
This study used the following songs for searches:
The Monster, Eminem
The Wanderer, Donna Summer.
The searches were conducted Jan 28 to Feb 2nd 2014.
A variety of Central Virginia IP addresses were used.
Sometimes our researchers changed the “location” of the browser (in preferences) to other cities including:
New York, NY
In order to automate the collection of search returns in Bing, one must log into a Microsoft account. We created a generic and semi anonymous live.com account to do this. As a result it is possible that search results were influenced by Bing’s knowledge of our past search results.
Advertising and Brands.
In addition to analyzing the search results, this time we noted which of the top 10 lyric websites appeared to regularly display advertising from major brands. A major brand was defined as a company on the current Fortune 1000 list or an institution (like a university or government agency) of similar size. We had five student volunteers navigate to each of these websites each day during the indicated study periods and catalogue the brands whose display advertisements appeared on each site.
The methodology for determining advertisers on the Undesirable List sites is:
- Launch Firefox browser and clear history at the beginning of each day.
- Visit each website once a day for 7 days.
- Find an incidence of reproduced lyrics.
- Catalogue the advertising that appeared on the page containing lyrics.
- Spread out the visits to each site over 12 hours.
- Randomize the order in which the sites are visited each day.
Brands and Advertising.
These are the “major” brands observed on the top 10 lyric websites (listed in alphabetical order.)
|Ai The Art Institutes
|American Heart Association/American Stroke Association
|Atlanta Institute of Music
|Carnival Cruise Lines
|Georgia State University
|Norton by Symantec
|Savannah Law School
|U.S. Department of Energy