72,625,772
Recently I have been thinking a lot about how things have changed pop music. In particular, I wanted to know if it becomes popular music in the United States less meaningful (this is a common topic of criticism from the older generation). To prove or disprove this statement, I have a definition of "less intelligent" means that there is less expressive in the text and a variety of words.
By studying pop music, you can learn a lot, because it affected the culture of a large number of Americans. To say that pop music is insensible, so to question the intellectual level of the US population.
Searching information on the Internet, I found an article
site Huffington Post, which analyzes the basic words, used in pop songs, written at different times. I think the data from this article speak volumes about how a different approach to writing texts in America for a certain period.
I also wanted to take part in establishing the truth, having a more general statistics, which would be more focused on the semantic content of the texts as a whole, rather than on the use of certain themes in the songs. I found
article authored by William Briggs (William Briggs), in which he argues that the music simply became much more stupid than before. It analyzes the ratio of the number of unique words [in this case, a "unique" words are likely to understand the words that appear in the song no more than once - approx. Perevi.] in popular songs to the total number of words in them, and uses the results of the study as evidence of declining levels of meaningful text.
Wanting to delay work on his diplomnomom, check out the results of research and Briggs to write a program in Python, I decided to conduct a similar study, using the lists of top 40 hits for each year since 1950 until now. Briggs does not say what kind of music it is specifically used for the analysis, however, it revealed that its source material was about the same as mine.
The website
Top40 Charts can find all the lists 40 of the best musical hits for each year since 1950, issued in the form of simple tables. I could easily use each item in the list with a simple cycle and project
Beautiful Soup . So I downloaded the names of artists and song titles in MongoDB, using PyMongo as a driver.
So, at my disposal were almost all music hits from all lists top 40, except for some low-quality material: in some song titles found me, proved to typographical and other errors. After that, I needed to find the lyrics. And here begins the difficulty, I do not cmog find affordable and free API to work with lyrics and not wanted to write a program-skraper that would collect search results on Google. It turned out that the resource
lyrics.wikia.com can find many lyrics and page structure can effectively find the necessary files to a large number of compositions.
The only problem was that the address of each song is very specific and the slightest inaccuracy resulted in a 404 error in the 50s among the big music groups have been very popular to call themselves "Such and such is", "So-and-so and Orchestra "or" So-and-so and his orchestra ", etc., and this makes it difficult to search for files. I accelerated this process by using the names of the songs and artists names different regular expressions, which, for example, removes the specified name or phrase in parentheses. In the end, it was a decisive factor for the success of finding lyrics. I decided that the results of this research will be enough for analysis.
After that, it was not difficult to analyze the texts of the songs of different years and compare them with each other. At first I found the number of unique words in each song, the total number of words in each song, and found the relationship between the two indicators. The results of my research has confirmed what Briggs described: the number of unique words, and the total number increased with time, while the value of their relationship has become much smaller. This could mean that the level of semantic content in texts of popular songs really dropped.
Increasing the total number of words in the text, you can probably explain the transition to other genres: in the lyrics of jazz big bands and songs in the genre Disco certainly fewer words than in the compositions of the musicians performing rap or rock 'n' roll. In addition, large jazz orchestras in previous years did not use the support of the choir, which significantly reduces the number of repeated words in a song.
Interestingly, the total number of words and the number of unique words in the song reached its maximum value in 2003, the cause of which may serve a large number of hits in the genre of rap and R & B, appeared at this time. The top line in the list of the 40 greatest hits of this year, full of songs 50 Cent, Eminem, Jay Z, and others. The subsequent declines may be due to the popularity of electronic music and dance hits.
I wanted to understand a little bit deeper in this subject therefore analyzed the average length of words in the lyrics. Its value in each year was very close to the four characters. I also calculated the average number of words in the songs of varying lengths. Similarly, the number of words of four, five, six, seven or eight letters of all words each year remained almost unchanged.
Thus, although the ratio of unique words to the total number of words in the songs of today may be less, however, the words of the older songs were not necessarily at the same time "more meaningful". Considering the above, we can conclude that in itself the length of words has little to say about their quality.
I spent a control check in order to finally make sure that the data has been received by me quite correct. I would be ashamed if my conclusions could be easily refuted by several examples of non-standard texts with lots of long or unique words. I take into account this possibility by calculating the coefficient of variation
word length and the number of unique words in each song. The general rule is: if the coefficient of variation of less than one, it is believed that the data is quite correct.
You can read more about the significance of this factor in the
посте StackExchange, which contains other useful resources on the topic. It was found that the content of the texts remained fairly stable during the period, so any "counter-examples" could not make incorrect findings.
A more detailed analysis of the level of semantic loading of popular songs would require an in-depth understanding of the content of texts, but above general information suggests that the texts of the pop artists over time have become less creative. Specifically, the texts became more repetitions. The overall increase in the number of words and unique words in songs, you can probably explain more intense verbal content of the songs in genres like rap or indie rock, compared with the music of great jazz orchestras '50s or' 70s disco.
Since the average length of words has not changed much over time, I believe it impossible to say with certainty that the current music has become "more stupid", says Briggs. More accurate and more objective would be the statement that the song become verbose, and they often use repetitive chorus.
The code generated within the framework of this experiment, entirely posted on the
Github . The database is also in the public domain (ibid) - in case you're wondering access to the texts of songs.
Source:
geektimes.ru/company/audiomania/blog/252532/