The first 2020 U.S. Presidential debate was widely covered as one of the worst in U.S. history, with analysts like CNN’s Jake Tapper left grasping for the words to describe it. Was it truly “a hot mess, inside a dumpster fire, inside a trainwreck”? Using textual analyses, I attempt to quantitatively compare this debate with every other in U.S. history.
My results are built on a dataset of every general election Presidential and Vice Presidential debate. I scraped this data from the Commission on Presidential Debates website and rev.com. The final dataset includes over 13,500 questions and responses organized by debate and speaker, from the first Kennedy-Nixon debate in 1960 to Harris-Pence VP debate in 2020. You can find the data here.
One way to analyze these data is to look at how positive or negative the candidates are in their responses — the sentiment of their arguments. In its most simple form, a sentiment analysis counts the number of positive words (“excellent”, “brilliant”, “win”) and negative words (“worst”, “horrible”, “fraud”) used in a text. A text with more positive than negative words will have a more positive average sentiment (and vice versa).
Using the AFINN dictionary, which assigns scores of -5 to 5 to a corpus of nearly 2,500 words, I calculate the average sentiment of every general election debater. The plot below shows the average sentiment of presidential candidates from the two major parties.