Text Analysis Research Tools

Dictionary of Extreme Language in Voluntary Disclosures

This dictionary consists of 23,355 words and phrases that are ranked on a scale ranging from “-5” for extremely negative to “+5” for extremely positive, where “0” indicates a neutral word or phrase. We briefly summarize the dictionary creation procedure below and provide more details in our published paper.

  1. We extracted all adjectives, nouns, and verbs that occurred in more than 1% of 60,940 earnings conference call transcripts.
  2. We then deleted finance and accounting terms as well as stop words, names, and generic terms.
  3. We then added more words from Loughran and McDonald (2011)’s positive and negative word lists (see http://ssrn.com/abstract=1331573), if those words were not already included in the dictionary.
  4. For each word in the merged dictionary, we also tried to find synonym words and phrases using the Microsoft Word’s thesaurus feature.
    Steps 1) – 4) generated 23,355 words and phrases.
  5. We then employed individuals subscribed to Amazon’s Mechanical Turk service (MTurk) to rate all words/phrases in the dictionary.

Follow this LINK to download the rated dictionary. Please reference our paper when using the dictionary: Khrystyna Bochkay, Jeffrey Hales, and Sudheer Chava (2019) Hyperbole or Reality? Investor Response to Extreme Language in Earnings Conference Calls. The Accounting Review In-Press.