The goal of our research was to assess whether the observation about deceptive texts
having a lower positive tone than truthful ones in terms of sentiment could become operative and
be used for building a classifier in the particular case of fraudster’s letters written in Spanish. The
data were the letters that CEOs address to company shareholders in their annual financial reports,
and the task was to identify the letters of companies that committed financial misconduct or fraud.
This case was ...
The goal of our research was to assess whether the observation about deceptive texts
having a lower positive tone than truthful ones in terms of sentiment could become operative and
be used for building a classifier in the particular case of fraudster’s letters written in Spanish. The
data were the letters that CEOs address to company shareholders in their annual financial reports,
and the task was to identify the letters of companies that committed financial misconduct or fraud.
This case was challenging for two reasons: first, most of the research worked with spontaneous
written or spoken texts, while these letters did not; second, most of the research in this area worked
on English texts, while we validated the linguistic cues found as evidence of deception for Spanish
texts. The results of our research confirm that an SVM trained with a bag-of-words model of frequent
adjectives can achieve 81% accuracy because these adjectives bring the information about which
positive or negative tone and which word combinations in a text turn out to be a characteristic of
fraudster’s texts.
+