Biases on social media data: (keynote extended abstract)
Mostra el registre complet Registre parcial de l'ítem
- dc.contributor.author Baeza Yates, Ricardo
- dc.date.accessioned 2021-04-15T07:54:03Z
- dc.date.available 2021-04-15T07:54:03Z
- dc.date.issued 2020
- dc.description Comunicació presentada al WWW'20: International World Wide Web Conference, celebrat del 20 al 24 d'abril de 2020 a Taipei, Taiwan.
- dc.description.abstract Social media data is often used to pulse the opinion of online communities, either by predicting sentiment or stances (e.g., political), to mention just two typical use cases. However, those analysis assume that the data samples really represent the underlying demographics of the overall community, both, in number and characteristics, which in most cases is not true. As a result, extrapolating these results to larger populations usually do not work. This happens because social media data is inherently biased, mainly due to two facts: (1) not all people is equally active in social media platforms and most of them are really passive; and (2) there are demographic biases in gender and age, among other attributes. Hence, the questions of how representative is the data and if is possible to make it representative are of crucial importance. We also discuss related issues such as using public samples of mostly private platforms as well as typical errors in the analysis of such data.en
- dc.format.mimetype application/pdf
- dc.identifier.citation Baeza-Yates R. Biases on social media data: (keynote extended abstract). In: Seghrouchni AEF, Sukthankar G, Liu TY, van Steen M. WWW '20: Companion Proceedings of the Web Conference; 2020 Apr 20-24; Taipei, Taiwan. New York: Association for Computing Machinery; 2020. p. 782-83. DOI: 10.1145/3366424.3383564
- dc.identifier.doi http://dx.doi.org/10.1145/3366424.3383564
- dc.identifier.uri http://hdl.handle.net/10230/47125
- dc.language.iso eng
- dc.publisher ACM Association for Computer Machinery
- dc.relation.ispartof Seghrouchni AEF, Sukthankar G, Liu TY, van Steen M. WWW '20: Companion Proceedings of the Web Conference; 2020 Apr 20-24; Taipei, Taiwan. New York: Association for Computing Machinery; 2020. p. 782-83
- dc.rights This paper is published under the Creative Commons Attribution 4.0 International (CC BY 4.0) license. Authors reserve their rights to disseminate the work on their personal and corporate Web sites with the appropriate attribution.
- dc.rights.accessRights info:eu-repo/semantics/openAccess
- dc.rights.uri https://creativecommons.org/licenses/by/4.0/
- dc.subject.keyword Biasen
- dc.subject.keyword Representativenessen
- dc.subject.keyword Data samplesen
- dc.subject.keyword Genderen
- dc.subject.keyword Age predictionen
- dc.title Biases on social media data: (keynote extended abstract)en
- dc.type info:eu-repo/semantics/conferenceObject
- dc.type.version info:eu-repo/semantics/publishedVersion