BRaS
  • blog
  • journal
  • opportunities
  • about
No Result
View All Result
Publish with us
BraS
  • blog
  • journal
  • opportunities
  • about
No Result
View All Result
BraS
No Result
View All Result

Dataset of 282 Million Tweets Marks Study on Brazilian Elections

by PublicaABCP
November 26, 2025
in ABCP
A A
Listen to this article

by PublicABCP

Translated and reviewed by Matheus Lucas Hebling

The study The Interfaces Twitter Elections Dataset: Construction Process and Characteristics of Big Social Data During the 2022 Presidential Elections in Brazil, published in PLOS One by the INTERFACES – Center for Sociopolitical Studies on Algorithms and Artificial Intelligence (UFSCar), documents the creation of a large-scale database of interactions on Twitter (now X) during Brazil’s 2022 presidential elections.

The article is authored by Sylvia Iasulaitis, leader of the INTERFACES group, in co-authorship with Alan Demétrius Baria Valejo, Bruno Cardoso Greco, Vinicius Gonçalves Perillo, Guilherme Henrique Messias, and Isabella Vicari. The multidisciplinary team—composed of researchers in Political Science, Computer Science, and Information Science—received support from FAPESP under the project Analysis of Large Volumes of Political Data and Complex Networks.

The research spans the pre-election, election, and post-election periods, including the January 8, 2023 events, when protesters stormed the headquarters of the Executive, Legislative, and Judiciary branches in Brasília. The resulting dataset, named ITED-Br, contains over 282 million tweets, making it one of the largest political social data collections in the world. According to PLOS One editors, the dataset is of high scientific value for research in Social Sciences and Computational Politics.

To ensure the preservation and scientific usability of the data, the ITED-Br dataset has been made publicly available on the INTERFACES group’s GitHub, in accordance with platform terms of use. The dataset was released in a “dehydrated” format, containing only anonymized tweet and user identifiers. The “rehydration process”, which allows researchers to retrieve original tweet content through Twitter’s (X) API, is detailed in the accompanying documentation.

The primary objective of the study was to describe the process of collecting and organizing the ITED-Br dataset, built from public Twitter interactions related to Brazil’s main presidential candidates in 2022. For this purpose, the team developed specific data collection strategies, combining keyword searches, profile tracking, and post-based queries.

To overcome technical restrictions imposed by the platform—such as API rate limits—the researchers developed a proprietary algorithm called token farm, which automatically managed multiple academic access keys, ensuring continuous data collection despite these limitations.

Data collection lasted one year, involving the storage and processing of an extensive volume of information, which required the development of custom technical solutions for organization, structure, and analysis. Limited infrastructure posed additional challenges, addressed through the use of open-source libraries and optimized Python programming environments.

According to the authors, working with big social data requires interdisciplinary expertise and a balance between technical and sociopolitical knowledge—conditions essential to extract informational and analytical value from large-scale datasets.

Among the study’s key findings, the authors highlight that the discontinuation of Twitter’s academic API, announced after the platform’s acquisition by Elon Musk, makes it unlikely that future data collections of similar scale will be feasible for research institutions. Based on current API pricing, reproducing the ITED-Br dataset would cost over 1.5 million Brazilian reais, underscoring its scientific and historical significance.

The study also draws attention to the limits of public access to digital data: while Twitter interactions are technically public, transforming them into meaningful information requires specialized expertise and substantial infrastructure.


Author Profiles

Sylvia Iasulaitis holds a PhD in Political Science from UFSCar and is a Professor at the Federal University of São Carlos (UFSCar). She is a permanent faculty member in the Graduate Programs in Science, Technology, and Society and in Information Science, and currently coordinates the Social Sciences program. She leads the INTERFACES – Center for Sociopolitical Studies on Algorithms and Artificial Intelligence, certified by CNPq. Her work focuses on Computational Social Science and Social Data Science.

Alan Demétrius Baria Valejo is an Associate Professor and researcher in the Department of Computing at UFSCar. He earned his Bachelor’s in Computer Science (ICMC-USP, 2012), Master’s (2014), and PhD (2019) in Computer Science and Computational Mathematics from ICMC-USP. In 2020, he completed a Postdoctoral Fellowship at USP (FFCLRP-USP), funded by FAPESP.

Bruno Cardoso Greco works in the fields of Information and Computer Science, with an emphasis on Software Engineering and Information Theory. He is a member of INTERFACES, certified by CNPq.

Vinicius Gonçalves Perillo is an undergraduate student in Computer Science at UFSCar, specializing in Machine Learning and Data Science. He is a research assistant at the INTERFACES group.

Guilherme Henrique Messias is an undergraduate student in Computer Science at UFSCar.

Isabella Vicari is a PhD candidate in Political Science at UFSCar, with a Master’s in Science, Technology, and Society (2024) and a Bachelor’s in Social Sciences (2021), both from UFSCar, with a dual concentration in Political Science and Sociology.


Technical Information

Title: The Interfaces Twitter Elections Dataset: Construction Process and Characteristics of Big Social Data During the 2022 Presidential Elections in Brazil
Authors: Sylvia Iasulaitis, Alan Demétrius Baria Valejo, Bruno Cardoso Greco, Vinicius Gonçalves Perillo, Guilherme Henrique Messias, and Isabella Vicari – INTERFACES Group
Year: 2025
Published in: PLOS One, vol. 20, no. 2
Dataset: ITED-Br, available on GitHub

Tags: BrazilDataData analysisData scienceelectionsInformationPolitical ScienceResearchResearch notesSocial scienceTwitter

Related Stories

Environmental Education in the Prison System: A Path to Citizenship

Environmental Education in the Prison System: A Path to Citizenship

by PublicaABCP

by PublicABCP Translated and reviewed by Matheus Lucas Hebling The study Environmental Education and Citizenship in Prison Settings: Results of...

Central to Democracy, Yet Marginal in Academic Research: Racial Representation in Brazil

Political Violence Against Women: Research in Minas Gerais Combines Data and Testimonies

by PublicaABCP

by PublicABCP Translated and reviewed by Matheus Lucas Hebling The article Political Violence Against Women from an Intersectional Perspective: Minas...

Sustainable Development in Peripheral Regions: The Case of West Africa

Sustainable Development in Peripheral Regions: The Case of West Africa

by PublicaABCP

by PublicABCP Translated and reviewed by Matheus Lucas Hebling The article Challenges in Implementing the Sustainable Development Goals in West...

Democracy and Youth: Reflections on Political Engagement in Latin America

Democracy and Youth: Reflections on Political Engagement in Latin America

by PublicaABCP

by PublicABCP Translated and reviewed by Matheus Lucas Hebling The dossier Latin American Youth: Political Participation, Pandemic, and Future Scenarios,...

BRaS

© Copyright 2019 - 2025 | Brazilian Research and Studies Center BraS | All Rights Reserved | Webmaster Matheus Zago

No Result
View All Result
  • blog
  • journal
  • opportunities
  • about

© Copyright 2019 - 2025 | Brazilian Research and Studies Center BraS | All Rights Reserved | Webmaster Matheus Zago