Using pre-trained language models for abstractive DBPEDIA summarization: a comparative study

Zahera, Hamada M.; Vitiugin, Fedor; Sherif, Mohamed Ahmed; Castillo, Carlos; Ngonga Ngomo, Axel-Cyrille

Using pre-trained language models for abstractive DBPEDIA summarization: a comparative study

Mostra el registre complet Registre parcial de l'ítem

dc.contributor.author Zahera, Hamada M.
dc.contributor.author Vitiugin, Fedor
dc.contributor.author Sherif, Mohamed Ahmed
dc.contributor.author Castillo, Carlos
dc.contributor.author Ngonga Ngomo, Axel-Cyrille
dc.date.accessioned 2025-05-26T07:14:23Z
dc.date.available 2025-05-26T07:14:23Z
dc.date.issued 2023
dc.description Comunicació presentada a la 19th International Conference on Semantic Systems, celebrada del 20 al 22 de setembre de 2023 a Leipzig, Germany.
dc.description.abstract Purpose: This study addresses the limitations of current short abstracts of DBPEDIA entities, which often lack a comprehensive overview due to their creating method (i.e., selecting the first two-three sentences from the full DBPEDIA abstracts). Methodology: We leverage pre-trained language models to generate abstractive summaries of DBPEDIA abstracts in six languages (English, French, German, Italian, Spanish, and Dutch). We performed several experiments to assess the quality of generated summaries by language models. In particular, we evaluated the generated summaries using human judgments and automated metrics (Self-ROUGE and BERTScore). Additionally, we studied the correlation between human judgments and automated metrics in evaluating the generated summaries under different aspects: informativeness, coherence, conciseness, and fluency. Findings: Pre-trained language models generate summaries more concise and informative than existing short abstracts. Specifically, BART-based models effectively overcome the limitations of DBPEDIA short abstracts, especially for longer ones. Moreover, we show that BERTScore and ROUGE-1 are reliable metrics for assessing the informativeness and coherence of the generated summaries with respect to the full DBPEDIA abstracts. We also find a negative correlation between conciseness and human ratings. Furthermore, fluency evaluation remains challenging without human judgment. Value: This study has significant implications for various applications in machine learning and natural language processing that rely on DBPEDIA resources. By providing succinct and comprehensive summaries, our approach enhances the quality of DBPEDIA abstracts and contributes to the semantic web community.en
dc.description.sponsorship This work has been supported by the German Federal Ministry of Education and Research (BMBF) through the EuroStars project E!114154 PORQUE (grant no 01QE2056C) and the KIAM project (grant no 02L19C115). Additionally, this work has been partially supported by: the Department of Research and Universities of the Government of Catalonia (SGR00930), the Ministry of Science and Innovation of Spain with the project COMCRISIS (reference code PID2019−109064GB−I00), the EU-funded SoBigData++ project under Grant Agreement 871042 and MCIN/AEI /10.13039/501100011033 under the Maria de Maeztu Units of Excellence Programme (CEX2021−001195− M).en
dc.format.mimetype application/pdf
dc.identifier.citation Zahera HM, Vitiugin F, Sherif MA, Castillo C, Ngonga Ngnomo AC. Using pre-trained language models for abstractive DBPEDIA summarization: a comparative study. In: Acosta M, Peroni S, Vahdati, Gentile AL, Pellegrini T, Kalo JC, editors. Knowledge graphs: semantics, machine learning, and languages. Proceedings of the 19th International Conference on Semantic Systems; 2023 Sep 20-22; Leipzig, Germany. Amsterdam: IOS Press; 2023. p. 19-37. DOI: 10.3233/SSW230003
dc.identifier.doi http://dx.doi.org/10.3233/SSW230003
dc.identifier.isbn 9781643684246
dc.identifier.uri http://hdl.handle.net/10230/70493
dc.language.iso eng
dc.publisher IOS Press
dc.relation.ispartof Acosta M, Peroni S, Vahdati, Gentile AL, Pellegrini T, Kalo JC, editors. Knowledge graphs: semantics, machine learning, and languages. Proceedings of the 19th International Conference on Semantic Systems; 2023 Sep 20-22; Leipzig, Germany. Amsterdam: IOS Press; 2023. p. 19-37
dc.relation.projectID info:eu-repo/grantAgreement/EC/FP7/871042
dc.relation.projectID info:eu-repo/grantAgreement/ES/2PE/PID2019-109064GB-I00
dc.rights © 2023 The Authors. This article is published online with Open Access by IOS Press and distributed under the terms of the Creative Commons Attribution Non-Commercial License 4.0 (CC BY-NC 4.0).
dc.rights.accessRights info:eu-repo/semantics/openAccess
dc.rights.uri https://creativecommons.org/licenses/by-nc/4.0/
dc.subject.keyword Abstractive summarizationen
dc.subject.keyword Large language modelsen
dc.subject.keyword Knowledge graphsen
dc.title Using pre-trained language models for abstractive DBPEDIA summarization: a comparative studyen
dc.type info:eu-repo/semantics/conferenceObject
dc.type.version info:eu-repo/semantics/publishedVersion

Col·leccions

Congressos (Departament de Tecnologies de la Informació i les Comunicacions)
Documents OpenAIRE (Open Access Infrastructure for Research in Europe)