Using pre-trained language models for abstractive DBPEDIA summarization: a comparative study
Mostra el registre complet Registre parcial de l'ítem
- dc.contributor.author Zahera, Hamada M.
- dc.contributor.author Vitiugin, Fedor
- dc.contributor.author Sherif, Mohamed Ahmed
- dc.contributor.author Castillo, Carlos
- dc.contributor.author Ngonga Ngomo, Axel-Cyrille
- dc.date.accessioned 2025-05-26T07:14:23Z
- dc.date.available 2025-05-26T07:14:23Z
- dc.date.issued 2023
- dc.description Comunicació presentada a la 19th International Conference on Semantic Systems, celebrada del 20 al 22 de setembre de 2023 a Leipzig, Germany.
- dc.description.abstract Purpose: This study addresses the limitations of current short abstracts of DBPEDIA entities, which often lack a comprehensive overview due to their creating method (i.e., selecting the first two-three sentences from the full DBPEDIA abstracts). Methodology: We leverage pre-trained language models to generate abstractive summaries of DBPEDIA abstracts in six languages (English, French, German, Italian, Spanish, and Dutch). We performed several experiments to assess the quality of generated summaries by language models. In particular, we evaluated the generated summaries using human judgments and automated metrics (Self-ROUGE and BERTScore). Additionally, we studied the correlation between human judgments and automated metrics in evaluating the generated summaries under different aspects: informativeness, coherence, conciseness, and fluency. Findings: Pre-trained language models generate summaries more concise and informative than existing short abstracts. Specifically, BART-based models effectively overcome the limitations of DBPEDIA short abstracts, especially for longer ones. Moreover, we show that BERTScore and ROUGE-1 are reliable metrics for assessing the informativeness and coherence of the generated summaries with respect to the full DBPEDIA abstracts. We also find a negative correlation between conciseness and human ratings. Furthermore, fluency evaluation remains challenging without human judgment. Value: This study has significant implications for various applications in machine learning and natural language processing that rely on DBPEDIA resources. By providing succinct and comprehensive summaries, our approach enhances the quality of DBPEDIA abstracts and contributes to the semantic web community.en
- dc.description.sponsorship This work has been supported by the German Federal Ministry of Education and Research (BMBF) through the EuroStars project E!114154 PORQUE (grant no 01QE2056C) and the KIAM project (grant no 02L19C115). Additionally, this work has been partially supported by: the Department of Research and Universities of the Government of Catalonia (SGR00930), the Ministry of Science and Innovation of Spain with the project COMCRISIS (reference code PID2019−109064GB−I00), the EU-funded SoBigData++ project under Grant Agreement 871042 and MCIN/AEI /10.13039/501100011033 under the Maria de Maeztu Units of Excellence Programme (CEX2021−001195− M).en
- dc.format.mimetype application/pdf
- dc.identifier.citation Zahera HM, Vitiugin F, Sherif MA, Castillo C, Ngonga Ngnomo AC. Using pre-trained language models for abstractive DBPEDIA summarization: a comparative study. In: Acosta M, Peroni S, Vahdati, Gentile AL, Pellegrini T, Kalo JC, editors. Knowledge graphs: semantics, machine learning, and languages. Proceedings of the 19th International Conference on Semantic Systems; 2023 Sep 20-22; Leipzig, Germany. Amsterdam: IOS Press; 2023. p. 19-37. DOI: 10.3233/SSW230003
- dc.identifier.doi http://dx.doi.org/10.3233/SSW230003
- dc.identifier.isbn 9781643684246
- dc.identifier.uri http://hdl.handle.net/10230/70493
- dc.language.iso eng
- dc.publisher IOS Press
- dc.relation.ispartof Acosta M, Peroni S, Vahdati, Gentile AL, Pellegrini T, Kalo JC, editors. Knowledge graphs: semantics, machine learning, and languages. Proceedings of the 19th International Conference on Semantic Systems; 2023 Sep 20-22; Leipzig, Germany. Amsterdam: IOS Press; 2023. p. 19-37
- dc.relation.projectID info:eu-repo/grantAgreement/EC/FP7/871042
- dc.relation.projectID info:eu-repo/grantAgreement/ES/2PE/PID2019-109064GB-I00
- dc.rights © 2023 The Authors. This article is published online with Open Access by IOS Press and distributed under the terms of the Creative Commons Attribution Non-Commercial License 4.0 (CC BY-NC 4.0).
- dc.rights.accessRights info:eu-repo/semantics/openAccess
- dc.rights.uri https://creativecommons.org/licenses/by-nc/4.0/
- dc.subject.keyword Abstractive summarizationen
- dc.subject.keyword Large language modelsen
- dc.subject.keyword Knowledge graphsen
- dc.title Using pre-trained language models for abstractive DBPEDIA summarization: a comparative studyen
- dc.type info:eu-repo/semantics/conferenceObject
- dc.type.version info:eu-repo/semantics/publishedVersion