A framework for semantic text clustering

Mostra el registre complet Registre parcial de l'ítem

  • dc.contributor.author Usuari de càrrega
  • dc.contributor.author EL Saili, Chama
  • dc.contributor.author Alaoui, Larbi
  • dc.date.accessioned 2024-02-19T07:24:38Z
  • dc.date.available 2024-02-19T07:24:38Z
  • dc.date.issued 2020
  • dc.description.abstract Existing approaches for text clustering are either agglomerative, divisive or based on frequent itemsets. However, most of the suggested solutions do not take the semantic associations between words into account and documents are only regarded as bags of unrelated words. Indeed, traditional text clustering methods usually focus on the frequency of terms in documents to create connected homogenous clusters without considering associated semantic which will of course lead to inaccurate clustering results. Accordingly, this research aims to understand the meanings of text phrases in the process of clustering to make maximum usage and use of documents. The semantic web framework is filled with useful techniques enabling database use to be substantial. The goal is to exploit these techniques to the full usage of the Resource Description Framework (RDF) to represent textual data as triplets. To come up a more effective clustering method, we provide a semantic representation of the data in texts on which the clustering process would be based. On the other hand, this study opts to implement other techniques within the clustering process such as ontology representation to manipulate and extract meaningful information using RDF, RDF Schemas (RDFS), and Web Ontology Language (OWL). Since Text clustering is an indispensable task for better exploitation of documents, the use of documents may be more intelligently conducted while considering semantics in the process of text clustering to efficiently identify the more related groups in a document collection. To this end, the proposed framework combines multiple techniques to come up with an efficient approach combining machine learning tools with semantic web principles. The framework allows documents RDF representation, clustering, topic modeling, clusters summarizing, information retrieval based on RDF querying and Reasoning tools. It also highlights the advantages of using semantic web techniques in clustering, subject modeling and knowledge extraction based on processes of questioning, reasoning and inferencing.
  • dc.format.mimetype application/pdf
  • dc.identifier.citation Fatimi S, EL Saili C, Alaoui L. A framework for semantic text clustering. Int J Adv Comput Sci Appl. 2020;11(6):451-9. DOI: 10.14569/IJACSA.2020.0110657
  • dc.identifier.doi http://dx.doi.org/10.14569/IJACSA.2020.0110657
  • dc.identifier.issn 2158-107X
  • dc.identifier.uri http://hdl.handle.net/10230/59127
  • dc.language.iso eng
  • dc.publisher SAI Organization
  • dc.relation.ispartof International Journal of Advanced Computer Science and Applications(IJACSA). 2020;11(6):451-9.
  • dc.rights This is an open access article licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, even commercially as long as the original work is properly cited.
  • dc.rights.accessRights info:eu-repo/semantics/openAccess
  • dc.rights.uri http://creativecommons.org/licenses/by/4.0/
  • dc.subject.keyword Text clustering
  • dc.subject.keyword similarity measure
  • dc.subject.keyword ontology
  • dc.subject.keyword semantic web
  • dc.subject.keyword RDF
  • dc.subject.keyword RDFS
  • dc.subject.keyword OWL
  • dc.subject.keyword reasoning
  • dc.subject.keyword inferencing rules
  • dc.subject.keyword SPARQL
  • dc.subject.keyword topic modeling
  • dc.subject.keyword summarization
  • dc.title A framework for semantic text clustering
  • dc.type info:eu-repo/semantics/article
  • dc.type.version info:eu-repo/semantics/publishedVersion