A framework for semantic text clustering

dc.contributor.authorUsuari de cĂ rrega
dc.contributor.authorEL Saili, Chama
dc.contributor.authorAlaoui, Larbi
dc.date.accessioned2024-02-19T07:24:38Z
dc.date.available2024-02-19T07:24:38Z
dc.date.issued2020
dc.description.abstractExisting approaches for text clustering are either agglomerative, divisive or based on frequent itemsets. However, most of the suggested solutions do not take the semantic associations between words into account and documents are only regarded as bags of unrelated words. Indeed, traditional text clustering methods usually focus on the frequency of terms in documents to create connected homogenous clusters without considering associated semantic which will of course lead to inaccurate clustering results. Accordingly, this research aims to understand the meanings of text phrases in the process of clustering to make maximum usage and use of documents. The semantic web framework is filled with useful techniques enabling database use to be substantial. The goal is to exploit these techniques to the full usage of the Resource Description Framework (RDF) to represent textual data as triplets. To come up a more effective clustering method, we provide a semantic representation of the data in texts on which the clustering process would be based. On the other hand, this study opts to implement other techniques within the clustering process such as ontology representation to manipulate and extract meaningful information using RDF, RDF Schemas (RDFS), and Web Ontology Language (OWL). Since Text clustering is an indispensable task for better exploitation of documents, the use of documents may be more intelligently conducted while considering semantics in the process of text clustering to efficiently identify the more related groups in a document collection. To this end, the proposed framework combines multiple techniques to come up with an efficient approach combining machine learning tools with semantic web principles. The framework allows documents RDF representation, clustering, topic modeling, clusters summarizing, information retrieval based on RDF querying and Reasoning tools. It also highlights the advantages of using semantic web techniques in clustering, subject modeling and knowledge extraction based on processes of questioning, reasoning and inferencing.
dc.format.mimetypeapplication/pdf
dc.identifier.citationFatimi S, EL Saili C, Alaoui L. A framework for semantic text clustering. Int J Adv Comput Sci Appl. 2020;11(6):451-9. DOI: 10.14569/IJACSA.2020.0110657
dc.identifier.doihttp://dx.doi.org/10.14569/IJACSA.2020.0110657
dc.identifier.issn2158-107X
dc.identifier.urihttp://hdl.handle.net/10230/59127
dc.language.isoeng
dc.publisherSAI Organization
dc.relation.ispartofInternational Journal of Advanced Computer Science and Applications(IJACSA). 2020;11(6):451-9.
dc.rightsThis is an open access article licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, even commercially as long as the original work is properly cited.
dc.rights.accessRightsinfo:eu-repo/semantics/openAccess
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/
dc.subject.keywordText clustering
dc.subject.keywordsimilarity measure
dc.subject.keywordontology
dc.subject.keywordsemantic web
dc.subject.keywordRDF
dc.subject.keywordRDFS
dc.subject.keywordOWL
dc.subject.keywordreasoning
dc.subject.keywordinferencing rules
dc.subject.keywordSPARQL
dc.subject.keywordtopic modeling
dc.subject.keywordsummarization
dc.titleA framework for semantic text clustering
dc.typeinfo:eu-repo/semantics/article
dc.type.versioninfo:eu-repo/semantics/publishedVersion

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Alaoui_ija_fram.pdf
Size:
322.49 KB
Format:
Adobe Portable Document Format

License

Rights