Benvinguts al Repositori Digital de la UPF

Topic detection using the DBSCAN-Martingale and the time operator

Mostra el registre parcial de l'element

dc.contributor.author Gialampoukidis, Ilias
dc.contributor.author Vrochidis, Stefanos
dc.contributor.author Kompatsiaris, Ioannis
dc.contributor.author Antoniou, Ioannis
dc.date.accessioned 2017-09-14T12:19:56Z
dc.date.available 2017-09-14T12:19:56Z
dc.date.issued 2017
dc.identifier.citation Gialampoukidis I, Vrochidis S, Kompatsiaris I, Antoniou I. Topic detection using the DBSCAN-Martingale and the time operator. Presented at: The 17th Conference of the Applied Stochastic Models and Data Analysis (ASMDA); 2017 June 6-9; London, UK. [8 p.].
dc.identifier.uri http://hdl.handle.net/10230/32762
dc.description Comunicació presentada a: The 17th Conference of the Applied Stochastic Models and Data Analysis (ASMDA), celebrada del 6 al 9 de juny de 2017 a Londres, Regne Unit.
dc.description.abstract Topic detection is usually considered as a decision process implemented in some relevant context, for example clustering. In this case, clusters correspond to topics that should be identifed. Density-based clustering, for example, uses only a density level E and a lower bound for the number of points in a cluster. As the density level is hard to be estimated, a stochastic process, called the DBSCANMartingale, is constructed for the combination of several outputs of DBSCAN for various randomly selected values of E in a predefned closed interval [0; Emax] from the uniform distribution. We have observed that most of the clusters are extracted in the interval [0; Emax=2], and moreover in the interval [Emax=2; Emax] the DBSCANMartingale stochastic process is less innovative, i.e. extracts only a few or no clusters. Therefore, non-symmetric skewed distributions are needed to generate density levels for the extraction of all clusters in a fast way. In this work we show that skewed distributions may be used instead of the uniform, so as to extract all clusters as quickly as possible. Experiments on real datasets show that the average innovation time of the DBSCAN-Martingale stochastic process is reduced when skewed distributions are employed, so less time is needed to extract all clusters.
dc.description.sponsorship The first author would like to thank the Research Committee of the Aristo- tle University of Thessaloniki for awarding him the \Aristeia" postdoctoral scholarship 2016. Moreover, this work has been partially supported by the EC-funded project KRISTINA (H2020-645012).
dc.format.mimetype application/pdf
dc.language.iso eng
dc.relation.ispartof The 17th Conference of the Applied Stochastic Models and Data Analysis (ASMDA); 2017 June 6-9; London, UK. [8 p.].
dc.rights © 2017 ISAST
dc.title Topic detection using the DBSCAN-Martingale and the time operator
dc.type info:eu-repo/semantics/conferenceObject
dc.subject.keyword DBSCAN-Martingale
dc.subject.keyword Time operator
dc.subject.keyword Skewed distributions
dc.subject.keyword Internal age
dc.subject.keyword Density-based clustering
dc.subject.keyword Innovation process
dc.relation.projectID info:eu-repo/grantAgreement/EC/H2020/645012
dc.rights.accessRights info:eu-repo/semantics/openAccess
dc.type.version info:eu-repo/semantics/publishedVersion


Aquest element apareix en la col·lecció o col·leccions següent(s)

Mostra el registre parcial de l'element