Analysis of the user queries of an e-commerce bookstore in terms of the Library of Congress classification and key publishers

Mostra el registre complet Registre parcial de l'ítem

  • dc.contributor.author Nettleton, David F.
  • dc.contributor.author Baeza Yates, Ricardo
  • dc.contributor.author Marcos, Mari Carmen
  • dc.date.accessioned 2020-03-05T15:20:07Z
  • dc.date.available 2020-03-05T15:20:07Z
  • dc.date.issued 2013
  • dc.description.abstract Introduction. A key aspect of data mining and its success in extracting useful knowledge is the way in which the data is represented. In this paper we propose representing the relations inherent in an e-commerce bookstore search log as a graph, which allows us to apply and customize graph metrics and algorithms to identify structures and key elements. This approach complements traditional transactional mining by facilitating the identification of underlying structural relationships. Method. The data are pre-processed and represented as a graph which permits the calculation of the descriptive metrics: hubs, bridges and community modularity. These metrics are then interpreted in terms of the book topics (Library of Congress Classification) and publishers. Analysis. The relations between users, books and publishers are studied. We calculate statistics based on the graph metrics and visualize the communities and structure of the graphs. Then we identify the top publishers and categories in terms of the community, hub and bridge structures of the graph. Results. We have successfully represented the Web activity data log as a graph, defining the relations between books and users based on activity; analysed the graphs based on the specific graph metrics of communities, hubs and bridges; and evaluated the utility of the analysis by using the graph structure to identify the key information of interest in terms of top publishers and book categories. Conclusion. We have defined a graph-based method for analysing transactional data which complements traditional transactional mining techniques in order to obtain business knowledge that can be used immediately for cross-selling and recommendation, or, in the medium term, for book catalogue organization.
  • dc.description.sponsorship We would like to thank Sandra Alvarez García of the University of La Coruña, Spain, for the data pre-processing of the book information using their Library of Congress catalogue API. This research is partially supported by the Spanish MEC (project HIPERGRAPH TIN2009-14560-C03-01).
  • dc.format.mimetype application/pdf
  • dc.identifier.citation Nettleton D, Baeza-Yates R, Marcos MC. Analysis of the user queries of an e-Commerce bookstore in terms of the LCC catalogue and publishers. Information Research. 2013 Dec;18(4).
  • dc.identifier.issn 1368-1613
  • dc.identifier.uri http://hdl.handle.net/10230/43811
  • dc.language.iso eng
  • dc.publisher University of Borås
  • dc.relation.ispartof Information Research. 2013 Dec;18(4)
  • dc.relation.projectID info:eu-repo/grantAgreement/ES/3PN/TIN2009-14560-C03-01
  • dc.rights This document is published under a Creative Commons License https://creativecommons.org/licenses/by-nc-nd/3.0/
  • dc.rights.accessRights info:eu-repo/semantics/openAccess
  • dc.rights.uri https://creativecommons.org/licenses/by-nc-nd/3.0/
  • dc.title Analysis of the user queries of an e-commerce bookstore in terms of the Library of Congress classification and key publishers
  • dc.type info:eu-repo/semantics/article
  • dc.type.version info:eu-repo/semantics/publishedVersion