Pre-indexing pruning strategies
| dc.contributor.author | Altin, Soner | |
| dc.contributor.author | Baeza Yates, Ricardo | |
| dc.contributor.author | Cambazoglu, B. Barla | |
| dc.date.accessioned | 2021-05-25T10:35:26Z | |
| dc.date.available | 2021-05-25T10:35:26Z | |
| dc.date.issued | 2020 | |
| dc.description | Comunicació presentada al SPIRE 2020: International Symposium on String Processing and Information Retrieval, celebrat del 13 al 15 d'octubre de 2020 a Orlando, Estats Units. | |
| dc.description.abstract | We explore different techniques for pruning an inverted index in advance, that is, without building the full index. These techniques provide interesting trade-offs between index size, answer quality and query coverage. We experimentally analyze them in a large public web collection with two different query logs. The trade-offs that we find range from an index of size 4% and 35% of precision@10 to an index of size 46% and 90% of precision@10, with respect to the full index case. In both cases we cover almost 97% of the query volume. We also do a relative relevance analysis with a smaller private web collection and query log, finding that some of our techniques allow a reduction of almost 40% the index size by losing less than 2% for NDCG@10. | en |
| dc.format.mimetype | application/pdf | |
| dc.identifier.citation | Altin S, Baeza-Yates R, Cambazoglu B. Pre-indexing pruning strategies. In: Boucher C, Thankachan SV, editors. SPIRE 2020: International Symposium on String Processing and Information Retrieval; 2020 Oct 13-15; Orlando, USA. Cham: Springer; 2020. p.177-93. DOI: 10.1007/978-3-030-59212-7_13 | |
| dc.identifier.doi | http://dx.doi.org/10.1007/978-3-030-59212-7_13 | |
| dc.identifier.uri | http://hdl.handle.net/10230/47646 | |
| dc.language.iso | eng | |
| dc.publisher | Springer | |
| dc.relation.ispartof | Boucher C, Thankachan SV, editors. SPIRE 2020: International Symposium on String Processing and Information Retrieval; 2020 Oct 13-15; Orlando, USA. Cham: Springer; 2020. p.177-93 | |
| dc.rights | © Springer The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-030-59212-7_13 | |
| dc.rights.accessRights | info:eu-repo/semantics/openAccess | |
| dc.subject.keyword | Web search | en |
| dc.subject.keyword | Inverted index | en |
| dc.subject.keyword | Index pruning | en |
| dc.subject.keyword | Search efficiency | en |
| dc.title | Pre-indexing pruning strategies | en |
| dc.type | info:eu-repo/semantics/conferenceObject | |
| dc.type.version | info:eu-repo/semantics/acceptedVersion |
Files
Original bundle
1 - 1 of 1
