Tagging Scientific Publications using Wikipedia and Natural Language Processing Tools. Comparison on the ArXiv Dataset
dc.contributor.author | Łopuszyński, Michał | |
dc.contributor.author | Bolikowski, Łukasz | |
dc.date.accessioned | 2015-01-27T10:20:07Z | |
dc.date.available | 2015-01-27T10:20:07Z | |
dc.date.issued | 2014 | |
dc.identifier.uri | http://dx.doi.org/10.1007/978-3-319-08425-1_3 | |
dc.identifier.uri | https://depot.ceon.pl/handle/123456789/6095 | |
dc.description.abstract | In this work, we compare two simple methods of tagging scientific publications with labels reflecting their content. As a first source of labels Wikipedia is employed, second label set is constructed from the noun phrases occurring in the analyzed corpus. We examine the statistical properties and the effectiveness of both approaches on the dataset consisting of abstracts from 0.7 million of scientific documents deposited in the ArXiv preprint collection. We believe that obtained tags can be later on applied as useful document features in various machine learning tasks (document similarity, clustering, topic modelling, etc.). | en |
dc.language.iso | en | pl_PL |
dc.publisher | Springer | pl_PL |
dc.relation.ispartofseries | Communications in Computer and Information Science;416 | |
dc.rights | Dozwolony użytek | |
dc.subject | Wikipedia | en |
dc.subject | natural language processing | en |
dc.subject | tagging document collections | en |
dc.title | Tagging Scientific Publications using Wikipedia and Natural Language Processing Tools. Comparison on the ArXiv Dataset | en |
dc.type | info:eu-repo/semantics/conferenceObject | pl_PL |
dc.contributor.organization | Interdisciplinary Centre for Mathematical and Computational Modelling, University of Warsaw | en |
dc.description.eperson | Michał Łopuszyński |
Pliki tej pozycji
Pozycja umieszczona jest w następujących kolekcjach
Korzystanie z tego materiału jest możliwe zgodnie z właściwymi przepisami o dozwolonym użytku lub o innych wyjątkach przewidzianych w przepisach prawa, a korzystanie w szerszym zakresie wymaga uzyskania zgody uprawnionego.