Szukaj

Wyświetlanie pozycji 1-6 z 6

A modular metadata extraction system for born-digital articles

Tkaczyk, Dominika; Bolikowski, Łukasz; Czeczko, Artur; Rusek, Krzysztof (2012-03-27)

We present a comprehensive system for extracting metadata from scholarly articles. In our approach the entire document is inspected, including headers and footers of all the pages as well as bibliographic references. ...

Data model for analysis of scholarly documents in the MapReduce paradigm

Kawa, Adam; Bolikowski, Łukasz; Czeczko, Artur; Dendek, Piotr Jan; Tkaczyk, Dominika (Springer, 2013)

At CEON ICM UW we are in possession of a large collection of scholarly documents that we store and process using MapReduce paradigm. One of the main challenges is to design a simple, but effective data model that fits ...

Workflow of metadata extraction from retro-born-digital documents

Tkaczyk, Dominika; Bolikowski, Łukasz (2011-06-27)

In this work-in-progress report we propose a workflow for metadata extraction from articles in a digital form. We decompose the problem into clearly defined sub-tasks and outline possible implementations of the sub-tasks. ...

Workflow of metadata extraction from retro-born-digital documents

Tkaczyk, Dominika; Bolikowski, Łukasz (2011-07-13)

Methodology for evaluating citation parsing and matching

Fedoryszak, Mateusz; Bolikowski, Łukasz; Tkaczyk, Dominika; Wojciechowski, Krzysztof (Springer, 2013)

Bibliographic references between scholarly publications contain valuable information for researchers and developers involved with digital repositories. They are indicators of topical similarity between linked texts, impact ...

GROTOAP: GROund Truth for Open Access Publications

Tkaczyk, Dominika; Czeczko, Artur; Rusek, Krzysztof; Bolikowski, Łukasz; Bogacewicz, Roman (ACM, 2012-06)

The field of digital document content analysis includes many important tasks, for example page segmentation or zone classification. It is impossible to build effective solutions for such problems and evaluate their performance ...

Szukaj

Filtry

A modular metadata extraction system for born-digital articles

Data model for analysis of scholarly documents in the MapReduce paradigm

Workflow of metadata extraction from retro-born-digital documents

Workflow of metadata extraction from retro-born-digital documents

Methodology for evaluating citation parsing and matching

GROTOAP: GROund Truth for Open Access Publications