UPDF AI

Visual Detection with Context for Document Layout Analysis

Carlos Soto,Shinjae Yoo

2019 · DOI: 10.18653/v1/D19-1348
Conference on Empirical Methods in Natural Language Processing · 60 Citations

TLDR

A work in progress method to visually segment key regions of scientific articles using an object detection technique augmented with contextual features, and a novel dataset of region-labeled articles, and ongoing work on further improvements are discussed.

Abstract

We present 1) a work in progress method to visually segment key regions of scientific articles using an object detection technique augmented with contextual features, and 2) a novel dataset of region-labeled articles. A continuing challenge in scientific literature mining is the difficulty of consistently extracting high-quality text from formatted PDFs. To address this, we adapt the object-detection technique Faster R-CNN for document layout detection, incorporating contextual information that leverages the inherently localized nature of article contents to improve the region detection performance. Due to the limited availability of high-quality region-labels for scientific articles, we also contribute a novel dataset of region annotations, the first version of which covers 9 region classes and 822 article pages. Initial experimental results demonstrate a 23.9% absolute improvement in mean average precision over the baseline model by incorporating contextual features, and a processing speed 14x faster than a text-based technique. Ongoing work on further improvements is also discussed.

Cited Papers
Citing Papers