go back

Volume 16, No. 10

Autonomously Computable Information Extraction

Authors:
Besat Kassaie, Frank Wm. Tompa

Abstract

Most optimization techniques deployed in information extraction systems assume that source documents are static. Instead, extracted relations can be considered to be materialized views defined by a language built on regular expressions. Using this perspective, we can provide an efficient verifier (using static analysis) that can be used to avoid the high cost of re-extracting information after an update. In particular, we propose an efficient mechanism to identify updates for which we can autonomously compute an extracted relation. We present experimental results that support the feasibility and practicality of this mechanism in real world extraction systems.

PVLDB is part of the VLDB Endowment Inc.

Privacy Policy