go back
go back
Volume 14, No. 12
CBench: Demonstrating Comprehensive Evaluation of Question Answering Systems over Knowledge Graphs Through Deep Analysis of Benchmarks
Abstract
A plethora of question answering (QA) systems that retrieve answers to natural language questions from knowledge graphs have been developed in recent years. However, choosing a benchmark to accurately assess the quality of a question answering system is a challenging task due to the high degree of variations among the available benchmarks with respect to their fine-grained properties. In this demonstration, we introduce CBench, an extensible, and more informative benchmarking suite for analyzing benchmarks and evaluating QA systems. CBench can be used to analyze existing benchmarks with respect to several fine-grained linguistic, syntactic, and structural properties of the questions and queries in the benchmarks. Moreover, CBench can be used to facilitate the evaluation of QA systems using a set of popular benchmarks that can be augmented with other user-provided benchmarks. CBench not only evaluates a QA system based on popular single-number metrics but also gives a detailed analysis of the linguistic, syntactic, and structural properties of answered and unanswered questions to help the developers of QA systems to better understand where their system excels and where it struggles.
PVLDB is part of the VLDB Endowment Inc.
Privacy Policy