CBench: Demonstrating Comprehensive Evaluation of Question Answering Systems over Knowledge Graphs Through Deep Analysis of Benchmarks

Authors:

Abdelghny Orogat (Carleton University), Ahmed El-Roby (Carleton University)

Download PDF

Abstract

A plethora of question answering (QA) systems that retrieve answers to natural language questions from knowledge graphs have been developed in recent years. However, choosing a benchmark to accurately assess the quality of a question answering system is a challenging task due to the high degree of variations among the available benchmarks with respect to their fine-grained properties. In this demonstration, we introduce CBench, an extensible, and more informative benchmarking suite for analyzing benchmarks and evaluating QA systems. CBench can be used to analyze existing benchmarks with respect to several fine-grained linguistic, syntactic, and structural properties of the questions and queries in the benchmarks. Moreover, CBench can be used to facilitate the evaluation of QA systems using a set of popular benchmarks that can be augmented with other user-provided benchmarks. CBench not only evaluates a QA system based on popular single-number metrics but also gives a detailed analysis of the linguistic, syntactic, and structural properties of answered and unanswered questions to help the developers of QA systems to better understand where their system excels and where it struggles.

PVLDB is part of the VLDB Endowment Inc.

Start

Current Submission

All Volumes

Reproducibility

General Information

Volume 14, No. 12

CBench: Demonstrating Comprehensive Evaluation of Question Answering Systems over Knowledge Graphs Through Deep Analysis of Benchmarks

Abstract