Efficient Query Processing Over Inconsistent Databases

Efficient Query Processing Over Inconsistent Databases
Author :
Publisher :
Total Pages : 420
Release :
ISBN-10 : 0494279095
ISBN-13 : 9780494279090
Rating : 4/5 (090 Downloads)

Book Synopsis Efficient Query Processing Over Inconsistent Databases by : Ariel Damián Fuxman

Download or read book Efficient Query Processing Over Inconsistent Databases written by Ariel Damián Fuxman and published by . This book was released on 2007 with total page 420 pages. Available in PDF, EPUB and Kindle. Book excerpt: Although integrity constraints have long been used to maintain data consistency, there are situations in which they may not be enforced or satisfied. In this thesis, we present ConQuer, a system for efficient and scalable answering of SQL queries on databases that may violate a set of constraints. ConQuer permits users to postulate a set of key constraints together with their queries. The system rewrites the queries to retrieve all (and only) data that is consistent with respect to the constraints. The rewriting is into SQL, so the rewritten queries can be efficiently optimized and executed by commercial database systems. The problem of obtaining consistent answers for primary key constraints and Select-Project-Join (SPJ) queries is known to be intractable in general. However, we identify a large and practical class of SPJ queries for which the problem is tractable. For this class of queries, we provide a query rewriting algorithm that can be executed in linear time in the size of the query. We consider SPJ queries that may have either set or bag semantics. For the latter case, the queries may also have grouping and aggregation. We show the maximality of the class of queries, in the sense that minimal relaxations of its conditions may lead to intractability. Finally, we study the efficiency and scalability of the query rewritings on a commercial database system. The study shows that the overhead of the rewritings is reasonable, when we consider the original (non-rewritten) queries as a baseline. The experiments use representative queries from TPC-H (the standard benchmark for decision support systems) and databases of up to 20 GB.


Efficient Query Processing Over Inconsistent Databases Related Books