Parallel Set Operations in Complex Object-Oriented Queries

Haddleton, Russell F., Department of Computer Science, University of Virginia
Pfaltz, John, Engineering/Computer Science, University of Virginia
Batson, Alan P., Department of Computer Science, University of Virginia
Son, Sang, En-Comp Science Dept, University of Virginia
French, James C., Department of Computer Science, University of Virginia
Mayer, Margaret, Department of Systems Engineering, University of Virginia

This dissertation presents a new parallel object-oriented database system implementation and architecture. The system, parallel ADAMS, we have implemented as appropriate to large-scale scientific database applications, where the retrieval of complex data from very large collections is a primary operation.

Aside from being a parallel implementation, parallel ADAMS differs from typical OODBMSs in three significant ways: (1) it employs the decomposed storage model rather than contiguous object storage, (2) it is based on a query server architecture, and (3) it employs a shared nothing distributed architecture.

Parallel ADAMS sets are partitioned by oid. In the dissertation, we demonstrate that set operators, and therefore logical query connectives, can be performed in a completely data parallel fashion. More complex queries involving implicit joins require inter-processor communication which is minimized in our implementation.

The implementation runs on general purpose hardware. Results are provided for a group of 1-8 SUN processors. We observe “super linear” speed up and nearly linear scale up for queries over a one million object (500 megabytes) database. In addition to measuring parallel performance in terms of time, we develop a formal model of behavior which could also be used for other database implementations. We model data movement in terms of primitive operations. Given accurate times of these primitive operations, performance times can be predicted.

We use the model to explain the behavior of the parallel ADAMS system, and use extensive tests of the system to validate the model.

ADAMS is a working system which supports the popular “Oracle of Bacon” web site.

PHD (Doctor of Philosophy)
All rights reserved (no additional license for public reuse)
Issued Date: