Mini SQL: A Lightweight Database Engine

or

Designing a Lean, Mean Data Machine



David J. Hughes
Hughes Technologies Pty Ltd
Gold Coast Qld 4229

< Bambi@Hughes.com.au >

Presented at the 1995 QAUUG Summer Technical Conference


20 April 1995


Abstract

It is a known fact that a database management system is more often than not a silent, behind the scenes tool used by many applications. The requirement to store and retrieve data in a centralised and managed manner is common to most uses of computers. The only remaining question is whether the average application requires the complexity and expense of the current commercial offerings in the RDBMS market.

The main-stream commercial database systems are very sophisticated pieces of software providing a vast array of features and functionality. The complexity of these applications is reflected by both the computing resources required by these tools and the purchase price of the products. For an application that requires extremely sophisticated data management these tools are obviously well suited, but for the vast majority of applications they could be viewed as excessive.

Mini SQL (or mSQL as it is known) is a lightweight Relation Database Management System offering the basic subset of features required by all applications. It was originally designed for use by the Minerva Network Management Environment but is now commonly used at thousands of sites for various data management tasks. This paper discusses the origins of mSQL and provides a detailed examination of its internal design.


Introduction

Mini SQL, or mSQL, is a lightweight database engine designed to provide fast access to stored data with low memory requirements. As its name implies, mSQL offers a subset of SQL as its query interface. Although it only supports a subset of SQL (no views, sub-queries etc.), everything it supports is in accordance with the ANSI SQL specification. The mSQL package includes the database engine, a terminal "monitor" program, a database administration program, a schema viewer, and a C language API. The API and the database engine have been designed to work in a client/server environment over a TCP/IP network.

The decision to write yet another database package was made due to the hole in the range of "free" or "freely available" databases. At the time of writing, there are no other database packages available that support SQL as the query language. The most notable database package for research work, Postgres from the University of California at Berkeley, offers a superset of the original Ingres QUEL known as PostQUEL as its query language.

mSQL has been developed as the database backend for the Minerva(1) Network Management Environment. Originally, Minerva utilised Postgres as its database and generated PostQUEL queries to access it. During initial alpha testing of Minerva, a comment was made that if Minerva generated SQL queries, sites with an existing database installation, such as Ingres or Oracle, could use their commercial databases rather than have to support Postgres as well. To accommodate that wish, mSQL was written initially as an SQL to PostQUEL translator so that sites without commercial database could still use Postgres (seeing as there were no "free" SQL engines available).

As time passed, and Minerva developed further, it became apparent that Postgres was too resource hungry to support the evolving mechanisms provided by Minerva. To gain speed, Minerva was extended to perform monitoring and data acquisition in parallel. Unfortunately, each process that communicated with the database forced another copy of the Postgres backend to be spawned. The fact that each Postgres backend consumes close to 1.5 megabytes of memory soon put a stop to the parallel data acquisition operations.

Although Postgres is a very large and capable package, it is supported on only a handful of platforms. This proved to be a problem as a couple of the original Minerva alpha testers wished to run Minerva on Silicon Graphics machines. Unfortunately, Postgres did not support the SGI machines so they could not participate in the testing. The fact that Minerva itself utilised only a fraction of the features of Postgres and needed to be portable to most platforms proved that tying Minerva to Postgres was not the best option. From that decision Mini SQL was developed.

It should be noted that Postgres is an excellent database package offering a vast array of powerful features and that the above comments in no way try to detract from its success. The fact that Minerva utilises very few database features showed that a database as capable and advanced as Postgres was overkill.

The first public release of mSQL generated a high level of interest. It became apparent that many people had a need for a small, fast SQL engine. Most of these people were attracted to mSQL because it was "freely-available" while others were impressed with its size and speed compared to the large commercial products. Since that time thousands of people have evaluated mSQL both as a free database package for research use and a commercially viable alternative to the large RDBMSs of today.


Design Goals

The primary design goals of mSQL directly reflect the fact that its initial use was as a database for network management. A Network Management System (NMS) in general (and Minerva in particular) generates a large number of very simple queries. A query such as:

	Select status from nodes
  		where nodename=`Hughes.com.au'
is very simple and is the style of query generated frequently by an NMS. If an NMS generates a large number of simple queries in contrast to very few complex queries, the RDBMS must be capable of handling such queries in as little time as possible.

Another important feature for an NMS is the ability to perform data acquisition work in parallel. This implies that the RDBMS must also be able to handle multiple operations, or at least multiple sessions, at the same time. Supporting multiple sessions should not consume vast quantities of system resources and it also should not reduce the performance of the RDBMS.

The final design goal was that the database should support client/server operations. Not only should the data be available via a single API from remote machines over a network, but the data should be transmitted in a manner that allows hosts of different design to share information. Considerations such as host to host byte-ordering and also the length of certain data types must be taken into account.


Client / Server support

mSQL has been implemented to meet the design goals outlined above. It`s implementation is based upon a very simple C API library communicating with the database server process. The communications between the API library and the server are performed via sockets. The mSQL server (msqld) offers two mechanisms via which it can be contacted:

When an application initiates a connection to the mSQL server, the client API library connects to either the local UNIX socket or the TCP socket and negotiates a session.

The reason that both TCP and UNIX domain connections are supported is entirely for performance. Access to the server via the UNIX socket is an order of magnitude faster than accessing it via the TCP socket because the kernel networking code is not invoked. Use of the TCP interface to mSQL requires the data to be packetised and encapsulated as TCP/IP packets even though the server is located on the same machine. The TCP interface should only be used by applications running on remote network hosts.

To facilitate client/server operations, all data sent between the client application and the mSQL server is sent as ASCII text. This fits into the SQL paradigm in which all data is sent to the server in a text based query string. This format allows hosts of different byte ordering to share integer and real values without problems.

Another consideration for data portability has been the size of integer fields. The mSQL server stores integer values as 32 bit signed integers and will truncate any values it receives that are out of that range. This facility allows a machine which internally offers 64 bit integers to share data with machines that offer 32 bit integers. As a result of this facility, mSQL is currently in use at Cray Research with the server process running on a Cray and the client applications running on Sun SPARC based equipment.


Multi-User Support

In accordance with the stated design goals, mSQL can support multiple simultaneous user sessions. It handles multiple sessions without the need for multiple processes by queuing client requests and processing them in order. This design offers two advantages and one disadvantage as listed below.

Pros:

Cons:

As connections are received, details including the connection socket, the client username and the desired database are stored in a connection table. The connection socket is also added to a list of active file descriptors for use with the select() system call. A tight select() loop is entered and the client connections are handled as data becomes available on them.

The user details are used at connect time to provide per database access control features. The server supports access control based on the username, the type of connection (local or remote) and the hostname of the remote machine.

If a read() error is encountered on an initial read from a connection after select() returns, the connection is viewed as dead and is shutdown. This caters for a situation where a client application crashes prior to closing its connection to the mSQL server.


Data Handling and Performance

mSQL embodies the "Keep it simple and fast" design principle with regard to its data management routines. It currently does not utilise b-trees, external hashing techniques or any other data management methodology that is usually associated with rapid data access. In contrast, mSQL utilises a single, flat sequential data file for each relation and an external "ISAM" styled index file. On machines that support shared read-write mmap() pages, the server process will map the entire database table into its virtual address space. If mmap() is not available, a buffered read mechanism is used to reduce the number of system calls made when retrieving data.

Despite the simple design, rapid access to the data is provided due to efficient code design and utilisation of cache techniques. The entire mSQL server has been profiled to an extent where raw data accessing performance is bound to the performance of a few system calls. An insert into a non-keyed table will normally be completed with three system calls (two calls to lseek() and one write() ). The use of system calls such as open(), close() and brk() is greatly reduced by the cache techniques outlined below.

The mSQL server maintains a "table cache" in which commonly used information about active database tables is maintained. This information includes the definition of the table rows, open file descriptors for the data and key files and mmap() regions for the data and keys if available. The cache maintains information about the 8 most recently used tables to ensure that the information about a table is likely to be in the cache when a query is received.

Although the primary performance goal of mSQL has been the handling of simple queries, work is underway to boost its performance on complex queries. The first such improvement has been in the handling of relational joins. Support for "partial match optimisation" has recently been added to the mSQL server so that a literal value in a join condition greatly reduces the time required to process a table join. If a literal condition is provided, mSQL will use the table to which it relates as the outer loop of the join and performs a partial match of the row based on the literal condition on each iteration of the join loop. If the partial match fails (i.e. the data from the outer table doesn't match the literal values provided) that row of the outer table is skipped without any consultation of the inner table. In its most basic form, this can reduce the computation of a join of two 10 row tables from 110 reads, 100 joins and 100 comparisons to 20 reads, 20 joins and 20 comparisons.

Further optimisation is underway with support for multiple keys within the join code, a faster sorting algorithm for ORDER clauses and the use of a memory mapped b-tree for data storage. These features will become available during the 1.1 series of releases.


Performance Comparisons

A direct comparison of mSQL against other more sophisticated RDBMSs may not be fair as mSQL has been designed to be small and fast when handling simple queries while other systems have been designed to handle complex queries quickly. That said, such a comparison does offer a guide to the relative performance of mSQL in its primary goal.

A simple comparison was made between mSQL, Postgres and University Ingres by the author of the Onyx 4gl system. His results showed that on simple queries, mSQL was more than 10 times faster than Ingres and 100 times faster than Postgres.

A similar comparison of mSQL and Oracle by a US company called Analogy using a test database of 90,000 rows gave quite interesting results as provided below:

100 Single Record Queries

mSQL (no key) 142 sec
Oracle (no key) 246 sec
mSQL (key) 2.4 sec
Oracle (key) ~2 sec
100 Multiple Record Queries

mSQL (no key) 110 sec
Oracle (no key) 205 sec
mSQL (key) 87 sec
Oracle (key) ~2 sec
Although these are not controlled or comprehensive tests, it does show that a simple and efficient design can provide comparable performance for simple tasks.

A simple test suite that is provided with mSQL using a 500 row test table produced the following results on a SPARCstation IPC running SunOS 4.1.3


Supported Languages

The standard mSQL distribution includes an API library for the C programming language. This library allows any C program to communicate with an mSQL server on the local machine or on a remote machine over a TCP/IP network. Several mSQL users have contributed API bindings for the popular UNIX scripting languages. The current list of supported programming environments includes:


Availability

mSQL is freely available via anonymous FTP from

Bond.edu.au (131.244.1.1)
in

/pub/Minerva/msql.
The current version is 1.0.5.

A mailing list is available for regular users of the software. You can join the list by sending a message containing the word "subscribe" to

msql-list-request@Bond.edu.au


Author Information

David Hughes currently holds the following positions:

Senior Network Programmer
Bond University

Managing Director
Web Publishing Australia
(Internet Marketing Services)

He holds a Bachelor of Applied Science in Computer Science and is currently completing a Ph.D. in Computer Science (network management) at Bond University.

He can be contacted via e-mail at:

Bambi@Bond.edu.au
or

Bambi@Hughes.com.au

Footnotes

(1)
Minerva was the Roman Goddess of knowledge and information.