SWI-Prolog ODBC Interface


Jan Wielemaker
SWI,
University of Amsterdam
The Netherlands
E-mail: jan@swi-prolog.org

Abstract

This document describes the SWI-Prolog interface to ODBC, the Microsoft standard for Open DataBase Connectivity. These days there are ODBC managers from multiple vendors for many platforms as well as drivers for most databases, making it an attractive target for a Prolog Database connection.

The database interface is envisioned to consist of two layers. The first layer is an encapsulation of the core functionality of ODBC. This layer makes it possible to run SQL queries. The second layer exploits the relation between Prolog predicates and database tables, providing ---a somewhat limited--- natural Prolog view on the data. The current interface only covers the first layer.

Table of Contents

1 Introduction

The value of RDMS for Prolog is often over-estimated, as Prolog itself can manage substantial amounts of data. Nevertheless a Prolog/RDMS interface provides advantages if data is already provided in an RDMS, data must be shared with other applications, there are strong persistency requirements or there is too much data to fit in memory.

The popularity of ODBC makes it possible to design a single foreign-language module that provides RDMS access for a wide variety of databases on a wide variety of platforms. The SWI-Prolog RDMS interface is closely modelled after the ODBC API. This API is rather low-level, but defaults and dynamic typing provided by Prolog give the user quite simple access to RDMS, while the interface provides the best possible performance given the RDMS independency constraint.

The Prolog community knows about various high-level connections between RDMS and Prolog. We envision these layered on top of the ODBC connection described here.

2 The ODBC layer

2.1 Connection management

The ODBC interface deals with a single ODBC environment but with multiple simultaneous connections. The predicates in this section deal with connection management.

odbc_connect(+DSN, -Connection, +Options)
Create a new ODBC connection to data-source DSN and return a handle to this connection in Connection. The connection handle is either an opaque structure or an atom of the alias option is used. The options below are defined. In addition, options of odbc_set_connection/2 may be provided.

user(User)
Define the user-name for the connection. This option must be present if the database uses authorization.

password(Password)
Provide a password for the connection. Normally used in combination with user(User).

alias(AliasName)
Use AliasName as Connection identifier, making the connection available as a global resource. A good choice is to use the DSN as alias.

open(OpenMode)
If OpenMode is once (default if an alias is provided), a second call to open the same DSN simply returns the existing connection. If multiple (default if there is no alias name), a second connection to the same data-source is opened.

The following example connects to the WordNet database, using the connection alias wordnet and opening the connection only once:


open_wordnet :-
        odbc_connect('WordNet', _,
                     [ user(jan),
                       password(xxx),
                       alias(wordnet),
                       open(once)
                     ]).

odbc_disconnect(+Connection)
Close the given Connection. This destroyes the connection alias or, if there is no alias, makes further use of the Connection handle illegal.

odbc_current_connection(?Connection, ?DSN)
Enumerate the existing ODBC connections.

odbc_set_connection(+Connection, +Option)
Set options on an existing connection. Defined options are:

auto_commit(bool)
If true (default), each update statement is committed immediately. If false, an update statement starts a transaction that can be committed or rolled-back. See section 2.3 for details on transaction management.

2.2 Running SQL queries

ODBC distinguishes between direct execution of literal SQL strings and parameterized execution of SQL strings. The first is a simple practical solution for infrequent calls (such as creating a table), while parameterized execution allows the driver and database to precompile the query and store the optimized code, making it suitable for time-critical operations. In addition, it allows for passing parameters without going through SQL-syntax and thus avoiding the need for quoting.

2.2.1 One-time invocation

odbc_query(+Connection, +SQL, -Row)
Same as odbc_query/4 using for Options.

odbc_query(+Connection, +SQL, -Row, +Options)
Fire a query on the database represented by Connection. SQL is any valid SQL statement. SQL statements can be specified as a plain atom, string or a term of the format Format-Arguments, which is converted using format/2. After executing the query, result-rows are returned one-by-one as terms of the functor row/\arg{Arity} , where Arity denotes the number of columns in the result-set. If the query has no results (such as INSERT) use odbc_query/2. Here is a small example using the connection created from odbc_connect/3.


lemma(Lemma) :-
        odbc_query(wordnet,
                   'SELECT (lemma) FROM word',
                   row(Lemma).

Please note that the SQL-statement does not end in the ; character. Options defines the following options:

types(ListOfTypes)
Determine the Prolog type used to report the column-values. When omitted, default conversion as described in section 2.6 is implied. A column may specify default to use default conversion for that column. The length of the type-list must match the number of columns in the result-set.

For example, in the table word the first column is defined with the SQL type DECIMAL(6). Using this SQL-type, ``001'' is distinct from ``1'', but using Prolog integers is a valid representation for Wordnet wordno identifiers. The following query extracts rows using Prolog integers:


?- odbc_query(wordnet,
              'select * from word', X,
              [ types([integer,default])
              ]).

X = row(1, entity)

See also section 2.6 for notes on type-conversion.

odbc_query(+Connection, +SQL)
As odbc_query/3, but used for SQL-statements that should not return result-rows (i.e. all statements except for SELECT). The predicate prints a diagnostic message if teh query returns a result.

2.2.2 Paramaterised queries

ODBC provides for `parameterized queries'. These are SQL queries with a ?-sign at places where parameters appear. The ODBC interface and database driver may use this to precompile the SQL-statement, giving better performance on repeated queries. This is exactly what we want if we associate Prolog predicates to database tables. This interface is defined by the following predicates:

odbc_prepare(+Connection, +SQL, +Parameters, -Statement)
As odbc_prepare/5 using for Options.

odbc_prepare(+Connection, +SQL, +Parameters, -Statement, +Options)
Create a statement from the given SQL (which may be a format specification as described with odbc_query/3) statement that normally has one or more parameter-indicators (?) and unify Statement with a handle to the created statement. Parameters is a list of descriptions, one for each parameter. Each parameter description is one of the following:

default
Uses the ODBC function SQLDescribeParam() to obtain information about the parameter and applies default rules. See section 2.6 for details.

SqlTypeSpecifier, ...(SqlTypeSpecifier, ...)
eclare the parameter to be of type SqlType with the given specifiers. Specifiers are required for char, varchar, etc. to specify the field-width. In odbc_execute/2, the must supply the value in default Prolog type for this SQL type. See section 2.6 for details.

PrologType > SqlType
As above, but supply values of the given PrologType, using the type-transformation defined by the database driver. For example, if the parameter is specified as


atom > date

The use must supply an atom of format YYYY-MM-DD rather than a term date(Year,Month,Day). This construct enhances flexibility and allows for passing values that have no proper representation in Prolog.

Options defines a list of options for executing the statement. See odbc_query/4 for details.

odbc_execute(+Statement, +ParameterValues, -Row)
Execute a statement prepared with odbc_prepare/4 with the given ParameterValues and return the rows one-by-one on backtracking as odbc_query/4. This predicate may return type_error exceptions if the provided parameter values cannot be converted to the declared types.

ODBC doesn't appear to allow for multiple cursors on the same result-set. (1) This would imply there can only be one active odbc_execute/3 (i.e. have a choice-point) on a prepared statement. Suppose we have a table age (name char(25), age integer) bound to the predicate age/2 we cannot write the code below without special precautions. The ODBC interface therefore creates a clone of a statement if it discovers the statement is being executed, which is discarded after the statement is finished. (2)


same_age(X, Y) :-
        age(X, AgeX),
        age(Y, AgeY),
        AgeX = AgeY.

odbc_free_statement(+Statement)
Destroy a statement prepared with odbc_prepare/4. If the statement is currently executing (i.e. odbc_execute/3 left a choice-point), the destruction is delayed until the execution terminates.

2.3 Transaction management

ODBC can run in two modi. By default, all update actions are immediately committed on the server. Using odbc_set_connection/2 this behaviour can be switched off, after which each SQL statement that can be inside a transaction implicitely starts a new transaction. This transaction can be ended using odbc_end_transaction/2.

odbc_end_transaction(+Connection, +Action)
End the currently open transaction if there is one. Using Action commit pending updates are made permanent, using rollback they are discarded.

The ODBC documentation has many comments on transation management and its interaction with database cursors.

2.4 Accessing the database dictionary

With this interface we do not envision the use of Prolog as a database manager. Nevertheless, elementary access to the structure of a database is required, for example to validate a database satisfies the assumptions made by the application.

odbc_current_table(+Connection, -Table)
Return on backtracking the names of all tables in the database identified by the connection.

odbc_current_table(+Connection, ?Table, ?Facet)
Enumerate properties of the tables. Defines facets are:

qualifier(Qualifier)

owner(Owner)

comment(Comment)
These facets are defined by SQLTables()

arity(Arity)
This facet returns the number of columns in a table.

odbc_table_column(+Connection, ?Table, ?Column)
On backtracking, enumerate all columns in all tables.

odbc_table_column(+Connection, ?Table, ?Column, ?Facet)
Provides access to the properties of the table as defined by the ODBC call SQLColumns(). Defined facets are:

table_qualifier(Qualifier)

table_owner(Owner)

table_name(Table)
See odbc_current_table/3.

data_type(DataType)

type_name(TypeName)

precision(Precision)

length(Length)

scale(Scale)

radix(Radix)

nullable(Nullable)

remarks(Remarks)
These facets are defined by SQLColumns()

type(Type)
More prolog-friendly representation of the type properties. See section 2.6.

odbc_type(+Connection, ?TypeSpec, ?Facet)
Query the types supported by the data source. TypeSpec is either an integer type-id, the name of an ODBC SQL type or the constant all_types to enumerate all known types. This predicate calls SQLGetTypeInfo() and its facet names are derived from the specification of this ODBC function:

name(Name)
Name used by the data-source. Use this in CREATE statements

data_type(DataType)
Numeric indentifier of the type

precision(Precision)
When available, maximum precision of the type.

literal_prefix(Prefix)
When available, prefix for literal representation.

literal_suffix(Suffix)
When available, suffix for literal representation.

create_params(CreateParams)
When available, arguments needed to create the type.

nullable(Bool)
Whether the type can be NULL. May be unknown

case_sensitive(Bool)
Whether values for this type are case-sensitive.

searchable(Searchable)
Whether the type can be searched. Values are false, true, like_only or all_except_like.

unsigned(Bool)
When available, whether the value is signed. Please note that SWI-Prolog does not provide unsigned integral values.

money(Bool)
Whether the type represents money.

auto_increment(Bool)
When available, whether the type can be auto-incremented.

local_name(LocalName)
Name of the type in local language.

minimum_scale(MinScale)
Minimum scale of the type.

maximum_scale(MaxScale)
Maximum scale of the type.

odbc_data_source(?DSN, ?Description)
Query the defined data sources. It is not required to have any open connections before calling this predicate. DSN is the name of the data source as required by odbc_connect/3. Description is the name of the driver. The driver name may be used to tailure the SQL statements used on the database. Unfortunately this name depends on the local installing details and is therefore not universially useful.

2.5 Getting more information

odbc_statistics(?Key)
Get statistical data on the ODBC interface. Currently defined keys are:

statements(Created, Freed)
Number of SQL statements that have been Created and Freed over all connections. Statements executed with odbc_query/[2,3] increment Created as the query is created and Freed if the query is terminated due to deterministic success, failure, cut or exception. Statements created with odbc_prepare/[4,5] are freed by odbc_free_statement/1 or due to a fatal error with the statement.

2.6 Representing SQL data in Prolog

Databases have a poorly standardized but rich set of datatypes. Some have natural Prolog counterparts, some not. A complete mapping requires us to define Prolog data-types for SQL types that have no standardized Prolog counterpart (such as timestamp), the definition of a default mapping and the possibility to define an alternative mapping for a specific column. For example, many variations of the SQL DECIMAL type cannot be mapped to a Prolog integer. Nevertheless, mapping to an integer may be the proper choice for a specific application.

The Prolog/ODBC interface defines the following Prolog result types with the indicated default transformation. Different result-types can be requested using the types(TypeList) option for the odbc_query/4 and odbc_prepare/5 interfaces.

atom
Used as default for the SQL types char, varchar, longvarchar, decimal and numeric. Can be used for all types.

string
SWI-Prolog extended type string. Use the type for special cases where garbage atoms must be avoided. Can be used for all types.

codes
List of character codes. Use this type if the argument must be analysed or compatibility with Prolog systems that cannot handle infinite-length atoms is desired. Can be used for all types.

integer
Used as default for the SQL types bit, tinyint, smallint and integer. Please note that SWI-Prolog integers are signed 32-bit values, where SQL allows for unsigned values as well. Can be used for the integral, and decimal types as well as the types date and time_stamp, which are represented as POSIX time-stamps (seconds after Jan 1, 1970).

double
Used as default for the SQL types real, float and double. Can be used for the integral, float and decimal types as well as the types date and time_stamp, which are represented as POSIX time-stamps (seconds after Jan 1, 1970). Representing time this way is compatible to SWI-Prologs time-stamp handling.

date
A Prolog term of the form date(Year,Month,Day) used as default for the SQL type date.

time
A Prolog term of the form time(Hour,Minute,Second) used as default for the SQL type time.

time_stamp
A Prolog term of the form time_tamp(Year,Month,Day,Hour,Minute,Second,Fraction) used as default for the SQL type time_stamp.

2.7 Errors and warnings

Disregarding some details, ODBC operations return success, error or `success with information'. This section explains how results from the ODBC layer are reported to Prolog.

2.7.1 ODBC messages: `Result with info'

If an ODBC operation returns `with info', the info is extracted from the interface and handled to the Prolog message dispatcher print_message/2. The level of the message is informational and the term is of the form:

odbc(State, Native, Message)
Here, State is the SQL-state as defined in the ODBC API, Native is the (integer) error code of the underlying data source and Message is a human readable explanation of the message.

2.7.2 ODBC errors

If an ODBC operation signals an error, it throws the exception error(odbc(State, Native, Message), _). The arguments of the odbc(3) term are explained in section 2.7.1.

In addition, the Prolog layer performs the normal tests for proper arguments and state, signalling the conventional instantiation, type, domain and resource exceptions.

2.8 ODBC implementations

There is a wealth on ODBC implementations that are completely or almost compatible to this interface. In addition, a number of databases are delivered with an ODBC compatible interface. This implies you get the portability benefits of ODBC without paying the configuration and performance price. Currently this interface is, according to the PHP documentation on this subject, provided by Adabas D, IBM DB2, Solid, and Sybase SQL Anywhere.

2.8.1 Using unixODBC

The SWI-Prolog ODBC interface was developed using unixODBC and MySQL on SuSE Linux.

2.8.2 Using Microsoft ODBC

On MS-Windows, the ODBC interface is a standard package, linked against odbc32.lib.

2.9 Remaining issues

The following issues are indentified and waiting for concrete problems and suggestions.

NULL data
How to represent NULL in Prolog. We have seen the atom $Null$ and the compound term 'NULL'(_).

3 Installation

3.1 Unix systems

Installation on Unix system uses the commonly found configure, make and make install sequence. SWI-Prolog should be installed before building this package. If SWI-Prolog is not installed as pl, the environment variable PL must be set to the name of the SWI-Prolog executable. Installation is now accomplished using:


% ./configure
% make
% make install

This installs the foreign libraries in $PLBASE/lib/$PLARCH and the Prolog library files in $PLBASE/library, where $PLBASE refers to the SWI-Prolog `home-directory'.

Footnotes

note-1
Is this right?
note-2
The code is prepared to maintain a cache of statements. Practice should tell us whether it is worthwhile activating this.