DCOracle.txt
Digital Creations Oracle Package Documentation
Introduction
The Digital Creations Oracle (DCOracle) package provides a Python interface to the Oracle relational database system. The DCOracle package supports the Python Database API, version 1.0, with extensions.
The DCOracle package is designed to work with Python versions 1.4 and later and Oracle versions 7.3 and later. It is based on the Oracle Call Interface (OCI) version 7 interface.
Python Interface
Python Database API, Version 1.0, as implemented by the DCOracle Package
This API has been defined to encourage similarity between the Python modules that are used to access databases. It is hoped that this will achieve a consistency leading to more easily understood modules, code that is generally more portable across databases, and a broader reach of database connectivity from Python.
This interface specification consists of several items:
- Package Interface
- Connection Objects
- Cursor Objects
- DBI Helper Objects
Comments and questions about this specification may be directed to the SIG on Tabular Databases in Python. Comments and questions about the DCOracle implementation of this interface or it's extensions should send to [email protected].
Package Interface
The DCOracle package exports two names:
- Connect(connection_string)
- Constructor for creating a connection to the
database. Returns a Connection Object.
The connection string is of the form:
user/password@system
where:
- user
- An Oracle user ID
- password
- An Oracle user password
- system
- The Oracle system ID
- error
- Exception raised for errors from the package.
- dbi
- A module providing access to "DBI Helper Objects", as defined below.
- Buffer
- A constructor for creating buffer objects. Buffer objects are needed when executing SQL statements or PL/SQL blocks that modify bound parameters and for calling stored procedures with input/output arguments. See the documentation of Buffer objects below.
Connection Objects
Connections Objects respond to the following methods:
- close()
- Close the connection now (rather than whenever __del__ is called). The connection will be unusable from this point forward; an exception will be raised if any operation is attempted with the connection.
- commit()
- Commit any pending transaction to the database.
- rollback()
- Roll the database back to the start of any pending transaction.
- cursor()
- Return a new Cursor Object. An exception may be thrown if the database does not support a cursor concept.
- callproc(name, [params])
- Call the named stored procedure or function with the given parameters.
- getSource(proc)
- Return the source code for the given procedure (or function) name. Note that the case of the name must match the case of the object name as stored in the database (all upper case).
- objects(all=0)
- Return basic information on user (default) or all objects.
If this method is called with no arguments (or a false argument), a sequence of object name and type tuples is returned.
If this method is called with a true argument, a sequence of object name, type, and owner tuples is returned.
- procedures
- This is an attribute that provides access to a namespace containing PL/SQL procedures, functions, and packages. This is a DCOracle extension.
All of the attributes defined for cursor objects are available for connection objects. The connection object simply provides access to an implicitly defined cursor managed by the connection for this purpose.
Cursor Objects
These objects represent a database cursor, which is used to manage the context of a fetch operation.
Cursor Objects should respond to the following methods and attributes:
- arraysize
- This read/write attribute specifies the number of
rows to fetch at a time with fetchmany().
The DCOracle package also uses this parameter to control the number of rows retrieved in a single internal database fetch, if no
LONG
orLONG RAW
columns are being retrieved. IfLONG
orLONG RAW
columns are retrieved, then it is necessary to fetch one row at a time, internally.The use of this attribute by the DCOracle package differs from the Python DBI Specification in that the value has no impact on array inserts or updates. The number of rows inserted or updated is based on the lengths of the parameter lists or tuples provided.
- description
- This read-only attribute is a tuple of 7-tuples.
Each 7-tuple contains information describing
each result column: (name, type_code,
display_size, internal_size,
precision, scale, null_ok). This attribute
will be None for operations that do not return
rows or if the cursor has not had an operation
invoked via the execute() method yet.
The type_code is one of the dbi values specified in the section below.
- procedures
- This is an attribute that provides access to a namespace containing PL/SQL procedures, functions, and packages. This is a DCOracle extension.
- close()
- Close the cursor now (rather than whenever __del__ is called). The cursor will be unusable from this point forward; an exception will be raised if any operation is attempted with the cursor.
- execute(operation [,params])
- Execute a database operation (query
or command). Parameters may be provided (as
a sequence (e.g. tuple/list)) and will be bound to
variables in the operation.
Variables are specified in a database-specific notation that is based on the index in the parameter tuple (position-based rather than name-based). For the DCOracle package, variables are specified positionally as
:p1
,:p2
, and so on. For example:select * from spam where foo=:p1 and bar=:p2
The DCOracle package allows parameters to be specified by name as well as by position. This is more convenient and necessary of Oracle PL/SQL blocks are used. See the section on DCOracle extensions to the database specification later in this document.
The parameters may also be specified as a sequence of sequences (e.g. a list of tuples) to insert multiple rows in a single operation.
A reference to the operation will be retained by the cursor. If the same operation object is passed in again, then the cursor can optimize its behavior. This is most effective for algorithms where the same operation is used, but different parameters are bound to it (many times).
Using SQL terminology, these are the possible result values from the execute() method:
- If the statement is DDL (e.g. CREATE TABLE), then 1 is returned.
- If the statement is DML (e.g. UPDATE or INSERT), then the number of rows affected is returned (0 or a positive integer).
- If the statement is DQL (e.g. SELECT), None is returned, indicating that the statement is not really complete until you use one of the fetch methods.
- fetchone()
- Fetch the next row of a query result, returning a single tuple.
- fetchmany([size])
- Fetch the next set of rows of a query result, returning as a list of tuples. An empty list is returned when no more rows are available. The number of rows to fetch is specified by the parameter. If it is not provided, or is None, then the cursor's arraysize determines the number of rows to be fetched.
- fetchall()
- Fetch all remaining rows of a query result, returning as a list of tuples. Note that the cursor's arraysize attribute can affect the performance of this operation.
- setinputsizes(sizes)
- This method is not well defined by the Python database API and is permitted to have an empty implementation, as it does in the DCOracle package. Calling this method has no effect.
- setoutputsize(size [,col])
- This method is not well defined by the Python database API and is permitted to have an empty implementation, as it does in the DCOracle package. Calling this method has no effect.
DBI Helper Objects
Many databases need to have the input in a particular format for binding to an operation's input parameters. For example, if an input is destined for a DATE column, then it must be bound to the database in a particular string format. Similar problems exist for "Row ID" columns or large binary items (e.g. blobs or RAW columns). This presents problems for Python since the parameters to the execute() method are untyped. When the database module sees a Python string object, it doesn't know if it should be bound as a simple CHAR column, as a raw binary item, or as a DATE.
To overcome this problem, the dbi module was created. This module specifies some basic database interface types for working with databases. There are two classes: dbiDate and dbiRaw. These are simple container classes that wrap up a value. When passed to the database modules, the module can then detect that the input parameter is intended as a DATE or a RAW. For symmetry, the database modules will return DATE and RAW columns as instances of these classes.
A Cursor Object's description attribute returns information about each of the result columns of a query. The type_code is defined to be one of five types exported by this module: STRING, RAW, NUMBER, DATE, or ROWID.
The dbi module in the DCOraclePackage exports the following names:
- dbiDate(value)
- This function constructs a dbiDate instance that
holds a date value. The value should be specified
as an integer number of seconds since the
"epoch" (e.g. time.time()).
The DCOracle package also allows the value to be a string in International Standard Organization, ISO, date or date and time format. For example, to provide a date, use something like:
DCOracle.dbi.dbiDate('1998-07-09')
Or to specify a date and time:
DCOracle.dbi.dbiDate('1998-07-09 11:49:00')
- dbiRaw(value)
- This function constructs a dbiRaw instance that holds a raw (binary) value. The value should be specified as a Python string.
- STRING
- This object is used to describe columns in a database that are string-based (e.g. CHAR).
- RAW
- This object is used to describe (large) binary columns in a database (e.g. LONG RAW, blobs).
- NUMBER
- This object is used to describe numeric columns in a database.
- DATE
- This object is used to describe date columns in a database.
- ROWID
- This object is used to describe the "Row ID" column in a database.
DCOracle Extensions to the Python Database API
Stored Procedures.
Oracle's stored procedures are simply PL/SQL procedures with a pretty traditional interface. Result sets are passed to and returned from procedures and functions as input or as input/output cursor arguments.
The DCOracle package, to the extent possible, makes stored procedures look like Python functions. Input, output, and input/output arguments are supported with a few restrictions:
- Input/output arguments do not support long and long raw, date, and rowid variables.
- Numeric and string input/output arguments will require the use of a special Buffer type provided with the oracle module. The Buffer type provides support for mutable numbers and strings.
- Complex arguments, like records and arrays are not supported.
- Overloaded procedures/functions (procedures that have the same name but different signatures) may not be handled correctly in some subtle cases.
Some advanced features of the stored procedure support:
- Support for features such as output and input/output arguments, overloaded procedures, packages, default arguments, and cursor arguments.
- Natural python interface, including positional and keyword arguments. Arguments are checked for name and number.
Procedure Namespaces
Connection and cursor objects provide an attribute,
procedures
that provides access to a namespace of stored
procedures, functions, and packages.
Examples
Given procedures:
procedure update_account(account string, debit float, balance in out float) function update_account(account string, credit float) return float procedure get_query(curs in out SomeCursorType) procedure bar(x int, y in out int, z out int, w out int) where bar is in package foo and database connction, dbc
These procedures might be used as follows:
balance = dbc.procedures.update_account('0123', debit=22.33) balance = dbc.procedures.update_account('0123', credit=10.00)
Output arguments are not provided in the call, but are returned. Even though the first version of update_account is a procedure, it is translated to a function in Python.
The overloading of update_account is allowed because the two versions have different signatures. There is a subtlety here though. The two versions have different signatures in Oracle because one takes three arguments and the other two. The Python versions have the same number of arguments, but different argument names. The Python versions require the use of keyword arguments to disambiguate the call.
Result tables are returned from stored procedures through input/output arguments, as in:
# Create a new cursor: c=dbc.cursor() # Call the procedure, which associates the cursor with a # query. dbc.procedures.get_query(c) # Get the data: data=c.fetchall()
Non-cursor input/output arguments have to be passed as Buffers:
yb=Buffer(1,'i') yb[0]=y z, w = dbc.procedures.foo.bar(1, y) y=yb[0]
Note the use of the package, foo.
SQL Variable substitution
Only named parameters are supported, however, positional
parameters are emulated with "names" of the form
p#
, where # is a 1-based position. For example:
select * from spam where x=:p1 and y=:yparm
The parameter name p1
is used to indicate a
positional parameter.
The cursor execute method has been extended to support keyword arguments. So the above sql would be executed with something like:
someCursor.execute('select * from spam ' 'where x=:1 and y=:yparm', (12,), y='splat')
Note that:
someCursor.execute('select * from spam ' 'where x=:p1 and y=:yparm', p1=12, y='splat')
Can also be used.
Oracle has a scary feature for bining output variables from SQL:
select salary into :sal from employees where emp_ssn = :ssn
This SQL statement has an input parameter, ssn, and an output parameter, sal. It is important that output parameters be passed as Buffers, as in:
salbuf=Buffer(1,'f') sql=('select salary into :sal ' 'from employees ' 'where emp_ssn = :ssn') someCursor.execute(sql, ssn='123-45-6789', sal=salbuf) print 'The salaray is ', salbuf[0]
Oracle 8 Large Objects (LOBs)
Limited support is provided for Oracle 8 large objects (LOBs). To get LOB support, you must link against Oracle 8 client libraries and make sure that the -DDCORACLE8 switch is supplied when the oci_ extension module is built.
LOB Columns can be selected in SQL Select statements. The way LOB data is returned depends on which fetch routine is used.
If fetchone is used, then LOB objects are returned. Lob objects support the following methods:
- length()
- Return the LOB length
- read([n,offset,csid,csfrm])
- Read data from the lob (OCILobRead)
Up to
n
characters are read, starting at the givenoffset
, where the offset of the first character is 1, not 0. If the argumentn
is less than 1 or is ommitted, then all data from the offset to the end of the LOB is read.The arguments csid and csfrm are as described in the documentation for the OCILobRead function in Oracle OCI 8 documentation.
- write(data, [offset,csid,csfrm])
- Write data to the lob (OCILobWrite)
The data given by the string argument
data
are written to the LOB starting at the givenoffset
, where the offset of the first character is 1, not 0.The arguments csid and csfrm are as described in the documentation for the OCILobRead function in Oracle OCI 8 documentation.
Note that to add or update LOB data, you can provide string input. To update LOB data, you can select the lob and use the write method.
If fetchmany or fetchall are used, then lobs are converted to strings and returned.
Array input
The arguments to execute may be lists or tuples, in which case, "array" input is used. In theory, this provides much faster execution, especially in a client-server environment.
For example, to insert three rows:
someCursor.execute('insert into spam values (:x, :y)', x=(1,2,3), y=('foo', 'bar', 'spam'))
Memory may be wasted if array input is used for large strings with highly variable lengths. This is due to the fact that a Buffer must be created with fixed-length elements.
Array input is not supported for input of LONG
or LONG RAW
data if elements are greater than 32,000 bytes in size.
Array output
If a result set includes no long or long raw columns, then array output is used automatically. Array output provides greater performance, especially in client-server environments. Array output is used regardless of which fetch methods are used to get data from the cursor object. Data are prefetched and fetched as needed to satisfy fetch, fetchmany, and fetchall calls. The arraysize cursor attribute is used to control the size of the arrays used. The default arraysize is 10.
For-loop support
After executing a query, you can iterate over the query with a for loop, as in:
aCursor.execute('select * from spam') for row in aCursor: ...do stuff with the row of data...
Buffer Objects
Buffer objects are used to manage basic mutable, such as integers and strings. They are necessary for passing input/output arguments to stored procedures and for passing arguments to SQL statements that modify their parameters.
Buffers are created by calling DCOracle.Buffer with one or two arguments. The first argument is a buffer size and the second optional athument is a type indicator. The typr indicator may be a single character:
-
c
- character
-
b
- signed 1-byte integer
-
B
- unsigned 1-byte integer
-
h
- signed 2-byte integer
-
H
- unsigned 2-byte integer
-
i
- signed 4-byte integer
-
I
- unsigned 4-byte integer
-
f
- floating point
-
d
- double precision
The type indicator may also be an integer, in which case the buffer is a buffer of fixed-length strings of the given length.
The default type indicator is c
.
Buffers have fixed length, which may be queries with len()
.
They support item retrieval and assignment, but not item
deletion. Buffers support slicing only when a typecode of c
is
used.
Buffers are initialized to zeros.
Thread Safety
If threads are enabled in Python, then the DCOracle package uses Oracle in a thread-safe manner. The Python Database API is, however, somewhat inherently not thread safe. In particular, connection objects and cursors are rather state-full. Within a single thread, calling the database or cursor methods in an incorrect order (e.g. calling a fetch method before calling an execute method) can cause lead to errors. Sharing objects between threads increases the danger.
Cursor and connection objects should not be shared between threads unless they are protected by application-provided mutexes.
Extension Modules
The DCOracle packages contains two extension modules, Buffer, and oci_. Binaries for these extensions for some platforms are included in the distribution. To use a precompiled binary, simply copy the binaries for the desired platform into the package directory. Binaries are located in subdirectories by Python version and system platform.
Known Problems
Oracle integers values that do not fit into 32-bit integers are not currently supported. The logic for handling numeric data needs to be changed to use either the NUMBER or VARNUM external data types.