Basic ORM with megrok.rdb and SQLAlchemy

Author:Jeffrey D Peterson
Version:Grok-1.0 >= 1.0, <= 1.1

This guide will take you through the process of
mapping relational data using megrok.rdb and by
extension SQLAlchemy (SQLA).

Prerequisities

  • Grok 1.0 or 1.1 (It may work with earlier versions but I haven’t tested it.)
  • megrok.rdb 0.11.0
  • SQLA 0.5.7, 0.5.8; I have tested it against 0.6 as well but you have to make a change in z3c.saconfig or use the repository latest version to make it work.
  • A properly configured relational database. I have tested this against Oracle 10g (10.2) and Postgresql 8.3
  • At a minimum, a basic grasp of Grok, SQLAlchemy and obviously Python

Installation

How you install Grok is your preference, there are many good documents on how to do this. This how-to assumes you have installed Grok via virtualenv and grokproject.

Adding megrok.rdb, SQLA and the support libraries they need is very simple, just add megrok.rdb to setup.py in your project and let buildout handle the rest:

setup.py
setup(
      * snip *
      install_requires=['setuptools',
                        'grok',
                        'grokui.admin',
                        'z3c.testsetup',
                        'grokcore.startup',
                        # Add extra requirements here
                    'megrok.rdb',
                       ],
      * snip *
      )

Then re-run buildout:

(grokenv)zope@zope3:$ ./bin/buildout

This should put megrok.rdb, SQLA, z3c.saconfig and any other support packages needed in the directory you specified with –eggs-dir when you ran grokproject.

Imports

Note: You can find another ( and eerily similar :) ) example of this setup code in megrok/rdb/tests/initialization.py in your eggs-dir.

To get started we need to import some packages and modules:

import grok
from megrok import rdb
from z3c.saconfig import (EngineFactory, GloballyScopedSession)
from z3c.saconfig.interfaces import (IEngineFactory, IScopedSession, IEngineCreatedEvent)

Connection string

We will be using Oracle as our database for this how-to, however it should cover pretty much any DB SQLA supports:

DSN = ‘oracle://username:password@database_name’ My DSN is a bit different than others as it relies on Oracle TNS for some of its information, specifically the host.

Create your DSN by following the guidelines from SQLA:

"The URL is a string in the form dialect://user:password@host/dbname[?key=value..], where dialect
is a name such as mysql, oracle, postgresql, etc."

For postgres it should look like:

DSN = 'dialect+driver://username:password@host/database_name'
So if you use psycopg2 for your adapter, you'd have 'postgresql+psycopg2://...../database_name'

For more info on SQLA connections please consult the SQLA docs.

SQLAlchemy engines and scope

Now that we have a connection string we must setup an SQLA database engine, for this z3c.saconfig has provided us the tools:

engine_factory = EngineFactory(DSN, echo=True)
grok.global_utility(engine_factory, provides=IEngineFactory, direct=True)

Consult the z3c.saconfig code to examine the inner workings of EngineFactory, but it’s main purpose is to provide an SQLA engine. Notice too that we have set echo=True, this turns on SQLA output so you can see what it is doing in the background. When you are satisfied all is well, just set it to False or remove it all together. We must also register our engine factory as a global utility, generally we are making it available globally, and more specifically so megrok.rdb, can get at it through the utility machinery.

Similarly we must also setup an SQLA session, again z3c.saconfig to the rescue:

scoped_session = GloballyScopedSession()
grok.global_utility(scoped_session, provides=IScopedSession, direct=True)

In our case the scope is global, making the session ubiquitous in our app. You are not required to to this, you are also able to scope it by site using SiteScopedSession(). Again consult the z3c.saconfig code for more specific details, particularly about scope. Moving on, we register the session as a utility to make the session available to the utility machinery scoped as we like.

Metadata and EngineCreatedEvents

Generally metadata is how SQLA keeps track of your classes and mappings. Here is where things can get trickier.

In most cases, you are going to only need one set of metadata. However there are special cases to consider, particularly for our example. So we will create two(2) sets of metadata.:

skip_create_metadata = rdb.MetaData()
create_metadata = rdb.MetaData()

Notice the names, the reasons for this are forthcoming...just note them for now.

Continue by setting up our events:

@grok.subscribe(IEngineCreatedEvent)
def create_engine_created(event):
    rdb.setupDatabase(create_metadata)

@grok.subscribe(IEngineCreatedEvent)
def skip_create_engine_created(event):
    rdb.setupDatabaseSkipCreate(skip_create_metadata)

Feel free to delve into how events work at your leisure, we simply need to know the basics of what’s happening. Grok subscribes to (watches for) the EngineCreatedEvent, and when it happens, executes some code, in our case rdb.setupDatabase() and rdb.setupDatabaseSkipCreate(). We have two(2) events, one for each set of metadata.

So, why both sets of metadata? Why both events? For Postgres, existing views, existing tables and tables dynamically created from your classes are all handled by rdb.setupDatabase(). SQLA recognizes all reflected objects as already existing and doesn’t try and create them. But with Oracle(and our example) if you reflect an existing view into some metadata and pass it through rdb.setupDatabase(), not only will SQLA reflect your views, it will also try very hard to create them as tables in your DB. This is bad. The views already exist, right? So, we need a solution, which is, hold them in a separate metadata, pass the proper metadata to rdb.setupDatabaseSkipCreate(), which skips the creation attempt, leaving you with your views reflected as you’d expect.

It’s up to you how you handle reflected tables in this case. SQLA will recognize those as existing and skip the create for you. Go ahead and put them in whichever metadata you like, as either will work. Dynamically created tables, however, must be held in the metadata passed to rdb.setupDatabase() so that they are created for you.

The Complete Configuration

Combining all the previous code our configuration looks like this: (oracle_cfg.py):

import grok
from megrok import rdb
from z3c.saconfig import (EngineFactory, GloballyScopedSession)
from z3c.saconfig.interfaces import (IEngineFactory, IScopedSession, IEngineCreatedEvent)

DSN = 'oracle://username:password@database_name'

engine_factory = EngineFactory(DSN, echo=True)
grok.global_utility(engine_factory, provides=IEngineFactory, direct=True)

scoped_session = GloballyScopedSession()
grok.global_utility(scoped_session, provides=IScopedSession, direct=True)

skip_create_metadata = rdb.MetaData()
create_matadata = rdb.MetaData()

@grok.subscribe(IEngineCreatedEvent)
def create_engine_created(event):
    rdb.setupDatabase(create_matadata)

@grok.subscribe(IEngineCreatedEvent)
def skip_create_engine_created(event):
    rdb.setupDatabaseSkipCreate(skip_create_metadata)

Mapping our Classes

The heavy lifting is finished and megrok.rdb, et al, did most of it for you trust me. Now we can start developing our mapped classes.

First, we import any modules we need and our metadata:

from megrok import rdb

from sqlalchemy import Column, ForeignKey, Sequence
from sqlalchemy.types import Integer, String, Float
from sqlalchemy.orm import relation

from oracle_cfg import (create_matadata, skip_create_metadata)

Next, we create a container, you’ll notice a striking similarity to grok.Container:

class TestSA(rdb.Container):
    pass

The container does what you might expect, it contains objects. Certainly, it does more than that and there are more container options than what will be presented here but we’ll save that is for a future how-to. If you don’t want to wait and you have to know it’s full functionality now, use the source. You can find this in the eggs-dir under megrok/rdb/ in interfaces.py and components.py

Next we create our Models, these will be our mapped classes. Again these are similar to their Grok counterpart grok.Model and you can find information about them in the same place as rdb.Container. In our first Model we will describe a reflected view. The actual Oracle view name is “sales_part” it is made up of numerous columns and holds data that describes, you guessed it, parts that are available for sale.:

class SalesPart(rdb.Model):
    """
    Reflected view, notice the metadata used is the same as the one passed
    to rdb.setupDatabaseSkipCreate(), Don't run create_all() on this
    metadata SQLA will see this as a table that needs creation and try and
    create it in the DB, which we do not want.
    """
    rdb.metadata(skip_create_metadata)
    rdb.reflected()
    rdb.tablename('sales_part')
    rdb.tableargs(schema='app', useexisting=True)

    contract = Column('contract', String, nullable=False, primary_key=True)
    catalog_no = Column('catalog_no', String, nullable=False, primary_key=True)

Ok, lets examine some of the details. The first thing we see is a class definition, nothing unusual, just a normal python class that inherits from rdb.Model. Next, we see several directives or class annotations, if you are new to Grok, you will find that it is big on these. Lets look at them individually:

rdb.metadata(skip_create_metadata)

Simply put, we are telling megrok rdb, that we want to act upon the specified metadata. It’s important to note the placement. If using a single metadata it is not required to define this at the model level, it could even be defined as part of your configuration (oracle_cfg.py) but since we will be exploring multiple metadata objects, this needs to be defined here and in each additional model so megrok.rdb operates on the proper metadata for each one.:

rdb.reflected()

This is telling megrok.rdb and SQLA to reflect an existing view or table.:

rdb.tablename('sales_part')

This is an optional directive, without it megrok.rdb and SQLA will look for a table/view matching the class name (salespart) but since we have an underscore in the name we use the directive to specify.:

rdb.tableargs(schema='app', useexisting=True)

Also optional depending on what you need. For Oracle if you don’t connect as the schema owner, you must pass the schema value as an argument to the table, this tells SQLA where to look for the table/view and also what permissions the user you did connect with, has. In our case we are also reflecting a view here and as you will see in a moment overriding some columns. The act of overriding requires us to pass the useexisting flag to tell SQLA to use the existing columns and override any columns presented in the class itself.:

contract = Column('contract', String, nullable=False, primary_key=True)
catalog_no = Column('catalog_no', String, nullable=False, primary_key=True)

Views have no primary keys of their own. SQLA wants primary keys, so we override any columns we want as part of the primary key. As mentioned above this step requires you to pass useexisting=True to the tableargs directive.

Our second example will be a reflected table.:

class OutputTab(rdb.Model):
    """
    Reflected table, notice the metadata that uses the rdb.setupDatabase,
    SQLA handles the create properly because it can see this as a table
    and therefore will not try and create it. You could use either metadata
    for this case.
    """
    rdb.metadata(create_metadata)
    rdb.reflected()
    rdb.tablename('output_tab')
    rdb.tableargs(schema='app')

That’s it, pretty simple, though there is a small difference to discuss. Notice that the useexisting flag is not there. This is a table, it has all the primary keys and other features SQLA wants so no overriding necessary, and as such, no useexisting flag required. On the flip side it isn’t required to not be there either, works either way. Remember though, if for some reason you did need to override a column, you must pass it, at least for Oracle.

Our final models will be a dynamically created:

class MyTable(rdb.Model):
    rdb.metadata(create_metadata)
    rdb.tablename('my_table')
    rdb.tableargs(schema='app')

    id = Column('id', Integer, nullable=False, primary_key=True)
    name = Column('name', String)
    email = Column('email', String)

This will create the table “my_table” in the schema “app” with 3 columns, id, name and email.

What about postgres serial type? SQLA handles that for you, for Integer based primary_key columns, but if you want to use a specific sequence (and for Oracle you’d have to) you can do it this way:

id = Column('id', Integer, Sequence('some_id_seq'), primary_key=True)

Continuing on, we can add another model that references the first model:

We modify our first model like this:

class MyTable(rdb.Model):
    rdb.metadata(create_metadata)
    rdb.tablename('my_table')
    rdb.tableargs(schema='app')

    id = Column('id', Integer, nullable=False, primary_key=True)
    name = Column('name', String)
    email = Column('email', String)
    my_other_tables = relation('MyOtherTable', backref='my_table', collection_class='TestSA')

Then create the new model:

class MyOtherTable(rdb.Model):
    rdb.metadata(create_metadata)
    rdb.tablename('my_other_table')
    rdb.tableargs(schema='app')

    id = Column('id', Integer, nullable=False, primary_key=True)
    my_table_id = Column('my_table_id', Integer, ForeignKey('my_table.id'))
    name = Column('name', String)
    size = Column('size', Float)

Using your mapped classes

To use your mapped classes, the easiest way is through a session:

class TestSQLA(grok.View):
    grok.context(context_of_your_app)

    def render(self):
        session = rdb.Session()
        sp = session.query(SalesPart).all()
        msg = ''.join(['<p>%s</p>' % (o.catalog_no) for o in sp])
        return """<html><head></head><body>%s</body></html>""" % msg

The full mapping

Combining all our examples it will look like this:

from megrok import rdb

from sqlalchemy import Column, ForeignKey, Sequence
from sqlalchemy.types import Integer, String, Float
from sqlalchemy.orm import relation

from oracle_cfg import (create_matadata, skip_create_metadata)

class TestSA(rdb.Container):
   pass

class SalesPart(rdb.Model):
    """
    Reflected view, notice the metadata used is the same as the one passed
    to rdb.setupDatabaseSkipCreate(), Don't run create_all() on this
    metadata SQLA will see this as a table that needs creation and try and
    create it in the DB, which we do not want.
    """
    rdb.metadata(skip_create_metadata)
    rdb.reflected()
    rdb.tablename('sales_part')
    rdb.tableargs(schema='app', useexisting=True)

    contract = Column('contract', String, nullable=False, primary_key=True)
    catalog_no = Column('catalog_no', String, nullable=False, primary_key=True)

class OutputTab(rdb.Model):
    """
    Reflected table, notice the metadata that uses the rdb.setupDatabase,
    SQLA handles the create properly because it can see this as a table and
    therefore will not try and create it. You could use either metadata for
    this case.
    """
    rdb.metadata(create_metadata)
    rdb.reflected()
    rdb.tablename('output_tab')
    rdb.tableargs(schema='app')

class MyTable(rdb.Model):
    rdb.metadata(create_metadata)
    rdb.tablename('my_table')
    rdb.tableargs(schema='app')

    id = Column('id', Integer, nullable=False, primary_key=True)
    name = Column('name', String)
    email = Column('email', String)
    my_other_tables = relation('MyOtherTable', backref='my_table', collection_class='TestSA')

class MyOtherTable(rdb.Model):
    rdb.metadata(create_metadata)
    rdb.tablename('my_other_table')
    rdb.tableargs(schema='app')

    id = Column('id', Integer, nullable=False, primary_key=True)
    my_table_id = Column('my_table_id', Integer, ForeignKey('my_table.id'))
    name = Column('name', String)
    size = Column('size', Float)

class TestSQLA(grok.View):
    grok.context(context_of_your_app)

    def render(self):
        session = rdb.Session()
        sp = session.query(SalesPart).all()
        msg = ''.join(['<p>%s</p>' % (o.catalog_no) for o in sp])
        return """<html><head></head><body>%s</body></html>""" % msg

Further information

SQLAlchemy Documentation Page: http://www.sqlalchemy.org/docs/

Python Package Index: http://pypi.python.org/pypi/

Python Documentation: http://python.org/doc/

PostgreSQL: http://www.postgresql.org/

Oracle: http://www.oracle.com/