Eggs, Known Good Sets and developing with unreleased Grok source code

Author:Kevin Teague

Grok releases are distributed as Python eggs. This gives you the flexibility to easily control what versions of each of the individual python packages that are used to make up a Grok application. Learn why this is a desirable goal, and how you can use this to develop your Grok application based on unreleased versions of Grok checked out from subversion.

Eggs and Versions

Eggs are packaged python projects. They are a collection of Python files and other project files with a set of associated package data. They are the way that the parts of Grok and other Python projects that Grok depends upon are distributed. Before Phillip J. Eby started the setuptools project that allows for the creation and distribution of Python eggs, Python developers distributed their code as a single, monolithic archive (e.g. zip files or tarballs). This system had the drawback that to upgrade to a new version of the framework, you had to download the entire framework - even if you were only updating to a maintenance release that contained just a few small bug fixes. Let’s look at how Zope 3 was originally distributed:

Release Size Package Versions
Zope-3.1.0.tgz 3.86 MB
  • zope.tal = 3.1.0
  • zope.schema = 3.1.0
  • zope.event = 3.1.0
Zope-3.2.1.tgz 6.23 MB
  • zope.tal = 3.2.1
  • zope.schema = 3.2.1
  • zope.event = 3.2.1
Zope-3.2.2.tgz 6.24 MB
  • zope.tal = 3.2.2
  • zope.schema = 3.2.2
  • zope.event = 3.2.2

Wow, to get the upgrade to the maintenance release between 3.2.1 and 3.2.2 you had to download and install over 6 MB of data. Worse, can you tell how much the zope.tal or zope.schema packages changed between release 3.1 and 3.2? The medium bump in version number indicates that there has been some interesting feature improvements and bug fixes happening. But this version number is global - perhaps people were perfectly happy with the 3.1 version of zope.event and it had no changes in it between the 3.1 and 3.2 releases. Yet it still got a medium increase in it’s version number. Wouldn’t it be good if only the packages that change had their version numbers updated?

Known Good Sets

Starting in 2006 work was done to break Zope 3 up so that it could be managed and distributed as individual Python eggs. This work was completed in 2007 and Zope 3.4a was the first release to use a completely egg-based system. Now each part of the framework can be assigned it’s own version number, and managed independently. With this system it is very easy to upgrade to a newer version of just one package in your application, without being forced to upgrade everything else in the framework.

When you take a collection of Python eggs that make up a framework or an application and produce a list of eggs and version numbers that are tested to work together this is called a Known Good Set (KGS).

Grok 0.10.1 was the first Grok release to be composed of a Known Good Set. Let’s look at a portion of the set of eggs in this release, and a set of eggs in the Grok 0.11 release:

Release Size Package Versions
Grok-0.10.1
  • 1.9 MB
  • 0.4 MB
  • 1.3 MB
  • grok = 0.10.1
  • martian = 0.9
  • zope.tal = 3.4.0b1
Grok-0.11
  • 1.9 MB
  • 0.4 MB
  • 1.3 MB
  • grok = 0.11
  • martian = 0.9.1
  • zope.tal = 3.4.0b1

This is much better than the monolithic tarball of the earlier Zope 3 releases as you now have additional information easily available. You can see that the martian package has had a single maintenance release made between Grok 0.10 and Grok 0.11, and that the zope.tal package has not changed at all between releases.

The full set of eggs and version numbers that made up the official Grok releases can be found at http://grok.zope.org/releaseinfo/

Managing eggs with zc.buildout

The zc.buildout tool was created primarily as a way of making it easy to collect together a set of eggs usable by a Python application. While a full discussion of zc.buildout is beyond the scope of this tutorial, we can look at how you can use this tool to work with eggs that are newer than the ones supplied in a known good set.

If you have created a Grok application using grokproject, then you will have a buildout.cfg file with these lines in it:

[buildout]
extends= http://grok.zope.org/releaseinfo/grok-0.11.cfg
versions = versions

This tells your buildout configuration that it is extending the Grok 0.11 configuration. The Grok release configuration files only ever contain a single section labelled [versions]. The versions = versions line tells buildout that we want to use all the eggs with the versions specified by a Grok release.

What if you wanted to update just a single egg to use a newer version than the one specified in an official Grok release? You would add the following below your [buildout] section:

[versions]
martian = 0.9.2

Run ./bin/buildout to update your application and you will be using version 0.9.2 of martian instead of 0.9.1. The version syntax of buildout also allows you to specify minimum and maximum versions. You could write:

[versions]
martian >= 0.9.2, < 1.0

This would ensure that you used at least the 0.9.2 version, and when you ran ./bin/buildout to update your application, it would upgrade to the newest maintenance release made in the 0.9 series. It would not upgrade to a 1.0 release though.

Trail blazing (working with unreleased eggs)

Cave men beat well-worn paths to the watering hole, and developing with tested releases of Grok is equivalent to staying on this beaten path. But what if you want to go on a hunting expedition for mammoths and these animals don’t frequent the watering hole near the cave?

Let’s look at an example of the Viewlets work done at the Snow Sprint 2008. How can we try out this experimental release in your project? First you will need to checkout that feature branch somewhere inside your project, the src directory is the standard location for the parts that you are actively working on:

$ cd ~/buildouts/my-grok-project/src
$ svn co svn://svn.zope.org/repos/main/grok/branches/snowsprint-viewlets2/ grok

You will notice that checking out Grok also checks out Martian using the svn:externals feature. Tell buildout that you want to use these subversion checkouts for development by adding the paths to them in the develop setting. The versions required by this feature branch also requires some newer eggs which are specified inside the grok/versions.cfg file. Change your extends setting from using official Grok release versions to point to these development versions:

[buildout]
develop = . src/grok src/grok/martian
extends = src/grok/versions.cfg

Run ./bin/buildout to update your application. There is just one last tricky bit that we are missing. If you look at parts/app/runzope, the Python program which launches your Grok application you will see:

sys.path[0:0] = [
  '/home/username/buildouts/my-grok-project/src',
  '/home/username/buildouts/my-grok-project/grok/src',
  '/home/username/buildouts/shared/eggs/martian-0.9.3-py2.4.egg',
  ...
]

This is because the src/grok/versions.cfg file is specifying martian 0.9.3, while the version in development may declare that it is a newer version. You can tell what the development version of martian is by looking at it’s setup.py file:

setup(
   name='martian',
   version='0.9.4dev',
   ...
)

You can declare that you want to use this version by overriding the value in your buildout configuration file. However, if you do newer checkouts of martian this version number may be updated. If you specify nothing as a version, then the newest version will be used, which is what you want:

[versions]
martian =

Now run ./bin/buildout one more time and you are all set to develop with a feature branch of Grok.

Developing with a development.cfg file

What if you’d like to keep your buildout.cfg file clean, as you are using that file for doing production installations of your project and you don’t want to forget to undo the changes to your buildout.cfg file when you are done experimenting?

You can have a separate file, typically named development.cfg (or dev.cfg if you are a lazy typist) that overrides just the development changes that you have made. Putting all of the changes above together it would look like this:

[buildout]
develop = . src/grok src/grok/martian
extends = buildout.cfg src/grok/versions.cfg

[versions]
grok =
martian =

Now a production buildout will be the default buildout.cfg file only, and you explicity specify a development buildout with ./bin/buildout -c development.cfg. Remember, you can also have a ~/.buildout/default.cfg file that contains defaults specific to your own personal set-up. This way you can separate the configuration neatly between:

  • Your own personal setup in ~/.buildout/default.cfg
  • Active development configuration in project/development.cfg
  • Default configuration in project/buildout.cfg

Happy mammoth hunting!