Bridging the Developer/DBA Divide: Advances in Database Source Control

For over 10 years now, I’ve been a huge advocate of integrating database development, design and architecture practices into application development processes. Time and time again I’ve encountered poorly performing database systems whose foundations were laid in the impedance mismatch between application developers and database developers. While ORMs are the most prominent area of mismatch, I’ve chosen to focus on

For over 10 years now, I’ve been a huge advocate of integrating database development, design and architecture practices into application development processes. Time and time again I’ve encountered poorly performing database systems whose foundations were laid in the impedance mismatch between application developers and database developers.

While ORMs are the most prominent area of mismatch, I’ve chosen to focus on another area which has been ripe for improvement: how software is maintained and packaged for delivery from the development environment into the eventual production environment.

Application developers have a rich toolset for managing versions of their code, reviewing change history and packaging code for shipment. Today, application developers can use continuous integration techniques and distributed development teams in order to rapidly produce code that is constantly tested and deployed over and over again.

Sadly, the database developer has had to struggle with limited tools for promulgating changes from environment to environment. In the early 2000’s we had basic tools for comparing databases and making them similar, but we had few ways to track changes, and limited techniques for generating portable delivery packages.

In fact, before the advent of the DBMS_METADATA package in Oracle 9i, we had extremely poor facilities for re-creating databases outside of export / import, making database changes a second or third-class action within many development shops.

I became focused on this topic around 2002, when I became a director of database development for a online Internet company. When I arrived at my new job, I was amazed at the amount of manual effort required to create database deployment packages. Each database developer wrote out scripts by hand, using inconsistent file names, one-off code and untested procedures. It was no surprise that database deployments were error-prone, slow and basically a crap-shoot every time.

One thing that the company had in place which the application developers were able to take advantage of was a reasonably robust software change management function. Everything necessary for production deployments had to be checked in to our source code control system and packaged by our CM group for delivery. Unfortunately, before I arrived, database script files lacked any kind of transparency as to their purpose within the source code control system.

I set about to change that.

I instituted a standard that we would have one script file per object, so that it would be easy to compare script files against database objects. Script file names also were to include mnemonics indicating their purpose (mk = create, rm = delete / remove / drop, mv = rename, ch = change / alter), their object type (tbl = table, idx = index, sps = package specification, spb = package body), their schema and their object name. If multiple changes to an object were required as a result of multiple bugs or features, the bug / feature number would be integrated into the name. Script files would include source code control tags or property keywords so that the source code control system could inject necessary documentation like author and date / time. Everything was designed to increase transparency and support automatic generation of script files.

Over time tools like Oracle’s SQL Developer and Red Gate’s Schema Compare for Oracle matured to the point where they could generate script files at the schema level and the object level in a similar manner — but they remained unaware of source code control conventions.

And while we maintained a library of scripts that could generate our individual files, and used DDL auditing to track changes from the database, we still had a difficult time ensuring that our source code control system accurately reflected what we had developed in the database.

We did our development in a single, shared development instance directly in the database — we didn’t write scripts first. We worked directly in the database, trying out ideas and implementing them immediately — in essence we did immediate integration. We’d take time at the end of our release development cycle (every 3 months), to assess what was changed, generate our scripts and then check those into source code control. Sometimes we’d generate an entire schema and then attempt to diff the script files against what was in source code control to make sure we captured everything.

There were days when I wondered if we should write our scripts first before making changes in the database, debugging them by running them over and over again (with appropriate rollback / backout scripting to handle failures). This approach is similar to the approach used by the popular Liquibase database migration tool, which unfortunately uses a generic XML-based database change DSL, but otherwise supports an iterative development process using script files as change sources.

I struggled with the idea that we needed to keep 3 data sources all synchronized: the scripts on the filesystem, the source code control system, and the data dictionaries of the databases. And while keeping a filesystem synchronized with a source code control system has gotten easier over the years, it’s been almost impossible to synchronize the database schema structure from within the database against a filesystem representation of it.

Until now.

I’ve been following with keen interest Red Gate’s efforts to create a source code control for Oracle database development tool and have recently had a chance to take a look at the production release of version 1.0

With support for source code control repositories based on Subversion and Team Foundation Server, Source Control for Oracle enables synchronization of database schemas directly with source code control.

Based on the 3.0 Release of Schema Compare for Oracle, the developers at Red Gate have created a filesystem representation of a schema’s data dictionary that can be compared to by their tools as if the filesystem representation was just another database. Not only that, but differences can be resolved in either direction: from source control to database or from database to source code control. More importantly, promoting changes from a database into source code control takes advantage of source control properties like bundling changes into a commit, commenting on the nature of the change, and maintaining version history.

Quite frankly I’m excited about the possibilities these capabilities open up for database software development. When using a central source code control repository for database changes, I can easily enable distributed development for developers with private, local databases. Or, I can still have a central development database and use the tool to keep the source code control system up-to-date.

Source Control for Oracle mirrors the software development process of “refreshing” from source code control before starting work — enabling database developers to review incoming changes and apply them before starting new work items. I also like the fact that it adds minimal changes to database developers who are used to working within a database instead of with script files. Source Code Control for Oracle manages the heavy lifting of generating scripts and managing the check in process for developers without even exposing them to Subversion or TFS commands.

Combined with Schema Compare for Oracle and its command-line capabilities, you’ve now got the beginnings of an ability to do continuous integration and deployment of database changes all the way from development to production.

For future versions I hope to see Red Gate including support for other source code control systems like Git & Mercurial, the ability to provide file naming standards, and as a possible major feature, the ability to include the delta changes (think alter scripts) as part of the repository. Perhaps the ability to create release branches and upgrade branches as well. You can see some of these ideas, comment on them and create your own at the Red Gate UserVoice page for Source Control for Oracle.

I’m looking forward to what’s next in this product but for now it marks an excellent step forward for those of us in the database development space working to keep pace with our application development brethren.