Migraine
Migraine helps with painful data migrations.
It provides a framework for running cross-model and SQL-to-model data migrations for Django. Migraine's Migrator classes provide a declarative approach to importing data from external databases and Django models into other Django models, with a syntax somewhat similar to Django's ModelForms. Migraine will also run you your migrations in order derived from inter-migration dependencies.
Building a migration project
To use Migraine, you will need to create a project containing two basic elements: a migrators package containg one module per Django app and a bootstrap script you will use to set up django settings and start a migration.
Migraine projects are recommended to be placed outside of your main
application's source code, so make sure the target app is available on
the PYTHONPATH. You can append its path in an environment variable or in
the migrate.py
script.
Assuming we want to migrate to a single app called polls
, here is
how our project structure will look like:
polls_migration/ __init__.py migrate.py migrators/ __init__.py polls.py
We created a migrate.py
module that will contain our configuration
code, and a polls.py
module where we will define our migrator
classes.
Writing a bootstrap script
Your migrate.py
script needs to do two things:
- Import your migrators package.
- Call
run_from_command_line
.
You can also use it for additional configuration, like loading Django settings.
Here is a basic example:
#!/usr/bin/env python import sys from migraine import run_from_command_line import migrators if __name__ == "__main__": run_from_command_line(migrators, sys.argv)
Defining Migrators
A Migrator class defines how we want to process the data we're going to
migrate. It can be any class providing a run_migration
method.
Inside each of the migrators
package submodules define a list called
migrators
, containing names of classes from that submodule that you
wish to be detected by migraine's migration-running mechanism.
Model to Model migrations
Migraine provides a base ModelToModelMigrator
that will create a
single record in the target model per each record from the source model.
We will use it to migrate data from an old model called OldPoll to a
fresh model called NewPoll.
# our app's models: from django.db import models class OldPoll(models.Model): old_poll_name = models.CharField(max_length=30) class NewPoll(models.Model): new_poll_name = models.CharField(max_length=36) # migrators/polls.py: from migraine.migrators import ModelToModelMigrator from polls.models import OldPoll, NewPoll migrators = ['PollsMigrator'] class PollsMigrator(ModelToModelMigrator): source_model = OldPoll target_model = NewPoll fields = [ ('old_poll_name', 'new_poll_name') ]
We've just created a Migrator that will copy over OldPolls to NewPolls.
You can also define more complex rules for processing fields. Let's
assume we want the old polls' names to end with "(old)". For each such
field we can define a method that will return a processed value.
Migraine uses a convention of prepending such methods' names with
import_
:
class AppendingPollsMigrator(ModelToModelMigrator): source_model = OldPoll target_model = NewPoll def import_new_poll_name(self, source): return source.old_poll_name + ' (old)'
Effect of running such a migration will be identical to running
new_poll.new_poll_name = source.old_poll_name + ' (old)'
for each newly created NewPoll object.
Instead of a source_model
, you can also define a query_set
field
if you need more control over source data.
SQL table to model migrations
Migraine can handle importing data from a raw SQL database. For this,
there is an SQLToModelMigrator
.
from blog.models import Author, BlogPost migrators = ['BlogPostMigrator'] class BlogPostMigrator(SQLToModelMigrator): source_db = 'oldblog' source_table = 'blog_posts' target_model = BlogPost skip_on_match = ['name'] fields = [ ('title', 'title'), ('content', 'content'), ] def import_author(self, source): return Author.objects.get_or_create(name=source['author_name'])
This simple example will populate the BlogPost model with data from
blog_post
table's rows. The import_
methods' source
argument
contains a dict mapping column names to values for each of source
table's rows.
The source_db
field declares the database to be used. The database
needs to be decared in the DATABASES dict in django settings. It is
optional and defaults to default
.
Intead of source_table
, you can define an sql
field. This will
cause the Migrator to use query's result rows as the source feed.
Running migrations
To launch all migrations, run your bootstrap script:
python migrate.py
You can also specify individual migrations to run. To see a list of
available migrations run migrate.py --list
.
Migrator dependencies
Migraine can sort your migrations using topological sorting based on
inter-migration dependencies. To use this feature, declare a
depends_on
field on your Migrators that will contain a list of
migrator names:
# migrators/foo.py migrators = ['MigratorA', MigratorB'] class MigratorA: depends_on = ['foo.MigratorB'] # ... class MigratorB: # ...
In this example, MigratorB will always be run before MigratorA.
Running tests
cd testapp pip install -r requirements.txt # you probably want to make a virtualenv # for this DJANGO_SETTINGS_MODULE=settings PYTHONPATH=`pwd` py.test