One consistent problem we've had is of syncing. We use revision control tools to share our code, and that works great (so long as you have decent tools and don't rely on the really basic free stuff). However, databases are not synced. It is possible to send over SQL dumps but this becomes a big pain for a few reasons:
- It's unfeasibly difficult to merge two different database dumps that were worked on simultaneously, for many reasons, but especially due to key-conflicts.
- SQL dumps have large file sizes, meaning lots of bandwidth and lots of waiting.
- It takes a few minutes to generate an SQL dump and then another few minutes to import it, and that's if things go right (charset issues, version issues, and platform compatibility issues can make problems take up to an hour to sort out sometimes). This doesn't seem much, but when people are collaborating closely it really adds up.
- You lose all the benefits of revision control (versioned backups in particular).
- For someone new to start up on the project they need to source a copy of the database – and God help the team if it has been lost (we've never done it, but obviously it could easily happen because the DB and project files are kept totally apart)
- If developers are in different time zones you often can't just ask for a database to quickly get sent over because they may well be asleep! So you have to worry about scheduling it
The net result of this is that after a certain point in development I've simply started maintaining my own database, and making sure it stays structurally-valid (contains all the latest tables, etc) via fiddling about. This isn't just time consuming and messy, it has some nasty scalability ramifications for working practices:
- It ends up being me doing things like setting permissions, or populating categories, because I need to see it on my copy, or because the changes are ones I can do real quick but would take 10 times longer if it went through a briefing process.
- It ends up being me deploying the site, because I have the real database.
- And then it ends up being me maintaining it too!
So we decided to solve it!
We're ditching MySQL for our own development, and we've written a totally new database for internal usage (extreme measures!). I should note this does not affect regular ocPortal users at all, it's simply implemented via a new database driver which happens to actually be a full implementation of a database.
The new database has the following special features that allow it to solve our problem:
- Each table is a directory and each record is a file (it can be an XML file, or PHP-serialization which is faster). This allows us to simply run data through our revision control system in the same way we already do for everything else.
- Auto-increment is replaced with key randomisation, so we are able to easily merge databases without key conflicts.
- There are no indexes, other than the primary key index (which is done via filename), meaning indexes can't become corrupt
- It is totally platform agnostic, so it runs wherever ocPortal can run.
- It doesn't require any special configuration at all (it binds itself to ocPortal's configuration automatically, it's completely build in).
- No special software needs installing, it's all PHP.
My biggest concern about this was performance. And yes initially it was taking about 10 minutes just to do the ocPortal installers database steps . But fortunately via a series of tricks it is now only about 50% slower than MySQL which is just fine for development. Obviously on a live site you wouldn't want that kind of performance degradation, but for developers working on a site which nobody else uses, it's irrelevant.
I'm really pleased we've been able to do this. We've basically implemented SQL and all the main features other database software has, and the database software just slots in as a part of ocPortal. Very few ocPortal changes were required, and those that were mostly were just bug fixes for places in our code where we made assumptions about how MySQL behaves (which needed fixing anyway).
It's great to see technology and business working in such harmony…
- Total coding time: about 30 hours.
- Total lines of code: about 2500 (that's darned compact for an SQL and database implementation!)