What happens when your data model changes?
People who have been following this tutorial for a while may have noticed that every time I come out with a new version, the existing users and jobs disappear and we start out with a blank slate again.
That's because changing the data schema of a happs-with-macid web app (aka migrating) is a chore.
It's a chore with traditional web apps too. But it's probably more difficult with macid, especially given the sparse documentation.
This isn't a problem for happstutorial.com, because typically there are only a few dozen users and jobs, plus whatever dummy data I've entered myself. Who cares? So far, rather than migrating, I've just wiped the slate clean.
However, that isn't going to work for your latest facebook-killer.
The good news is, there is a way to migrate HAppS state through various iterations, it's sufficiently documented if you know where to look, and it's not too painful once you've gotten used to it.
The main challenge is finding documentation.
The best documentation I have found is two threads in the happs googlegroup, along with the migration example that was produced by eelco lempsink during the thread. I have taken this migration example example and included it in the happstutorial distribution, with some improvements of my own.
The googlegroup migration threads are here and follow-up here.
The migration example I've included with the tutorial is at migrationexample.
My advice is to try the demo in migrationexample first, following the directions in the README file, read through the source files to understand how the demo works, and then refer to the googlegroups thread if necessary.
Possibly this is sufficient documentation for you to start doing migrations yourself. If so, great, ignore what follows.
If you feel like you could use more guidance, read on for some notes I put together on my own migration experience.
Though I haven't started migration for the toy job board in happs-tutorial, I am using it for my commercial project that is under development. This will be the basis of the notes that follow.
I almost didn't include these notes, because I wanted to provide an easy step by step example that referenced the toy job board as I have done for other tutorial topics. However, doing this will be quite a bit of work, I haven't gotten around to doing this after many weeks. I want to get the information out, so I decided to just share what I have.
I apologize in advance if some of this seems confusing or fragmentary. I did my best, and I will try to clean things up and integrate the example into the tutorial rather than snipping from an external app.
Ok... General migration notes:
Old state module names should not change, nor be shifted around in directory structure (which is really just another kind of name change). Therefore it's a good idea to start out with a sane directory hierarchy for schema versions before you start doing migrations. I recommend keeping App state in one monolithic file, in a directory devoted to state versions. It makes schema migrations much easier, as all references to the old state can then be handle via import Qualified StateLast.hs as Old, then referenced via Old.whatever, when bits of logic that remain consistent between the old monolithic state file and the new monolithic state file. Resist the temptation to split state into multiple files. Bear in mind that due to template haskell the order of data structures declarations becomes significant, which is usually not the case with haskell. (I seem to reall this annoyance was part of the reason why I started splitting things into multiple files to begin with, which I later regretted because it made migration that much harder.)
Don't call HAppS State "State", as this conflicts with the State datatype in Control.Monad.State. I usually call my state datatype AppState.
There will be code duplication, for the functions that get transformed to state modifiers in template haskell. This is a bad code smell, but I think it's unavoidable for the mkMethods directive with all the methods template haskell needs.
In the old state file (being migrated from), make sure it exports everything via module OldState ( ... everything gets exported here ...) where.... My way to do this is load the state module, :browse in ghci, copy the output, and clean it up using emacs regexen. In emacs, dired-mark-files-regexp and dired-do-query-replace-regexp are your friends.
Then, (my way), cd StateVersions, cp AppState1.hs AppState2.hs (or whatever version number we're on.)
Seems almost too obvious to say, but if you have live customer you're not going to want to migrate this without having tested the migration in a sandbox first. Create your sandbox, which should include a snapshot of live customer data. Good way to create a snapshot is tar -czvf _local on your live data. Test thoroughly on this snapshot before doing the live migration. And even if you think you've tested enough, tar snapshot your live data before the migration again, just in case.
Make a live data snapshot and copy it to your migration sandbox: _local.tar.gz. (If there is an unwieldy large amount of data, create a smaller data set by setting up a server identical with the live server and doing some actions manually.)
cd StateVersions; cp AppState1.hs AppState2.hs (or whatever version we're on) For now, we just want a placeholder that will have AppState2 behaving exactly like AppState1. change references from AppState1.hs to AppState2.hs in app code. try running server, the result should be that it compiles, runs, but all data is all lost. (because we haven't written migration yet.)
roll back from backup taken earlier: rm -rf _local and tar -xzvf _local.tar.gz (an explicit reminder to rollback the live data tar may be omitted from future steps, but basically you keep rolling back until you get a successful migration.)
modify AppState (or whatever your main State datastructure is), say, adding a field. Don't write a Migrate instance yet. Try running. You'll probably get an error like "Exception: Non-exhaustive patterns in case." Kind of a crappy error message if you ask me, but ok. What's happening is the pre-existing data in the _local directory isn't compatible with your modified AppState. If you rm -rf _local and try running again, it should work now. But of course you have lost all your data, and need to rollback the live data again for the next step.
Now make necessary changes in code for migration. See eelco's and my uploads to happs google group (tk happs tutorial). Summary is: modify the version instance for AppState and add a migrate instance, allong the following lines. First,
import qualified StateVersions.AppState1 as Old
...
-- we'll say 2 because this is StateVersions/AppState2.hs
-- I don't think it matters what number you use as long as it's higher than the last version,
-- but I'd like to have a core dev confirm that intuition.
-- I wonder too what happens if you screw up this version number somehow. EG, what if you specify a version number
-- identical to the version you're migrating from?
instance Version AppState where
mode = extension 2 (Proxy :: Proxy Old.AppState)
This won't compile, you'll get an error about a missing (Migrate Old.AppState AppState) instance arising from use of extension. So we supply the instance
instance Migrate Old.AppState AppState where
migrate (Old.AppState s d) = AppState (migrates s) (migrated d)
migrates s = undefined
migrated (us, aus, rs, rus) = undefined
We use undefined just to get it to compile and have something to darcs commit, and then write sensible code later.
*Main> :! grep -irn AppState1 *.hs
Controller.hs:27:import StateVersions.AppState1
ControllerAppMigration.hs:18:import StateVersions.AppState1
......
View.hs:29:import StateVersions.AppState1
These are the places in the code that need to be switched to use AppState2 instead.
Let's test this by adding an emails field to UserInfos
Actually, first let's try adding an email field to Macid1 and see if we get an error.
We do get an error, and it's a weird error:
*** Exception: src/Macid1/Repos.hs:45:2-24: Non-exhaustive patterns in case
at \$(deriveSerialize ''Repos)
Is non-exhaustive pattern because somewhere behind the scenes there has been a macid version bump when it detected that the schema changed?
Dunno, but let's try now by switching state to AppState2.hs
Let's also note the latest checkpoint in the _local directory. It is: ...
ls -lth _local/patch-shack_state/ | head -n2
... checkpoints-0000000014
and back it up:
tar -czvf _loacl.beforemigration.tar.gz _local
Step1, cd StateVersions; cp AppState1.hs AppState2.hs. Ok, that works. (Haven't actually used migration machinery yet.)
Now, let's try using the migration machinery, but the migrate is actually just id (so no data structure actually changes).
The following is a snip from a working migration instance, where one field in an interior data structure has been added. (Specifically, UserProfile has gone from a 3 argument constructor to a 4 argument constructor).
You might think this looks like a lot of boilerplate for adding a single field, and I would agree. The good news is that your migration code will look similar if you are making more than just that one change, and the problem is still tractable.
And of course, migrations with a database back-end are no picnic either.
Migrate instance example (add a field to UserProfile):
instance Migrate Old.AppState AppState where
migrate (Old.AppState s d) = AppState (migrates s) (migrated d)
-- Nothing changed in sessions -- it's the second arg to appstate (AppDatastore users) that had a field added
-- We could have avoided writing migrates by using type synonyms to exactly copy the types from AppState1,
-- as is done in eelco's example.
-- I prefer to write out the migration explicitly rather than use type synonyms, because then after a successful
-- migration the Migrate instance and the old state code can be removed, and you wind up with just a monolithic
-- state file evolving over time rather than a sequence of states each with a module dependency on the previous state
-- . (I think this is the case -- still have to prove this works.)
migrates :: Old.Sessions Old.SessionData -> Sessions SessionData
migrates (Old.Sessions s) = Sessions . M.map f \$ s
where f :: Old.SessionData -> SessionData
f (Old.UserSession (Old.UserName u) ) = UserSession (UserName u)
f (Old.AdminSession (Old.AdminUserName u) ) = AdminSession (AdminUserName u)
migrated (Old.Users us, Old.AdminUsers aus, Old.Repos rs, Old.RepoUsers rus) =
( (Users . M.map ui . M.mapKeys uk \$ us),
(AdminUsers . M.map auv . M.mapKeys auk \$ aus),
(Repos . M.map rv . M.mapKeys rk \$ rs),
(RepoUsers . IXS.fromSet . S.map ru . IXS.toSet \$ rus )
)
where ui (Old.UserInfos p (Old.UserProfile c bl av ) ) = UserInfos p (UserProfile S.empty c bl av)
uk (Old.UserName u) = UserName u
auk (Old.AdminUserName u) = AdminUserName u
auv (Old.AdminUserInfos p) = AdminUserInfos p
rv (Old.Repo (Old.UserName u) bud blu isp) = Repo (UserName u) bud blu isp
rk (Old.RepoName n) = RepoName n
ru (Old.RepoUser (Old.RepoName n) (Old.UserName u) ) = RepoUser (RepoName n) (UserName u)