All web startups have something in common. If you lose your user data, you are hosed. !$ $!
So when I was first learning about HAppS, and contemplating doing a startup with it, one of my first questions was, how do I keep this from happening? !$
If you are using php, ruby on rails, or one of the other popular web frameworks, your user data is likely in a mysql database$!, or if you are well funded maybe in Oracle!$. If you have outsourced your server hosting, maybe you $!are even lucky enough to!$ have a database administrator that takes backups for you on a regular basis. That probably helps you sleep at night, assuming that you can really trust that your dba is doing their job.
As we learned in the previous lesson, if you are using HAppS with macid, your data is right there on your filesystem, by default in the directory called _local.
~/happs-tutorial>ls _local/happs-tutorial_state/
current-0000000000 events-0000000000 events-0000000001 events-0000000002
If there is money on the line, you are going to want to be careful with this directory.
When migrating macid data to a new schema, you are also going to want to be extra cautious.
But for now, since you don't have any valuable data, the following procedure is probably enough to remind yourself to be careful while learning about HAppS in the tutorial sandbox.
Q: Do you have to shut down the HAppS server every time you migrate data to a new schema?
A: As far as I can tell, yes, you do, at least for the HAppS 0.9.2.1 release.
There is code under development in HAppS head that supposedly allows schema migration without
shutting down HAppS, by using multiple HAppS instances and storing your data in a
distributed way. But to keep things straightforward, in particular the zero-hassle
HAppS install, I'm not going to cover anything on this tutorial that hasn't been
released on hackage.
If your web startup requires zero downtime during data migrations,
HAppS probably isn't for you, at least not yet.
Then again, schema migrations using a traditional rdbms are
no picnic either.
Q: Is macid safe? Could I wake up one day with corrupted data under _local and no way to recover from it?
A: Let's be realistic. Compared to, say, mysql, HAppS hasn't been stress-tested much in critical high-volume web sites. Or if it has, no one has shared their experiences. (This is something I hope to change by making it easier to get started with HAppS.) So whatever the HAppS developers say about reliability, personally I wouldn't be surprised if I encountered some kind of data corruption problem as an early adopter.
That said, the unix filesystem is pretty good at not losing your data -- a point famously made by startup guru paul graham, who created viaweb (now yahoo stores) with all the application state in flat files. $! This might not have worked so well if the application required transactional integrity -- say, moving money between accounts. But using macid, if you set up your state appropriately, it should. !$
If you use windows or mac, you probably believe these filesystem are pretty reliable too.
Taking a closer look at what is under _local...
thartman@thartman-laptop:~/happs-tutorial/_local/happs-tutorial_state>ls -lth
total 12K
-rw-r--r-- 1 thartman thartman 0 Oct 1 13:55 events-0000000003
-rw-r--r-- 1 thartman thartman 0 Oct 1 11:55 events-0000000002
-rw-r--r-- 1 thartman thartman 792 Oct 1 11:04 events-0000000001
-rw-r--r-- 1 thartman thartman 491 Oct 1 11:00 events-0000000000
-rw-r--r-- 1 thartman thartman 25 Oct 1 10:59 current-0000000000
thartman@thartman-laptop:~/happs-tutorial/_local/happs-tutorial_state>
Macid serialization works by writing state change event data one file at a time. At server startup, HAppS "replays" all the information here in the order specified by the file names. This is similar to the database transaction log used by many rdbms systems.
So, if I woke up one morning with my HAppS application in a corrupt, non-startable state and my inbox full of angry customer email, probably what I would do is move files, one at a time, out of the serialization directory, last-file created first, and keep trying to restart HAppS.
Q: What if my hard drive dies and I can't get my data back?
A: Like with any other data storage system, if there's valuable data, you need to be making backups. In the case of HAppS data stored under _local, I would probably be rsyncing the _local directory to a remote server, or maybe multiple remote servers for extra safety. For now I am not worried about securing data, but when that day comes I'm pretty confident I'll be ok.
Let's now populate our web application with dummy data.