Skip to content
samuelpward edited this page Mar 26, 2011 · 32 revisions

Storages

Currently supported storages:

  • Amazon Simple Storage Service (S3)
  • Rackspace Cloud Files (Mosso)
  • Dropbox Web Service
  • Remote Server (Protocols: FTP, SFTP, SCP, RSync)

Examples

Below are examples you can copy/paste in to your "Backup" configuration file. These blocks should be placed between Backup::Model.new(:my_backup, 'My Backup') do and end.

Amazon Simple Storage Service (S3)

store_with S3 do |s3|
  s3.access_key_id      = 'my_access_key_id'
  s3.secret_access_key  = 'my_secret_access_key'
  s3.region             = 'us-east-1'
  s3.bucket             = 'bucket-name'
  s3.path               = '/path/to/my/backups'
  s3.keep               = 10
end

You will need an Amazon AWS (S3) account. You can get one here.

The s3.keep option allows you to "cycle" backups. If the s3.keep limit you specified gets exceeded, the oldest backed up file will be removed from the S3 bucket.

TIP It's often a good choice to store to Amazon S3. Though, they have a file size limit of 5GB, this means that if your backup file exceeds 5GB, Amazon S3 will not accept it. Backups tend to get large really quick when you're dealing with non-textual files, such as images, video and other formats. It's often a good idea (if you plan to back them up anyway) to store these assets to Amazon S3 directly when users upload these files, and avoid high backup costs and server load. If you're only backing up your database(s), then by the time your database(s) hit the (in compressed state) 5GB mark, then you're probably earning enough to justify paying for your own little dedicated backup server. Then you could set up Backup to roll out backups to that server instead of Amazon S3 using for example the RSync or SCP protocols Backup provides.

Rackspace Cloud Files (Mosso)

store_with CloudFiles do |cf|
  cf.api_key   = 'my_api_key'
  cf.username  = 'my_username'
  cf.container = 'my_container'
  cf.path      = '/path/to/my/backups'
  cf.keep      = 5
end

You will need a Rackspace Cloud Files account. You can get one here.

The cf.keep option allows you to "cycle" backups. If the cf.keep limit you specified gets exceeded, the oldest backed up file will be removed from the Cloud Files container.

TIP Storing to Rackspace Cloud Files is another option, I would only recommend using this if you really need to and your backups are relatively small. It seems (at this point) there is no way to stream data from your server to Rackspace Cloud Files. This means that if your backup is 300MB in size, it'll have to allocate 300MB of your server's memory in order to transfer the file. If you are looking for Cloud Based storage and your backup file size is less than 2GB, I'd recommend taking a look at Amazon S3. If your backup file size is more than 2GB, I'd recommend taking a look at the RSync protocol and spawn your own little backup server (a cheap server with a lot of storage capacity) and do incremental backups to reduce server load, bandwidth usage and increase the backup invocation frequency.

Dropbox Web Service

store_with Dropbox do |db|
  db.email      = 'my@email.com'
  db.password   = 'my_password'
  db.api_key    = 'my_api_key'
  db.api_secret = 'my_api_secret'
  db.path       = '/path/to/my/backups'
  db.keep       = 25
  db.timeout    = 300
end

To use the Dropbox service as a backup storage, you need two things:

NOTE The first link I provided is a referral link. If you create your account through that link, then you should receive an additional 250MB storage (2.25GB total, instead of 2GB) for your newly created account.

The db.keep option allows you to "cycle" backups. If the db.keep limit you specified gets exceeded, the oldest backed up file will be removed from the Dropbox account.

The db.timeout (default 300) indicates the time you want a Dropbox connection to time out (in seconds). If you get too many errors due to timeouts you may want to increase the timeout time to give Backup more time to transfer the backups.

TIP this is a nice pick for small databases. Say you have a bunch of small web apps with small databases, then this is a good (and free!) pick. Create a Dropbox account, get your API key, and let Backup transfer your backups to Dropbox. You get a 2GB Dropbox account for free, there are no bandwidth charges either. The reason I say "small" databases is because unlike Amazon S3 and Rackspace Cloud Files, which have a 5GB file size limit, Dropbox has a 300MB file size limit. For assets/files this isn't ideal, but for database dumps, which are textual, and thus can be compressed extremely well, this'll be ideal for most website backups, assuming your compressed database dumps are less than 300MB in size. (And as you know, 300MB compressed is actually a lot of data for almost any regular website, usually MySQL dumps compress to a 10 times smaller size. A 3GB backup could become 300MB when compressed). When you near the 2GB storage limit, you can always upgrade your Dropbox account, or create new ones for free or move to Amazon S3, Rackspace Cloud Files or host your own little backup server.

Remote Server (FTP)

store_with FTP do |server|
  server.username = 'my_username'
  server.password = 'my_password'
  server.ip       = '123.45.678.90'
  server.port     = 21
  server.path     = '~/backups/'
  server.keep     = 5
end

The server.keep option allows you to "cycle" backups. If the server.keep limit you specified gets exceeded, the oldest backed up file will be removed from the remote server.

TIP use SFTP if possible, it's a more secure protocol.

Remote Server (SFTP)

store_with SFTP do |server|
  server.username = 'my_username'
  server.password = 'my_password'
  server.ip       = '123.45.678.90'
  server.port     = 22
  server.path     = '~/backups/'
  server.keep     = 5
end

The server.keep option allows you to "cycle" backups. If the server.keep limit you specified gets exceeded, the oldest backed up file will be removed from the remote server.

Remote Server (SCP)

store_with SCP do |server|
  server.username = 'my_username'
  server.password = 'my_password'
  server.ip       = '123.45.678.90'
  server.port     = 22
  server.path     = '~/backups/'
  server.keep     = 5
end

The server.keep option allows you to "cycle" backups. If the server.keep limit you specified gets exceeded, the oldest backed up file will be removed from the remote server.

Remote Server (RSync)

store_with RSync do |server|
  server.username = 'my_username'
  server.password = 'my_password'
  server.ip       = '123.45.678.90'
  server.port     = 22
  server.path     = '~/backups/'
end

Additional Notes

You cannot specify the server.keep = amount here. The reason you use RSync is probably because you want to benefit from the incremental backup feature it provides to reduce bandwidth costs, heavy transfer load, etc. the RSync protocol allows you to transfer only the changes of each backup, rather than the whole backup.

NOTE If you only want to sync particular folders on your filesystem to a backup server then be sure to take a look at Syncers. They are, in most cases, more suitable for this purpose. Especially if you're planning to transfer large (gigabytes) of data in folders such as images, music, videos, and other heavy formats.

Example: Say you just transferred a backup of about 2000MB in size. 12 hours later the Backup gem packages a new backup file for you and it appears to be 2050MB in size. Rather than transferring the whole 2050MB to the remote server, it'll lookup the difference between the source and destination backups and only transfer the bytes that changed. In this case it'll transfer only around 50MB rather than the full 2050MB.

Note that you should NOT use any compressor (like Gzip) or encryptor (like OpenSSL or GPG) when using RSync. RSync has a hard time determining the difference between the source and destination files when compressed or encrypted, which'll resort in high bandwidth usage again.

TIP use this if you find that your bandwidth usage/server load is too high, if you want to backup more frequently, if you prefer incremental backups over cycling, if you've outgrown Amazon S3 or Rackspace Cloud Files etc. etc.

Note that Backup::Storage::RSync does not support cycling. This means only ONE copy of your backup will be stored at the remote location. If you want to store multiple, here's an idea.

Create multiple Backup::Model's.

# Backup configuration file
1.upto(4) do |n|
  Backup::Model.new("my_backup_#{n}", "My Backup") do
    store_with RSync do |rsync|
      rsync.path = "/backups/rsync_#{n}/"
      ...
    end
  end
end

# Whenever Gem for managing the crontab
every 1.day, :at => ('06'..'11').to_a.map {|x| "#{x}:00" } do
  command "backup --trigger my_backup_1"
end

every 1.day, :at => ('12'..'17').to_a.map {|x| "#{x}:00" } do
  command "backup --trigger my_backup_2"
end

every 1.day, :at => ('18'..'23').to_a.map {|x| "#{x}:00" } do
  command "backup --trigger my_backup_3"
end

every 1.day, :at => ('0'..'05').to_a.map {|x| "#{x}:00" } do
  command "backup --trigger my_backup_4"
end

This will ensure you have 4 copies of your backup on the remote server that sync once each hour. The nice thing is that in case your production server crashes, or has a power outage, you'll only lose an hour worth of data (max) if anything gets corrupted. The other nice thing is that in case your database becomes corrupt for some reason and it still manages to dump and RSync the corrupt data, you'll still have 3 copies unharmed, given you don't take more than 6-24 hours (or more) to act after your applications doesn't work anymore.

Or of course, think of your own use cases (and let me know if you figure out any good ones!).

Default Configuration for each storage

If you are backing up to multiple storage locations, you may want to specify default configuration so that you don't have to rewrite the same lines of code for each of the same storage types. For example, say that the Amazon S3 storage always has the same access_key_id and secret_access_key. You could write this above the Model configurations.

Backup::Configuration::Storage::S3.defaults do |s3|
  s3.access_key_id     = "my_access_key_id"
  s3.secret_access_key = "my_secret_access_key"
end

So now for every S3 database you wish to back up that requires the access_key_id and secret_access_key to be filled in with the defaults we just specified above, you may omit them in the actual store_with block, like so:

store_with S3 do |s3|
  s3.bucket = "some-bucket"
  # no need to specify access_key_id
  # no need to specify my_secret_access_key
end

You can set defaults for CloudFiles by changing the Backup::Configuration::Storage::S3.defaults to Backup::Configuration::Storage::CloudFiles.defaults, and so forth for every supported storage location.

Clone this wiki locally