Someone on Lemmy posted a phrase recently: “If you’re not prepared to manage backups then you’re not prepared to self host.”
This seems like not only sound advice but a crucial attitude. My backup plans have been fairly sporadic as I’ve been entering into the world of self hosting. I’m now at a point where I have enough useful software and content that losing my hard drive would be a serious bummer. All of my most valuable content is backed up in one way or another, but it’s time for me to get serious.
I’m currently running an Ubuntu Server with a number of Docker containers, and lots of audio, video, and documents. I’d like to be able to back up everything to a reliable cloud service. I currently have a subscription to proton drive, which is a nice padding to have, but which I knew from the start would not be really adequate. Especially since there is no native Linux proton drive capability.
I’ve read good things about iDrive, S3, and Backblaze. Which one do you use? Would you recommend it? What makes your short list? what is the best value?
After some research on here and reddit about 6 months so, I settled on Borgbase and its been pretty good. I also manually save occasionally to proton drive but you’re right to give up on that as a solution!
The hardest part was choosing the backup method and properly setting up Borg or restic on my machine properly, especially with docker and databases. I have ended up with adding db backup images to each container with an important db, saving to a specific folder. Then that and all the files are backed up by restic to an attached external drive at well as borgbase. This happens at a specific time in the morning and found a restic action to stop all docker containers first, back them up, then spin them back up. I am find the guides that I used if it’s helpful to you.
I also checked my backups a few times and found a few small problems I had to fix. I got the message from order users several times that your backups are useless unless you regularly test them.
I can recommend Restic with Wasabi S3 as cloud storage backend.
I use Storj, it’s been my favorite for years.
Do you mine? Always sounded like the best option if you dont have a friend in another georegion to replicate-to
I did for a few years when the network started, but it became increasingly difficult to do so from a residential IP with slow upload speeds (cable internet).
My backup plan includes Backrest (restic) up to B2. So far so good!
I use Backblaze B2 through my Synology NAS to offsite my important data. Most things though I just backup locally and accept the risk of needing to rebuild certain things (like most of my movie/TV media files since I can just re-rip my physical media, and the storage costs are not worth the couple of days of time in that unlikely case).
I really think this is key when thinking about your backup strategy that is specific to self hosting compared to enterprise operations. The costs come out of our pockets with no revenue to back it up. Managing backups for self hosting IMO is just as much about understanding your risk appetite and then choosing a strategy to match that. For example I keep just single copy in B2, since the failure mode I’m looking to protect against is catastrophic failure of my NAS which holds my main backups and media. I then use Proton Drive and OneDrive to backup secrets for my 2FA setups and encryption for my B2 bucket. This isn’t how I would do it at work (we have a fair more robust, but much more expensive setup). But my costs for B2 are around $15/mo which I am fine with. When I tried keeping multiple copies it had grown to over $50/mo before I cared enough to really rethink things (the cost of the hobby I told myself).
I use iDrive, 20TB for a couple hundred bucks a year. I’ve not found anything that compares to that in pricing. Backblaze I think it’s about $1600 a year for the same storage and the major cloud providers are much higher than that. I view cloud backups as the the last line of defense in the backup strategy. So all the nice features that most providers offer at a significant price increase just don’t make sense to me as I won’t use them. I have the iDrive Linux app running it detects what’s new in the monitored directories and shoves them up to the cloud hopefully to never be needed.
I’ve been using Restic to Backblaze B2.
I don’t really trust B2 that much (I think it is mostly a single-DC kind of storage) but it is reasonably priced and easy to use. Plus as long as their failures aren’t correlated with mine it should be fine.
I quadrupal vote for this combination.
You could trust B2 more; maybe dig into their structure. They’re solid, and not only that they provide an awesome service with their yearly HD failure rate evaluations, in which they describe the structure of their data centers.
I’m terms of NPS, I’m on their side. Unless something comes out and shady business practices, I’m brand loyal to B2. Been with then for years, and love the service, pricing, and company.
I think it depends on your needs. IIUC there storage is “single location”. Like a very significant natural disaster could take it offline or maybe even lose it. Something like S3 or Google Cloud Storage (depending on which durability you select) is multi-location (as in significantly distinct geographical regions). So still very likely that you will never lose any data, but in the extreme cases potentially you could.
If I was storing my only copy of something it would matter a lot more (although even then you are best to store with multiple providers for social reasons, not just technical) but for a backup it is fine.
Enabling multi DC redundancy is really easy though. The other providers you mentioned may have it by default, but they’re also a lot more expensive.
I love that they let me pick my own redundancy strategy, without forcing me to pay for theirs
I think I see what you’re saying.
B2 has multiple data centers around the world - at least 3 in the US and 1 in EU, that I know of. If you want your data replicated, you have to create buckets in multiple locations and connect them for replication, which they’ll do for you (the replication).
If you’re saying that they don’t automatically store multiple copies of your data in multiple locations for you, for free, you’re right. But they do have multiple data centers located around the world, and you can create multiple buckets and configure them for automatic replication so you have redundancy. You have to pay for the storage at each replicated location, though. If you want a bucket in Sacramento, it’ll cost you those pennies. If you want it replicated to Rest on, you’ll pay double the pennies. If you want it also replicated to Amsterdam, triple the pennies.
I don’t think it’s fair to say that they’re single location that could have a natural disaster and you therefore lose your storage. It’s only like that if you set it up that way, and it’s pretty trivial to set up global replication - it just costs more.
That’s true. And I’m not saying B2 is bad, it is just something that you should be aware of.
Their automatic replication isn’t quite as seamless as GCS or S3 though. For example deletes aren’t replicated so you will need a cleanup strategy. Plus once you 2x or 3x the price B2 isn’t as competitive on price. My point is that it is very easy to compare apples to oranges looking at cloud storage providers and it is important to be aware.
For me B2 is a great fit and I am happy with it, but I don’t wan to mislead peope.
I use restic with a wrapper script to automate it on all of my machines. The backend storage can be anything that speaks S3, so B2, or iDrive would both work. I currently use Storj for my backend. It’s globally distributed storage, so no single point of failure geographically and it’s cheap. Backblaze is also a great company, but I’ve grown a little skeptical since they went public.
3,2,1.
My nas is a Synology with raid.
- Backup with versions to a single large HD via USB. This ransomware protection or accidental deletion. (Rsync)
- Offsite copy to backblaze b2.One version. (Rsync) (~$6/month) This would be natual disaster protection. flood, fire.
- Second not raided cheaper Synology at a friends on the other coast. This has ~3 versions. Sorta the backup to the first two.
How do you do versioning with rsync? I use rdiff.
3, 2, 1. ❤
Without implementing this, it’s a delusion that some company, regardless of the size and reputation, can be trusted to keep our data safe.
Also don’t forget to restore test, otherwise you may as well not do backups. I have a reminder for once a year to test them, not just if it works but also what the performance is just in case.
This is the part that gets me. I don’t know how to automate this. I periodically retrieve something from the backups, which, so far, has worked. That’s not really good insurance, though. Any suggests or resources, ideally for borg and/restic?
You can get append only backups on backblaze with their lifecycle rules. So that can have ransomware protection too
“Append only backup” what’s that?
Its a system where you can only apppend, not delete.
https://en.wikipedia.org/wiki/Append-only
Its what’s required for ransomware safe backup system, since the attacker can’t delete your backups because they can only append
Oh, I see, I didn’t know that “nomenclature”. Thanks! Good for some thing, dangerous for other because the stored data keeps growing.
If you’re talking multiple Terrabytes and are located in the EU you might want to consider AWS Glacier I have like 6Tb on there and pay sub 20€ p.m. If you’re in the EU you can request one free migration download by contacting the support. Otherwise you’ll pay thousands.
A server in a friend/family member’s home. All of the cloud based backups I’ve encountered seem either unaffordable or have annoying limitations.
This. Install a NAS in a friend’s house, give them 10% of the capacity as a thank you.
I have one in theirs, they have one in mine.
This is the way
100% this. OP, whatever solution you come up with, strongly consider disentangling your backup ‘storage’ from the platform or software, so you’re not ‘locked in’.
IMO, you want to have something universal, that works with both local and ‘cloud’ (ideally off-site on a own/family/friend’s NAS; far less expensive in the long run). Trust me, as someone who came from CrashPlan and moved to Duplicacy 8 years ago, I no longer worry about how robust my backups are, as I can practice 3-2-1 on my own terms.
Or simply sneakernet drives to a friend’s home. Good excuse to visit a friend more often.
Yup I’ve got a box in my mum’s house that all my off site backups go to and it’s a damn site cheaper just to give her some money for the electricity cost of it each month than pay for any cloud service.
I’ve been using rsync.net for a while now. It’s been stable, fast, and relatively inexpensive. There’s also the benefit that it’s easy to script automated backups directly to it. For more Dropbox-like functionality, I have a Nextcloud instance that uses rsync.net as external storage. It’s been great so far!
They require you to buy a minimum of 800Gb, which for most people is an overkill
@gedaliyah If you’re not married to managed cloud services, services like rsync.net or a Hetzner storage box work very well. They require more effort, but you have complete control and can do some fun things (like using rclone’s crypt module with them). Plus rsync.net is super useful if your sources use ZFS.
Of the cloud providers, Backblaze is the one that anecdotally seems most popular.
I’m a long time user of jottacloud. It’s not really meant for 10TB+, but works great for what I need it to do.
I use the unlimited consumer backblaze with private key on a windows VM. I provision a 40tb iscsi connection to the VM from a NAS and all kinds of various homelab systems and devices store thier backups there. Works great and is the cheapest possible option at $9 a month.
Is that not against their TOS? Could make the service more expensive for the rest of us
Bsckblaze doesn’t care, they know they’ll get their money when @Unforeseen@sh.itjust.works tries to get data back from backup.
Restoring data is free from backblaze.
Well yes and no. The rate at which you get your data back is where the gotcha lies anything up to 8TB is free if you send them $280 and they’ll refund the money once they get the drive back. Anything over 8TB is where it gets pricey.
And do that multiple times?
There aren’t any “gotchas” they absolutely lose money us who store more than a few TB but its worth it considering that we are in the minority.
Someone from BB posted a graph showing the distribution of data usage over all users and the VAST majority are under 1-2 TB
I’m not sure about the iscsi protocol. They allow VMs, including harddrives via USB, so the point of doing this making it more expensive does not apply considering someone could just hook up 100tb+ of USB drives and still be clear under the TOS.
If they did have a problem with this I would just do that instead.