CAMERA

A Creator’s Guide to Data Storage and Backup

If you’re someone who makes a living by creating things with your computer, or even an enthusiastic hobbyist, chances are that eventually, you’re going to need more room to store all of your stuff than can be provided by the drive(s) built into your computer. Arguably more importantly, if you’re especially attached to some of the work you’ve done and would like to protect it from loss, you’re going to want some method of backing up your data.

In the current computing world, a revolving door of marketing buzzwords and jargon, it can be difficult to figure out exactly what the purchase proposition is for any given solution, or which one is right for you. But fear not! Allow me to take you through what I hope is a simple, practical explanation of the landscape that will be useful to you in the real world.

Table of Contents

The 3-2-1 Rule

The gold standard of everyday data security is what is known as the “3-2-1 Rule”. It’s very simple, and is broken down thusly:

Any mission-critical or otherwise-sensitive data should have: no less than 3 copies, stored on at least 2 different storage media, with at least 1 copy stored offsite.

This management scheme is designed to ensure that the destruction or loss of any one method of data storage can be mitigated by at least one other method.

A simple example of this rule in practice is to imagine that you keep a full backup of your laptop on a USB external drive stored in your desk, and another one in secure cloud storage with a service like OneDrive or iCloud. The theft or destruction of your laptop still leaves 2 additional copies of the data to be accessed, and even in the worst-case scenario of a fire, flood, or other disaster where you live that results in the loss of both your computer and your local backup, you’ll still be able to recover your data from cloud storage at a later date. For a creator, you might modify this to include not formatting or overwriting memory cards or other storage media until you’re certain you won’t need that data anymore.

For people like myself with terabytes and terabytes of photos, video footage, project files, and related data, this scheme may or may not be feasible to apply to everything in your collection. I’ve run the numbers, and to have a triplicate backup of my entire catalog of work would cost thousands and thousands of dollars at a minimum, and given that I’ve already extracted all of the expected value out of most of it, that’s not really worth it to me.

However, I still apply the rule in certain circumstances. For work that I consider worthy of inclusion in my portfolio, special projects I really loved working on, or things that I’m actively working on, I always apply the 3-2-1 rule in full effect, and it has saved me both headaches and actual money on more than one occasion.

Internal vs. External Storage

When you first run out of space on the factory storage built into your machine, you have some choices to make about how to proceed, chief among which is deciding whether to simply slot another drive into your machine if it supports doing so, or to go fully external.

If you choose the former, you have less to worry about-purchase a compatible drive of an appropriate size, and follow a guide from YouTube or iFixit to install it in your machine if you’re unsure. Nearly all desktops will allow for additional storage to be installed, and some Windows laptops will as well. Mac machines are a different story, as most of them (especially the more recent ones with Apple’s first-party silicon onboard) are so tightly integrated that you’re usually stuck with however much you went for when you first bought it.

Simply installing more internal storage does have a few distinct advantages over using external solutions. Since the drives probably aren’t going to be moved around as much, there’s little chance of a drive failure related to physical damage, especially if you choose to install a solid-state drive with no moving components. It’s also generally a higher-performance option with faster read/write speeds, since you’re plumbing directly into the motherboard or an expansion card instead of potentially bottlenecking yourself over an older USB standard or through a “value-oriented” control chip in an external drive. However, they’re more complicated to access, so installation and servicing can be a pain.

If you elect to use an external storage device, you have a variety of options to choose from. Obviously, you can pick up a simple USB external drive at just about any store with an electronics department, or you could go much more advanced should your needs or goals warrant it.

DAS/NAS

Two of the most common “more advanced” external storage methods are Direct-Attached Storage, or DAS, and Network-Attached Storage, or NAS. Each provides unique benefits and unique considerations.

DAS systems are, essentially, an extra-large version of the USB-powered external drive you picked up from Best Buy. Many of these will simply take the form of a small enclosure containing one or more bays in which you can install standard 3.5” hard drives, powered via an AC adapter and plugged directly into your computer, usually via USB. Since these systems don’t contain their own processor, memory, or networking, they’re usually more affordable than their networked brethren, so if you have no need to access your data remotely but don’t want to deal with a horde of 1TB USB drives, a DAS is an excellent option. If you happen to own a fancy router with supporting functionality, you might even be able to connect your DAS to your router via USB and access it over your local network, freeing up real estate on your desk.

NAS systems, on the other hand, are small, low-power computers whose only job is to store and retrieve data. Visually, many look very similar to their direct-attached counterparts, but inside will be an entire computer system in addition to the drives used for storage. This allows the unit to be plugged into a router or network switch and accessed over the internet from anywhere in the world. Naturally, this adds complexity and therefore cost. In addition, there’s potential concern that the accompanying management suites and mobile apps will lose developer support and might become obsolete, but tools like FreeNAS and TrueNAS do exist to continue using your storage box with an open-source operating system, so I wouldn’t worry too much about that.

RAID Boss

RAID, or Redundant Array of Independent Disks/Drives, is a method of combining the storage capacity of multiple hard drives or solid-state drives to increase their performance or fault tolerance. Many DAS and NAS products on the market, from brands like Western Digital, OWC, and Glyph Technologies, incorporate the ability to be configured as a RAID array rather than what is often called a “JBOD”, or “Just a Bunch Of Drives”, where 100% of the available drive space is used for storage, with none given over to improve performance or redundancy.

Many configurations of RAID exist, but let’s go through a few of the most common.

Diagram of RAID 0 with two disks. Disk 0 holds data blocks A1, A3, A5, and A7; Disk 1 holds A2, A4, A6, and A8, showing striped data distribution across both disks.

RAID 0-In RAID 0, the data to be stored is distributed across all of the drives in equal proportion. In basic terms, in an array with 2 drives, each drive gets half of any given file, which means data can be read and written at (ideally) twice the speed, increasing as more drives are added. This configuration is very susceptible to data loss — if one drive fails, none of the data on any of the other drives can be recovered, since it’s missing the piece stored on the failed drive. Absolutely not recommended for storage unless all you care about is raw speed.

Diagram of RAID 1: Two disks labeled Disk 0 and Disk 1, each storing identical data blocks A1, A2, A3, and A4, illustrating data mirroring for redundancy.

RAID 1-In RAID 1, data is duplicated to each drive separately, so an exact copy is stored on each drive in the array. No benefit to performance, in fact it’s often slower than a single drive thanks to the duplication overhead, but very resilient to loss, since only one drive needs to survive for all of the data to be recovered. The biggest downside to a RAID 1 setup is that effective capacity will only be 1/n of the total combined capacity, where n is the number of drives in the array.

Diagram illustrating RAID 5 data distribution across four disks, showing striped data blocks (A, B, C, D) and parity blocks (p) spread evenly among Disk 0, Disk 1, Disk 2, and Disk 3.

RAID 5-In RAID 5, data is distributed to the drives much as it is in RAID 0, with the addition of a separate “parity” drive created using some fancy math, necessitating a minimum of 3 drives. RAID 5 enjoys most of the performance benefits of RAID 0 with the added benefit that it can survive the failure of any drive in the array, since the remaining drives can be used to rebuild the lost data. However, if another drive were to fail during that operation, all data would be lost.

Diagram showing RAID 6 data storage across five disks, with data blocks (A1–E1, etc.) and two sets of parity blocks (P and Q) distributed among disks labeled Disk 0 to Disk 4.

RAID 6-Functionally identical to RAID 5, but with 2 parity drives rather than 1, requiring a minimum of 4 drives. This greatly decreases the chance of an unrecoverable drive failure during data recovery, and is therefore highly recommended for large arrays or those using high-capacity drives that take longer to rebuild and therefore spend more time vulnerable to loss during a recovery.

Nested or Hybrid RAID-This description spans a wide range of different implementations that are usually tailored to a very specific use case. In simple terms, we’re talking RAID arrays within RAID arrays — for example, a RAID 0 array in which each copy of the data is mirrored across a number of drives as in RAID 1, resulting in what is known as RAID 10. There are far too many variants to get into here — check out the Wikipedia article if you’d like to learn more.

I feel it is important to stress at this juncture that RAID is not a backup unless it contains data also stored elsewhere. Yes, an appropriate implementation of RAID does provide a level of resiliency to the one effective copy of the data stored on the array, but if some freak occurrence results in the loss or destruction of the entire array, you’re no better off than if the data had been stored on a single drive in its entirety. Therefore, among those with large amounts of data, it’s often the case that their RAID storage array is treated as the “master” copy of their data, with additional storage methods being utilized for data that warrants it.

Cloud Storage

This is a bit of a touchy subject among the data hoarder community. Cloud storage undoubtedly offers a minimally complicated, easy-to-use solution to generate backups of specific files on your PC without even really needing to think about it. Of course, that comes at the cost of having to upload your potentially sensitive data to somebody else’s computer, somewhere else in the world, which for some people is a dealbreaker in and of itself. For many with nothing more than a collection of family photos or their tax documents they need to keep safe though, it’s kind of an ideal solution.

On the other hand, for data packages larger than a terabyte or two, cloud storage services get to be expensive very quickly, and many providers don’t even advertise a subscription tier of more than about 10 TB (for reference, I have almost 4 times that on my various drives at the moment). What’s worse, even many of the “unlimited storage” services are not as “unlimited” as they advertise themselves to be, as many of them will throttle your upload/download bandwidth if you exceed a certain amount of data use, or will even cut you off entirely until you get rid of some of your files. Even more egregiously, some have file size limits of perhaps 5-10GB, presumably to prevent people from uploading massive .ZIP archives, but for a creator that might shoot a lot of video where files can easily be several times that large each, cloud storage is ruled out as a solution.

The way I use cloud storage is twofold: as a quick and easy means of transferring files between my phone and my computer, and as a last-resort backup of my most important files and favorite work. Mass cloud storage of my full archive is likely never going to be an economical solution for me anymore, but I think I’d be foolish to pass up use of an extra backup method I’m already paying for through my Microsoft 365 subscription.

In closing, I think it’s important to remind you all that as digital creators, we all have projects and work that we’d be devastated to lose. It might have involved time we had to take off from work, money that we set aside from other things, effort that we hadn’t originally planned to invest, or you could’ve just really enjoyed the process of doing it. I would argue that the fruits of that labor are even more important to protect than our cameras or other equipment, since most of those things are commodities that can be replaced. I hope I’ve given you some of the tools and knowledge to ensure your data is safe and sound. And remember, 3-2-1.


Image credits: Stock photo from Depositphotos


Source link

Related Articles

Back to top button