Back in 1860, a revolutionary new system increased the flow of information throughout the country, reducing to days what used to take weeks or months. The Pony Express used relays of horse-mounted riders to deliver the mail between Missouri and California. Yet, as inventive as it was, the Pony Express lasted for only 18 months.
Why? The telegraph came along and wiped out the Pony Express almost over night.
The telegraph meant that people were able to transfer information not in days, but in minutes. Now that was revolutionary! You could get the information to the correct people, regardless of their location, as long as they had a telegraph office nearby. If you were in business, you needed to use the telegraph just to remain competitive and conduct business at the highest levels.
The modern state of records and information programs (RIM) is a bit like the Pony Express problem: so many of the traditional practices come from the classic paper world.
The original goal for RIM was to figure out how to do things like color code files to manage them and keep the business moving forward. Still today, most of the focus within records management is on speeding up those processes instead of fundamentally changing them to adapt to a digital environment.
We hardly need tell you that the volume of information has increased exponentially. As we’ve moved from paper to digital documents, the scale has fundamentally changed. We’ve gone from megabytes to zettabytes, and the numbers continue to increase. Organizations unable to easily access their data in digital form are using the Pony Express, while everyone else has moved on to the telegraph.
As with any topic as complex as document digitization, there’s good and bad news.
Let’s cover the good news first. These days many records are, happily, born in a digital format, though the volume can be just as impossible to manage as paper records.
Now comes the bad news: Every office probably still has a printer sitting in the corner.
It won’t take too long before some of those digital records become paper and, before you know it, you’ve got two or three—or more—sets of the same information. Even if these files stay purely digital, research has shown that 70 to 80% of information is actually ROT (redundant, obsolete, or trivial). So only 20 to 30% of your information is an active record – something that you’ll need to keep as reference or as a working file.
Counter-intuitive as it seems, digital systems have only exacerbated the problem. There’s a tendency to just stuff information into a digital system like it’s a bottomless filing cabinet. Sooner or later, though, even your digital filing system has a bursting point , and the data must be migrated to a bigger, more feature-rich system.
A data warehouse is a store of data designated for a specific purpose, while a data lake is a large pool of raw data with an undefined purpose. |
First of all, storage (whether paper or digital) isn’t free. Even though storage costs on a per-gigabyte basis have declined even faster than microprocessors have increased in processing power, organizational storage costs still go up, on average, more than 10% per year.
Another version of that issue is the data warehouse or the data lake problem. Most digital storage systems don’t have a meaningful purge capacity, so the overall size of the information pie gets bigger… and bigger… and still bigger. And that means ROT continues to grow as well, making it harder to find the right information.
All of that useless data, possibly existing in multiple places, is the first step to loss of control—and ultimately towards all sorts of severely penalized violations.
We’re currently in a landscape where privacy and data security are really critical issues for any organization, whether you’re a business, a nonprofit, or a government agency; but all of those issues begin with a loss of control and dark data sets.
What’s in a dark data set? That’s the problem: you don’t know—and it’s more than likely to be made up entirely of ROT. |
That should be a concern for every organization, because there are regulatory, discovery, and privacy issues waiting to happen – if they haven’t already. Plus, having all of this information available means that, if you’re in the process of legal discovery, you could be forced to produce that data. Producing it is half the problem: the other half is actually finding it, and e-discovery costs are continuing to grow. ROT obfuscates useful information, leading organizations can spend about $2,000 per gigabyte just looking for information.
An accurate, up-to-date RIM program is an indispensable tool for maintaining control of your data, eliminating dark data and ROT data, and ensuring regulatory compliance. It’s time to leave your Pony Express data storage problems behind and join the age of digital transformation.
For more on powering a successful digital transformation, watch Building a Complete IG Program: What Are the Pieces of the Puzzle?
Share