Bekker's Blog

Blog archive

IDC: Data Creation Hit 64ZB in 2020

I'll admit it, I'm a sucker for data milestones.

They're simultaneously amazing and entirely expected.

After all, we're all doing more digitally, seemingly every day, with more and more of our output occurring with images and video. Over this last year, a lot of these remote work meetings happen on video. We're recording a lot of the meetings rather than picking up a pen since the fidelity of a recorded conversation is always better than what you happen to scribble down while the conversation is continuing in your ears.

Then there's the duplicate file in a separate service to create a transcript -- you get the idea. We're all living it.

On to the jaw-dropping new figures. Researchers at IDC this week posted a huge new number for global data creation: 64.2ZB in 2020. That's zettabytes. If you're keeping track, it goes kilobyte (KB), megabyte (MB), gigabyte (GB), terabyte (TB), petabyte (PB), exabyte (EB) and zettabyte (ZB).

"In 2020, 64.2ZB of data was created or replicated, defying the systemic downward pressure asserted by the COVID-19 pandemic on many industries and its impact will be felt for several years," said Dave Reinsel, senior vice president, IDC's Global DataSphere, in a statement Wednesday. "The amount of digital data created over the next five years will be greater than twice the amount of data created since the advent of digital storage."

It's not just virtual meetings, social media, video streaming, wireless and mobile traffic and ever-fatter broadband pipes driving the increase, which IDC anticipates will grow at a compound annual growth rate of 23 percent a year through 2025. It's also IoT data, which is the fastest growing data segment, data at the edge and enterprise data generally.

Other than the headline storage number, what's also interesting in IDC's recent analysis is how ephemeral most of that data is. The overall installed base of storage capacity grew steadily to 6.7ZB in 2020. You'll notice that's about 10 percent of the capacity of all the data that was created.

In all, less than 2 percent of the new data was saved and retained into 2021. Most of it was temporarily created or replicated for consumption and then deleted or overwritten with newer data. Think downloading a Netflix movie to your phone, watching it and then having it removed when you're finished.

The combination of those numbers prompts Reinsel to ask, "How much of it should be stored?"

In IDC's view, we should think about retaining a lot more data, at least on the business side.

"Organizations should consider preparing now to store more data as they seek to achieve digital transformation milestones and improve business metrics by accelerating innovative data analytics initiatives," said John Rydning, research vice president of IDC's Global DataSphere.

Whether or not organizations need to store more data, or just need to more effectively use what they already retain, one thing is sure. We're awash in data, and it's still flooding in.

Posted by Scott Bekker on March 25, 2021