De-Risk Your Data to Accelerate Your Cloud Journey: Part 1 — How Did We Get Here

Reducing the risk of your data while moving it to the cloud can help you get early wins in your cloud journey without adding unnecessary risk to your business.

Eric McCarty
7 min readJul 8, 2021
Photo by Samson on Unsplash

How Did We Get Here?

“It’s 2021, why isn’t all your data and analytics workloads in the cloud yet?”

This is a question generally asked by folks at digital native companies, naïve idealistic engineers (my favorite kind), and executives who don’t quite grasp why their analytics journey is moving at a snails pace. But for highly regulated industries or very risk-averse companies (or both, which is the case for most financial services companies), the answer is usually not easy.

To understand why these highly-regulated, risk-averse (from here, shortened to HRRA) companies are often slow to move data to the cloud, you have to look at some history. To start, many of these companies are many decades to over a century old or more, and most have achieved that longevity by building trust with their customers and not taking unnecessary risk.

There are many ballyhooed articles talking about companies that fail to innovate, but you’ll see a surprising lack of financial services companies on these lists. You’ll notice none of the 50 in that linked article are traditional finserv companies; it’s always some combination of Blockbuster, Polaroid, Kodak, AOL and similar stories. But the truth is, in the financial services/HRRA world, slow, steady, avoiding risk and building trust are paramount to success.

Take the biggest failures in the financial services industry (with the top two actually being the top two failures by assets of ANY company, regardless of sector) for example. We’ll look at the biggest two, Lehman Brothers’ and Washington Mutual. Without getting into all the details of Lehman Brothers demise, most analysts agree their biggest cause of downfall was engaging in risky subprime practices. This practice led to the 2008 financial crisis, resulting in the market being highly regulated and scrutinized by the federal government, with each major decision being analyzed under a microscope. This landscape is still very much in effect today. In Washington Mutual’s case, they were lauded at one time for being a risk-taking, innovative company. In 2003, CEO Kerry Killinger said:

“We hope to do to this industry what Wal-Mart did to theirs, Starbucks did to theirs, Costco did to theirs and Lowe’s, Home Depot did to their industry. And I think if we’ve done our job, five years from now you’re not going to call us a bank.”

In a classic “aged like milk” story, people weren’t calling them a bank anymore, but for very different reasons than he expected. WaMu was filing for bankruptcy and being acquired five years later. While there are many contributing factors to WaMu’s demise, their risk-taking strategy became a cautionary tale for other financial services companies.

In the end, both of these companies lost the trust of the federal government, lost the trust of their customers, and ultimately it led to their failure. In response, many HRRA companies have focused on staying compliant and continuing with decisions that build trust with their customers.

Disclaimer: This is not making the case that HRRA companies cannot be innovative, that they shouldn’t take measurable risks, or that they couldn’t be disrupted by startups similar to other industries. What I am saying is that fundamental shifts in doing business, like moving their most precious assets to the public cloud (something that may be obvious and 8-years-too-late for digital native companies) will come with more careful planning and risk-reducing actions, which generally also means lagging behind other industries.

All of the above is to say that history has made HRRA companies gun-shy of taking risks, like moving large amounts of data to the public cloud. Don’t get me wrong, almost all HRRA companies in 2021 have some data in the cloud, many of them for a very long time. But if you peel this back, much of this data falls into three domains:

  • Partners that Leverage the Public Cloud — where the partner owns the liability of protecting that data. For most HRRA companies to function and compete, they require sending large amounts of data to partners who bear the responsibility of processing and protecting that data. For example, Experian is a company that collects and aggregates sensitive data from many banking companies, and much of their storage and processing is done in the public cloud. But for many of these banks, they entrusted Experian to protect data in the public cloud that they themselves had not yet leveraged or trusted. But because of Experian’s expertise in this space, along with their neck being on the line in the event of a data incident, many banks send their data with confidence (or ignorance, perhaps). There are many examples like this across many HRRA companies, and it’s a requirement to compete in these industries.
  • SaaS Applications — where, like above, the SaaS company owns the liability of protecting that data. Very similar to above, there is no “shared responsibility” here, the SaaS application owns the processing and protection of data leveraged in its application, and the benefits of many of these applications make it a worthwhile (and sometimes necessary) component of their infrastructure, and have been leveraged for many years.
  • “Non-Critical Zones” in Public Cloud — where many HRRA companies are dabbling, iterating, and experimenting in the public cloud. These zones are getting HRRA companies’ feet wet as they understand the perfect security and implementation model, but if we’re being very honest, are not transforming their business in the way the public cloud promised. But because they now share responsibility and burden for the protection of their data, expansion in the public cloud is painful, frustrating, and slow. If you have read this far, it’s likely you are here.

A Common “Analytics in the Cloud” Story

“Analytics is a great use case to move forward with public cloud, isn’t it?”

When it comes to data and analytics, historically there have been two competing issues at play:

  • Most security functions (IAM, firewall, encryption, etc) are binary: keep the bad guys out, only allow the good guys in to only the data they need.
  • To be successful, most data analysts need access to a ton of data: access to large sets of data is one of the key components to an analyst’s success (sometimes referred to as “data democratization”). When you are talking about 1000’s of elements or more, it’s hard to nail down exactly what elements are needed at a specific time, so analysts prefer access to it all.

This dichotomy has led to some of my favorite challenges in my career thus far. As a data nerd, I tend to empathize and have architected solutions aimed at the second group. If you are a data person at all, you have probably done some analysis in the last year on something COVID related, only to hit some pitfall of some missing piece of information that doesn’t enable you to verify a theory you had. Now imagine doing that at work every day, except that missing information is available, but locked behind another database, or only accessible if you get approval from 3 layers of bureaucratic, unclear data ownership. By the time you go through the process of getting access to this data, it’s a week later and you are on to way different problems, and the spark of your original theory that could have saved the company millions has long been lost.

That being said, I empathize with the security group too as they have a thankless job. If they do their job right, each owner of data gets to wall off their specific data and analysts complain that they are impeding progress. If they don’t do their job right, there is a data breach and they are in the news, and those same analysts complaining would not magically show up in their defense (and guess who gets “re-orged” in that scenario?).

Due to the aforementioned highly-regulated and risk-averse environment (plus being a very highly-prized target of criminals), many companies have spent decade(s) securing their on-premise environment, spending painstaking time making sure their data environment is a fortress from outside attack. However, during that same time, the rise of analytics and big data programs have opened up access to very large amounts of data to an ever-increasing number of analyst and data scientist types internally. And it gets very difficult to understand if all the access these folks have has sensitive information in it. While it’s easy to eliminate obvious high-risk/low-reward fields for analysts (like SSN), it gets much more difficult with fields that are more ambiguous (like freetext input fields that may contain an SSN if a customer mistakenly entered it into that field). But that risk has always seemed somewhat offset because those same analysts with access to all that data have very little chance of exposing that data externally due to the fortress that security has placed on the perimeter.

Enter public cloud. Unlike partners that leverage public cloud and SaaS applications, the company is 100% liable for any data incident (both financially and in public opinion). And now the perimeter that security has worked on that has helped reduce risk and build customer trust has to be extended. So when discussing the public cloud, this leaves HRRA companies with three traditional options:

  • Keep the “walled off” data analytics environment on-prem and resist the move to the cloud
  • Ensure the information security and cloud governance processes are perfect/highly mature before movement of any data to the cloud
  • Increase the risk appetite and move forward
All Traditional Options Have Risk. “Wall Off” Data Environment On-Prem:  Reduced Agility Reduced Scalability Talent Gap Issues More time/cycles spent on operations. Wait for Perfect Infosec/ Governance in the Cloud: Long timelines before value seen Do not get value out of commitments Frustrated workforce Unhappy users No reduction of on-prem infrastructure. Increase Risk Appetite: Risk of data exposure if something is misconfigured Brand reputation risk Fines.
Fig 1. -The Three Traditional Options

Based on my experience, it seems most HRRA companies have chosen the middle option. And based on that experience, “frustrated workforce” and “unhappy users” is putting it mildly. I’ve seen analysts actively avoiding anything with the cloud because it was easier to get their job done, engineers work around constraints adding to risk, and project and cloud migration timelines missed by a year or more. It’s a mess.

There is another option though. In part 2 of this series, we will explore how to de-risk your data so you can move past these hurdles and accelerate your journey to the cloud.

Continue this series in part 2 here.

--

--

Eric McCarty

Data Specialist Engineer at Google and former Technical Architect at USAA and Walgreens. Opinions are of my own and not of Google.