Convenience is so important to me. Move quickly and get things done. That’s probably why I am so addicted to Amazon Prime despite the fact that I could save money if I would just do a little research and find better prices for the things I buy. My impatience and one-click buyer behavior is the reason that Amazon is on track to deliver more than $200B in revenue this year. Happily, I am not a material portion of that revenue!
The same expectation for convenience and one-click service is having a major influence on the database and big data analytics industry as well. No one can deny that Data Warehousing as a Service (DWaaS) is impacting our customer conversations and I fully understand. After all, what could be better than simply handing over your data and letting someone else manage everything? All you need is a nice visualization tool that delivers the dashboards, reports and ad hoc queries right to the screen in front of you?
But let’s dig a little deeper into the “hand over your data” part of this discussion. Data Warehousing as a Service is an offering that is delivered by people. Handing over your data means you choose to give full access to all of your data and you trust the company and its employees fully and completely with that data. You are confident that they have the resources, expertise, maturity, processes and policies to protect your data. When we’re talking about DWaaS in the cloud, then you’re actually agreeing to give full and unlimited access to all your data to at least two external companies – the DWaaS provider and the Cloud provider. This is a serious decision and one that could have significant business implications and legal liabilities.
DWaaS providers make strong and compelling claims about their security practices. They ensure that all your data is encrypted, but who has the keys to that encryption? They continuously work to optimize and improve your query performance, but doesn’t that mean that they can see and know what your queries look like, and what the results are? They manage all the infrastructure including compute and storage but what confidence should you have about how that infrastructure is shared with other data sets, especially when database sharing is in play? And do you really even know who and where the people are who are delivering this DWaaS? Every tech company uses third party resources in a variety of locations to manage costs, extending the visibility of your data even more broadly than you might know.
The most prominent DWaaS providers are young companies with employee populations in the hundreds. The hacking industry is populated by thousands of experienced and highly motivated people who breech security every hour of every day. Their maturity, experience and a proven and successful hacking track record are undeniable. But malicious intent is not the only concern; in fact for me, it’s less worrisome than human mistakes.
Convenience but at what cost?
The two most prominent “oops” stories in my recent memory come from Amazon and Twitter. Remember back in March 2017 when significant portions of the Internet were crippled when part of Amazon’s cloud, S3, experienced an hours-long outage?
In an official Amazon statement, the company said “An authorized S3 team member using an established playbook executed a command which was intended to remove a small number of servers for one of the S3 subsystems that is used by the S3 billing process. Unfortunately, one of the inputs to the command was entered incorrectly and a larger set of servers was removed than intended.”
The two most important takeaways (my bold font) for me was that this was an authorized employee following an approved company process who simply made a mistake. No one was malicious here but the impact was enormous. A similar example came recently from Twitter, when an employee “accidentally” deleted a very well known (no politics allowed!) Twitter account. Again, this employee was following company policy and procedures but was successful in executing a “delete” command that had quite an impact. Now think about the likelihood (high) that these risks are could also happen within a DWaaS environment. It’s something that must be seriously considered when the attractiveness of Data Warehousing as a Service is flirting with the CIO and his overworked IT team.
So why is Vertica a better choice? Why should companies really worry about “outsourcing” their data warehouse? There are two fundamental reasons why Vertica can deliver the analytics needed without the risk that everyone dreads. First, Vertica provides a granular level of role and privilege control ensuring that even within the company that trusts its data to Vertica, each user can only see the data and the queries for which he or she is specifically granted approved access. Column level encryption, alerts in the Management Console and a wide variety of other security functions are built into the core of the Vertica code. But the second reason is even more compelling to me. When there is a problem with a table, or a problem with a node, or a performance issue with a query or frankly any support related issue, the customer is fully and completely in charge of what information is shared with the Vertica Support and Best Practices team. Only the customer can choose to provide Vertica logs, and only the customer can choose what data is included in those logs. Only the customer can run a Scrutinize report and provide that information to Vertica. This gives customers full and complete control over what they share, something that is not true with Data Warehousing as a Service.
I was recently chatting with Shrirang Kamat, the leader of Vertica’s Best Practices team and someone who is the point of escalation for the most critical customer issues. He told me that he has helped customers whose data was so sensitive that they could not allow him to see even the column names in their tables given the sensitivity. He shared a funny story about information that he did learn from data provided by one of our customers … information that he wished he didn’t know! But the visibility to any and all of the data that Vertica manages and analyses for all of our customers is firmly and completely within their control and since they’re the ones who will pay the price if mistakes are made, that level of control will definitely provide a better night’s sleep, especially to the CFO and General Counsel.
The message here is clear – there are a lot of questions but not a lot of answers. Human error is unavoidable so how can our customers ensure that they don’t add yet another layer (DWaaS) of risk to their cloud stacks? Take control and keep control!
Security is not the only unknown. As we approach the end of the quarter, my Product Marketing team is working hard to ensure that every forecasted expense, invoice, accrual and Pcard reconciliation is completed before quarter close. We get in a LOT of trouble if we overspend or miss our forecast and I can’t help but assume that many of our customers have the same constraints. For DWaaS providers who “auto scale” without any control or approval from their customers, the only accurate financial forecast is the monthly bill. My experience is that Finance doesn’t like surprises!
None of this means that Data Warehousing as a Service isn’t the right choice for some companies. But for companies whose data is truly foundational to their business success and legal responsibility, it’s a good idea to consider the risks as well as the convenience.