By Peter Bailis, CEO and Founder, Sisu
We’ve spent weeks examining how the COVID-19 pandemic is changing how we operate. Everyone is adapting to remote teams, learning how to collaborate with customers, and in some cases taking extreme measures to manage IT costs.
I believe it’s time to start looking ahead again. Whether your forecast for the next few quarters is up or down, making informed, confident decisions about where to invest now is how successful organizations will separate themselves from the competition.
To find that path forward, we have to change how we leverage the data we have and turn to proactive analytics, rather than predictive models, to inform strategic decisions in real time.
Proactive analytics prioritize attention and accelerate decision-making
Trusting in prediction when the world is unpredictable is risky. Especially when teams need to make rapid decisions on everything from resource allocation to retention campaigns. More than that, completely rebuilding models for a new set of assumptions presents a myriad of problems. The cost is prohibitive, the time it takes to train a model is untenable, and with the rate of change we can expect to see over the next few months, “predictions” will be challenging, no matter what the method.
But that doesn’t mean your analytics team is without the data they need to inform decisions across the business. While the value of predictive models is rapidly decaying, you’re already capturing the rich transaction, customer, and daily operations data you need from data sources that are accurate to the minute.
The implication is that you need to realign how you’re using this data to fit the needs of the business. Instead of more “peacetime” investments like predictive tooling and model building, you need to find the fastest way to make the data available to analysts across the organization, and then augment their ability to explore it rapidly. By doing so, you inform more daily decisions with data, and avoid the inevitable missteps of “gut-feel” decision making.
New analytics engines, purpose-built for cloud-native data
The necessity of faster, more comprehensive analytics is highlighting a systems discrepancy that’s been building for years: we’ve dramatically improved how we collect, manage, and store complex data, but the platforms we use to work with it have fallen behind.
Cloud-native data architectures make it easier to both handle rich, structured datasets and make the information readily available to engineering, data science, analytics, and the business. These datasets are more detailed than ever before, with rich transaction- and session-level information spread across hundreds of factors. These flat, disaggregated tables are far more flexible and accessible for analysts, compared to the star or snowflake schemas that optimize for the performance of legacy warehouses.
But, companies who are investing in these more flexible cloud data warehouses, modern data pipelines, and data engineering tools are discovering that their legacy BI and dashboarding tools struggle with the breadth and depth of the data. Companies that rely on these manual, desktop-based tools will miss out on any gains that a cloud-native architecture can provide.
To truly get ROI from the data infrastructure investments we’ve made, companies will adopt a new breed of powerful analytics platforms purpose-built to automate the exploration of these wide tables and rapidly test millions of hypotheses – and proactively recommend to analysts where they should prioritize their attention.
Where ML actually helps: A relevance-based approach to data exploration
It’s no longer feasible to expect analysts to manually test every feature or carefully construct complex queries for static dashboards and ad-hoc requests. In these flat datasets, the number of potential hypotheses numbers in the hundreds of millions to billions. There’s no way manual exploration and feature selection can cover a material percentage of that space.
Rather than starting from the raw data and making biased guesses based on gut-assumptions and historic knowledge, we should use platforms that take a declarative approach. These platforms start from the metrics that matter, and then use machine learning to proactively explore the hypotheses that explain any change.
There’s a parallel here to a modern search engine. Search allows users to rapidly locate highly relevant content, drawn from a collection of billions of documents, with only a few simple search terms. And over time, these engines improve, learning from behavior signals like clickthrough rates and related pages.
It’s the same with declarative analytics platforms. Results from this analysis are rapidly stack ranked, interesting populations rise to the surface, and competing hypotheses are tested and presented in parallel. With this approach, your analysts can now spend more time making the answers actionable and less time digging through the data.
It’s critical you change how you’re making data-driven decisions to see you through this uncertain time. To be a truly successful leader, you need to seize on any opportunity to proactively use detailed facts about your business’ current performance to ensure your business’ future.
About the Author
Peter Bailis is the founder and CEO of Sisu Data, a data analytics platform that helps users understand the key drivers behind critical business metrics in real time. Peter is also an assistant professor of Computer Science at Stanford University, where he co-leads Stanford DAWN, a research project focused on making it dramatically easier to build machine learning enabled applications. He received his Ph.D. from UC Berkeley in 2015, for which he was awarded the ACM SIGMOD Jim Gray Doctoral Dissertation Award, and holds an A.B. from Harvard College in 2011, both in Computer Science.