When my kids were younger, the sheer volume of Legos scattered hazardously around the house demanded near-constant cleanup, which always included storing and organizing the bricks. For kids eager to speed through cleanup, organizing the bricks by color, rather than by type or by the kit they came in, would appear to be the optimal Lego storage strategy. But when playtime came back around, they learned the hard way that storing thousands of individual Legos by color made it nearly impossible to locate the bricks they needed in order to build functional structures.
In essence, my kids were making their Lego storage decisions based on what would be easiest up front instead of considering how it would affect their ability to retrieve the specific pieces they’d need later on. We often see similar tendencies when it comes to organizations and their data architecture preferences.
Early in an organization’s shift to more comprehensive data operations, we find that decision making revolves around architecture selection, with factors like user interface and the speed, risk, and cost of implementation at the forefront of conversation.
Architecture is indeed important to successful data operations, but it’s hardly the only thing an organization must consider after deciding to introduce, replace, or upgrade their systems. Too often, fixation on architecture selection comes at the expense of understanding why the architecture is important in the first place.
For example, we field lots of concerns about data’s mobility within a given architecture. With what speed and ease will the data get from Point A to Point B? Sure, it’s a good question to ask. But like my kids learned by storing their Legos by color rather than by type, perhaps the more important question is, what is going to be done with the data once it’s at Point B?
Understanding what limitations exist with a particular architecture is crucial to forecasting analytic capabilities down the road. An even better way to identify the right architecture, though, is to work backwards so that an organization’s goals dictate the architecture, rather than the other way around.
An example of this that we observe surrounds an organization’s decision to adopt a data lake as their primary storage solution. Depending upon longer-term objectives, a data lake may be the right choice. But what needs will be met with the introduction of a data lake? Why will business users benefit from this configuration? Failing to understand the answers to these questions early on leaves organizations vulnerable to inefficient decisions that are more likely to be reversed.
When mapping out an analytic strategy, we encourage IT departments to shift the focus of conversation from architectural configuration to broader analytic goals. This allows your goals to dictate the architecture you choose on the front end, rather than architecture dictating what can ultimately be done with the data.
About the Author
Andre Wolosewicz is the Director of Sales at HEXstream. He received his MBA from Northwestern University’s Kellogg School of Management and has spent over 20 years in analytics with industry experience focusing on Operations and Supply Chain Management. Andre has six patents issued and nine pending related to machine learning and data processing. Outside of work, Andre is the treasurer for a local region of the American Youth Soccer Organization and the Lyons Township High School Boosters.