It is a modern dilemma for modern companies: Big data technologies enable them to acquire, organize, manage and analyze vast amounts of new types of information, much of it unstructured, but the new infrastructure developed to cope with this unprecedented workload must operate alongside existing IT architectures. The challenge is to marry the old and the new.
One possibility, of course, might simply be to do without the new. After all, while conventional relational database management systems can’t process unstructured data – which flows in from widely diverse sources, from social media and GPS sensors to weather data and news feeds – it is possible to translate this information into a structured form. In which case, traditional IT infrastructures might be able to cope.
The translation process, is cumbersome and time-consuming. That drives up costs and slows down the crucial, time-sensitive insights that the new data inputs were supposed to deliver. Sticking with the old is a solution that is neither economically feasible nor strategically sensible.
So what might the new infrastructure look like? In recent times big data has helped to drive a trend toward virtualized and consolidated data centers, using a handful of large servers connected to giant, shared storage platforms. But that may not be how structures evolve in the future.
For one thing, this model relies on the funneling of huge amounts of data through a limited number of shared storage disks, potentially producing frustrating bottlenecks that hamper performance. For another, other models look more cost-effective.
In particular, a platform consisting of large numbers of smaller, commodity servers handling storage locally is highly scalable. To expand capacity, the business just adds more of these smaller servers, which are relatively inexpensive, rather than having to upgrade enterprise services and storage equipment at great cost.
However, that doesn’t mean businesses will automatically replace their existing infrastructure with these new big data platforms. In practice, in order to deliver the most value to the business, chief technology officers and their teams will need to operate with both models, while ensuring their data flows to the right places at the right time.
This hybrid model will enable businesses to capture the benefits of new big data platforms without giving up their existing architectures. Indeed, some of the technologies that power the new will also be used to invigorate what is already in place. These dual infrastructures will operate separately but also work together.
That brings a wide variety of challenges. One big issue companies will need to address is the capabilities of their data centers, which in the future will need to manage these large-scale big data platforms, where hundreds or even thousands of clustered servers are added to the infrastructure. They will also need to manage services across the network and to integrate big data management tools with the traditional management suite.
The capacity of network infrastructure, currently required to move terabyte-sized data sets, will also need to be assessed. That will include the physical issues of installing so many new servers, such as required power, cooling systems and sheer floor space.
Storage is another consideration with the need for multipetabyte capacities in a big data environment. Much of the data will be a valuable business asset that must be both accessible and protected, raising the stakes for security procedures. Even backing up the data, given its scale, may not be possible using conventional processes.
As the new systems evolve, governance will become ever more important, requiring IT to ensure robust processes are in place to accommodate needs such as performance management to crisis resolution.
There will also be the question of how to integrate the big data platform with the rest of the IT infrastructure. Businesses will want to be able to use both structures simultaneously as they draw both structured and unstructured data into analytics tools in order to produce the most valuable insights possible.
A Multidisciplinary Response
With so many questions to answer, IT infrastructure teams will need to work with a wide range of other IT professionals in order to build big data platforms. This multidiscipline approach must also include a careful focus on the economics of the projects. Costs to consider include hardware, software, implementation, the risk of delays and problems, and the opportunity cost of not being able to run with other IT projects.
Clearly, there will be no one-size-fits-all-solution, and businesses will use variations on the theme. For example, where businesses are less concerned about data bottlenecks – when they’re relatively new to big data, for instance – they may choose to use a commodity server arrangement alongside a centralized, shared storage solution. Some businesses may prefer to buy packaged, engineered systems, rather than designing and building their own. Though the upfront costs at least may be higher, these packages will be quicker to implement and offer more streamlined management.
For other businesses, however, the build-your-own approach may be more appealing, depending on their individual needs. The key for all decision-makers is to keep a single fundamental principle in mind: IT should be an enabler of business results.
In practice, that requires an ongoing balancing act between the cost of developing big data infrastructures and the opportunities that new capabilities will offer the business as it seeks agility and growth.
Enabling a Better Business
Accenture’s research suggests high-performing businesses tend to be continuously innovative and they adopt strategies that are both adaptable and executable in changing circumstances. In that context, big data technologies allow businesses to use their information much more quickly, but infrastructures will need to be scalable – and they’ll need to be able to demonstrate their value