Across industries and companies, a sizable number of big data applications now in the proof-of-concept (POC) stage are poised to enter full-blown production status during the second half of 2014. A number of factors are aligning to support this broad-based move from big data POCs to production apps. These include: 

  • The availability of Hadoop 2, which features enhanced real-time processing.
  • Mounting competitive pressure to ensure that critical business decisions are data driven.
  • Emergence of a set of best practices to help IT departments implement big data applications in the context of their overall corporate IT architecture.

According to a Forrester survey, 70 percent of IT decision-makers say big data analytics will be a key priority in 2014. The organizations these IT leaders represent hope to expand the breadth and power of their analytical decision making for competitive advantage.

However, the coming migration of big data apps to production status can pose significant challenges to the people, processes, and technology of the organizations that are deploying them. The following best practices will help businesses successfully navigate these promising but uncharted waters. 

Challenge
Ensure a problem-free transition from proof-ofconcept to enterprise-wide production for big data applications.
At Stake
Businesses risk delaying wide-scale deployment of big data projects, disappointing stakeholders, and curtailing the scope of the competitive advantage they seek.
Solution
Adhere to proven planning and deployment strategies and a newly defined set of best practices to sidestep potential obstacles and execute a successful transition.

1. Plan for 99.99% Uptime (At a Minimum) 

Nothing will torpedo a big data rollout faster than failing to meet the expectations of your business users. Reliability is key: once a business adapts its people and processes to a new technology, nothing frustrates it more than for that technology to be unavailable.

To formalize that understanding, business users expect a servicelevel agreement (SLA) that guarantees the level of performance they will receive, and 99.9999 or even 99.99999 percent of uptime has become the norm. But the challenge of providing continuous availability is substantial. Hadoop maintains three copies of data by default, and service interruptions can occur if servers go down or copies get out of sync.

There is also a risk to the data itself if multiple copies are stored in a single location.

To safeguard the data and provide the level of availability that users expect, you will need to bulletproof your infrastructure. This requires a high-degree of redundancy to accommodate Hadoop’s requirements.

An alternative approach is to co-locate or host big data applications with a third-party service provider. Such an arrangement allows IT to support the new infrastructure requirements quickly and cost-effectively, while storing Hadoop data at more than one location.

2. Comply With Your Existing Security Architecture 

To launch POC apps quickly, security is often downgraded to a bare minimum. This may happen outside of IT’s purview, and the app in question doesn’t always adhere to corporate security practices.

That needs to change when the application is put into full-scale production. 

Big data apps should not require unique security tools or expertise; rather they should conform to the security requirements of the company’s other mission-critical applications. Typically, this means securing them from endpoint “in” to endpoint “out” using multiple layers of security for the network, servers, endpoints, and data, both in transit and at rest. Access controls, such as those provided by a virtual private network (VPN), should also be in place. 

3. Integrate Big Data Apps with Other Mission-Critical Apps

The integration shouldn’t end with security. No mission-critical IT application can operate in isolation. A big data software stack (including the infrastructure, data, and analytics or “insight” layers, as well as security) needs to be integrated with all the diverse software components that comprise the rest of your organization’s enterprise applications and tool sets.

The ability for a big data application to integrate and share data with other enterprise apps is a prerequisite for achieving the sort of insights and analysis that is driving the move from POC to production in the first place. The best practice is to integrate data from across the enterprise — regardless of organizational silos — into a data “lake.”

If you’re not analyzing all the organization’s data, you run the risk of missing important trends.

There’s inherent complexity in such integrations; they require interfacing with technology from multiple software developers (many of which are competitors) and writing custom scripts to allow for data sharing. That has the potential to overtax the resources of even the most sophisticated corporate IT department. To avoid over-extending, consider partnering with an integrator or service provider with experience deploying enterprise-level big data applications.

4. Go Mobile From Day One 

When a big data app is launched in test mode, it’s operating in a controlled environment, supporting a limited number of use cases and access methods. Control is precisely the point; companies don’t want to provide open and highly flexible access to POCs. POCs, by definition, sometimes fail, and minimizing access can be a key part of risk-management strategy.

Moving a big data application into production is another matter entirely. “Users expect to access any function that IT supports from their iPad or iPhone,” says Bill Peterson, Director at CenturyLink Business. 

“So for business users to adopt a big data app, they have to have access to it from those same devices.” Smartphone and tablet support is a baseline requirement.

To support the company’s mobile users, IT must provide the necessary infrastructure for secure remote access and centralized storage. And they must take into account the growth in usage that’s likely to take place once the application becomes accessible through a variety of mobile platforms and users become more familiar with its features.

5. CapEx Versus OpEx: Decide How to Fund Your Infrastructure Requirements 

To support big data infrastructure requirements, companies essentially have two options: they can invest in and build out their current infrastructure — continuing to manage it themselves — or they can partner with a service provider. From a financial standpoint, this often comes down to choosing between treating the infrastructure as a capital expense or an operating expense.

If the business wants to retain absolute control over its infrastructure and acquire in-house expertise in big data tools such as Hadoop and NoSQL, then it should stick with a traditional on-premises strategy and fund it out of its CapEx budget.

But many companies believe it’s the big data application, not the infrastructure that supports it, that will add the most value to the business. They prefer to pay others to manage their infrastructure, freeing their internal IT team to focus on the application and become experts in the analytical and decision support functions it provides.

These businesses may also feel they are better off budgeting their big data initiatives as an operating cost that can be quickly trimmed or enlarged as business conditions require.

“Many companies see the need to outsource to take advantage of their service providers’ economies of scale,” Peterson explains. “This allows them to expand their infrastructure as their business needs dictate without having to over-commit to additional capacity that they might not need on an on-going basis.”

Peterson adds that many businesses adopt a hybrid model, retaining control over critical elements of their infrastructure while outsourcing the rest.

6. It’s Not Sexy, But Don’t Overlook Documentation and Support 

The major objective of a POC is to gain the knowledge and experience that will ultimately support a production roll out. But within a relatively short period of time that experience will dissipate unless it’s been systematically documented.

Good documentation usually takes two forms: a set of best support practices for IT and a user manual for business users.

Don’t underestimate how important this is for a successful big data deployment. Clear, comprehensive documentation boosts the confidence of support personnel and business users by answering many of their routine questions. And this, in turn, increases their productivity as they come up to speed with the new application.

Conclusion: Plan, Execute, Adapt 

While big data applications have a number of unique characteristics, they should not be viewed in a silo. IT professionals should seek to leverage existing practices, architectures, and expertise wherever possible, treating big data as one more application in the corporate portfolio.

At the same time, big data applications do place additional stresses on a company’s IT resources. One cost-effective way to manage these is by partnering with a reputable managed service provider that brings infrastructure management and big data integration experience to the table.

As companies migrate their big data applications from POC to enterprise-wide production, they will quickly gain experience with Hadoop 2 and other big data tools for the enterprise, refining their data integration practices in the process. Early success will foster grander ambitions as businesses seek to utilize all the data at their disposal for competitive advantage.

6 Key Factors for Successful Big Data Application Rollouts

  • Plan for near continuous uptime — and put that in an SLA with your stakeholders.
  • Secure your big data app; make it conform to your organization’s established security architecture.
  • Share data and integrate your big data applications with the rest of your company’s mission-critical apps.
  • To meet the expectations of your user base, support mobile computing platforms from the outset.
  • Decide if you will go it alone and fund your big data applications as a capital expense, or if you should partner with a service provider and fund them as an operating cost.
  • To capture what you learned with your POC, document best practices for IT support and for business users.
About Big Data Services from CenturyLink

CenturyLink Big Data Foundation Services combine CenturyLink’s enterprise-grade global infrastructure and network connectivity with proven big data software in a fully hosted and managed service. Our big data solutions allow your organization to realize game-changing insights, operate with unprecedented speed and agility, and gain a true competitive edge. A longtime leader in managed hosting, CenturyLink stores and manages critical data for a wide range of enterprise clients, including five of the top 14 securities firms. 

About CenturyLink Business 

CenturyLink, Inc. is the third largest telecommunications company in the United States. Headquartered in Monroe, LA, CenturyLink is an S&P 500 company and is included among the Fortune 500 list of America’s largest corporations. CenturyLink Business delivers innovative private and public networking and managed services for global businesses on virtual, dedicated and colocation platforms. It is a global leader in data and voice networks, cloud infrastructure and hosted IT solutions for enterprise business customers. 

Explore Other Articles