Friday, March 4, 2011

5 Laws of Data Integration in the Cloud

In a time when data is so scattered, APIs are so very numerous, and data volumes are exploding, data integration has become essential. More than that, it has become expected

The cloud is everywhere. It’s one of those momentous paradigmatic shifts in IT. We have to deal with it. We have to understand what it means to each and every one of us.

Sweeping generalization as it may be, we’ve observed that people in every industry, from software to healthcare, from banking to manufacturing, are either on the cloud because they built their business around hosted on-demand software, or they’re using more traditional on-premises software – but they want to take advantage of the cloud.

There are huge advantages to moving to the cloud: reduced expenses, powerful elastic computing platforms that handle peak loads, fast implementations and reduced IT needs, to name a few. Lots of businesses have been quick to grab onto that brass ring, but Gartner did a study on companies transitioning to SaaS to see how it was working. They found that some businesses were actually pulling their data back out of cloud-based applications, and asked the obvious question: why?

In the survey, Gartner asked 270 people, “Why is your organization currently transitioning from a SaaS solution to an on-premises solution?” The number one reason 56 percent of respondents gave for transitioning back to on-premises solutions was unexpectedly significant requirements of integration. That bears repeating. More than half of the people who tried moving their business to a cloud-based application and pulled back did it because integrating those applications with the rest of their business proved too challenging to make it worthwhile. This has been, since the advent of SaaS, the Achilles’ heel, the kryptonite, of cloud computing and on-demand software delivery.

The next obvious question is: Why is integration so difficult in a cloud environment? There are five good reasons why the data integration challenge that has always been tricky but doable has now jumped several notches on the difficulty scale to something that only Supermen can handle.

Data is Widely Distributed

The first law of data integration on the cloud is that data is widely distributed in a way never seen before. We had enough problems integrating all our data systems and application systems and back-end databases when we owned them, in the sense that we owned them inside our own firewall. That has changed forever. We don’t own our data assets – they are living someplace else, on some (hopefully safe) data center. That’s the reality. Data is far flung and distributed in a way that it’s never been before.

One of the famous myths of computing is that we gradually shed our legacy infrastructure as we move to more modern technologies, such as cloud computing. The truth is, we never shed anything. We have as many mainframes running today as we ever did. We still have corporate data assets that are in VSAM, and there are corporate mission-critical COBOL data files by the petabyte. We have, of course, all the client/server data. We have all the traditional on-premise applications systems. And now we have the cloud to deal with on top of that. Data is everywhere, and this makes data integration pretty tough.

An integrator has to connect those old COBOL files on the mainframe with on-premise application and database data, then connect that to a cloud server three states away. He’s likely to also need to connect an endpoint that lives in a private cloud, requiring extra security access. Integration has always been a tough problem. But this extra problem is a back- breaker for some people.

Everything Happens Faster

The second law is that everything happens faster now. For those of us who have done integration, we know that what kills integration projects is the speed of change. Just when an implementation is complete, everything changes. A few years back, the venture capital wisdom was that you needed about $10 million to start a company, because you had to build your data center and understand that infrastructure. Now, you can do it with $1 million. All you need is a credit card, and a platform like Amazon or Azure can fire up 300 servers for you in a minute. In this environment of rapid change, old-school methods of integration, such as hand coding each interface, become economic lethal weapons. The speed at which innovation is going to occur, and hence the speed at which people will be able to develop new applications in the next 10 years, will be eye-popping.

So, not only do we have data everywhere, but thanks to the advent of cloud infrastructure as a service, the creation of new end points is happening faster and faster.

Control Becomes Increasingly Distributed

On top of that, businesses have less and less control. I’ve always disputed how much control people really had when they thought they owned their own assets and their own IT infrastructure, but I think we can all agree now that as IT experts, control has slipped out of our fingers. The whole cloud computing and on-demand SaaS revolution has swung the power pendulum away from IT and the vendors and the people who knew hardware and software to the check writers, the customers. In my opinion, this is a positive development. It brings important pressure to bear on hardware and software vendors to create better, faster and easier software. SaaS has raised the bar for all of us. But the net effect from an integration perspective is that you have less control than you used to have.

Connectivity Becomes More Challenging Than Ever

Connectivity is becoming a bigger and bigger challenge. I describe this as the new Cambrian era of evolution in application creation. But it’s not just the extreme number of new applications, it’s the shift of nearly everything in our culture into a cloud-based service. Everything is a Web service, and everything has an API. In a few years, my watch is going to be an API. Your refrigerator is going to be an API. Billions of devices are going to be connected to the Internet in some way, all of them with interfaces – published, public, defended and, hopefully, metadata rich interfaces.

I talked to a fellow who is in the business of smart grids and sensor networks. He’s dealing with utilities and how utilities are measuring meters. We used to have meter readers, but that’s changing. Someday, every meter is going to be a live device, an API, an interface to integrate. He can test that kind of meter now, not once a month when the reader goes by. He can test it once a day, once an hour, once a minute or once a second if he wants. An API means it’s something that can be touched over the communications network, something that will eventually need to be integrated with other systems.

This is a scary world for someone tackling integration. Data is everywhere, billions of new APIs are created every day, the pace of change has increased radically and our ability to control it has diminished.

Data Volumes Increasing At An Explosive Rate

The fifth law of data integration in the cloud is the one that as of yet hasn’t completely hit a lot of people, but is an exponentially growing problem: it’s the increase in data volumes. I had dinner with a customer not long ago. He’s got millions of customers that generate tens of millions of transactions. That’s an enterprise-class, but still fairly straightforward, volume to handle with the right infrastructure. On the other hand, China Mobile has something like 500 million customers and adds millions of customers each month. If every customer does 10 or 20 transactions a day – texts, calls and such – that’s billions and billions of transactions in just one day, with just one vendor, in just one country. This adds up to an unbelievable number of records of information.

The explosion of data volumes is a big deal, and it’s just going to get bigger. Integrators have to be able to do the heavy lifting to deal with it.

There’s a new sheriff in town. It’s the cloud and, whether you like it or not, it’s going to change the world of data integration forever. But just because the problem has gotten more difficult doesn’t mean it can be ignored, or avoided.

Integration is going to be all the more critical. In a time when data is so scattered, APIs are so very numerous (and growing all the time), and data volumes are exploding, data integration has become essential, and more than that, it has become expected. Every single software buyer, every single business trading partner, expects everything to be integrated all the time, flawlessly, as easily and quickly as using a SaaS application. Vendors of integration solutions and systems integrators have to be true innovators in order to find robust solutions that not only solve the problems that we’re faced with now, but that also give the power to handle the complexities headed toward us like a speeding locomotive.

Michael Hoskins, Pervasive Software CTO and general manager, Integration Products, directs Pervasive’s technology and Innovation Labs, and evangelizes Pervasive’s innovations in big data challenges, including cloud-based and on-premises data management and integration. Mike can be reached at mhoskins@pervasive.com.

Paige Roberts is Marketing Manager in the Integration Division at Pervasive Software, and an integration jack of all trades, having worked in the industry for the last 14 years as an engineer, technician, consultant, trainer, and writer. She blogs at dataintegrationblog.com and can be reached at proberts@pervasive.com.

Mike Hoskins, Paige Roberts

Pulled from/Sourced: information-management.com

Tags: , , , ,

No comments:

Post a Comment