When our information systems started to grow and a few tables were not able to hold the (mostly numerical) information we started to build databases.
When databases became too small to contain all the information we were dealing with or the data were distributed in different (not so compatible) data stores we invented the data warehouse.
But to fit into these warehouses data had to be structured. Life though is different and today we are dealing with tons of poorly structured and unstructured data. So here is the latest trend: Big Data.
The industry successfully termed this as a new buzz word and — as always when a new buzz word hits the market — definitions of the term are different be company, speaker and region (here in Switzerland by canton).
Raj Sabhlok wrote in Forbes: “For example, most organizations have their data in structured relational databases like Oracle, but much of the data generated today is unstructured, high-volume web data or machine data. Technologies like Hadoop and “NoSQL” databases, such as Cassandra and MongoDB, are better designed to support massive data processing and storage. Emerging technologies such asStorm and Kafka are designed to provide real-time streaming analytics, which is critical for volume data feeds such as social networks. Even ad-hoc query tools such as Dremel have been introduced to support Big Data environments with low latency.
“Big Data also brings new skill-set challenges. As companies look to answer the most relevant questions related to their businesses, they will need data analysts or “data scientists” to mine the data. And they should get started soon; according to a recent McKinsey study, the United States alone faces a shortage of up to 190,000 workers with analytical expertise, as well as another 1.5 million managers and analysts that have the skills to understand and make decisions based on Big Data analysis.
“The Big Data movement is the recognition that there’s “gold in them there data stores!” There are tons of real-world examples of Big Data done right — just ask President Obama. However, it’s not something to dive into without first doing some serious soul-searching about your company’s goals. And it’s definitely crucial to have the right tools to support your unique corporate needs. But as professor Clemen always used to ask, “What would you pay for perfect information?”
One of the newer methods to introduce new terms and to explain novel concepts has been the use of infographics. You can find several such examples in this blog when you enter “infographic” as a search term in the rightmost column.
Infosys just published one of those infographics on Big Data in the enterprise. I like this graphic since it understandably explains the concepts behind the buzz word (click to enlarge).
Another useful infographic on Big Data was recently published by Muhammed Saleem: Big Data and the future of our health. He maintains that medical diagnoses, general patient care, and medical practices are often more expensive and inferior than they should be. Big Data could revolutionize healthcare by replacing up to 80% of what doctors do while still maintaining over 91% accuracy. The graphic is displayed at http://www.insurancequotes.org/2013/01/15/big-data-and-the-future-of-healthcare/ (click to enlarge).
The importance of Big Data analysis has recently been reported in the context of President Obama’s re-election. Crovitz wrote on Nov. 19, 2012 in the Wall Street Journal:
When the Obama campaign emailed supporters to join a $40,000-a-ticket dinner in June at the New York home of actress Sarah Jessica Parker, journalists at ProPublica noticed something odd. They uncovered seven versions of the email solicitation for the fundraiser, some mentioning a second fundraiser that night, a concert by Mariah Carey, others that Ms. Parker is a mother, and still others that Vogue editor Anna Wintour would be at the dinner.
Who got which email depended on “big data”—information about each fundraising prospect and how different people react to different messages. In this year’s election, it looks as if the Obama team’s use of such data was one of its biggest edges over the Romney effort.
[ . . .]
The Obama campaign focused on data showing the “persuadability” of voters. Multivariate tests identified issues and positions that could move undecided voters, ProPublica said: “The persuasion scores allowed the campaign to focus its outreach efforts—and their volunteer calls—on voters who might actually change their minds as the result. It also guided them in what policy messages individual voters should hear.” (Read the full article here)
Big data hold a so far untapped potential. Pharma companies will have to deal with a massive data deluge when comparing genome information of thousands of people to find patterns that correlate to certain diseases and give clues on possible medications.
Are we becoming more transparent? You bet. But we have to learn to mask data in a way that it ceases to be personally identifiable information (PII). See my blog on “You Have Zero Privacy Anyway — Get Over It” (Really?)
Do you want to share some insight or other infographic on the subject? What is your take on Big Data?
- Big Data + Visualization + Healthcare Equals A Healthier Outlook (blogs.sap.com)
- Big Data Infographic | How Big is Big Data? | Domo | Blog (domo.com)
- The Ecommerce Guide to Big Data [Infographic] (getelastic.com)
- Big Data Drives Business Decisions with Enterprise Search (arnoldit.com)
- From Big Data Science to Big Data Action (servantofchaos.com)
- The future of big data (infographic) (siliconrepublic.com)
- The Human Face Of Big Data (landbar.wordpress.com)
- The Big Data Storymap (infocus.emc.com)
- 6 Steps To Manage Big Data (informationweek.com)
- Data Warehouse Dilemma | Solving Real Problems | Domo | Blog (domo.com)