The Big Data Trend

For many years, India dominated the Information Technology (IT) outsourcing market but thing could change too. The next big wave of IT outsourcing is no longer “Code and Test” or “Applications Development” but Big Data analysis where global companies are willing to spend several millions to billions of dollars each year to companies or countries that can provide skilled workers in this field. An Indian executive lamented: “The rule has changed, it is no longer lower cost but higher skills and it caught us by surprised. We will need to develop more people with this skill quickly.”

Big Data is about collecting a vast amount of information a company has internally, combining them with external sources such as internet then analyzing them to get valuable information such as trends and opportunities. Global companies need Big Data to identify new business market and using such analysis to effectively help managers to make decisions. For example, Big Data analysts use trend analytics to understand market direction, customers’ needs to develop new products before others even know about it. It also analyzes industry’s trends and future market to manage the entire supply chain, from raw materials to manufacturing, from distributing channels to retails, to cut operational costs and maximize profits.

Big Data is also used in the financial market to predict trading trends. Currently the stock market is volatile due to the effect of the financial crisis so most traders are very cautious because the U.S economy is still recovering; the European Union is in a crisis; China market is slower than expected; the Middle East is under a chaotic situation etc. In this situation, few people would dare to make any bold move. But with the use of Big Data, some trading companies have been able to identify certain trends quickly and seize the opportunity. When most people are losing money on the stock market, these trading companies are winning big. If you look at the Wall Street Journal’s job wanted, you may find all trading companies are hiring Data Scientists and Data analyst and the competition is fierce.

Since this is a relatively new field, there are only few top universities provide trainings so there is a significant shortage of Big Data workers and the demand is growing fast. According to the industry report, the U.S alone will need 250,000 Data Scientist by 2015 and worldwide demand could push this number to millions. Big Data skills are a combination of mathematics, statistics, machine learning, and computer science. Data Scientists work on real-time data collected from multiple sources to do predictive analysis on the market trends which can help management set direction, make decision about the future market. Last month, the “Harvard Business Review” has termed Data Scientists to be the “Sexiest career of the 21st Century.” And Wall Street stock trading companies considered Big Data analysis to be the future of all financial business transactions.

Currently, Big Data is emerging as the most lucrative market for IT outsourcing companies with market value estimated to be $1 billion dollars in 2015. Since it is a new area that has few competitions so the race to seize this opportunity and capture this market has begun among companies all over the world. A Wall Street analyst said: “The proliferation of social and other online network has provided so much data on the Internet. What we need is to capture these valuable data, analyze them for trends so we can make market decision quickly. We need Data Scientists to inject more exciting into the stock market.”

Many students are confused about the difference between Big Data and other well established fields such as database administration, data management, data mining and business intelligence. The key difference is the other fields are collecting and managing data from company’s relational database to analyze and generate reports. The report is limited on the data collected and stored inside the company database and these data are well defined and structured.

Big data scientists collect data from both internal AND external sources such as the internet etc. This is more difficult because data from external sources are mostly not well defined and structured. For example, the web is full of “data-driven apps.” Almost any e-commerce application is a data-driven application. Social networks such as Facebook, Linkedln are full of social and personal data. There are all types of data behind each web front end, and middleware that connect other databases and data services such as credit card companies, banks etc. That is why the amount of data is very big. On the average, Data Scientists must analyze about 3.5 zettabytes a year (A zettabyte is a trillion gigabytes or a billion terabytes). AND these data are changing and growing every second or minute. That is why it needs different skills and algorithms to perform the analysis.

Even it is a relatively new field, Big Data is already being used in the U.S. where such analysis are generating public opinion polls, forecast election results, predict stock market trends, analyze global financial transactions and develop strategies for governments and private companies. Today every company is looking for workers with these skills to work on their Big Data projects and they would grab any graduate with or without experience to train them on Big Data. An industry analyst said: “There is no question that companies will need this skill to gain a competitive advantage in this highly competitive market and who have the most qualified workers will win.”

Basically there are four job categories in the Big Data area:

1) Data Scientist: This job usually requires advanced degrees (MS or PhD) in Computer Science, Software Engineering, Statistics, Artificial Intelligence, and Machine Learning. Data scientist design special programs and algorithms to collect and analyze data. He is responsible to set data strategy and implement all data products for the company. The Data scientist works with vast amount of data collected from both inside and outside the company to determine what these data means and how they impact the company.

2) Data Architect: This job requires an advanced degree (MS or PhD) in Computer Science, Software Engineering that specialize in Data management or Artificial Intelligence. The Data Architect plans, architects and organize all data searching, collecting and analyzing tools for a company.

3) Data Analyst: This job requires a Bachelor’s degree in Computer Science, Software Engineer, or Information System Management. The data analyst translates analytics into information that managers can use to make decision. The analyst put them into reports for management and helps them understand the current trends.

4) Data Engineer: This job requires a Bachelor’s degree in Computer Science, Software Engineer, or Information System Management. The data engineer develops and implements analytic software programs that collect and analyze data for the company.


  • Blogs of Prof. John Vu, Carnegie Mellon University