Data management

A student wrote to me: “What is big data? What is data management? Where can we learn about this new field? Please advice.

Answer: Big data is a collection of data that are so large and complex that it becomes difficult to process using current database management tools or traditional data processing applications. Since the data set is too big to be collected, stored, search, analyzed and updated, it needs a new approach to data management. The reason is today many companies are fully automated so the amount of data collected is increasing fast and management needs additional information derive from the analysis of these large set of data, as compared to smaller sets of current available data, in order for them to identify business trends and other applications in much more detail.

Successful data management requires more than just investment in buying more hardware as some technology consultants often recommend. Company must invest in having a well defined data management process in place as well as skilled people to manage all aspects of the data lifecycle. All data must be collected, stored, used, updated, modified, and then retired. With the amount of data increasing fast, it is critical to ensure that data needed for management decision making and reporting is available, accurate, complete and secure.

Without a skilled data management in place, senior executives may not receive the right information on time to make decision. If they receive information that is late or untrustworthy, they will need more time to analyze and validate them; and in this fast changing world, a late decision is often a bad decision. Without a data management system in place, management may receive different information from different sources with different terminology and data formats and they often are confused. Effective data management allows management to have enough information to make better decision.

With big data, data management is becoming more complex than before and it needs to have a structured approach to support decision making process. Data management is a new course often taught at the Information System Management program. It consists of Data Governance or the management and oversee of company data; Data Structure or the definition of data; Data Architecture or the storage and retrieval of data; Data Management or the maintenance of data throughout a company and with business partners and suppliers; Data Quality or the accuracy, completeness and legal compliance of data and Data Security or the protection of data and the authorization to use it.

Sources

  • Blogs of Prof. John Vu, Carnegie Mellon University