Data Cleaning

Data Cleaning
Author :
Publisher : Morgan & Claypool Publishers
Total Pages : 87
Release :
ISBN-10 : 9781608456789
ISBN-13 : 1608456781
Rating : 4/5 (89 Downloads)

Book Synopsis Data Cleaning by : Venkatesh Ganti

Download or read book Data Cleaning written by Venkatesh Ganti and published by Morgan & Claypool Publishers. This book was released on 2013-09-01 with total page 87 pages. Available in PDF, EPUB and Kindle. Book excerpt: Data warehouses consolidate various activities of a business and often form the backbone for generating reports that support important business decisions. Errors in data tend to creep in for a variety of reasons. Some of these reasons include errors during input data collection and errors while merging data collected independently across different databases. These errors in data warehouses often result in erroneous upstream reports, and could impact business decisions negatively. Therefore, one of the critical challenges while maintaining large data warehouses is that of ensuring the quality of data in the data warehouse remains high. The process of maintaining high data quality is commonly referred to as data cleaning. In this book, we first discuss the goals of data cleaning. Often, the goals of data cleaning are not well defined and could mean different solutions in different scenarios. Toward clarifying these goals, we abstract out a common set of data cleaning tasks that often need to be addressed. This abstraction allows us to develop solutions for these common data cleaning tasks. We then discuss a few popular approaches for developing such solutions. In particular, we focus on an operator-centric approach for developing a data cleaning platform. The operator-centric approach involves the development of customizable operators that could be used as building blocks for developing common solutions. This is similar to the approach of relational algebra for query processing. The basic set of operators can be put together to build complex queries. Finally, we discuss the development of custom scripts which leverage the basic data cleaning operators along with relational operators to implement effective solutions for data cleaning tasks.


Data Cleaning Related Books

Data Cleaning
Language: en
Pages: 87
Authors: Venkatesh Ganti
Categories: Computers
Type: BOOK - Published: 2013-09-01 - Publisher: Morgan & Claypool Publishers

DOWNLOAD EBOOK

Data warehouses consolidate various activities of a business and often form the backbone for generating reports that support important business decisions. Error
Best Practices in Data Cleaning
Language: en
Pages: 297
Authors: Jason W. Osborne
Categories: Social Science
Type: BOOK - Published: 2013 - Publisher: SAGE

DOWNLOAD EBOOK

Many researchers jump straight from data collection to data analysis without realizing how analyses and hypothesis tests can go profoundly wrong without clean d
Cleaning Data for Effective Data Science
Language: en
Pages: 499
Authors: David Mertz
Categories: Mathematics
Type: BOOK - Published: 2021-03-31 - Publisher: Packt Publishing Ltd

DOWNLOAD EBOOK

Think about your data intelligently and ask the right questions Key FeaturesMaster data cleaning techniques necessary to perform real-world data science and mac
Practical Data Cleaning
Language: en
Pages: 41
Authors: Lee Baker
Categories: Education
Type: BOOK - Published: 2019-01-30 - Publisher: Lee Baker

DOWNLOAD EBOOK

Data cleaning is a waste of time. If the data had been collected properly in the first place there wouldn’t be any cleaning to do, and you wouldn’t now be f
Python Data Cleaning Cookbook
Language: en
Pages: 437
Authors: Michael Walker
Categories: Computers
Type: BOOK - Published: 2020-12-11 - Publisher: Packt Publishing Ltd

DOWNLOAD EBOOK

Discover how to describe your data in detail, identify data issues, and find out how to solve them using commonly used techniques and tips and tricks Key Featur