Cleaning Data for Effective Data Science: Doing the Other 80% of the Work with Python, R, and Command-line Tools by David MertzThink about your data intelligently and ask the right questionsKey FeaturesMaster data cleaning techniques necessary to perform real-world data science and machine learning tasksSpot common problems with dirty data and develop flexible solutions from first principlesTest and refine your newly acquired skills through detailed exercises at the end of each chapterBook DescriptionData cleaning is the all-important first step to successful data science, data analysis, and machine learning. If you work with any kind of data, this book is your go-to resource, arming you with the insights and heuristics experienced data scientists had to learn the hard way.
Call Number: Ebook
ISBN: 9781801071291
Publication Date: 2021-03-31
Data Science Fundamentals Pocket Primer by Oswald CampesatoAs part of the best-selling Pocket Primer series, this book is designed to introduce the reader to the basic concepts of data science using Python 3 and other computer applications. It is intended to be a fast-paced introduction to some basic features of data analytics and also covers statistics, data visualization, linear algebra, and regular expressions. The book includes numerous code samples using Python, NumPy, R, SQL, NoSQL, and Pandas. Companion files with source code and color figures are available. FEATURES: Includes a concise introduction to Python 3 and linear algebra. Provides a thorough introduction to data visualization and regular expressions. Covers NumPy, Pandas, R, and SQL. Introduces probability and statistical concepts. Features numerous code samples throughout. Companion files with source code and figures.
Unmasking AI: My Mission to Protect what is Human in a World of Machines by Joy BuolamwiniUnmasking AI goes beyond the headlines about existential risks produced by Big Tech. It is the remarkable story of how Buolamwini uncovered what she calls “the coded gaze”—the evidence of encoded discrimination and exclusion in tech products—and how she galvanized the movement to prevent AI harms by founding the Algorithmic Justice League. Applying an intersectional lens to both the tech industry and the research sector, she shows how racism, sexism, colorism, and ableism can overlap and render broad swaths of humanity “excoded” and therefore vulnerable in a world rapidly adopting AI tools. Computers, she reminds us, are reflections of both the aspirations and the limitations of the people who create them.
The CODATA Data Science Journal is a peer-reviewed, open access, electronic journal, publishing papers on the management, dissemination, use and reuse of research data and databases across all research domains, including science, technology, the humanities and the arts.
As an open access platform of the Harvard Data Science Initiative, Harvard Data Science Review (HDSR) features foundational thinking, research milestones, educational innovations, and major applications, with a primary emphasis on reproducibility, replicability, and readability.