Course Content
Introduction: How to Become a Data Analyst
How to Become a Data Analyst
0/1
Installing MySQL and create a database for Windows, MacOS, and Linux
How to Installing MySQL and create a database.
0/1
SELECT Statement and Where Clause in MySQL
Starting your Data Analysis Properly
0/2
LIMIT` + ALIASING` Group by+ Order By, Having Vs Where in MySQL
LIMIT` + ALIASING`
0/3
JOINS
Joins in MySQL
0/1
Unions in MySQL
Unions in MySQL
0/1
Window functions in MySQL
Window functions:- in MySQL
0/1
Common Table Expressions (CTEs) in MySQL and Temp Tables
Common Table Expressions (CTEs) in MySQL
0/2
stored procedures
stored procedures.
0/1
Triggers and Events in MySQL
Triggers and Events
0/1
Data Cleaning in MySQL
Data Cleaning in MySQL
0/1
MSQL EXPLORATORY DATA ANALYSIS
MSQL EXPLORATORY DATA ANALYSIS
0/1
Data Analyst Resume
Data Analyst Resume
0/1
How To Download Your Data Analyst Bootcamp Certification (Congrats!!)
How To Download Your Data Analyst Bootcamp Certification (Congrats!!)
0/1
Guide to Data Analysis for Beginners
About Lesson

Data cleaning in MySQL involves identifying and correcting inaccuracies, inconsistencies, and incomplete data in a database. Effective data cleaning ensures accuracy and reliability in database operations and analytics.

Key activities include:

 

  1. Removing Duplicates: Identifying and deleting duplicate records using `GROUP BY` or unique constraints.
  2. Handling Nulls: Updating or filling null values with appropriate default values or calculated data.
  3. Standardizing Data: Ensuring data formats are consistent (e.g., date formats, phone numbers) using string functions and regular expressions.
  4. Validating Data: Using constraints and checks to enforce data integrity and validate input values.
  5. Data Transformation: Normalizing or aggregating data to meet analytical or reporting requirements.

 

Sample SQL queries for various data cleaning tasks in MySQL These queries help clean and standardize data, ensuring consistency and improving the quality of the database:

 

  1. Removing Duplicate Records

Assume you have a table `employees` with potential duplicate rows:

“`sql

— Remove duplicates while keeping the row with the lowest ID

DELETE e1

FROM employees e1

JOIN employees e2

ON e1.email = e2.email

WHERE e1.id > e2.id;

  1. Handling Nulls

Update null values in the `phone_number` column with a default value:

“`sql

— Replace null phone numbers with ‘N/A’

UPDATE employees

SET phone_number = ‘N/A’

WHERE phone_number IS NULL;

 

  1. Standardizing Data

Convert all email addresses to lowercase:

“`sql

— Convert email addresses to lowercase

UPDATE employees

SET email = LOWER(email);

 

  1. Validating Data

 

Add a constraint to ensure that the `salary` column is always positive:

 

“`sql

— Add a check constraint for positive salary values

ALTER TABLE employees

ADD CONSTRAINT chk_salary_positive CHECK (salary > 0);

  1. Data Transformation

Normalize job titles to a consistent format:

“`sql

— Capitalize job titles

UPDATE employees

SET job_title = CONCAT(UCASE(SUBSTRING(job_title, 1, 1)), LCASE(SUBSTRING(job_title, 2)));