Data Cleaning in Python with Steps

February 09, 2023

Data Cleaning in Python with Steps - Part 1

Data Cleaning With Steps in Python

What is Data?

Data is the heart of each and every company. Data that has been converted to information that is very efficient for processing. So, it is very important to have data to be cleared and neater.

However, Data Cleaning is a very tedious task.

Data Cleaning is an essential step in the data-science process, as real-world data is messier and requires preparation before analysis.

This process involves a lot of things such as finding the missing values, filling these missing values, removing the duplication, finding the outliers, and so on.

In this blog, we will learn how to perform the data cleaning process in Python.

Python provides several libraries that make data cleaning easier and faster.

Step 1: Load the data.

Here, we are using the Netflix dataset, I will provide the link in the description below👇

We are using Pandas Library to load the dataset. This can be done using the read_csv function in Pandas, which reads a CSV file into a data frame.

import pandas as pd
data=pd.read_excel(r'Downloads\netflix.xlsx')
data

Step 2:Missing Values

One of the most common problems in real-world data is missing values.

Python provides several libraries to handle the missing data and they also fill the data using the front row or back row or fill it with a default value.

Finding the null values of this dataset

print(data.isnull().sum())

Drop the missing values

# Drop missing values
data.dropna(inplace=True)

filling the values:

# Fill missing values with 0
df.fillna(0, inplace=True)

# Fill missing values with the mean value of the column
df.fillna(df.mean(), inplace=True)

You can fill the missing values with the above row and below row, even you can fill with the mean, median, and even mode also.

Stay tuned for the next part where we handle the duplicates and we can learn mode methods to fill the data, also we can learn how to treat outliers.

Thank you for reading!!!!!

Search This Blog

TheHouseOfIT

Data Cleaning in Python with Steps - Part 1

Comments

Post a Comment

Popular Posts

Data Cleaning in Python with steps - Part 2

Activation Functions in Neural Networks