Data Cleaning in Python with steps - Part 2
Data Cleaning in Python with Steps- Part 2
So, in the last part, we saw how to find the null values, and how to fill them.
In this blog, we will see how to fill the null values, we will focus on the duplicates, and more importantly how to deal with outliers!!
The next step, How to fill the null values.
In the last series, we saw that we can fill with the median, mode, and mean also.
1. We can fill with the above row and below the row
If you want to fill the null values with the above row you can use the ffill method.
Here ffill is the forward fill.
We can also fill it with the below row.
So, for that, we can use bfill method.
2. We can also do with using interpolate() method:
Interpolate method
The interpolate() method is used to fill null values in a pandas Data frame or series by using a linear interpolation technique. It works by using the values of the surrounding data points to estimate the value of the missing data point.
The method looks at the missing value and the values that surround it then uses linear interpolation to calculate an estimate for the missing value based on the values of the surrounding points. The interpolation method can be set to "linear", "quadratic", "cubic", and others, depending on the degree of complexity required for the interpolation.
So, the next step we treat the outliers
So, there are so many ways that we can know the outliers.
So, there are many ways to handle the outlier, but here I am using the z-score method to handle the outliers.
Thanks For Watching💜
If You guys have any doubts feel free to contact me
And Within a time, I will come up with new content, So Stay tuned
Read the blogs,
Comments💭
Like 💪
And
Subscribe💓
Thank you so much to helping me out in this situation
ReplyDelete