Data Science II - Introduction to Pandas
Pandas is an open source library built on the top of
NumPythat allows us to analyse and clean the data for further step to be performed upon.
Pandas is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of the Python programming language.
2. How to install Pandas?
3. What are we going to learn?
4. Series in Pandas
5. DataFrames in Pandas
6. DataFrame Operations
7. Working on missing data in DataFrames
8. Applying GroupBy( Aggregate) operations in Pandas
9. Merging, concatenating and joining DataFrames in Pandas
- Pandas library has a built-in visualization which you can use which we are going to discuss in the next few parts.
- It can work with a wide variety of data sources and can help us to clean them up.
How to install Pandas?Installing Pandas is quite similar to installing Numpy as we did in the last part. When in your virtual environment, use the following command
pip install pandas
What are we going to learn?We are going to learn various methods of the Pandas library which will help us to clean and analyse the code. Some important terms we will be seeing in this post are:
- Missing Data
- Merging, Joining and Concatenating …
Series in Pandas
RecapIn this notebook, we discussed,
- How to create Pandas series?
- How to create Pandas series with custom indexes?
- Creating series using Python dictionaries.
- How to select elements from pandas series?
- How to apply arithmetic operations in Pandas series?
inttype, it will convert it to
DataFrames in Pandas
- How to create Pandas DataFrames?
- How to select column series from DataFrame?
- How to add new data into the Pandas DataFrame?
- How to remove series from Pandas DataFrames?
- How to select rows from Pandas DataFrames?
Normal pythonNumpy DataFrames Operations Notebook In this notebook we learned about the different methods used in Pandas to select, manipulate and operate on Pandas DataFrames.
ordon’t work because they doesn’t have the capability to compare boolean values in a series.
Working on missing data in DataFramesPandas provide a lot of methods that can help us with cleaning and removing the missing data from the DataFrames. Let’s head up to the jupyter notebook and learn more on how to handle missing data in a DataFrame.
Applying GroupBy( Aggregate) operations in PandasGroup by operators allow us to apply aggregate functions. Let’s jump into the jupyter notebook and learn how can we apply group by techniques to pandas DataFrame.
Merging, concatenating and joining DataFrames in PandasLet’s jump to the jupyter notebook to learn more about this. That’s it for this part of the post. I will keep adding more operations and methods if I find something interesting to this post.
Did you enjoy reading or think it can be improved? Don’t forget to leave your thoughts in the comments section below! If you liked this article, please share it with your friends, and read a few more!