Now, let’s talk about a very important aspect of any dataset, i.e., data types. In a particular dataset, you have multiple types of variables with different kinds of data types such as integers, string, floats, etc.
For data analysis, you will use the following libraries through the entire module, which you must have already covered in prep content:
- Pandas: It is a library to deal with dataframes in python. Pandas is an acronym derived from panel data. It is solely used for data analysis purposes in python.
- NumPy: This library is used for performing numerical operations on a dataset.
Now, let’s go through the bank marketing dataset along with rahim and try to find out the data types that are present in it.
In general, any given data set is expected to have different types of data. Following are some examples with their data types.
Example | Variable Type | Data Type |
Height, weight, age, temperature | Numerical variable | Int, float |
Size of clothes, months, type of jobs, blood group. | Categorical variable | Object |
Grades in exam, education level, months, integer ratings | Ordinal categorical type | Object, int, float |
Date, time, timestamp | Date and time variable | Date and time |
In the next segment, you will get an understanding of the steps in the data cleaning process, particularly fixing the rows and columns.
FREQUENTLY ASKED QUESTIONS (FAQ)