IKH

From Pandas Dataframe

In the last segment, we saw a few different ways to create dataframes.

In this segment, Vishwa will discuss how to create dataframes from Pandas dataframes in the upcoming video.

Note

Please note that in this module, you may sometimes see that the kernel is mentioned as Python 2 instead of PySpark. This is because some of these videos are older and the Python 2 kernel had the PySpark libraries installed already. For the current configuration of EMR, you will need to use the PySpark kernel only. The SME might also mention EC2 instance instead of EMR instance which is supposed to be in our case(At the most basic level, EMR instances make use of EC2 instances with additional configurations).

So far, we have seen various ways to create dataframes. We will discuss different operations on dataframes in the upcoming segments.

Note

The notebook used in this session is attached in the previous segment.

Report an error