IKH

Model Building

In the previous segment, you obtained the final data set. In this segment, you will build the model using the above DataFrame. Let’s hear from Sajan how the model can be built:

The first step in the model building process is splitting the data set into train and test data. This can be done using the following code:

This way, you can split the 70% data for training the model and use the rest 30% to test the model. Now, in order to build the model, first, the logistic regression should be imported from the classification library provided in the pyspark.ml package.

After importing, the logistic regression object can be created using the following code:

Now, the model can be trained on the training data using the following code:

So, now you have trained and obtained the model on the training data. In the next segment, Sajan will explain how to evaluate the model.

Report an error