IKH

Practice Question

In the previous segment, we have seen Python lab for K-Modes clustering with bank marketing dataset. We filtered out numerical columns and used only categorical columns to test K-Modes clustering.

In this Practice Python Lab, you will be required to use both numerical and categorical columns from bank marketing data to test K-Prototype clustering.

Download the bank marketing data from below:

You are required to answer the below questions by running the K-Prototype clustering on your local machine.

Some pointers before you proceed:

  • Use only the following columns ‘job’, ‘marital’, ‘education’, ‘default’, ‘housing’, ‘loan’,’contact’,’month’,’day_of_week’,’poutcome’,’age’,’duration’,’euribor3m’ where age, duration and euriborn3m are the numerical columns.
  • Convert all categorical columns to numeric by using LabelEncoder().
  • Standardize all the columns before using K-Prototype clustering.
  • Remember that you also need to convert the final dataframe to a matrix for applying K-Prototype.
  • First check K-prototype with the number of clusters as 5.
  • Please keep in mind that the code may take some time to execute as there are so many categorical variables, so be patient.

You may check your answer if you were unable to solve any. Use the below Python notebook.

Report an error