IKH

Kaggle Practice Exercise

This is a practice session in which we will solve the Kaggle competition TalkingData AdTracking Fraud Detection Challenge. The training set, as well as the solution, is provided in here. Please implement the code and try tuning the different parameters to achieve a better result, but first, let’s understand why it is beneficial to participate in Kaggle competitions.

Note that this is given to you as an exercise by professor Raghavan where you are requested to read the notebook, modify and improvise the code, and optionally participate in the Kaggle competition. A great way to learn from other people on Kaggle is to read the kernels shared by them and incorporate useful things into your own code. 

Please download the following files:

Note

We have used only a fraction of the training set (train_sample, 100k rows), the full training data on Kaggle (train.csv) has about 180 million rows. For better understanding, you are required to implement the same code on a significant portion of the training dataset.

Hope you have gone through the Notebook. Let’s answer the following questions:

Additional References

Report an error