Welcome to this module on “Analytics Using PySpark”. Let’s hear from Sajan the broad topics you would be covering in this module.
As you know the size of data is increasing exponentially every day. So in order to manipulate this large amount of data, Spark ML library is used. So in this module, you would be covering the basic machine learning algorithms using the Spark ML library. The machine learning algorithms covered in this module are:
- Linear Regression
- Logistic Regression
- K-Means Clustering
As explained in the video, this module is more towards the hands-on coding of the algorithms and not much on the theoretical aspects of the same.
Guidelines for this module
This module is more inclined towards the implementation of the basic machine learning algorithms using PySpark. Each session starts with a quick recap of the topics followed by the implementation part.
Please also try writing/running the codes with the experts. The presentation used in the sessions will be provided in the corresponding ‘Session Summary’ segment.
Also please run the following code in every Juypter Notebook in order to initialize Spark.
Guidelines for In-segment and Graded Questions
There will be a separate session for graded questions. The other sessions will contain non-graded questions. Each graded question in this module will have 10 marks for a correct response and 0 for an incorrect response. Each graded question will have only one attempt, while each non-graded question will have one or two attempts depending upon the type of question and the number of options.
People you will hear from in this module
Subject Matter Expert
Data Science Lead – Myntra
Sajan has completed his undergraduate and postgraduate in Computer Science Engineering from IIT, BHU. He heads the pricing team at Myntra, where he actively works on technologies like Data Science, Big Data, Spark and Machine learning. Presently, his work mainly involves the development of discounting strategies for all the products offered by Myntra.
Subject Matter Expert
Senior Data Scientist at Gramener
With over 10 years of experience in data science and predictive analysis, Jaidev has worked in multiple firms such as Springboard, iDataLabs and cube26. He has completed his bachelor’s degree in Electrical and Electronics Engineering from Vishwakarma Institute of Technology, Pune. He is currently working as Senior Data Scientist at Gramener, a leading data science consulting company that advises clients on Data-Driven Leadership.
Subject Matter Expert
AI-COE, Reliance Jio
Ankit has over 12 years of experience in machine learning and AI across various domains such as banking and financial services, e-commerce and telecom. He has worked with Amazon, Snapdeal and Citigroup.
He is an expert in the application of ML in marketing and risk. He has worked with organisations across multiple geographies and developed and implemented data science solutions targeting different stages in the customer life cycle. He has worked extensively on building ML models and has experience in advanced techniques such as neural networks, GBMs and SVMs.
Report an error