Welcome to the second session on ‘Analytics using PySpark’ . In the last session, you learnt about performing the basic EDA using Spark ML Library and the basic Spark concepts.
Let’s start this session by watching the upcoming video in which Sajan outlines the concepts that will be covered in this session.
In this session
As Sajan discussed in the previous video, in this session you would be learning about the implementation of linear regression algorithms. After this, you would learn about the basic model building techniques using PySpark. Then you would learn about the implementation of the Linear Regression model using the Spark ML library. After building the model, you would look at the concepts of cross-validation and the Bias-Variance tradeoff.
Note:
The PPT used in this session is available in the ‘Session Summary’ segment.
People you will hear from in this session
Subject Matter Expert
Data Science Lead – Myntra
Sajan has completed his undergraduate and postgraduate in Computer Science Engineering from IIT, BHU. He heads the pricing team at Myntra, where he actively works on technologies like Data Science, Big Data, Spark and Machine learning. Presently, his work mainly involves the development of discounting strategies for all the products offered by Myntra.
Subject Matter Expert
Senior Data Scientist at Gramener
With over 10 years of experience in data science and predictive analysis, Jaidev has worked in multiple firms such as Springboard, iDataLabs and cube26. He has completed his bachelor’s degree in Electrical and Electronics Engineering from Vishwakarma Institute of Technology, Pune. He is currently working as a Senior Data Scientist at Gramener, a leading data science consulting company that advises clients on Data-Driven Leadership.
Report an error