IKH

Implementing Joins In Structured Streaming

Now that you have learnt how joins work on streams, in this segment, you will see a coding lab wherein we will apply all of our learning so far on a code.

So, in this lab, we created two streams and converted them into DataFrames using Select. The DataFrames had names of players, and we performed an inner join on them to see the outputs. As expected, only records that were present in both the DataFrames were returned.

Now, in the next coding lab, we will perform a static-stream join.

So, we started with a static outer join. First, we moved a csv file from our system to HDFS and used it to read our file from HDFS, thereby creating a static DataFrame. Once we started sending in the stream, we had the name from the stream DataFrame, while we had the age from our static DataFrame.

When we tried performing a left outer join on a stream-static join, since we had a stream on the left, our join still worked and picked up Finch.

Make sure you have prepared the players. csv file before you run the codes given below.

Feel free to explore the other joins as well to have a clear understanding of the concept.

Report an error