Deep dive: memory management in Apache spark

Memory allocation in spark has three key contention points, this post is a break down of the three, and a description of the progress that was made in each one The contention points are: Contention between memory allocated for execution and for storage (cache) Contention between tasks running in the same process Contention between operators executing in the same... Continue Reading →

Spark application logging

When coding a spark application, we often want to write some application logs to trace or track our application's progress. we would want to benefit from spark's log4j's configuration i.e log collection etc... so naturally, we would declare a logger instance at the class level and use it in our closure. Unfortunately we can't do... Continue Reading →

Website Powered by

Up ↑