Memory allocation in spark has three key contention points, this post is a break down of the three, and a description of the progress that was made in each one
The contention points are:
- Contention between memory allocated for execution and for storage (cache)
- Contention between tasks running in the same process
- Contention between operators executing in the same task
Contention between memory allocated for execution and for storage (cache)
Execution, meaning the actual program and storage meaning the space used to cache RDDs for later processing.
Static memory allocation – spark v1.4
The memory was divided statically between execution and storage, memory couldn’t be shared, even if it was unused. for example – if the memory allocated for storage was not used, because little or no data was cached- it couldn’t be used for execution
Unified memory space – spark v1.6
The same memory space was used for both execution and storage, memory used for execution takes priority over memory used for cache, this means that when ever a memory contention between the two occurs, the memory used for cache will be spilled to disk.
This reason behind this decision is the understanding that execution data will always be re-read, while for cached data – not necessarily.
If an your app relies on caching, you can specify the amount of memory that if used for storage, will never be evicted. the memory is not pre-allocated.
Contention between tasks running in the same process
Each task is pre allocated with its share – relative to the number of tasks, i.e. given 4 tasks, each task will get 1/4 of the available memory, even if no other tasks are running.
Dynamic allocation – since spark v1.0 (2014)
The task’s share of the memory is computed relative to the number or running tasks, if a new task starts running, tasks that are already running will spill to disk to free memory, i.e. a task gets 1/N of the available memory, where N is the number of currently running tasks
This approach handles stragglers better,the last running task will be allocated with all of the available memory and with that, a chance to complete faster. a common reason for stragglers is a skewed partition size.
Contention between operators executing in the same task
Some operators like aggregate, use lots of memory, if an operator uses up all of the memory available for the task, then the next operator wont have any memory to work with and the program would fail.
A simple solution could be statically allocating memory for operators, but obviously it wouldn’t be optimal.
Cooperative spilling – the solution since spark v1.6
Operators use all memory they need, whenever an operator needs extra memory and no memory is available, it instructs the previous operator to spill to disk and free some memory.
This results in just enough data being spilled and fairness between operators.
Static, pre allocation of resources is the simpler solution. but it creates a situation that, some memory might not be utilized just because is was pre-allocated to a task\operator, which is not use all of the memory allocated for it, or might not be running at all.
Instead, the guys at Databricks chose to allocate memory dynamically, and when there is a contention – force other consumers (task\operator) to spill to disk and free up memory.
This results in better memory utilization without the developer having to tune each job to get optimal performance.