[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision Both sides next revision
spark:running-spark-on-single-machine-or-on-cluster [2022/12/14 12:49]
straka [Starting Spark Cluster]
spark:running-spark-on-single-machine-or-on-cluster [2022/12/14 12:53]
straka [Memory Specification]
Line 27: Line 27:
 ==== Memory Specification ==== ==== Memory Specification ====
  
-Memory specification used for master and worker heap size (and for ''mem_free'' SGE constraint) must be specified. The memory can be specified either in bytes, or using ''kK/mM/gG'' suffix. A reasonable default value is 512M or 1G. +TL;DR: Good default is ''2G'.
  
 +The memory for each worker is specified using the following format: <file>spark_memory_per_workerG[:memory_per_Python_processG]</file>
 +The Spark memory limits the Java heap, and half of it is reserved for memory storage of cached RDDs. The second value sets a memory limit of every Python process and is by default set to ''2G''.
 ==== Examples ==== ==== Examples ====
  

[ Back to the navigation ] [ Back to the content ]