It is also possible to store an RDD in a binary format, which has several advantages:
Nevertheless, note that the binary format is different in Python and in Scala.
In Python, rdd.saveAsPickleFile is used to save the file and sc.pickleFile can load it:
data.saveAsPickleFile(path) ... loaded_data = sc.pickleFile(path)
In Scala, rdd.saveAsObjectFile is used to save the file and sc.objectFile can load it:
data.saveAsObjectFile(path) ... val loaded_data = sc.objectFile(path)