[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
spark:recipes:reading-text-files [2014/11/04 14:13]
straka
spark:recipes:reading-text-files [2016/03/31 22:02] (current)
straka
Line 74: Line 74:
 ===== Reading Whole Text Files ===== ===== Reading Whole Text Files =====
  
-To read whole text file or whole text files in a given directory, ''sc.wholeTextFiles'' can be used. +To read whole text file or whole text files in a given directory, ''sc.wholeTextFiles'' can be used. Compressed files are supported.
- +
-Unfortunately, ''sc.wholeTextFiles'' **does not** support compressed files.+
  
 <file python> <file python>
 whole_wiki = sc.wholeTextFiles("/net/projects/spark-example-data/wiki-cs") whole_wiki = sc.wholeTextFiles("/net/projects/spark-example-data/wiki-cs")
 </file> </file>
 +
 +By default, every file is read in separate partitions. To control the number of partitions, ''repartition'' or ''coalesce'' can be used. 

[ Back to the navigation ] [ Back to the content ]