Differences
This shows you the differences between two versions of the page.
| Both sides previous revision Previous revision Next revision | Previous revision | ||
|
spark:recipes:reading-text-files [2025/10/15 20:12] straka [Number of Partitions: Multiple Files in a Directory] |
spark:recipes:reading-text-files [2025/10/15 20:13] (current) straka [Number of Partitions: Multiple Files in a Directory] |
||
|---|---|---|---|
| Line 33: | Line 33: | ||
| When the input file is a directory, each file is read in separate partitions. The minimum number of partitions given as second argument to '' | When the input file is a directory, each file is read in separate partitions. The minimum number of partitions given as second argument to '' | ||
| - | Note that when there are many files (thousands or more), the number of partitions can be quite large, which slows down the computation. In that case, '' | + | Note that when there are many files (thousands or more, as for example in ''/ |
| <file python> | <file python> | ||
| conll_lines = sc.textFile("/ | conll_lines = sc.textFile("/ | ||
