Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
spark:using-python [2014/11/10 15:36] straka |
spark:using-python [2014/11/10 15:42] straka |
||
---|---|---|---|
Line 16: | Line 16: | ||
Consider the following simple script computing 10 most frequent words of Czech Wikipedia: | Consider the following simple script computing 10 most frequent words of Czech Wikipedia: | ||
<file python> | <file python> | ||
- | (sc.textFile("/ | + | (sc.textFile("/ |
| | ||
| | ||
Line 36: | Line 36: | ||
===== Running Python Spark Scripts ===== | ===== Running Python Spark Scripts ===== | ||
- | Python Spark scripts can be started using | + | Python Spark scripts can be started using: |
- | spark-submit | + | < |
As described in [[running-spark-on-single-machine-or-on-cluster|Running Spark on Single Machine or on Cluster]], environmental variable '' | As described in [[running-spark-on-single-machine-or-on-cluster|Running Spark on Single Machine or on Cluster]], environmental variable '' | ||
Line 56: | Line 56: | ||
sc = SparkContext() | sc = SparkContext() | ||
- | (sc.textFile(input) | + | (sc.textFile(input, 3*sc.defaultParallelism) |
| | ||
| |