[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
spark:using-scala [2017/10/16 21:24]
ufal [Using Scala]
spark:using-scala [2022/12/14 13:02]
straka [Usage Examples]
Line 13: Line 13:
 Consider the following simple script computing 10 most frequent words of Czech Wikipedia: Consider the following simple script computing 10 most frequent words of Czech Wikipedia:
 <file scala> <file scala>
-(sc.textFile("/net/projects/spark-example-data/wiki-cs", 3*sc.defaultParallelism)+(sc.textFile("/lnet/troja/data/npfl118/wiki/cs/wiki.txt", 3*sc.defaultParallelism)
    .flatMap(_.split("\\s"))    .flatMap(_.split("\\s"))
    .map((_,1)).reduceByKey(_+_)    .map((_,1)).reduceByKey(_+_)
Line 20: Line 20:
 </file> </file>
  
-  * run interactive shell using existing Spark cluster (i.e., inside ''spark-qrsh''), or start local Spark cluster using as many threads as there are cores if there is none:+  * run interactive shell using existing Spark cluster (i.e., inside ''spark-srun''), or start local Spark cluster using as many threads as there are cores if there is none:
   <file>spark-shell</file>   <file>spark-shell</file>
   * run interactive shell with local Spark cluster using one thread:   * run interactive shell with local Spark cluster using one thread:
   <file>MASTER=local spark-shell</file>   <file>MASTER=local spark-shell</file>
-  * start Spark cluster (10 machines, 1GB RAM each) on SGE and run interactive shell: +  * start Spark cluster (10 machines, 2GB RAM each) via Slurm and run interactive shell: 
-  <file>spark-qrsh 10 1G spark-shell</file>+  <file>spark-srun 10 2G spark-shell</file>
  
  
Line 72: Line 72:
 version := "1.0" version := "1.0"
  
-scalaVersion := "2.11.8"+scalaVersion := "2.11.12"
  
-libraryDependencies += "org.apache.spark" %% "spark-core" % "2.1.1"+libraryDependencies += "org.apache.spark" %% "spark-core" % "2.3.2"
 </file> </file>
  

[ Back to the navigation ] [ Back to the content ]