[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
spark [2017/10/16 20:56]
ufal [Basic Information]
spark [2023/11/13 17:52] (current)
straka
Line 1: Line 1:
-====== Spark: Framework for Distributed Computations (Under Construction) ======+====== Spark: Framework for Distributed Computations ======
  
 [[http://spark.apache.org|{{:spark:spark-logo.png?150 }}]] [[http://spark.apache.org|Spark]] is a framework for distributed computations. Natively it works in Python, Scala and Java, and can be used limitedly in Perl using pipes. [[http://spark.apache.org|{{:spark:spark-logo.png?150 }}]] [[http://spark.apache.org|Spark]] is a framework for distributed computations. Natively it works in Python, Scala and Java, and can be used limitedly in Perl using pipes.
Line 11: Line 11:
 All Python, Scala and Java bindings work well in UFAL Environment. The displayed examples here are in Python and Scala. We do not discuss the Java binding, because it has the same API as Spark (and if you are a Java fan or know Java substantially better than Spark, you will be able to use it by yourself). All Python, Scala and Java bindings work well in UFAL Environment. The displayed examples here are in Python and Scala. We do not discuss the Java binding, because it has the same API as Spark (and if you are a Java fan or know Java substantially better than Spark, you will be able to use it by yourself).
  
-Currently, Spark 2.is available.+Currently (Nov 2023), Spark 3.5.0 is available.
  
 ===== Getting Started ===== ===== Getting Started =====
Line 19: Line 19:
   * Official [[http://spark.apache.org/docs/latest/programming-guide.html|Spark Programming Guide]]   * Official [[http://spark.apache.org/docs/latest/programming-guide.html|Spark Programming Guide]]
   * Official [[http://spark.apache.org/docs/latest/mllib-guide.html|MLlib Programming Guide]] (Spark’s scalable machine learning library consisting of common learning algorithms and utilities, including classification, regression, clustering, collaborative filtering, dimensionality reduction, as well as underlying optimization primitives)   * Official [[http://spark.apache.org/docs/latest/mllib-guide.html|MLlib Programming Guide]] (Spark’s scalable machine learning library consisting of common learning algorithms and utilities, including classification, regression, clustering, collaborative filtering, dimensionality reduction, as well as underlying optimization primitives)
-  * Official [[http://spark.apache.org/docs/latest/api/python/index.html|Python API Reference]]/[[http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.package|Scala API Reference]]+  * Official [[http://spark.apache.org/docs/latest/api/python/index.html|Python API Reference]]/[[https://spark.apache.org/docs/latest/api/scala/org/apache/spark/index.html|Scala API Reference]]
  
 ===== Using Spark in UFAL Environment ===== ===== Using Spark in UFAL Environment =====
  
 Latest supported version of Spark is available in ''/net/projects/spark''. To use it, add Latest supported version of Spark is available in ''/net/projects/spark''. To use it, add
-  export PATH="/net/projects/spark/bin:/net/projects/spark/sge:$PATH"+  export PATH="/net/projects/spark/bin:/net/projects/spark/slurm:$PATH"
 to your ''.bashrc'' (or to your favourite shell config file). If you want to use Scala and do not have ''sbt'' already installed (or you do not know what ''sbt'' is), add also to your ''.bashrc'' (or to your favourite shell config file). If you want to use Scala and do not have ''sbt'' already installed (or you do not know what ''sbt'' is), add also
   export PATH="/net/projects/spark/sbt/bin:$PATH"   export PATH="/net/projects/spark/sbt/bin:$PATH"

[ Back to the navigation ] [ Back to the content ]