Differences

This shows you the differences between two versions of the page.

--- spark:recipes:using-perl-via-pipes [2014/11/11 09:36]
straka
+++ spark:recipes:using-perl-via-pipes [2015/07/27 19:08]
straka
@@ Line 84: / Line 84: @@
 sc = SparkContext()
 (sc.textFile(input)
-   .map(json.dumps).pipe("env perl tokenize.pl", os.environ).map(json.loads)
+   .map(json.dumps).pipe("perl tokenize.pl", os.environ).map(json.loads)
    .flatMap(lambda tokens: map(lambda x: (x, 1), tokens))
    .reduceByKey(lambda x,y: x + y)
@@ Line 155: / Line 155: @@
 After compiling ''perl_integration.scala'' with ''sbt'', we can execute it using
-  spark-submit --class Main --files tokenize.pl target/scala-2.10/perl_integration_2.10-1.0.jar input output
+  spark-submit --files tokenize.pl target/scala-2.10/perl_integration_2.10-1.0.jar input output

Institute of Formal and Applied Linguistics Wiki