[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision Both sides next revision
courses:mapreduce-tutorial:step-5 [2012/01/29 21:30]
straka
courses:mapreduce-tutorial:step-5 [2012/01/29 23:42]
majlis
Line 50: Line 50:
 Run a Hadoop job on ''/home/straka/wiki/cs-text-small'', which counts occurrences of every word in the article texts. You can download the template {{:courses:mapreduce-tutorial:step-5-exercise1.txt|step-5-exercise1.pl}}  and execute it. Run a Hadoop job on ''/home/straka/wiki/cs-text-small'', which counts occurrences of every word in the article texts. You can download the template {{:courses:mapreduce-tutorial:step-5-exercise1.txt|step-5-exercise1.pl}}  and execute it.
   wget --no-check-certificate 'https://wiki.ufal.ms.mff.cuni.cz/_media/courses:mapreduce-tutorial:step-5-exercise1.txt' -O 'step-5-exercise1.pl'   wget --no-check-certificate 'https://wiki.ufal.ms.mff.cuni.cz/_media/courses:mapreduce-tutorial:step-5-exercise1.txt' -O 'step-5-exercise1.pl'
 +  # NOW EDIT THE FILE
 +  # $EDITOR step-5-exercise1.pl
   rm -rf step-5-out-ex1; perl step-5-exercise1.pl run /home/straka/wiki/cs-text-medium/ step-5-out-ex1   rm -rf step-5-out-ex1; perl step-5-exercise1.pl run /home/straka/wiki/cs-text-medium/ step-5-out-ex1
   less step-5-out-ex1/part-*   less step-5-out-ex1/part-*
Line 64: Line 66:
 Run a Hadoop job on ''/home/straka/wiki/cs-text-small'', which generates an inverted index. Inverted index contains for each word all its //occurrences//, where each occurrence is pair (article of occurrence, position of occurrence). You can download the template {{:courses:mapreduce-tutorial:step-5-exercise2.txt|step-5-exercise2.pl}}  and execute it. Run a Hadoop job on ''/home/straka/wiki/cs-text-small'', which generates an inverted index. Inverted index contains for each word all its //occurrences//, where each occurrence is pair (article of occurrence, position of occurrence). You can download the template {{:courses:mapreduce-tutorial:step-5-exercise2.txt|step-5-exercise2.pl}}  and execute it.
   wget --no-check-certificate 'https://wiki.ufal.ms.mff.cuni.cz/_media/courses:mapreduce-tutorial:step-5-exercise2.txt' -O 'step-5-exercise2.pl'   wget --no-check-certificate 'https://wiki.ufal.ms.mff.cuni.cz/_media/courses:mapreduce-tutorial:step-5-exercise2.txt' -O 'step-5-exercise2.pl'
-  rm -rf step-5-out-ex2; perl step-5-exercise2.pl run /home/straka/wiki/cs-text-tiny/ step-5-out-ex2+  # NOW EDIT THE FILE 
 +  # $EDITOR step-5-exercise2.pl 
 +  rm -rf step-5-out-ex2; perl step-5-exercise2.pl run /home/straka/wiki/cs-text-small/ step-5-out-ex2
   less step-5-out-ex2/part-*   less step-5-out-ex2/part-*
  
Line 70: Line 74:
 You can also download the solution {{:courses:mapreduce-tutorial:step-5-solution2.txt|step-5-solution2.pl}} and check the correct output. You can also download the solution {{:courses:mapreduce-tutorial:step-5-solution2.txt|step-5-solution2.pl}} and check the correct output.
   wget --no-check-certificate 'https://wiki.ufal.ms.mff.cuni.cz/_media/courses:mapreduce-tutorial:step-5-solution2.txt' -O 'step-5-solution2.pl'   wget --no-check-certificate 'https://wiki.ufal.ms.mff.cuni.cz/_media/courses:mapreduce-tutorial:step-5-solution2.txt' -O 'step-5-solution2.pl'
-  rm -rf step-5-out-sol2; perl step-5-solution2.pl run /home/straka/wiki/cs-text-tiny/ step-5-out-sol2+  rm -rf step-5-out-sol2; perl step-5-solution2.pl run /home/straka/wiki/cs-text-small/ step-5-out-sol2
   less step-5-out-sol2/part-*   less step-5-out-sol2/part-*
  

[ Back to the navigation ] [ Back to the content ]