[ Skip to the content ]

Institute of Formal and Applied Linguistics Wiki


[ Back to the navigation ]

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision Both sides next revision
user:zeman:mdmake [2010/11/05 14:24]
zeman vytvořeno
user:zeman:mdmake [2010/11/05 14:52]
zeman
Line 3: Line 3:
 Imagine you need to apply the same sequence of tools to a set of data files, and possibly want to be able to repeat the experiment later, i.e. sometime in future you will want recall how precisely the processing would be invoked. One example is a shared task in processing of similarly formatted data in many languages. One may want to use [[http://www.gnu.org/software/make/manual/make.html|make]] and Makefiles where the sequence of application of the various scripts can be well described. However, dealing with some phenomena of such sort of processing is rather tricky in classical Makefiles. Imagine you need to apply the same sequence of tools to a set of data files, and possibly want to be able to repeat the experiment later, i.e. sometime in future you will want recall how precisely the processing would be invoked. One example is a shared task in processing of similarly formatted data in many languages. One may want to use [[http://www.gnu.org/software/make/manual/make.html|make]] and Makefiles where the sequence of application of the various scripts can be well described. However, dealing with some phenomena of such sort of processing is rather tricky in classical Makefiles.
  
-The most prominent phenomenon that is difficult to capture is what I call //multidimensionality// of the data. Every data file undergoes a sequence of processing steps, i.e. it appears in many different states (and intermediate data formats). Some processing tools may have alternative implementations, so you may have the same piece of data in the same stage of processing (e.g. syntactically parsed) but with different processing results (e.g. parsed either by Malt parser, or MST parser).+The most prominent phenomenon that is difficult to capture is what I call //multidimensionality// of the data. Every data file undergoes a sequence of processing steps, i.e. it appears in many different states (and intermediate data formats). Some processing tools may have alternative implementations, so you may have the same piece of data in the same stage of processing (e.g. syntactically parsed) but with different processing results (e.g. parsed either by Malt parser, or MST parser). Besides that, you may be applying the same processing to data in ten different languages, several domains per language, separately to development and evaluation test data etc. All these //dimensions// will probably be somehow reflected in the path to your data files. You probably would want to use pattern (template) rules in your Makefile to describe the same action applied to many files. However, gnu make allows you only one ''%'' (variable) per pattern rule, which makes it rather difficult to define templates in the multidimensional space. This is where **mdmake,** or “multidimensional make” may be useful.

[ Back to the navigation ] [ Back to the content ]