Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
user:zeman:mdmake [2010/11/05 16:22] zeman Files for download. |
user:zeman:mdmake [2023/04/21 18:17] (current) zeman Now versioned at Github. |
||
---|---|---|---|
Line 1: | Line 1: | ||
====== mdmake ====== | ====== mdmake ====== | ||
+ | |||
+ | [[https:// | ||
Imagine you need to apply the same sequence of tools to a set of data files, and possibly want to be able to repeat the experiment later, i.e. sometime in future you will want recall how precisely the processing would be invoked. One example is a shared task in processing of similarly formatted data in many languages. One may want to use [[http:// | Imagine you need to apply the same sequence of tools to a set of data files, and possibly want to be able to repeat the experiment later, i.e. sometime in future you will want recall how precisely the processing would be invoked. One example is a shared task in processing of similarly formatted data in many languages. One may want to use [[http:// | ||
Line 11: | Line 13: | ||
* A MD-makefile ('' | * A MD-makefile ('' | ||
* Enumerate variables that contain values of respective dimensions. At the same time tell how to combine them into file names (paths). (The spaces will be deleted, their purpose here is to show what delimiter should be omitted if a dimension is omitted. Permitted delimiters are slash, hyphen and period.) | * Enumerate variables that contain values of respective dimensions. At the same time tell how to combine them into file names (paths). (The spaces will be deleted, their purpose here is to show what delimiter should be omitted if a dimension is omitted. Permitted delimiters are slash, hyphen and period.) | ||
+ | |||
< | < | ||
+ | |||
* The delimiters are not mandatory but MD-make checks whether missing delimiters do not cause ambiguities (e.g. if LANGUAGES = hi him, DOMAINS = mix ix, then .MDIMS: LANGUAGES DOMAINS would cause problems). | * The delimiters are not mandatory but MD-make checks whether missing delimiters do not cause ambiguities (e.g. if LANGUAGES = hi him, DOMAINS = mix ix, then .MDIMS: LANGUAGES DOMAINS would cause problems). | ||
* The last dimension in the list of dimensions is special. It need not be named STATES and it need not be delimited by a period (although it is recommended - in some operating systems it is desirable that the file name extension defines the type of the contents), nevertheless the value of this dimension is considered the type of the file. Among others, the file type defines, in what dimensions the files of this type exist. MD-make gets that information from the rule that generates files of this type as its goal. For every type there must be at least one such rule. Theoretically there can be more if e.g. we want to perform different actions for different languages. In that case all such rules must lead to the same list of dimensions of the goal. However, they are not required to cover together all values of all these dimensions. | * The last dimension in the list of dimensions is special. It need not be named STATES and it need not be delimited by a period (although it is recommended - in some operating systems it is desirable that the file name extension defines the type of the contents), nevertheless the value of this dimension is considered the type of the file. Among others, the file type defines, in what dimensions the files of this type exist. MD-make gets that information from the rule that generates files of this type as its goal. For every type there must be at least one such rule. Theoretically there can be more if e.g. we want to perform different actions for different languages. In that case all such rules must lead to the same list of dimensions of the goal. However, they are not required to cover together all values of all these dimensions. | ||
* The respective variables with values of the respective dimensions must be normal variables containing only a list of words separated by spaces. MD-make will not search them for references to other variables or macros. If it encounters a dollar sign in these variables, it will throw an exception and terminate. These variables will be visible in the generated makefile as well. | * The respective variables with values of the respective dimensions must be normal variables containing only a list of words separated by spaces. MD-make will not search them for references to other variables or macros. If it encounters a dollar sign in these variables, it will throw an exception and terminate. These variables will be visible in the generated makefile as well. | ||
* No value in no dimension can be identical with any other value of any dimension. In other words, a value uniquely identifies its dimension. (This helps prevent ambiguities in file names that do not contain all dimensions.) | * No value in no dimension can be identical with any other value of any dimension. In other words, a value uniquely identifies its dimension. (This helps prevent ambiguities in file names that do not contain all dimensions.) | ||
- | | + | |
- | * In what dimensions the target file exists. (The other dimensions will not appear in the file name.) | + | < |
+ | DE = d e | ||
+ | TRAINTEST = train test | ||
+ | PREPROCESSINGS = pre1 pre2 | ||
+ | STATES = mst blind.conll mst.conll</ | ||
+ | |||
+ | | ||
+ | * '' | ||
+ | |||
+ | < | ||
+ | .md.rul: mst.conll < blind.conll mst | ||
+ | @echo Run the parser here. | ||
+ | </ | ||
+ | |||
+ | * A MD-rule ends obligatorily with an empty line (even at the end of the file). | ||
+ | * MD-make will generate many normal rules from the multidimensional rule. In these generated rules, all combinations of all values in all affected dimensions will appear. As these rules are not templatic any more, we don't have to fear that gnu make will encounter cyclic dependencies or other problems. For instance, the above multidimensional rule yields the following normal rules, among others: | ||
+ | |||
+ | < | ||
+ | @echo Run the parser here. | ||
+ | cs/ | ||
+ | @echo Run the parser here. | ||
+ | cs/ | ||
+ | @echo Run the parser here. | ||
+ | cs/ | ||
+ | @echo Run the parser here. | ||
+ | ... | ||
+ | en/ | ||
+ | @echo Run the parser here. | ||
+ | </ | ||
+ | |||
+ | * The following parameters can be supplied, too: | ||
+ | * The '' | ||
+ | * The '' | ||
+ | * If '' | ||
+ | * If '' | ||
+ | * If a source file requires a dimension not contained in the target file, and the dimension is not fixed, the rule will be generated for all values of this dimension. This means that there will be several competing rules for the same target file. | ||
+ | * Example: The rule defined above is intended for parsing, not training, so it should only operate on test conll files. We thus freeze the TRAINTEST dimension on the test value. | ||
+ | |||
+ | < | ||
+ | .md.rul: mst.conll < blind.conll mst | ||
+ | .md.for: LANGUAGES DE PREPROCESSINGS | ||
+ | .md.fix: test | ||
+ | @echo Run the parser here. | ||
+ | </ | ||
* What are the constraints for the values in the respective dimensions. (Standard way is the '' | * What are the constraints for the values in the respective dimensions. (Standard way is the '' | ||
- | * MD-make will generate many normal rules from the multidimensional rule. In these generated rules, all combinations of all values in all affected dimensions will appear. As these rules are not templatic any more, we don't have to fear that gnu make will encounter cyclic dependencies or other problems. | ||
* New variables '' | * New variables '' | ||
- | * A MD-rule ends obligatorily with an empty line (even at the end of the file). | ||
- | * If there is no parameter '' | ||
- | * The '' | ||
- | * If '' | ||
- | * If '' | ||
- | * If a source file requires a dimension not contained in the target file, and the dimension is not fixed, the rule will be generated for all values of this dimension. This means that there will be several competing rules for the same target file. | ||
* '' | * '' | ||
* '' | * '' | ||
Line 32: | Line 73: | ||
< | < | ||
- | .md.rul mst.conll < blind.conll mst | + | .md.rul: mst.conll < blind.conll mst |
- | .md.dep $(TOOLDIR)/ | + | .md.dep: $(TOOLDIR)/ |
.md.for: LANGUAGES DE PREPROCESSINGS | .md.for: LANGUAGES DE PREPROCESSINGS | ||
.md.fix: test | .md.fix: test | ||
Line 67: | Line 108: | ||
{{: | {{: | ||
+ | |||
+ | ===== Acknowledgements ===== | ||
+ | |||
+ | This research has been supported by the grant of the Czech Ministry of Education no. MSM0021620838. | ||
+ |