... | ... | @@ -44,9 +44,21 @@ This will create the basic structure of a new generator called *my-datagen* proj |
|
|
2. Specify the Data Generator Program
|
|
|
-------------------------------------
|
|
|
|
|
|
The *Myriad* toolkit promotes a general-purpose data generation model centered around the generation of pseudo-random sequences of user defined domain types. To fully specify a *Myriad* data generator, the user must provide a family of *domain types* and an associated family of *pseudo-random domain type generators (PRDGs)*. At runtime, the PRDG functions are applied iteratively to generate the pseudo-random sequences of the corresponding domain types.
|
|
|
The *Myriad* toolkit promotes a general-purpose data generation model centered around the generation of pseudo-random sequences of user defined domain types. To fully specify a *Myriad* data generator, the user must provide a family of *domain types* and an associated family of *pseudo-random domain type generators (PRDGs)*, which essentially are programs that map sequences of pseudo-random numbers to sequences of the user-defined domain types.
|
|
|
|
|
|
The specification can be implemented at one of two possible levels - as a high-level *XML specification of a data generator prototype*, or directly at the code level in one of the C++ classes extending the *Myriad runtime library*. The XML layer is ideal for rapid prototyping and probably sufficient for simple relational use-cases, whereas code level extensions are useful when tailor-made data generating logic is required.
|
|
|
The specification can be implemented at one of two possible levels - as a high-level *XML prototype specification*, or directly at the code level in one of the C++ classes extending the *Myriad runtime library*. The XML layer is the recommended entry point for new users, as it is well suited for rapid prototyping and probably sufficient for simple relational use-cases. Code level extensions are an advanced feature that is useful when tailor-made data generating logic is required and will not be discussed further in this section.
|
|
|
|
|
|
The XML specification for a *Myriad*-based data generator project is typically located under `src/config/${my_datagen}-prototype.xml`. In order to invoke the *Myriad prototype compiler* you have to execute the *compile:prototype* task in the assistant CLI tool:
|
|
|
|
|
|
```bash
|
|
|
./myriad-assistant compile:prototype
|
|
|
```
|
|
|
|
|
|
If you are working from the `build` folder, you can use the enclosing make target shortcut instead:
|
|
|
|
|
|
```bash
|
|
|
make prototype
|
|
|
```
|
|
|
|
|
|
When the *Myriad prototype compiler* is invoked for the first time, it will generate three groups of C++ sources:
|
|
|
|
... | ... | @@ -54,9 +66,7 @@ When the *Myriad prototype compiler* is invoked for the first time, it will gene |
|
|
* (B) an associated family of PRDG functions (also called *setter chains*, located under *src/cpp/runtime/setter*), and
|
|
|
* (C) a generator configuration that reads the domains and distributions of the values required by the PRDG functions (located under *src/cpp/config*).
|
|
|
|
|
|
All sources are generated as a pair consisting of a main class and a corresponding base class located in the *base* sub-folder. All logic derived from the XML specification is contained in the base classes, while the main classes can be used as extension points by overriding specific base-class methods. Subsequent invocations of the compiler will not touch already existing main classes, which means that users can modify and re-compile the XML specification even after adding custom logic at the code level. Code-level extensions therefore present not an alternative, but rather a complementary way to specify your data generator programs. In order to keep the specification structure clear, we advise users to always use the XML specification as much as possible and fall back to code level extensions only when they are absolutely necessary.
|
|
|
|
|
|
You can find out more about the XML dialect supported by the *Myriad compiler* in the [XML Specification Reference Manual](/TU-Berlin-DIMA/myriad-toolkit/wiki/XML-Specification-Reference-Manual).
|
|
|
You can find out more about the *Myriad* XML specification language in the [XML Specification Reference Manual](/TU-Berlin-DIMA/myriad-toolkit/wiki/XML-Specification-Reference-Manual).
|
|
|
|
|
|
|
|
|
3. Build the Data Generator Binary
|
... | ... | |