... | ... | @@ -8,9 +8,9 @@ Myriad Data Generator Toolkit |
|
|
Core Features
|
|
|
-------------
|
|
|
|
|
|
The main functional advantage from the use of the toolkit is the built-in parallelization support of the produced generators. Our parallelization approach builds on the idea of mapping fix-sized chunks from an underlying pseudo-random number generator (PRNG) into a pseudo-random stream of records. The horizontal partitioning parallel execution model implemented by the toolkit relies on the use of efficient `skip-ahead` PRNG operations to advance to the starting position of the assigned record substreams in each generator node.
|
|
|
The main functional advantage provided by the toolkit is the built-in parallelization support in all produced generators. Our parallelization approach builds on the idea of mapping fix-sized chunks from an underlying pseudo-random number generator (PRNG) into a pseudo-random sequence of typed records. The parallel execution model implemented by the toolkit relies on horizontal partitioning of all record sequences. The core runtime library implements the partitioning support in a generic way through `skip-ahead` PRNG operations that adjust the starting position of the assigned record sequences in each generator node.
|
|
|
|
|
|
Moreover, the same technique facilitates the efficient realization of a broad set of reference-based model restrictions as the random values of each referenced record are completely dependant (and thus easily re-computable) on its sequence number. You can generate a set of `A`-records and a referencing set of `B`-records simply by sampling arbitrary `a` values from the `A`-sequence for each `b` -- regardless of all current partitioning specifics. A restriction of the form `b.y := a.x` (used for instance to set a foreign key in `b`) can be implemented through local re-computation of the interesting value `a.x` based on the position of the `A`-sample.
|
|
|
Moreover, the same technique facilitates the efficient realization of a broad set of reference-based model restrictions as the random values of each referenced record are completely dependent (and thus easily re-computable) on its sequence number. You can generate a set of `A`-records and a referencing set of `B`-records simply by sampling arbitrary `a` values from the `A`-sequence for each `b` -- regardless of all current partitioning specifics. A restriction of the form `b.y := a.x` (used for instance to set a foreign key in `b`) can be implemented through local re-computation of the interesting value `a.x` based on the position of the `A`-sample.
|
|
|
|
|
|
|
|
|
Extensible Architecture
|
... | ... | |