|
|
## About Oligos
|
|
|
|
|
|
*Oligos* is a profiling tool that extracts a generator specification from a given reference database. Therefore it make extensive use of stored database statistics. Oligos collects database statistics such as cardinalities, most frequent values, distribution statistics and many more, and uses this information to create a dense description of the profiled database. This information is then used to create a *Myriad* XML Specification. For more information about Oligos go to the [project homepage](https://bitbucket.org/carabolic/oligos).
|
|
|
*Oligos* is a profiling tool that extracts a generator specification from a given reference database, using the stored database statistics. Oligos collects database statistics such as table and column cardinality, most frequent values, distribution statistics and many more, and uses this information to create a dense description of the profiled database. This information is then used to create a *Myriad* XML Specification.
|
|
|
<!--
|
|
|
For more information about Oligos go to the [project homepage](https://bitbucket.org/carabolic/oligos).
|
|
|
-->
|
|
|
|
|
|
## Using Oligos with Myriad
|
|
|
|
|
|
One way to generate a *Myriad* XML Specification is to use *Oligos* and a reference database. In order to do so you need to run the `compile:oligos` task. The only prerequisite to run the `compile:oligos` task is the specific JDBC driver for your database. There are two ways to the JDBC driver to your *Myriad* project
|
|
|
* Add the path to the `$CLASSPATH` environment variable, or
|
|
|
* Add the path to the `MYRIAD_OLIGOS_CP` property in the `$PROJECT-HOME/.myriad-settings` file
|
|
|
One way to generate a *Myriad* XML Specification is to use *Oligos* and a reference database. In order to do so you need to run the `compile:oligos` task. The only prerequisite to run the `compile:oligos` task is the specific JDBC driver for your database. There are two ways to add the JDBC driver to your *Myriad* project:
|
|
|
|
|
|
* add the path to the `$CLASSPATH` environment variable, or
|
|
|
* add the path to the `MYRIAD_OLIGOS_CP` property in the `$PROJECT-HOME/.myriad-settings` file.
|
|
|
|
|
|
After setting the path to the specific JDBC driver you can start using *Myriad* with *Oligos*. The basic syntax of the `compile:oligos` is a follows
|
|
|
After setting the path to the specific JDBC driver you can start using *Myriad* with *Oligos*. The basic syntax of the `compile:oligos` assistant task is a follows
|
|
|
|
|
|
```
|
|
|
myriad-assistant compile:oligos -h [host] -P [port] -D [database] -u [username] -p [password] [schema]
|
|
|
```
|
|
|
|
|
|
Where `host` is the hostname of your database, `port` is the database port, `database` is the name of the database, `username` and `password` are the credentials used for authentication. The `schema` states which columns should be profiled and has the following syntax (Backus Naur Form):
|
|
|
Where `host` is the hostname of your database, `port` is the database port, `database` is the name of the database, and `username` and `password` are the credentials used for authentication. The `schema` states which columns should be profiled and has the following syntax (Backus Naur Form):
|
|
|
|
|
|
```bnf
|
|
|
COLUMN_SEQUENCE = COLUMN_ID {"," COLUMN_ID}
|
|
|
TABLE_DEFINITION = TABLE_ID ["(" COLUMN_SEQUENCE ")"]
|
|
|
TABLE_SEQUENCE = TABLE_DEFINITION {"," TABLE_DEFINITION}
|
|
|
SCHEMA = SCHEMA_DEFINITION { "," SCHEMA_DEFINITION }
|
|
|
SCHEMA_DEFINITION = SCHEMA_ID ["(" TABLE_SEQUENCE ")"]
|
|
|
|
|
|
SCHEMA = SCHEMA_DEFINITION {"," SCHEMA_DEFINITION}
|
|
|
TABLE_SEQUENCE = TABLE_DEFINITION { "," TABLE_DEFINITION }
|
|
|
TABLE_DEFINITION = TABLE_ID [ "(" COLUMN_SEQUENCE ")" ]
|
|
|
COLUMN_SEQUENCE = COLUMN_ID { "," COLUMN_ID }
|
|
|
```
|
|
|
|
|
|
A schema consists of at least a `SCHEMA_ID` which is the name of the schema you want to profile. A `SCHEMA_ID` is followed by an optional sequence of `TABLE_DEFINITIONs` enclosed in parentheses and separated by `,` (comma). Each `TABLE_DEFINITION` in turn contains a mandatory `TABLE_ID` (read: table name) and an optional sequence of `COLUMN_IDs`. If you omit a `TABLE_SEQUENCE` or `COLUMN_SEQUENCE` it is interpreted as wildcard and all tables, columns resp., are profiled.
|
|
|
A schema consists of at least a `SCHEMA_ID` which is the name of the schema you want to profile. A `SCHEMA_ID` is followed by an optional sequence of `TABLE_DEFINITIONs` enclosed in parentheses and separated by comma. Each `TABLE_DEFINITION` in turn contains a mandatory `TABLE_ID` (read: table name) and an optional sequence of `COLUMN_IDs`. Omitting the `TABLE_SEQUENCE` or `COLUMN_SEQUENCE` clause is interpreted as wildcard and all tables (resp. columns), are profiled.
|
|
|
|
|
|
<!-- graphical schema description
|
|
|
```
|
... | ... | @@ -45,6 +48,8 @@ SCHEMA_B, |
|
|
|
|
|
## Examples
|
|
|
|
|
|
Take a look at the following examples that illustrate the concrete syntax of the `compile:oligos` task.
|
|
|
|
|
|
### Reference Database
|
|
|
|
|
|
All of the following examples use the TPCH Schema. The schema is as follows:
|
... | ... | |