Skip to content

Add 0.01 percent sample of the Higgs data set

Created by: cviebig

The sampling was performed with a simple stratosphere sampling job.

Use the following parameters to resample: RunSampling HIGGS.csv HIGGS-0.0001.csv 0.0001

Be aware that the results are currently not reproducible as the seed is not configurable. In a distributed environment with a varying number of parallel workers it is also necessary to use a pseudo random number generator with skipValue functionality otherwise the result will again be not reproducible.

Merge request reports