Skip to content
Snippets Groups Projects
Commit 414b361a authored by Carlos Muniz's avatar Carlos Muniz
Browse files

STAN: Improve code for reproducibility

parent 326285a4
No related branches found
No related tags found
No related merge requests found
__pycache__/
*.py[cod]
*$py.class
.idea
.reuse
data/*/
.vscode/
is-it-worth-it.egg-info/
dist/
build/
*.egg-info/
is-it-worth-it/
is-it-worth-it-benchmark/
!is-it-worth-it-benchmark/conf/model/STAN.yaml
!is-it-worth-it-benchmark/conf/experiment/main/stan.yaml
!is-it-worth-it-benchmark/utsadbench/core/data/dataset.py
!is-it-worth-it-benchmark/utsadbench/data/ucr.py
!is-it-worth-it-benchmark/utsadbench/models/stan.py
\ No newline at end of file
# Fast, parameter-free time series anomaly detection
## The STAN Algorithm
STAN (**s**ummary s**ta**tistics e**n**semble) is a parameter-free time series anomaly detection algorithm. It is designed to be simple and fast at detecting the most anomalous region within a univariate time series.
The idea behind STAN is to incorporate the sliding window technique over all non-overlapping time series subsequences with a predefined length based on the periodic length of the time series. STAN applies a set of summary statistics over the time series subsequences. For our experiments, we used a set of 8 summary statistics (e.g., Min, Max, Mean, etc.), but feel free to experiment with other aggregate functions if you’d like. Based on the results of the summary statistics, STAN identifies the potentially most anomalous region within the time series. At the end, STAN returns the index of this anomalous subsequence.
The source code can be found in the file STAN.py. Running this script will execute STAN on the UCR Anomaly Archive and will return:
- The anomaly detection accuracy (UCR-Score),
- The execution time,
- The individual anomaly detection accuracies of each summary statistic.
## Plots
Running the script Plots.py will generate the plots used in the paper. Unfortunately, this script does not reproduce the experiments, as the results were saved manually during the experiments. If you want to generate similar plots, ensure that the results are saved in the required format while running the experiments.
## MERLIN
The time series discord is a popular indicator of an anomaly in a time series. One of the best algorithms for computing the time series discord is MERLIN. We ran MERLIN on the UCR Anomaly Archive twice:
1. Using a fixed subsequence size range.
2. Using the periodic length calculated from the highest_autocorrelation() function, which we also use in STAN.py.
Running the script MERLIN.py will install all dependencies and execute the algorithm MERLIN3_1.m (located in the UCR_Anomaly_Archive folder) on all time series from the UCR Anomaly Archive.
## dependencies
- Python 3.x
- Required Python libraries: numpy, pandas, matplotlib
- MATLAB (for running MERLIN3_1.m)
# Fast, parameter-free time series anomaly detection
This repository contains the code to reproduce the results of the publication:
> Blagov, Kristiyan, Carlos Enrique Muñiz-Cuza, and Matthias Boehm. "Fast, Parameter-free Time Series Anomaly Detection. BTW. 2025"
## The STAN Algorithm
STAN (**s**ummary s**ta**tistics e**n**semble) is a parameter-free time series anomaly detection algorithm. It is designed to be simple and fast at detecting the most anomalous region within a univariate time series.
The idea behind STAN is to incorporate the sliding window technique over all non-overlapping time series subsequences with a predefined length based on the periodic length of the time series. STAN applies a set of summary statistics over the time series subsequences. For our experiments, we used a set of 8 summary statistics (e.g., Min, Max, Mean, etc.), but feel free to experiment with other aggregate functions if you’d like. Based on the results of the summary statistics, STAN identifies the potentially most anomalous region within the time series. At the end, STAN returns the index of this anomalous subsequence.
The source code for STAN can be found at `stan.py` as an standalone implementation and at `is-it-worth-it-benchmark/utsadbench/models/stan.py` as part of [2]'s benchmark. We slighly modified the `utsadbench` package to effectively add STAN as part of the suit. Specifically, we change the UCRcore.TimeSeries class to hold the starting of the training datasets. This information is necessary for STAN. This has no impact of the rest of the baselines.
To run STAN's standalone implementation follow the instructions below:
### Install `UCR Anomaly Archive` and `MERLIN3.1`
To download the [UCR Anomaly Archive and Merlin](https://www.cs.ucr.edu/%7Eeamonn/time_series_data_2018/) run
```bash
chmod +x ./scripts/install_merlin_ucr.sh
./install_merlin_ucr.sh
```
### Run `STAN ALONE`
-------
We have prepare a python script that runs STAN over the UCR dataset and returns the UCR-score, execution time and the distribution of anomaly detected per summary statistics. To run this script you need to install a few dependencies.
```
numpy
scipy
statsmodels
scikit-learn
tqdm
```
We have prepare a file containing these requirements at `requirements/stan_alone_requirements.txt` so you can execute:
```bash
pip install -r requirements/stan_alone_requirements.txt
```
Once all dependencies are installed, simply run:
```bash
python stan.py
```
### Run `MERLIN3.1 ALONE`
-------
To run Merlin 3.1 as shared by the original authors, Matlab Engine is required. [Here](https://de.mathworks.com/help/matlab/matlab_external/install-the-matlab-engine-for-python.html) you can find the necessary instructions to install Matlab Engine API for Python. Once installed, simply run:
```bash
python merlin.py
```
#### Potential Issues with `MERLIN3.1`
MERLIN3.1.m uses outdated functions to compute the mean and standard deviation: `nanmean` and `nanstd`. If you are using R2020b or higher, you will need to change these functions to `mean(T, 'omitnan');` and `std(T, 'omitnan');` to run the script. We only tested the script on Windows 11.
### Run `STAN and ALL BASELINES`
-------
To run STAN against the rest of the baselines we have included a configuration file to install the `utsadbench` library released as part of the paper:
> Ferdinand Rewicki, Joachim Denzler, and Julia Niebling. 2023. "Is It Worth It? Comparing Six Deep and Classical Methods for Unsupervised Anomaly Detection in Time Series" Applied Sciences 13, no. 3: 1778. https://doi.org/10.3390/app13031778
#### Install `UTSADBENCH`
We use [Anaconda](https://www.anaconda.com/download/) to create an updated environment to install and run `utsadbench`. We suggest following the script provided by us at `install_utsadbench.sh` to install all dependencies
```bash
chmod +x install_utsadbench.sh
./install_utsadbench.sh
```
For building [MDI]((https://cvjena.github.io/libmaxdiv/)) algorithm, the following dependencies are needed.
* [gcc](https://gcc.gnu.org/install/), tested with g++ version 9.4.0
* [cmake](https://cmake.org/), tested with version 3.31.5
* [Eigen3](https://eigen.tuxfamily.org), tested with eigen3 version 3.4
* [pkg-config](https://www.freedesktop.org/wiki/Software/pkg-config/), tested with version 0.29.1
* [OpenMP](https://www.openmp.org/), tested with version 4.5
Once the dependencies are installed, run `utsadbench`'s script to install `libmaxdiv` or run:
```bash
cd is-it-worth-it-benchmark
chmod +x ./scripts/install_mdi.sh
./scripts/install_mdi.sh
```
#### Run `UTSADBENCH`
## License
Please refer to the `LICENSE.md` file for further information about how the content is licensed.
\ No newline at end of file
# The IDs of UCR Anomaly Archive time series sorted into arrays of corresponding anomaly type
amplitude_change = ["013","014", "037", "042", "044", "053", "057", "066", "091", "100", "104", "121", "122", "145", "150", "152", "161", "165", "174", "199", "205", '206', '207', "215", "217"]
flat = ['045', '078', '110', '153', '186', '236']
freq_change = ['023', '026', '032', '033', '034', '040', '048', '099', '101', '131', '134', '140', '141', '142', '148', '156', '202', '222', '223', '224', '227', '228', '229', '244', '245', "246", '247']
local_drop = ['005', '043', '054', '063', '077', '086', '092', '102', '106', '113', '151', '162', '171', '185', '194', '200', '232', '233', '237', '238']
local_peak = ['007', '021', '024', '025', '030', '049', '062', '064', '085', '089', '097', '115', '129', '132', '133', '138', '157', '170', '172', '193', '197', '234', '235', '239', '243']
missing_drop = ['002', '072', '180']
missing_peak = ['004', '019', '035', '036', '059', '060', '094', '112', '127', '143', '144', '167', '168', '248']
noise = ['003', '008', '027', '028', '029', '039', '056', '067', '068', '083', '095', '098', '107', '111', '116', '135', '136', '137', '147', '164', '175', '176', '191']
outlier_datasets = ['011', '012', '015', '018', '070', '071', '084', '119', '120', '123', '126', '178', '179', '192', '213', '216', '220', '226']
reverse = ['020', '022', '038', '052', '055', '065', '090', '103', '128', '130', '146', '160', '163', '173', '198', '201', '203', '209', '212', '225', '230', '242', '249']
reverse_horizontal = ['016', '017', '058', '096', '124', '125', '166', '231']
sampling_rate = ['050', '061', '105', '158', '169']
signal_shift = ['204']
smoothed_increase = ['241']
steep_increase = ['051', '159']
time_shift = ['069', '074', '075', '079', '080', '081', '082', '087', '088', '108', '177', '182', '183', '187', '188', '189', '190', '195', '196', '208']
time_warping = ['031', '076', '139', '184']
unusual_pattern = ['001', '006', '009', '010', '041', '046', '047', '073', '093', '109', '114', '117', '118', '149', '154', '155', '181', '210', '211', '214', '218', '219', '221', '240', '250']
#!/bin/bash
# We slighly modified to preserve MERLIN3_1.m to add it to the root directory
# SPDX-FileCopyrightText: 2022 German Aerospace Center (DLR e.V.), Ferdinand Rewicki
#
# SPDX-License-Identifier: Apache-2.0
DATADIR="data"
TARGET="ucraa"
# check if dataset already installed
if [ -d "$DATADIR/$TARGET" ]; then
printf "UCR Anomaly Archive already installed!\n"
exit 0
fi
# create dataroot dir if not exsists
mkdir -p $DATADIR
# create tmp dir
mkdir -p tmp && cd tmp
wget https://www.cs.ucr.edu/~eamonn/time_series_data_2018/UCR_TimeSeriesAnomalyDatasets2021.zip
unzip UCR_TimeSeriesAnomalyDatasets2021.zip -d $TARGET
# # move to data dir
mv ./ucraa/AnomalyDatasets_2021/UCR_TimeSeriesAnomalyDatasets2021/FilesAreInHere/UCR_Anomaly_FullData ../$DATADIR/$TARGET
# move Merlin to the root
mv ./ucraa/AnomalyDatasets_2021/UCR_TimeSeriesAnomalyDatasets2021/FilesAreInHere/introducingMERLIN/MERLIN3_1.m ../
# move Merlin to the root
# clean up
cd ..
rm -rf tmp
printf "UCR Anomaly Archive successfully installed in $DATADIR/$TARGET\n"
printf "MERLIN3_1.m successfully installed in the root directory\n"
\ No newline at end of file
#!/bin/bash
wget https://gitlab.com/dlr-dw/is-it-worth-it-benchmark/-/archive/main/is-it-worth-it-benchmark-main.zip
unzip is-it-worth-it-benchmark-main.zip
mv is-it-worth-it-benchmark-main is-it-worth-it-benchmark
rm is-it-worth-it-benchmark-main.zip
conda env create -f requirements/utsadbench_requirements.yml # Create the conda environment
conda activate btw2025
pip install merlin@git+https://gitlab.com/dlr-dw/py-merlin.git
cd is-it-worth-it-benchmark-main
pip install -e . --no-deps # All dependencies are already installed in the conda environment
# Make sure to run the script install_merlin_ucr.sh to copy the data to the correct location inside the is-it-worth-it-benchmark-main directory
cp -r ../data/ucraa/ data/ucraa/
cd ..
name: btw2025
channels:
- conda-forge
- defaults
dependencies:
- _libgcc_mutex=0.1=conda_forge
- _openmp_mutex=4.5=2_kmp_llvm
- absl-py=2.1.0=pyhd8ed1ab_1
- aiohappyeyeballs=2.4.6=pyhd8ed1ab_0
- aiohttp=3.11.12=py39h9399b63_0
- aiosignal=1.3.2=pyhd8ed1ab_0
- alembic=1.14.1=pyhd8ed1ab_0
- annotated-types=0.7.0=pyhd8ed1ab_1
- antlr-python-runtime=4.9.3=pyhd8ed1ab_1
- anyio=4.8.0=pyhd8ed1ab_0
- argon2-cffi=23.1.0=pyhd8ed1ab_1
- argon2-cffi-bindings=21.2.0=py39h8cd3c5a_5
- arrow=1.3.0=pyhd8ed1ab_1
- asttokens=3.0.0=pyhd8ed1ab_1
- astunparse=1.6.3=pyhd8ed1ab_3
- async-lru=2.0.4=pyhd8ed1ab_1
- async-timeout=5.0.1=pyhd8ed1ab_1
- attrs=25.1.0=pyh71513ae_0
- aws-c-auth=0.7.22=h96bc93b_2
- aws-c-cal=0.6.14=h88a6e22_1
- aws-c-common=0.9.19=h4ab18f5_0
- aws-c-compression=0.2.18=h83b837d_6
- aws-c-event-stream=0.4.2=ha47c788_12
- aws-c-http=0.8.1=h29d6fba_17
- aws-c-io=0.14.8=h21d4f22_5
- aws-c-mqtt=0.10.4=h759edc4_4
- aws-c-s3=0.5.9=h594631b_3
- aws-c-sdkutils=0.1.16=h83b837d_2
- aws-checksums=0.1.18=h83b837d_6
- aws-crt-cpp=0.26.9=he3a8b3b_0
- aws-sdk-cpp=1.11.329=hba8bd5f_3
- babel=2.17.0=pyhd8ed1ab_0
- bcrypt=4.2.1=py39he612d8f_0
- beautifulsoup4=4.13.3=pyha770c72_0
- bleach=6.2.0=pyh29332c3_4
- bleach-with-css=6.2.0=h82add2a_4
- blinker=1.9.0=pyhff2d567_0
- brotli=1.1.0=hb9d3cd8_2
- brotli-bin=1.1.0=hb9d3cd8_2
- brotli-python=1.1.0=py39hf88036b_2
- bzip2=1.0.8=h4bc722e_7
- c-ares=1.34.4=hb9d3cd8_0
- ca-certificates=2025.1.31=hbcca054_0
- cached-property=1.5.2=hd8ed1ab_1
- cached_property=1.5.2=pyha770c72_1
- cachetools=5.5.1=pyhd8ed1ab_0
- certifi=2025.1.31=py39h06a4308_0
- cffi=1.17.1=py39h15c3d72_0
- charset-normalizer=3.4.1=pyhd8ed1ab_0
- click=8.1.8=pyh707e725_0
- cloudpickle=3.0.0=py39h06a4308_0
- colorama=0.4.6=pyhd8ed1ab_1
- colorlog=5.0.1=py39h06a4308_1
- comm=0.2.2=pyhd8ed1ab_1
- contourpy=1.3.0=py39h74842e3_2
- cpython=3.9.21=py39hd8ed1ab_1
- cryptography=44.0.1=py39h7170ec2_0
- cuda-version=11.8=h70ddcb2_3
- cudatoolkit=11.8.0=h4ba93d1_13
- cudnn=8.9.7.29=hbc23b4c_3
- cycler=0.12.1=pyhd8ed1ab_1
- databricks-sdk=0.44.1=pyhd8ed1ab_0
- debugpy=1.8.12=py39hf88036b_0
- decorator=5.1.1=pyhd8ed1ab_1
- defusedxml=0.7.1=pyhd8ed1ab_0
- deprecated=1.2.18=pyhd8ed1ab_0
- dgl=2.3.0=cuda118py39h609a92b_0
- docker-py=7.1.0=pyhd8ed1ab_1
- entrypoints=0.4=pyhd8ed1ab_1
- exceptiongroup=1.2.2=pyhd8ed1ab_1
- executing=2.1.0=pyhd8ed1ab_1
- filelock=3.17.0=pyhd8ed1ab_0
- flask=3.1.0=pyhff2d567_0
- flatbuffers=24.3.25=h59595ed_0
- fonttools=4.56.0=py39h9399b63_0
- fqdn=1.5.1=pyhd8ed1ab_1
- freetype=2.12.1=h267a509_2
- frozenlist=1.5.0=py39h9399b63_1
- fsspec=2025.2.0=pyhd8ed1ab_0
- gast=0.6.0=pyhd8ed1ab_0
- gflags=2.2.2=h5888daf_1005
- giflib=5.2.2=hd590300_0
- gitdb=4.0.12=pyhd8ed1ab_0
- gitpython=3.1.44=pyhff2d567_0
- glog=0.7.1=hbabe93e_0
- gmp=6.3.0=hac33072_2
- gmpy2=2.1.5=py39h7196dd7_3
- google-auth=2.38.0=pyhd8ed1ab_0
- google-pasta=0.2.0=pyhd8ed1ab_2
- graphene=3.4.3=pyhd8ed1ab_1
- graphql-core=3.2.6=pyh29332c3_0
- graphql-relay=3.2.0=pyhd8ed1ab_1
- greenlet=3.1.1=py39hf88036b_1
- grpcio=1.62.2=py39h174d805_0
- gunicorn=23.0.0=py39hf3d152e_1
- h11=0.14.0=pyhd8ed1ab_1
- h2=4.2.0=pyhd8ed1ab_0
- h5py=3.13.0=nompi_py39h30a5a8d_100
- hdf5=1.14.3=nompi_hdf9ad27_105
- hpack=4.1.0=pyhd8ed1ab_0
- httpcore=1.0.7=pyh29332c3_1
- httpx=0.28.1=pyhd8ed1ab_0
- hydra-core=1.3.2=pyhd8ed1ab_1
- hydra-submitit-launcher=1.2.0=pyhd8ed1ab_0
- hyperframe=6.1.0=pyhd8ed1ab_0
- icu=73.2=h59595ed_0
- idna=3.10=pyhd8ed1ab_1
- importlib-metadata=8.6.1=pyha770c72_0
- importlib-resources=6.5.2=pyhd8ed1ab_0
- importlib_resources=6.5.2=pyhd8ed1ab_0
- ipykernel=6.29.5=pyh3099207_0
- ipython=8.18.1=pyh707e725_3
- isoduration=20.11.0=pyhd8ed1ab_1
- itsdangerous=2.2.0=pyhd8ed1ab_1
- jedi=0.19.2=pyhd8ed1ab_1
- jinja2=3.1.5=pyhd8ed1ab_0
- joblib=1.4.2=pyhd8ed1ab_1
- json5=0.10.0=pyhd8ed1ab_1
- jsonpointer=3.0.0=py39hf3d152e_1
- jsonschema=4.23.0=pyhd8ed1ab_1
- jsonschema-specifications=2024.10.1=pyhd8ed1ab_1
- jsonschema-with-format-nongpl=4.23.0=hd8ed1ab_1
- jupyter-lsp=2.2.5=pyhd8ed1ab_1
- jupyter_client=8.6.3=pyhd8ed1ab_1
- jupyter_core=5.7.2=pyh31011fe_1
- jupyter_events=0.12.0=pyh29332c3_0
- jupyter_server=2.15.0=pyhd8ed1ab_0
- jupyter_server_terminals=0.5.3=pyhd8ed1ab_1
- jupyterlab=4.3.5=pyhd8ed1ab_0
- jupyterlab_pygments=0.3.0=pyhd8ed1ab_2
- jupyterlab_server=2.27.3=pyhd8ed1ab_1
- keras=3.8.0=pyh753f3f9_0
- kernel-headers_linux-64=3.10.0=he073ed8_18
- keyutils=1.6.1=h166bdaf_0
- kiwisolver=1.4.7=py39h74842e3_0
- krb5=1.21.3=h659f571_0
- lcms2=2.16=hb7c19ff_0
- ld_impl_linux-64=2.40=h12ee557_0
- lerc=4.0.0=h27087fc_0
- libabseil=20240116.2=cxx17_he02047a_1
- libaec=1.1.3=h59595ed_0
- libarrow=16.1.0=hcb6531f_6_cpu
- libarrow-acero=16.1.0=hac33072_6_cpu
- libarrow-dataset=16.1.0=hac33072_6_cpu
- libarrow-substrait=16.1.0=h7e0c224_6_cpu
- libblas=3.9.0=28_h59b9bed_openblas
- libbrotlicommon=1.1.0=hb9d3cd8_2
- libbrotlidec=1.1.0=hb9d3cd8_2
- libbrotlienc=1.1.0=hb9d3cd8_2
- libcblas=3.9.0=28_he106b2a_openblas
- libcrc32c=1.1.2=h9c3ff4c_0
- libcurl=8.8.0=hca28451_1
- libdeflate=1.20=hd590300_0
- libedit=3.1.20191231=he28a2e2_2
- libev=4.33=hd590300_2
- libevent=2.1.12=hf998b51_1
- libffi=3.4.4=h6a678d5_1
- libgcc=14.2.0=h77fa898_1
- libgcc-ng=14.2.0=h69a702a_1
- libgfortran=14.2.0=h69a702a_1
- libgfortran-ng=14.2.0=h69a702a_1
- libgfortran5=14.2.0=hd5240d6_1
- libgomp=14.2.0=h77fa898_1
- libgoogle-cloud=2.24.0=h2736e30_0
- libgoogle-cloud-storage=2.24.0=h3d9a0c8_0
- libgrpc=1.62.2=h15f2491_0
- libhwloc=2.11.2=default_h0d58e46_1001
- libjpeg-turbo=3.0.0=hd590300_1
- liblapack=3.9.0=28_h7ac8fdf_openblas
- liblapacke=3.9.0=28_he2f377e_openblas
- libmagma=2.7.2=h09b5827_2
- libmagma_sparse=2.7.2=h09b5827_3
- libnghttp2=1.58.0=h47da74e_1
- libopenblas=0.3.28=pthreads_h94d23a6_1
- libparquet=16.1.0=h6a7eafb_6_cpu
- libpng=1.6.43=h2797004_0
- libprotobuf=4.25.3=h08a7969_0
- libre2-11=2023.09.01=h5a48ba9_2
- libsodium=1.0.20=h4ab18f5_0
- libsqlite=3.46.0=hde9e2c9_0
- libssh2=1.11.0=h0841786_0
- libstdcxx=14.2.0=hc0a3c3a_1
- libstdcxx-ng=14.2.0=h4852527_1
- libthrift=0.19.0=hb90f79a_1
- libtiff=4.6.0=h1dd3fc0_3
- libtorch=2.3.1=cuda118_h7aef8b2_300
- liburing=2.7=h434a139_0
- libutf8proc=2.8.0=hf23e847_1
- libuv=1.50.0=hb9d3cd8_0
- libwebp-base=1.5.0=h851e524_0
- libxcb=1.15=h0b41bf4_0
- libxml2=2.13.5=hfdd30dd_0
- libzlib=1.2.13=h4ab18f5_6
- llvm-openmp=19.1.7=h024ca30_0
- lz4-c=1.9.4=hcb278e6_0
- mako=1.3.9=pyhd8ed1ab_0
- markdown=3.6=pyhd8ed1ab_0
- markdown-it-py=3.0.0=pyhd8ed1ab_1
- markupsafe=3.0.2=py39h9399b63_1
- matplotlib-base=3.9.4=py39h16632d1_0
- matplotlib-inline=0.1.7=pyhd8ed1ab_1
- mdurl=0.1.2=pyhd8ed1ab_1
- metis=5.1.1=h59595ed_2
- mistune=3.1.2=pyhd8ed1ab_0
- mkl=2023.2.0=h84fe81f_50496
- ml_dtypes=0.3.2=py39hddac248_0
- mlflow=2.20.2=hf3d152e_0
- mlflow-skinny=2.20.2=py39hf3d152e_0
- mlflow-ui=2.20.2=py39hf3d152e_0
- mpc=1.3.1=h24ddda3_1
- mpfr=4.2.1=h90cbb55_3
- mpmath=1.3.0=pyhd8ed1ab_1
- multidict=6.1.0=py39h9399b63_2
- munkres=1.1.4=pyh9f0ad1d_0
- namex=0.0.8=pyhd8ed1ab_1
- narwhals=1.27.1=pyhd8ed1ab_0
- nbclient=0.10.2=pyhd8ed1ab_0
- nbconvert-core=7.16.6=pyh29332c3_0
- nbformat=5.10.4=pyhd8ed1ab_1
- nccl=2.25.1.1=h03a54cd_0
- ncurses=6.4=h6a678d5_0
- nest-asyncio=1.6.0=pyhd8ed1ab_1
- networkx=3.2.1=pyhd8ed1ab_0
- notebook-shim=0.2.4=pyhd8ed1ab_1
- numpy=1.26.4=py39h474f0d3_0
- omegaconf=2.3.0=pyhd8ed1ab_0
- openjpeg=2.5.2=h488ebb8_0
- openssl=3.4.1=h7b32b05_0
- opentelemetry-api=1.16.0=pyhd8ed1ab_0
- opentelemetry-sdk=1.16.0=pyhd8ed1ab_0
- opentelemetry-semantic-conventions=0.37b0=pyhd8ed1ab_0
- opt_einsum=3.4.0=pyhd8ed1ab_1
- optree=0.14.0=py39h74842e3_1
- orc=2.0.1=h17fec99_1
- overrides=7.7.0=pyhd8ed1ab_1
- packaging=24.2=py39h06a4308_0
- pandas=2.2.3=py39h3b40f6f_2
- pandocfilters=1.5.0=pyhd8ed1ab_0
- paramiko=3.5.1=pyhd8ed1ab_0
- parso=0.8.4=pyhd8ed1ab_1
- patsy=1.0.1=pyhd8ed1ab_1
- pexpect=4.9.0=pyhd8ed1ab_1
- pickleshare=0.7.5=pyhd8ed1ab_1004
- pillow=10.3.0=py39h90c7501_0
- pip=25.0=py39h06a4308_0
- pkgutil-resolve-name=1.3.10=pyhd8ed1ab_2
- platformdirs=4.3.6=pyhd8ed1ab_1
- plotly=6.0.0=pyhd8ed1ab_0
- prometheus_client=0.21.1=pyhd8ed1ab_0
- prometheus_flask_exporter=0.23.1=pyhd8ed1ab_1
- prompt-toolkit=3.0.50=pyha770c72_0
- propcache=0.2.1=py39h9399b63_1
- protobuf=4.25.3=py39hbb4dce6_1
- psutil=6.1.1=py39h8cd3c5a_0
- pthread-stubs=0.4=hb9d3cd8_1002
- ptyprocess=0.7.0=pyhd8ed1ab_1
- pure_eval=0.2.3=pyhd8ed1ab_1
- pyarrow=16.1.0=py39h8003fee_1
- pyarrow-core=16.1.0=py39h176f5a7_1_cpu
- pyasn1=0.6.1=pyhd8ed1ab_2
- pyasn1-modules=0.4.1=pyhd8ed1ab_1
- pycparser=2.22=pyh29332c3_1
- pydantic=2.10.6=pyh3cfb1c2_0
- pydantic-core=2.27.2=py39he612d8f_0
- pygments=2.19.1=pyhd8ed1ab_0
- pynacl=1.5.0=py39h8cd3c5a_4
- pyopenssl=25.0.0=pyhd8ed1ab_0
- pyparsing=3.2.1=pyhd8ed1ab_0
- pysocks=1.7.1=pyha55dd90_7
- python=3.9.21=he870216_1
- python-dateutil=2.9.0.post0=pyhff2d567_1
- python-fastjsonschema=2.21.1=pyhd8ed1ab_0
- python-flatbuffers=25.2.10=pyhbc23db3_0
- python-json-logger=2.0.7=pyhd8ed1ab_0
- python-tzdata=2025.1=pyhd8ed1ab_0
- python_abi=3.9=2_cp39
- pytz=2024.1=pyhd8ed1ab_0
- pyu2f=0.1.5=pyhd8ed1ab_1
- pywin32-on-windows=0.1.0=pyh1179c8e_3
- pyyaml=6.0.2=py39h5eee18b_0
- pyzmq=26.2.1=py39h4e4fb57_0
- qhull=2020.2=h434a139_5
- querystring_parser=1.2.4=pyhd8ed1ab_2
- re2=2023.09.01=h7f4b329_2
- readline=8.2=h5eee18b_0
- referencing=0.36.2=pyh29332c3_0
- requests=2.32.3=pyhd8ed1ab_1
- rfc3339-validator=0.1.4=pyhd8ed1ab_1
- rfc3986-validator=0.1.1=pyh9f0ad1d_0
- rich=13.9.4=pyhd8ed1ab_1
- rpds-py=0.22.3=py39he612d8f_0
- rsa=4.9=pyhd8ed1ab_1
- s2n=1.4.15=he19d79f_0
- scikit-learn=1.6.1=py39h4b7350c_0
- scipy=1.13.1=py39haf93ffa_0
- seaborn=0.13.2=hd8ed1ab_3
- seaborn-base=0.13.2=pyhd8ed1ab_3
- send2trash=1.8.3=pyh0d859eb_1
- setuptools=75.8.0=py39h06a4308_0
- six=1.17.0=pyhd8ed1ab_0
- sleef=3.8=h1b44611_0
- smmap=5.0.2=pyhd8ed1ab_0
- snappy=1.2.1=h8bd8927_1
- sniffio=1.3.1=pyhd8ed1ab_1
- soupsieve=2.5=pyhd8ed1ab_1
- sqlalchemy=2.0.38=py39h8cd3c5a_0
- sqlite=3.45.3=h5eee18b_0
- sqlparse=0.5.3=pyhd8ed1ab_0
- stack_data=0.6.3=pyhd8ed1ab_1
- statsmodels=0.14.4=py39hf3d9206_0
- submitit=1.5.2=pyhd8ed1ab_0
- sysroot_linux-64=2.17=h0157908_18
- tbb=2021.13.0=hceb3a55_1
- tensorboard=2.16.2=pyhd8ed1ab_0
- tensorboard-data-server=0.7.0=py39h7170ec2_2
- tensorflow=2.16.1=cpu_py39hc1df215_0
- tensorflow-base=2.16.1=cpu_py39had76461_0
- tensorflow-estimator=2.16.1=cpu_py39he137130_0
- termcolor=2.5.0=pyhd8ed1ab_1
- terminado=0.18.1=pyh0d859eb_0
- threadpoolctl=3.5.0=pyhc1e730c_0
- tinycss2=1.4.0=pyhd8ed1ab_0
- tk=8.6.14=h39e8969_0
- tomli=2.2.1=pyhd8ed1ab_1
- torchdata=0.7.1=py39hc552c7e_6
- tornado=6.4.2=py39h8cd3c5a_0
- tqdm=4.67.1=pyhd8ed1ab_1
- traitlets=5.14.3=pyhd8ed1ab_1
- types-python-dateutil=2.9.0.20241206=pyhd8ed1ab_0
- typing-extensions=4.12.2=py39h06a4308_0
- typing_extensions=4.12.2=py39h06a4308_0
- typing_utils=0.1.0=pyhd8ed1ab_1
- tzdata=2025a=h04d1e81_0
- unicodedata2=16.0.0=py39h8cd3c5a_0
- uri-template=1.3.0=pyhd8ed1ab_1
- urllib3=2.3.0=pyhd8ed1ab_0
- wcwidth=0.2.13=pyhd8ed1ab_1
- webcolors=24.11.1=pyhd8ed1ab_0
- webencodings=0.5.1=pyhd8ed1ab_3
- websocket-client=1.8.0=pyhd8ed1ab_1
- werkzeug=3.1.3=pyhd8ed1ab_1
- wheel=0.45.1=py39h06a4308_0
- wrapt=1.17.2=py39h8cd3c5a_0
- xorg-libxau=1.0.12=hb9d3cd8_0
- xorg-libxdmcp=1.1.5=hb9d3cd8_0
- xz=5.6.4=h5eee18b_1
- yaml=0.2.5=h7b6447c_0
- yarl=1.18.3=py39h9399b63_1
- zeromq=4.3.5=h3b0a872_7
- zipp=3.21.0=pyhd8ed1ab_1
- zlib=1.2.13=h4ab18f5_6
- zstandard=0.23.0=py39h08a7858_1
- zstd=1.5.6=ha6fb4c9_0
- pip:
- autopage==0.5.2
- cliff==4.8.0
- cmaes==0.11.1
- cmd2==2.5.11
- hydra-optuna-sweeper==1.2.0
- nvidia-cublas-cu12==12.4.5.8
- nvidia-cuda-cupti-cu12==12.4.127
- nvidia-cuda-nvrtc-cu12==12.4.127
- nvidia-cuda-runtime-cu12==12.4.127
- nvidia-cudnn-cu12==9.1.0.70
- nvidia-cufft-cu12==11.2.1.3
- nvidia-curand-cu12==10.3.5.147
- nvidia-cusolver-cu12==11.6.1.9
- nvidia-cusparse-cu12==12.3.1.170
- nvidia-cusparselt-cu12==0.6.2
- nvidia-nccl-cu12==2.21.5
- nvidia-nvjitlink-cu12==12.4.127
- nvidia-nvtx-cu12==12.4.127
- optuna==2.10.1
- pbr==6.1.1
- prettytable==3.14.0
- pyperclip==1.9.0
- rrcf==0.4.4
- stevedore==5.4.0
- sympy==1.13.1
- torch==2.6.0
- torchaudio==2.6.0
- torchvision==0.21.0
- triton==3.2.0
File added
File added
# -*- coding: utf-8 -*-
"""
Created on Tue Jan 14 21:43:36 2025
@author: Kristiyan
"""
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from matplotlib.colors import ListedColormap
custom_palette = sns.color_palette("light:#5A9")
custom_cmap = ListedColormap(custom_palette)
# Figure 3
palette = sns.light_palette("#5A9", n_colors=12)
reversed_palette = list(reversed(palette))
aggfuncdict = {"MERLIN (fixed)": 63.6, "STAN": 60.4, "MERLIN (ACF)": 55.2, "MDI": 47.0, "GANF": 43.0, "AE": 28.0, "TranAD": 16.0, "STUMPY (ACF)": 10.4, "STUMPY (fixed)": 6.8, "RRCF": 3.0}
df123 = pd.DataFrame(list(aggfuncdict.items()), columns=['Aggregate Function', 'Accuracy'])
plt.figure(figsize=(10, 5))
sns.barplot(x='Aggregate Function', y='Accuracy', palette = reversed_palette, data=df123)
plt.xticks(fontsize = 11.5)
plt.xlabel('')
plt.ylabel('UCR Score', fontsize = 11.5)
plt.xticks(rotation=45)
plt.show()
# Figure 4
data = pd.read_excel('SAFE-AD&MERLIN_AnomalyTypes_NEW.xlsx', sheet_name='Sheet1')
datanew = data.set_index(data.columns[0])
dfff = datanew * 100
dfff = dfff.round(0)
plt.figure(figsize=(4, 7))
hi =sns.heatmap(dfff, annot=True, fmt='g', cmap=custom_cmap, cbar_kws={"location":"right"})
hi.set_ylabel('Time Series Anomaly Type')
plt.show()
# Figure 5
data = pd.read_excel('heatmat_functions_new.xlsx', sheet_name='Sheet1')
datanew = data.set_index(data.columns[0]) # Set the first column as index
dfff = datanew * 100
dfff = dfff.round(0)
plt.figure(figsize=(7, 9))
hi =sns.heatmap(dfff, annot=True, fmt='g', cmap=custom_cmap, cbar_kws={"location":"right"})
plt.xticks(rotation=40, ha = "right")
hi.set_ylabel('Time Series Anomaly Type')
plt.title('Anomaly Detection Accuracy')
plt.show()
# Figure 6
palette = sns.light_palette("#5A9", n_colors=12)
reversed_palette = list(reversed(palette))
aggfuncdict = {"SD":43.2,"+Min":52,"+Max": 52,"+Kurtosis":52,"+Skewness":54,"+PointAnomaly":55.6,"+Truning Points":58.4,"+Mean":60.4}
df123 = pd.DataFrame(list(aggfuncdict.items()), columns=['Aggregate Function', 'Accuracy'])
plt.figure()
sns.barplot(x='Aggregate Function', y='Accuracy', palette = reversed_palette, data=df123)
plt.xticks()
plt.xlabel('')
plt.ylabel('UCR Score')
plt.xticks(rotation=45)
plt.show()
# Figure 7a)
palette = sns.light_palette("#5A9", n_colors=12)
reversed_palette = list(reversed(palette))
data = {"STAN": 1.96, "STUMPY (fixed)": 45.0, "STUMPY (ACF)": 53.0, "MDI": 74.0, "MERLIN (fixed)": 82.0, "MERLIN (ACF)": 94.0, "GANF": 96.0, "TranAD": 109.0, "AE": 149.0, "RRCF": 162.0}
df123 = pd.DataFrame(list(data.items()), columns=['Accuracy', 'Algorithm'])
plt.figure(figsize=(7, 5))
sns.barplot(x="Accuracy", y = "Algorithm", palette = reversed_palette, data = df123)
plt.ylabel('Runtime(s)')
plt.xlabel('')
plt.xticks(ha = 'center', rotation = 45)
plt.show()
#Figure 7b)
palette = sns.light_palette("#5A9", n_colors=12)
reversed_palette = list(reversed(palette))
aggfuncdict = {"SD":80, "+Min":90, "+Max":120, "+Kurtosis":190, "+Skewness":250, "+PointAnomaly":400, "+Truning Points":430, "+Mean":490}
df123 = pd.DataFrame(list(aggfuncdict.items()), columns=['Aggregate Function', 'Accuracy'])
plt.figure()
sns.barplot(x='Aggregate Function', y='Accuracy', palette = reversed_palette, data=df123)
plt.xticks()
plt.xlabel('')
plt.ylabel('Runtime (s)')
plt.xticks(rotation=45)
plt.show()
# Figure 8
palette = sns.light_palette("#5A9", n_colors=4)
reversed_palette = list(reversed(palette))
data = {"ACF":60.4, "MWF": 54.8, "FFT":48}
df123 = pd.DataFrame(list(data.items()), columns=['Method', 'Accuracy'])
plt.figure(figsize=(5, 3))
sns.barplot(x="Method", y = "Accuracy", palette = reversed_palette, data = df123)
plt.xlabel("")
plt.ylabel('UCR Score', fontsize = 13)
plt.savefig("window-size-comparison.pdf", format = "pdf")
plt.show()
\ No newline at end of file
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment