PPDS Presentation
Deadline for submission: July 7th, EOD
This repository will contain all code required for the presentation, including code for gathering, processing and plotting benchmark data.
Presentation
Outline
This is a WIP proposal, please change as seen fit
- Übersicht
- Kurz, Arbeitsschritte und Ergebnisse
- Hierzu criterion mit naive, simple und adaptive compiler aufbauen und Fortschritt in Graphen zeigen
- Lernergebnisse
- Was hat sich durch die Implementation gezeigt,
- was hat funktioniert?
- Query Rewriting (Effekt von Fuse und Predicate pass)
- was nicht?
- portables SIMD (Miroarch Probleme)
- ONC
- was hat funktioniert?
- Was hat sich durch die Implementation gezeigt,
Backup slides
- Test system specs
- Interfaces & Constraints from rust
Notes on building
This uses a derivative from my LaTeX flake.
- to get a continous preview (you need to open the PDF yourself):
nix run .#preview
- to build a final PDF:
nix run .#build
- to compress a PDF from
.#build
withghostscript
:nix run .#compress
- to clean up build artifacts:
nix run .#clean
You may use it, or your own TeXLive, instance to build the presentation. Please commit an up-to-date build of the presentation as PDF along with any tex changes on each commit.
Benchmarks
We have 2 days to collect and plot the benchmark data, currently outstanding are:
- what do we want to benchmark? what queries are we evaluating?
- Effect of SIMD (on/off tests with simple vs adaptive compiler)
- Effect of query rewriting (a/b test for some queries)
- Progress in terms of perf, naive -> simple -> adaptive (SIMD)
- how do we achieve this?
- Effect of SIMD: tests with simple and adaptive compiler on subset of benchmark queries, e.g., showing worst and best case?
- Effect of query rewriting:
- Show difference between fused scans and normal scans
- Show difference with predicate rewrite and NOP push-up
- Progress in terms of perf:
- Hook up benchmark and criterion, run them and aggregate data
- collect benchmark data:
- run for each cluster:
- Cascade Lake AP:
standard96:test
- Genoa:
genoa-cpu:all
- Saphire Rappids:
gpu-pvc:all
- Icelake:
gpu-a100:all
- Cascade Lake AP:
- Postprocess and aggregate
- extract data from logs
- parse and dump into some sort of container (polars or duckdb)
- Plot out aggregated data
- likely plotly with jupyter notebook, then dump to png's
- we need to this manually, however we can emulate criterion's plots
- run for each cluster:
Contains the code used for benchmarks, (TBD) notebooks used for gene