skip to content
Aaron Jarmusch
Aaron Jarmusch

Aaron Jarmusch

Ph.D. Student · HPC & GPU Systems

About

I'm a Computer Science Ph.D. student at the University of Delaware in the Computational Research Programming Lab (CRPL), advised by Dr. Sunita Chandrasekaran.

My research builds analytical performance models for GPU computing and validation infrastructure for HPC compilers. I characterize new architectures (NVIDIA Blackwell & Hopper, AMD MI300A) with targeted microbenchmarks, and I ship CI/CD pipelines for LLVM-based compilers across H100, GH200, and MI300A nodes.

I also apply LLMs to compiler validation — automating test generation and using LLM-as-judge to grade the resulting tests.

Recent News

[May 2026] Joining Lawrence Livermore National Laboratory (LLNL) as a summer intern.
[May 2026] New preprint "Microbenchmark-Driven Analytical Performance Modeling Across Modern GPU Architectures" on arXiv — an analytical model for NVIDIA Blackwell and AMD CDNA3 GPUs.
[February 2026] Paper accepted to IPDPS 2026.
[November 2025] Passed Ph.D. preliminary examination.
[August 2025] Attended ATPESC 2025 at Argonne National Laboratory.
[August 2024] Paper accepted at WACCPD @ SC24 — LLM-as-a-Judge for Validation & Verification Testsuites.
[August 2024] Paper accepted at IWOMP — CI/CD framework for OpenMP Offloading validation and benchmarking.

Selected Projects & Outcomes

Headline results — for the rest, see Projects.

GPU Architecture Characterization
Blackwell + MI300A

Microbenchmark studies covering FP4/FP6 tensor cores on NVIDIA Blackwell (RTX 5080) and FP8 + 2:4 sparsity on AMD MI300A. Two preprints in 2025–2026, one accepted at IPDPS 2026.

Compiler Validation at Scale
OpenACC 3.0+ · 4 vendors

OpenACC V&V testsuite shipping against NVIDIA HPC SDK, GCC, Clang/LLVM, and Cray — running on Summit, Frontier, and Perlmutter. Best Poster at SC22 SRC and DARWIN '23.

CI/CD for HPC Compilers
−60% manual testing

Automated GitLab CI pipelines for LLVM OpenMP offloading across H100, GH200, MI210, and MI300A. Used by 15+ DOE exascale applications. Published at IWOMP 2024.

Recent Publications

Full list on Publications or Google Scholar.

Aaron Jarmusch , Connor Vitz , Sunita Chandrasekaran (2026). Execution-Centric Characterization of FP8 Matrix Cores, Asynchronous Execution, and Structured Sparsity on AMD MI300A. arXiv preprint . doi: 10.48550/arXiv.2602.10262.

PDF

Aaron Jarmusch , Nathan Graddon , Sunita Chandrasekaran (2025). Dissecting the NVIDIA Blackwell Architecture with Microbenchmarks. arXiv preprint . doi: 10.48550/arXiv.2507.10789.

PDF

Zachariah Sollenberger , Jay Patel , Christian Munley , Aaron Jarmusch , Sunita Chandrasekaran (2024). Exploring LLM-as-a-Judge for Validation and Verification Testsuites. SC24-W: Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis , pp. 1885-1893 . doi: 10.1109/SCW63240.2024.00238.

Recent Blog Posts

Notes on HPC research, GPU computing, and side projects.