skip to content
Aaron Jarmusch
Aaron Jarmusch

Aaron Jarmusch

Ph.D. Student · HPC & GPU Systems

About

I'm a Computer Science Ph.D. student at the University of Delaware in the Computational Research Programming Lab (CRPL), advised by Dr. Sunita Chandrasekaran.

My research builds analytical performance models for GPU computing and validation infrastructure for HPC compilers. I characterize new architectures (NVIDIA Blackwell & Hopper, AMD MI300A) with targeted microbenchmarks, and I ship CI/CD pipelines for LLVM-based compilers across H100, GH200, and MI300A nodes.

I also apply LLMs to compiler validation — automating test generation and using LLM-as-judge to grade the resulting tests.

Recent News

[May 2026] Presented "Microbenchmarking NVIDIA's Blackwell Architecture: An in-depth Architectural Analysis" at IPDPS 2026slides available.
[May 2026] Joining Lawrence Livermore National Laboratory (LLNL) as a summer intern.
[May 2026] New preprint "Microbenchmark-Driven Analytical Performance Modeling Across Modern GPU Architectures" on arXiv — an analytical model for NVIDIA Blackwell and AMD CDNA3 GPUs.
[February 2026] Paper accepted to IPDPS 2026.
[November 2025] Passed Ph.D. preliminary examination.
[August 2025] Attended ATPESC 2025 at Argonne National Laboratory.
[August 2024] Paper accepted at WACCPD @ SC24 — LLM-as-a-Judge for Validation & Verification Testsuites.
[August 2024] Paper accepted at IWOMP — CI/CD framework for OpenMP Offloading validation and benchmarking.

Selected Projects & Outcomes

Headline results — for the rest, see Projects.

GPU Architecture Characterization
Blackwell + MI300A

Microbenchmark studies covering FP4/FP6 tensor cores on NVIDIA Blackwell (RTX 5080) and FP8 + 2:4 sparsity on AMD MI300A. Two preprints in 2025–2026, one accepted at IPDPS 2026.

Compiler Validation at Scale
OpenACC 3.0+ · 4 vendors

OpenACC V&V testsuite shipping against NVIDIA HPC SDK, GCC, Clang/LLVM, and Cray — running on Summit, Frontier, and Perlmutter. Best Poster at SC22 SRC and DARWIN '23.

CI/CD for HPC Compilers
−60% manual testing

Automated GitLab CI pipelines for LLVM OpenMP offloading across H100, GH200, MI210, and MI300A. Used by 15+ DOE exascale applications. Published at IWOMP 2024.

Recent Publications

Full list on Publications or Google Scholar.

Aaron Jarmusch , Sunita Chandrasekaran (2026). Microbenchmarking NVIDIA's Blackwell Architecture: An in-depth Architectural Analysis. IEEE International Parallel and Distributed Processing Symposium (IPDPS 2026) .

Aaron Jarmusch , Connor Vitz , Sunita Chandrasekaran (2026). Execution-Centric Characterization of FP8 Matrix Cores, Asynchronous Execution, and Structured Sparsity on AMD MI300A. arXiv preprint . doi: 10.48550/arXiv.2602.10262.

PDF

Aaron Jarmusch , Nathan Graddon , Sunita Chandrasekaran (2025). Dissecting the NVIDIA RTX Blackwell Architecture with Microbenchmarks. 2025 IEEE 32nd International Conference on High Performance Computing, Data and Analytics Workshop (HiPCW) , pp. 309-310 . doi: 10.1109/HiPCW66559.2025.00109.

Recent Blog Posts

Notes on HPC research, GPU computing, and side projects.