skip to content
Aaron Jarmusch

Projects

Selected work in GPU performance characterization, compiler validation, and CI/CD for HPC. Each project is summarized by what shipped and what was measured — not the methodology behind it. Links go straight to source, papers, and dashboards.

Research Projects

OpenACC Validation and Verification (V&V) Testsuite logo

Leading the development of a comprehensive validation and verification testsuite for the OpenACC Programming Model. This project ensures compiler compliance and reliability across diverse GPU architectures, supporting the broader HPC community.

OpenACC 3.0+
spec coverage
4 vendors
NVIDIA HPC SDK, GCC, Clang/LLVM, Cray
3 DOE systems
Summit, Frontier, Perlmutter
Best Poster — SC22
ACM SRC, DARWIN '23
OpenACCCUDAGPU ComputingCompiler TestingC/C++
Stewardship for Programming Systems and Tools (S4PST) logo

Contributing to a predictive ecosystem for HPC software sustainability. Our CI/CD pipeline for LLVM Clang and new-Flang's OpenMP Offloading implementation runs comprehensive test suites and benchmarks on cutting-edge GPU hardware including NVIDIA H100, GH200, and AMD MI210/MI300A.

4 architectures
H100, GH200, MI210, MI300A
15+ apps
DOE exascale workloads validated
−60%
manual testing overhead
IWOMP '24
published, peer reviewed
OpenMPLLVMCI/CDDockerGitLabHPCGPU Benchmarking

Developing novel approaches to leverage Large Language Models for automated test generation and validation in compiler verification workflows. This research explores the intersection of AI and systems software to improve testing efficiency and coverage.

2 papers
FGCS journal + SC24-W workshop
GPT-class
LLM-as-judge for compiler tests
OpenACC + OpenMP
directive coverage
LLMsGPT ModelsAutomated TestingCompiler ValidationPythonAI/ML

Comprehensive continuous integration and deployment framework for validating OpenMP GPU offloading implementations. Focuses on automated testing, performance benchmarking, and cross-platform compatibility across diverse GPU architectures.

IWOMP '24
20th Intl. Workshop on OpenMP
GitLab CI
automated H100/GH200/MI300A runs
LLVM Clang + Flang
compilers under test
OpenMPCI/CDGPU OffloadingPerformance TestingLLVMClang

Developing unified testing methodologies for GPU computing across NVIDIA (H100, GH200) and AMD (MI300A) architectures. Ensures consistent performance and reliability for HPC applications across diverse hardware platforms.

FP8, 2:4 sparsity
characterized on MI300A
5th-gen TC
Blackwell FP4/FP6 microbenchmarks
IPDPS '26
accepted
CUDAHIPROCmGPU ComputingPerformance AnalysisCross-Platform Testing

Side Projects & Coursework

Self-contained projects from coursework, hackathons, and exploratory work.

Quantum Error Correction — Surface Code Study

Quantum

Course-driven deep dive on stabilizer codes and the surface code, capped with a written report and a 30-page technical write-up.

30+ pages
technical writeup
Surface code
decoder analysis

MARS — Multi-Architecture Roofline Study

Performance Analysis

Roofline-model performance characterization of GEMM and FDTD on CPU+GPU. Measured cache-hit behavior and bandwidth-bound vs compute-bound regimes across problem sizes.

GEMM + FDTD
kernels measured
CPU + GPU
rooflines compared

HenStreet Hacks 2025 — SallieMae Track

Hackathon

Built a hackathon prototype in a fast-turnaround team setting for the SallieMae industry track at the University of Delaware.

48-hour
build window
Team of 4
cross-functional
Looking for code? Most of my open-source work lives on GitHub and in the OpenACCV-V repository.