28th Annual
IEEE High Performance Extreme Computing Virtual Conference
23 - 27 September 2024

HPEC 2024 Agenda

All times are EDT (UTC/GMT -04 hours)

Speaker/Presenting Author in Italics

Day Monday Tuesday Wednesday Thursday Friday
10:30-11:00am Session 1-K: Kickoff Session 2-K: Keynote Session 3-K: Keynote Session 4-K: Keynote Session 5-K: Keynote
11:00am-12:15pm Session 1-1: Advanced Multicore Software Technologies Session 2-1: Case Studies, Benchmarking, and Tools 1 Session 3-1: AI / Machine Learning 1 Session 4-1: AI at Scale and AI on the Edge Session 5-1: High Performance Computing 1
12:15-12:30pm Break Session 1-P1 (12:15-13:15): Poster Session 1-1 Break Session 2-P1 (12:15-13:15): Poster Session 2-1 Break Session 3-P1 (12:15-13:15): Poster Session 3-1 Tutorial Session 3-T (12:15-15:45): Spiral Tutorial Break Session 4-P1 (12:15-13:15): Poster Session 4-1 Break Session 5-P1 (12:15-13:15): Poster Session 5-1
12:30-1:45pm Session 1-2: Advanced Processor Architectures Session 2-2: Case Studies, Benchmarking, and Tools 2 Session 3-2: Scaling Research Computing Education Session 4-2: Large AI Models Session 5-2: High Performance Computing 2
1:45-2:15pm Break Session 1-P2 (13:45-14:45): Poster Session 1-2 Break Session 2-P2 (13:45-14:45): Poster Session 2-2 Break Session 3-P2 (13:45-14:45): Poster Session 3-2 Break Session 4-P2 (13:45-14:45): Poster Session 4-2 Break Session 5-P2 (13:45-14:45): Poster Session 5-2
2:15-3:30pm Session 1-3: ASIC and FPGA Advances Session 2-3: Graph Analytics & Network Science 1 Session 3-3: AI / Machine Learning 2 Session 4-3: Innovative Computing Session 5-3: High Performance Computing 3
3:30-3:45pm Break Break Break Break Break
3:45-5:00pm Session 1-4: BRAINS – Building Resilience through Artificial Intelligence for Networked Systems Session 2-4: Graph Analytics & Network Science 2 Session 3-4: General Purpose GPU Computing 1 Session 4-4: Graph Challenge Session 5-4: High Performance Computing 4
5:00-5:30pm Break Break Break
5:30-7:30pm Session Session 2-S1: GraphBLAS BoF Session 3-S1: LLMs: Opportunities & Challenges Session Session

Monday, September 23

1-K: Kickoff Session (10:30-11:00)

Co-Chairs: J. Kepner & A. Reuther

Kickoff Talk: Where We Stand: Education, Research, and High Performance Computing
Peter Fisher (MIT)

1-1: Advanced Multicore Software Technologies Session (11:00-12:15)

Co-Chairs: A. Conard & C. Byun

Supercomputer 3D Digital Twin for User Focused Real-Time Monitoring [Outstanding Paper Award]
William Bergeron, Matthew Hubbell, Daniel Mojica, Albert Reuther, William Arcand, David Bestor, Daniel Burrill, Chansup, Byun, Vijay Gadepally, Michael Houle, Hayden Jananthan, Michael Jones, Piotr Luszczek, Peter Michaleas (MIT Lincoln Laboratory), Lauren Milechin (MIT), Julie Mullen, Andrew Prout, Antonio Rosa, Charles Yee, Jeremy Kepner (MIT Lincoln Laboratory)
Dynamic Task Scheduling with Data Dependency Awareness Using Julia
Rabab MA Alomairy, Felipe Tome, Julian Samaroo, Alan Edelman (MIT)
Optimization Strategies to Accelerate BLAS Operations with ARM SVE
Aniket P Garade, Sushil Pratap Singh, Juliya James, Deepika H V, Haribabu Pasupuleti, S A Kumar, Sudarsan S D (C-DAC)
A Highly Scalable Parallel Design for Data Compression
S Biplab Raut (AMD)
Investigating Resilience of Loops in HPC Programs: A Semantic Approach with LLMs
Hailong Jiang, Jianfeng Zhu (Kent State Univ.), Bo Fang (PNNL), Chao Chen (Intel), Qiang Guan (Kent State Univ.)

1-P1 (12:15-13:15): Poster Session 1-1

Chair(s)/Host(s): K Keville & K. Cain

Performance Benchmarking of H2O AutoML and Individual Models on Malware Detection Tasks
Minakshi Arya (NDSU), Shubhavi Arya (Indiana Univ.), Saatvik Arya (Univ. of Washington)
IOS: A Low Cost Defense to Mitigate Meltdown and Spectre like Attacks
Xin Wang (Virginia Commonwealth Univ.), Wei Zhang (Univ. of Louisville)
Authentication in High Noise Environments using PUF-Based Parallel Probabilistic Searches
Brian Donnelly, Michael Gowanlock (Northern Arizona Univ.)
Intel Xeon Optimization for Efficient Media Workload Acceleration
Karan Puttannaiah, Rajesh Poornachandran (Intel)
Towards an End-to-End Processing-in-DRAM Acceleration of Spectral Library Search
Tianyun Zhang, Eric Tang (Carnegie Mellon Univ.), Farzana A Siddique, Kevin Skadron (Univ. of Virginia), Franz Franchetti (Carnegie Mellon Univ.)
Neuromorphic Circuits with Spiking Astrocytes for Increased Energy Efficiency, Fault Tolerance, and Memory Capacitance
Murat Isik (Drexel Univ.), Kaushal Gawri (SemaAI), Maurizio De Pitta (University Health Network)

1-2: Advanced Processor Architectures Session (12:30-13:45)

Co-Chairs: M. Barnell & K. Gettings

VeBPF Many-Core Architecture for Network Functions in FPGA-based SmartNICs and IoT
Zaid Tahir (Boston Univ.), Ahmed Sanaullah (Red Hat), Sahan Bandara (Boston Univ.), Ulrich Drepper (Red Hat), Martin Herbordt (Boston Univ.)
Hunting the Needle – The Potential of Innovation in Architecture
Peter Kogge (Univ. of Notre Dame), Janice McMahon (Self), Timothy Dysart (Tactical Computing Labs)
Predictive Performance of Photonic SRAM-based In-Memory Computing for Tensor Decomposition [Best Student Paper Award]
Sasindu Wijeratne (USC), Sugeet Sunder (USC Information Sciences Institute), Md Abdullah-Al Kaiser, Akhilesh Jaiswal (Univ. of Wisconsin), Clynn Mathew, Ajey Jacob (USC Information Sciences Institute), Viktor K Prasanna (USC)
A Multilevel Approach For Solving Large-Scale QUBO Problems With Noisy Hybrid Quantum Approximate Optimization
Filip B Maciejewski (NASA/USRA), Bao Gia Bach (Univ. of Delaware), Maxime Dupont (Rigetti Computing), Paul A Lott (Universities Space Research Association), Bhuvanesh Sundar (Rigetti Computing), David Neira (Purdue University/USRA), Ilya Safro (Univ. of Delaware), Davide Venturelli (Universities Space Research Association)

1-P2 (13:45-14:45): Poster Session 1-2

Chair(s)/Host(s): P. Luszczek

Quantum Machine Learning in the Cognitive Domain: Alzheimer’s Disease Study
Emine Akpinar (Yıldız Technical Univ.)
On the Design of the Quantum-Classical Hybrid-Service Architecture
Yi Liu, Yuchou Chang (UMass Dartmouth)
Quantum Computing for Data Calibration in Parallel Magnetic Resonance Imaging Reconstruction
Girish Babu Reddy, Gulfam A Saju, Yi Liu, Yuchou Chang (UMass Dartmouth)
Ultra Low Latency Hardware Optimised Radix-4 FFT for Optical Wireless FPGA Transceiver’s via Hermitian Symmetry Characteristics
Michael Codd, Ciara McDonald (Maynooth Univ.), Yiyue Jiang, Chunan Chen (Northeastern Univ.), Holger Claussen (Tyndall National Institute), Miriam Leeser (Northeastern Univ.), John Dooley (Maynooth Univ.)
Fully Transparent Client-Side Caching for Key-Value Store Applications Using FPGAs
Sahan Bandara, Noah Cherry, Martin Herbordt (Boston Univ.)
Impact of Grid Processing on Signal Cross-Correlation
Rhea Senthil Kumar, Nathan Simard, Jonathan Mathews, Jeremy Kepner, Timothy Collard (MIT Lincoln Laboratory)

1-3: ASIC and FPGA Advances Session (14:15-15:30)

Co-Chairs: C. Long & S. Shankar

A High-Performance Curve25519 and Curve448 Unified Elliptic Curve Cryptography Accelerator
Aniket Banerjee (IISc), Utsav Banerjee (Indian Institute of Science)
Direct RF FPGAs built with Multi-Chip Packaging Overcome Technology Challenges
Marjorie Catt, Dustin J Henderson (Altera)
A Run-Time Configurable NTT Architecture for Homomorphic Encryption Based on 3D Algorithm
Weicong Lu, Xiaojie Chen, Dihu Chen, Tao Su (Sun Yat-Sen Univ.)
Optimizing FPGA Memory Allocation for Matrix-Matrix Multiplication using Bayesian Optimization
Mehmet Gungor, Stratis Ioannidis, Miriam Leeser (Northeastern Univ.)
pc-COP: An Efficient and Configurable 2048-p-Bit Fully-Connected Probabilistic Computing Accelerator for Combinatorial Optimization
Kiran Magar (IISc), Shreya Bharathan (National Inst. of Tech., Tiruchirappalli), Utsav Banerjee (Indian Institute of Science)

1-4: BRAINS – Building Resilience through Artificial Intelligence for Networked Systems Session (15:45-17:30)

Co-Chairs: S. Pisharody & J. Holodnak

Invited Talk: The SWARM Project: Reimagining Workflow and Resource Management Systems with Swarm Intelligence
Prasanna Balaprakash (ORNL)
Invited Talk: The Convergence of Intuitive AI and Exascale Computing: Redefining What’s Possible
Eliu Huerta (ANL)
Invited Talk: The National Cybersecurity Strategy: A Progress Report
Robert Knake (Orkestrel)
Invited Talk: Operational AI/ML Opportunities
Scott Weed (US Air Force)
Hardware Trojan Detection Utilizing Graph Neural Networks and Structural Checking
Hunter Nauman, Jia Di (Univ. of Arkansas)
Break
Composable Mission-Critical Embedded System Architecture for High Assurance
Michael Vai, Eric Simpson, Alice Lee, Huy Nguyen, Jeffrey Hughes, Ben Nahill, Jeffery Lim, Roger Khazan, Sean O’melia (MIT Lincoln Laboratory), Fred Schneider (Cornell University)
What is Normal? A Big Data Observational Science Model of Anonymized Internet Traffic
Jeremy Kepner, Hayden Jananthan, Michael Jones, William Arcand, David Bestor, William Bergeron, Daniel Burrill (MIT Lincoln Laboratory), Aydin Buluc (LBNL), Chansup Byun (MIT Lincoln Laboratory), Timothy Davis (Texas A&M), Vijay Gadepally (MIT Lincoln Laboratory), Daniel Grant (GreyNoise), Michael Houle, Matthew Hubbell, Piotr Luszczek (MIT Lincoln Laboratory), Lauren Milechin (MIT), Chasen Milner, Guillermo Morales (MIT Lincoln Laboratory), Andrew Morris (GreyNoise), Julie Mullen, Ritesh Patel (MIT Lincoln Laboratory), Alex Pentland (MIT), Sandeep Pisharody, Andrew Prout, Albert Reuther, Antonio Rosa, Gabriel Wachman, Charles Yee, Peter Michaleas (MIT Lincoln Laboratory)
Invited Talk: National Centers of Academic Excellence in Cybersecurity Program
Teddy Lynch (NSA)
Keynote Talk: Verification in ML
Shafi Goldwasser (Simons Theory of Computing Institute)

Tuesday, September 24

2-K: Keynote Session (10:30-11:00)

Co-Chairs: J. Kepner & A. Reuther

Keynote Talk: Energy Efficiency Scaling for 2 Decades (EES2) Roadmap for Computing
Tina Kaarsberg (Dept. of Energy)

2-1: Case Studies, Benchmarking, and Tools 1 Session (11:00-12:15)

Co-Chairs: B. Raut & C. Byun

A Neural Network Based GCC Cost Model for Faster Compiler Tuning
Hafsah Shahzad (Boston Univ.), Ahmed Sanaullah, Sanjay Arora, Ulrich Drepper (Red Hat), Martin Herbordt (Boston Univ.)
HENNC: Hardware Engine for Artificial Neural Network-Based Chaotic Oscillators
Mobin Vaziri (Polytechnique Montréal), Shervin Vakili (Institut National de la Recherche Scientifique), Mohammad Mehdi Rahimifar (Interdisciplinary Institute for Technological Innovation), Pierre Langlois (Polytechnique Montréal)
A Graph-Based Algorithm for Optimizing GCC Compiler Flag Settings
Reza Sajjadinasab (Boston Univ.), Sanjay Arora, Ulrich Drepper, Ahmed Sanaullah (Red Hat), Martin Herbordt (Boston Univ.)
Analyzing an In-line Compression on the Matrix Matrix Multiplication Kernel
Steven Platt, Jon C Calhoun (Clemson Univ.)
On the Scalability of Computing Genomic Diversity Using SparkLeBLAST: A Feasibility Study [Outstanding Student Paper Award]
Ritvik R Prabhu, Bernard Moussad (Virginia Tech), Karim Youssef (LLNL), Emil Vatai (RIKEN), Wu-chun Feng (Virginia Tech)

2-P1 (12:15-13:15): Poster Session 2-1

Chair(s)/Host(s): K. Keville

Running GraphBLAS on the FABRIC testbed [Outstanding Short Paper Award]
Vaneshi Ramdhony, Hyunsuk Bang, Nik Sultana (Illinois Institute of Technology)
Solving Hard Combinatorial Problems in Parallel Using Lift-and-Project Preconditioning
Bogdan Zavalnij (Renyi Institute)
Community Detection in Stochastic Block Model Variations
Allison I Gunby-Mann, Peter Chin (Dartmouth Coll.)
Hypersparse Traffic Matrices from Suricata Network Flows using GraphBLAS [Outstanding Short Paper Award]
Michael D Houle, Michael Jones (MIT Lincoln Laboratory), Dan Wallmeyer, Risa Brodeur, Justin Burr (Center for Internet Security), Hayden Jananthan (MIT Lincoln Laboratory), Sam Merrell (Center for Internet Security), Peter Michaleas (MIT Lincoln Laboratory), Anthony Perez (Center for Internet Security), Andrew Prout, Jeremy Kepner (MIT Lincoln Laboratory)

2-2: Case Studies, Benchmarking, and Tools 2 Session (12:30-13:45)

Co-Chairs: C. Valentine & C. Byun

Characterization and Optimization of the Fitting of Quantum Correlation Functions
Pi-Yueh Chuang, Niteya M Shah (Virginia Tech), Patrick Barry, Ian Cloet, Emil Constantinescu (Argonne National Laboratory), Nobuo Sato (Jefferson Lab), Wu-chun Feng (Virginia Tech)
Elucidating US Import Supply Chain Dynamics: A Spatial-Temporal Graph Neural Network Approach
Nikolay Aristov (MIT-CTL), Ziyan Li, Thomas Koch, Elenna Dugundji (MIT)
The Genomic Computing Revolution: Defining the Next Decades of Accelerating Genomics
Harisankar Sadasivan (AMD), Artur Klauser (-NA-), Juergen Hench (University Hospital Basel), Yatish Turakhia (UCSD), Gagandeep Singh, Alberto Zeni (AMD), Sarah Beecroft (Pawsey Supercomputing Research Centre), Satish Narayanasamy (University of Michigan), Jeff Nivala (Univ. of Washington Seattle), Bob Robey (AMD), Onur Mutlu (ETH), Kristof Denolf, Gina Sitaraman (AMD)
Comparison of Vectorization Capabilities of Different Compilers for X86 and ARM CPUs
Nazmus Sakib (New Mexico State Univ.), Tarun Prabhu, Nandakishore Santhi (LANL), John Shalf (LBNL), Abdel-Hameed Badawy (New Mexico State Univ.)

2-P2 (13:45-14:45): Poster Session 2-2

Chair(s)/Host(s): TBD

Performance Analysis of Falcon Post-Quantum Cryptography in Embedded Hardware-Software Integration [Outstanding Short Paper Award]
John Biselx (HES-SO), Andrea Guerrieri (HES-SO and EPFL)
A Performance Analysis of GPU-Aware MPI Implementations Over the Slingshot-11 Interconnect
Michael Beebe (Texas Tech Univ.), Rahulkumar Gayatri, Kevin Gott, Adam Lavely (LBNL), Muhammad Haseeb (Nvidia), Brandon Cook (LBNL), Yong Chen (Texas Tech Univ.)
Application of Virtual Client for Azure Hardware Qualification
Anna Mary Mathew, Bryan DeYoung, Michael Chhor, Sharjil Khan (Microsoft)
Applying Natural Language Processing for Initial Categorizing of Product Descriptions
Nikolay Aristov, Thomas Koch, Elenna Dugundji (MIT-CTL)
Privacy-Preserving AI for Document Understanding with Controlled Unclassified Information
Scott M Sawyer (Paperless Parts, Inc.)

2-3: Graph Analytics & Network Science 1 Session (14:15-15:30)

Co-Chairs: P. Luszczek & O. Green

Multilevel Diffusion Based Spectral Graph Clustering [Outstanding Paper Award]
Malik Lechekhab, Dimosthenis Pasadakis, Olaf Schenk (Univ. della Svizzera italiana)
Batch-Parallel Compressed Sparse Row: A Locality-Optimized Dynamic-Graph Representation [Outstanding Student Paper Award]
Brian Wheatman, Randal Burns (Johns Hopkins Univ.), Helen Xu (Georgia Inst. of Tech.)
Indexed Binary Operations in the GraphBLAS
Tim Mattson (Human Learning Group), Manaswinee Bezbaruah, Matthias Maier (Texas A&M Univ.), Scott McMillan (CMU Software Engineering Institute), Michel Pelletier (Graphegon), Erik Welch (Nvidia), Timothy A Davis (Texas A&M Univ.)
VF2-PS: Parallel and Scalable Subgraph Monomorphism in Arachne
Mohammad Dindoost, Oliver Alvarado Rodriguez (New Jersey Inst. of Tech.), Sounak Bagchi (Edison Academy Magnet School), Palina Pauliuchenka, Zhihui Du, David A Bader (New Jersey Inst. of Tech.)
MESM: A Query-Agnostic and Memory-Efficient Parallel Subgraph Matching Algorithm
Shubhashish Kar, Shaikh Arifuzzaman (UNLV)

2-4: Graph Analytics & Network Science 2 Session (15:45-17:30)

Co-Chairs: X. Sun & TBD

Constant-Memory Graph Coarsening
Christopher Brissette (NVIDIA), George M Slota (RPI)
Algebraic Vertex Ordering of a Sparse Graph for Adjacency Access Locality and Graph Compression
Dimitris Floros (Duke Univ.), Nikos P Pitsianis (Aristotle Univ. of Thessaloniki), Xiaobai Sun (Duke Univ.)
An Efficient Multi-core Parallel Implementation of SSSP Algorithm with Decreasing Delta-stepping
Rakibul Hassan, Shaikh Arifuzzaman (UNLV)
IRIS-MEMFLOW: Data Flow Enabled Portable Memory Orchestration in IRIS Runtime for Diverse Heterogeneity
Mohammad Alaul Haque Monil, Narasinga Rao Miniskar, Seyong Lee, Beau Johnston, Pedro Valero-Lara, Aaron Young, Keita Teranishi, Jeffrey Vetter (ORNL)
A Deployment Tool for Large Scale Graph Analytics Framework Arachne
Garrett R Gonzalez-Rivas, Zhihui Du, David A Bader (New Jersey Inst. of Tech.)

2-S1: GraphBLAS BoF Special (17:30-19:30)

Organizers: T. Mattson, B. Brock & S. McMillan

Report on the binsparse Specification
Ben Brock (Intel)
SuiteSparse Update
Tim Davis (Texas A&M Univ.)
Python GraphBLAS Update
Julia GraphBLAS Update
Raye Kimmerer (MIT)
Postgres GraphBLAS Update
Michel Pelletier (Graphegon)
Wild and Crazy Ideas for GraphBLAS 3.0
Raye Kimmerer (MIT)
Keynote Talk: The Future of Sparse Computing is Compilers
Fredrik Kjolstad (Stanford)

Wednesday, September 25

3-K: Keynote Session (10:30-11:00)

Co-Chairs: J. Kepner & A. Reuther

Keynote Talk: The Building Blocks of Cloud – Research Enablement
Scott Yockel (Harvard Univ.)

3-1: AI / Machine Learning 1 Session (11:00-12:15)

Co-Chairs: P. Luszczek & P. Monticciolo

ModelGauge: Inference Profiling of Deep-Learning Models [Outstanding Paper Award]
Calvin B Gealy (Univ. of Pittsburgh), David Langerman (NSF SHREC), Alan George (NSF Center for High Performance Reconfigurable Computing)
Enhanced Knowledge Graph Attention Networks for Efficient Graph Learning [Outstanding Student Paper Award]
Fernando P Vera Buschmann, Zhihui Du, David A Bader (New Jersey Inst. of Tech.)
Mobile-Optimized Vessel Segmentation for Ultrasound-Guided Surgical Procedures
Mateusz Wolak, Fin Amin, Nancy DeLosa, Brian A Telfer, Benjamin Roop, Lars Gjesteby (MIT Lincoln Laboratory)
GLITCHES: GPU-FPGA LLM Inference Through a Collaborative Heterogeneous System
Fan Yang (Tsinghua Univ., SenseTime Inc.), Xinhao Yang, Hongyi Wang, Zehao Wang, Zhenhua Zhu, Shulin Zeng, Yu Wang (Tsinghua Univ.)
Graphical Learning Optimization and Dimensionality Reduction with Geometric Multi-Resolution Analysis
Felicia Schenkelberg, Allison I Gunby-Mann, Emma Graham (Dartmouth Coll.), Shuoxuan Li (Carnegie Mellon Univ.), Peter Chin (Dartmouth Coll.)

3-P1 (12:15-13:15): Poster Session 3-1

Chair(s)/Host(s): K. Keville

CLIP-Embed-KD: Computationally Efficient Knowledge Distillation Using Embeddings as Teachers [Outstanding Short Paper Award]
Lakshmi V Nair (Lightmatter)
Efficiency of Data Intensive Computing (DIC) in MEMS Research for Data Processing and Analysis
Yeligay Segizbay (Nazarbayev Univ.)
Capturing the Carbon Impact of Deep Learning
Alexis Corona, Sanmukh Kuppannagari (Case Western Reserve Univ.)
Transfer Learning Assisted Parameter Selection for Water-Fat Separation in Dixon MRI
Alan Okinaka (Ursinus College), Gulfam A Saju, Yuchou Chang (UMass Dartmouth)
Traditional Costume Image Classification for Indian States Using Deep Learning
Sahana R Koti, Sahana Channappa Jatti, Anupama S Nandeppanavar, Medha Kudari (KLE Institute of Technology)
Scalable Approach for Analytic Polynomial Subspace Projection Matrices for a Space-Time Covariance Matrix
Faizan Ahmad Khattak, Mohammed Bakhit, Ian K. Proudler, Stephan Weiss (Univ. of Strathclyde)

Tutorial Session: 3-T (12:15-15:45): Spiral Tutorial

Organizer(s): F. Franchetti and M. Franusich

3-2: Scaling Research Computing Education Session (12:30-13:45)

Co-Chairs: J. Mullen, L. Milechin & H. Jananthan

Invited Talk: Scaling Project-based Learning from Education to Research
Joel Grimm (MIT Lincoln Laboratory)
Invited Talk: Educational Game Dev from Start to Finish: A Short Example
Chasen Milner (USAF)
Invited Talk: HPC-ED: A Federated Catalog to Share and Discover CyberTraining Materials
Susan Mehringer (Cornell Center for Advanced Computing)
Invited Talk: The Wide Area Classroom – 10 Years On
John Urbanic (CMU and Pittsburgh Supercomputing Center)

3-P2 (13:45-14:45): Poster Session 3-2

Chair(s)/Host(s): K. Cain

Gesture Controlled System to Automate Shutdown, Screenshot and Volume Toggle
Prisha Bhosale, Ananya Dandekar, Ria Dcosta, Sri Aishwarya Jonnavittula, Shagufta Rajguru (Fr. Conceicao Rodrigues Institute of Technology)
Machine Learning Application for Smart Network Traffic Prediction
Islam Omar (New Mexico State Univ.), Whit Schonbein (SNL), Hameed Badawy (New Mexico State Univ.)
Model to Predict Inventory Demand in Retail SMEs Using CRISP-DM and Machine Learning
Jhomax R Torres, Diego Moises Carpio Andia, Victor Parasi (Univ. Peruana de Ciencias Aplicadas)
Determination of Game-based Design Equilibria by Using Machine Learning Approach
Sara Karimi, Ehsan Ghotbi (Alfred Univ.)
The Analysis of the Sparse Multi-GPU Parallel Method on the Large Sparse Power Flow Calculation
Lei Zeng, Shadi Alawneh (Oakland Univ.)

3-3: AI / Machine Learning 2 Session (14:15-15:30)

Co-Chairs: H. Badawy & C. Long

A Dynamic Weighting Strategy to Mitigate Worker Node Failure in Distributed Deep Learning
Yuesheng Xu, Arielle Carr (Lehigh Univ.)
P-YOLOv8: Efficient and Accurate Real-Time Detection of Distracted Driving
Mohamed R. Elshamy (New Mexico State Univ.), Heba Emara (Pyramids High Institute of Electronic Engineering), Mohamed Nanyang Shoaib (Nanyang Tech. Univ.), Hameed Badawy (New Mexico State Univ.)
Spike-driven YOLO: Ultra Low-Power Object Detection with Neuromorphic Computing
Mark Barnell, Courtney Raymond, Lisa Loomis (AFRL), Francesca Vidal, Daniel Brown, Darrek Isereau (SRC)
Exploring sparse inference with SuiteSparse:GraphBLAS
Deepak Suresh, Timothy A Davis (Texas A&M Univ.)
Improving Regression in Spiking Neural Networks for Oceanographic Data Analysis
Alissa Kane, Yuchou Chang (UMass Dartmouth)

3-4: General Purpose GPU Computing 1 Session (15:45-17:30)

Co-Chairs: S. Gottlieb & N. Prajapati

Benchmarking Thread Block Cluster [Best Paper Award]
Tim Lühnen, Tobias Marschner, Sohan Lal (TU Hamburg)
Understanding the Efficacy of Power Profiles: A Case Study of AMD Instinct MI100 GPU
Ghazanfar Ali, Mert Side (Texas Tech Univ.), Sridutt Bhalachandra (Univ. of North Carolina), Tommy Dang, Alan Sill, Yong Chen (Texas Tech Univ.)
Community Detection for Large Graphs on GPUs with Unified Memory
Emre Dinçer, Işıl Öz (Izmir Institute of Technology)
Invited Talk: From Simple to Hyper Co-Design of HPC Platforms
Gary Grider (Los Alamos National Laboratory)
Invited Talk: Lessons Learned from Implementing the Anonymized Network Sensing Graph Challenge with GPUs and Commodity Software
Siddharth Samsi, Dan Campbell, Emanuel Scoullos, and Oded Green (NVIDIA)

3-S1: LLMs: Opportunities & Challenges Special (17:30-19:30)

Organizers: V. Gadepally & D. Burrill

Invited Talk: Generative AI in the DoD: Use Cases and Challenges
Manuel Xavier Lugo (US Navy)
Invited Talk: How to Make an LLM Understand Human Conversation for Fun & Profit
Kartik Talamadupula (Symbl.ai)
Invited Talk: Innovations for Reducing the Environmental Impact of LLMs
Boris Gamazaychikov (Salesforce)
Invited Talk: Tackling Generative AI Productivity and Efficiency Challenge with Intel® Gaudi® 3 AI Accelerators
Vasudev Lal (Intel)

Thursday, September 26

4-K: Keynote Session (10:30-11:00)

Co-Chairs: J. Kepner & A. Reuther

Keynote Talk: Convergence across the Computing Continuum: The NSF Leadership Class Computing Facility meets the Edge, Interactive Computing, and Low-Precision AI
Dan Stanzione (Texas Advanced Computing Center)

4-1: AI at Scale and AI on the Edge Session (11:00-12:15)

Co-Chairs: B. Sroka & K. Gettings

Breakthrough Edge AI Inference Performance using NorthPole in 3U VPX Form Factor
Filipp Akopyan, William Risk, John Arthur, Andrew Cassidy, Michael Debole, Carlos Ortega Otero, Jun Sawada, Evan Colgan, Michael Criscolo (IBM Research), Phillip Mann (IBM), Heinz Baier, Kai Schleupen, Arnon Amir (IBM Research), Alexander Andreopoulos (IBM), Rathinakumar Appuswamy, Deepika Bablani, Peter Carlson, Pallab Datta, Steven Esser, Myron Flickner, Rajamohan Gandhasri, Guillaume Garreau, Megumi Ito, Jennifer Klamo, Jeffrey Kusnitz, Nathaniel McClatchey, Neil McGlohon, Jeffrey McKinstry, Yutaka Nakamura (IBM Research), Tapan Nayak (IBM Corporation), Jay Sivagnaname, Daniel Smith, Rafael Sousa, Brian Taba, Ignacio Terrizzano, Takanori Ueda, Dharmendra Modha (IBM Research)
Breakthrough LLM Inference Performance using NorthPole
Rathinakumar Appuswamy, Michael Debole, Brian Taba, Steven Esser, Andrew Cassidy, Arnon Amir (IBM Research), Alexander Andreopoulos (IBM), Deepika Bablani, Pallab Datta, Jeffrey Kusnitz, Nathaniel McClatchey, Neil McGlohon, Jeffrey McKinstry (IBM Research), Tapan Nayak (IBM Corporation), Daniel Smith, Rafael Sousa, Ignacio Terrizzano, Filipp Akopyan, Peter Carlson, Rajamohan Gandhasri, Guillaume Garreau, Nelson Gonzalez, Megumi Ito, Jennifer Klamo, Yutaka Nakamura, Carlos Ortega Otero, William Risk, Jun Sawada, Kai Schleupen, Jay Sivagnaname, Matthew Stallone, Takanori Ueda, Myron Flickner, John Arthur (IBM Research), Rameswar Panda, David Cox (MIT-IBM Watson AI Lab), Dharmendra Modha (IBM Research)
A Framework to Enable Algorithmic Design Choice Exploration in DNNs
Timothy Cronin, Sanmukh Kuppannagari (Case Western Reserve Univ.)
Benchmarking Edge AI Platforms for High-Performance ML Inference
Rakshith Jayanth, Neelesh Gupta, Viktor K Prasanna (USC)
Transformers: A Graph Processing Perspective
Manish Sri Sai Surya Routhu, Sai Dheeraj Yanduru, Nathaniel K Tomczak, Sanmukh Kuppannagari (Case Western Reserve Univ.)

4-P1 (12:15-13:15): Poster Session 4-1

Chair(s)/Host(s): K. Keville

Perspective-Aware Ai (PAi) for Augmenting Critical Decision Making
Marjan Alirezaie, Daniel Platnick (Flybits), Hossein Rahnama, Dava Newman, Alex Pentland (MIT)
Evaluating the Impact of Noisy Blades on PROPELLER MRI Reconstruction Quality
Gulfam A Saju, Marjan Akhi, Yuchou Chang (UMass Dartmouth)
CompJouleS: Energy Estimate Tool for Machine Learning Algorithms for Multiple Applications in CPU, GPU, and FPGA Architectures
Murat Isik (Stanford Univ.), Jens E. Pedersen (SLAC National Accelerator Laboratory), Vedant Karia (Univ. of Texas at San Antonio), Sadasivan Shankar (Stanford Univ.)
Power Efficient Deep Learning Acceleration using Intel Xeon® Processors
Xiaofei Jiang, Mona Minakshi, Rajesh Poornachandran, Shamima Najnin (Intel)
Impact of Estimation Errors of a Matrix of Transfer Functions onto Its Analytic Singular Values and Their Potential Algorithmic Extraction
Mohammed Bakhit, Faizan Ahmad Khattak, Ian K. Proudler, Stephan Weiss (Univ. of Strathclyde)
Disaggregation Patterns for Secure AI Systems
Mohamed Ghamri, Marc A Lacoste, Divi De Lacour (Orange)

4-2: Large AI Models Session (12:30-13:45)

Co-Chairs: N. Pitsianis & B. Sroka

MonoCoder: Domain-Specific Code Language Model for HPC Codes and Tasks [Outstanding Paper Award]
Tal Kadosh (Ben-Gurion Univ., IAEC), Niranjan Hasabnis (Intel), Vy Vo (Intel Labs), Nadav Schneider (Ben-Gurion University), Neva Krien (Independent), Mihai Capotă (Intel Labs), Abdul Wasay, Guy Tamir (Intel), Theodore L Willke, Nesreen Ahmed (Intel Labs), Yuval Pinter (Ben-Gurion University), Tim Mattson (Human Learning Group), Gal Oren (Technion, Stanford Univ.)
LLM Inference Serving: Survey of Recent Advances and Opportunities [Outstanding Paper Award]
Baolin Li, Yankai Jiang (Northeastern Univ.), Vijay Gadepally (MIT Lincoln Laboratory), Devesh Tiwari (Northeastern Univ.)
Enhancing Code Translation in Language Models with Few-Shot Learning via Retrieval-Augmented Generation
Manish Bhattarai, Javier E Santos, Shawn M Jones, Ayan Biswas, Boian Alexandroe, Daniel O Malley (LANL)
High Performance Im2win and Direct Convolutions using Three Tensor Layouts on SIMD Architectures
Xiang Fu, Xinpeng Zhang, Jixiang Ma (Nanchang Hangkong Univ.), Peng Zhao (Microsoft), Shuai Lu (Nanchang Hangkong Univ.), Xu Liu (Univ. of Washington)
Accelerating Sensor Fusion in Neuromorphic Computing: A Case Study on Loihi-2
Murat Isik (Drexel Univ.), Karn Tiwari (IIS Bangalore), Burak Eryilmaz (Bilkent Univ.), Ismail Can Dikmen (TEMSA)

4-P2 (13:45-14:45): Poster Session 4-2

Chair(s)/Host(s): TBD

NeuroVM: Dynamic Neuromorphic Hardware Virtualization
Murat Isik (Drexel Univ.), Kayode Inadagbo (Prairie View A&M Univ.), Ismail Can Dikmen (TEMSA)
LLMs for Closed-Library Multi-Document Query, Test Generation, and Evaluation
Claire Randolph (Dept. of the Air Force), Adam Michaleas, Darrell O Ricke (MIT Lincoln Laboratory)
LLM-Based Task Planning for Navigating Companion Robot from Emotion Signals
Yuchou Chang (UMass Dartmouth), Huy Anh Pham (Intelligent Medical Objects, Inc.), Gulfam A Saju (UMass Dartmouth)
Large Multimodal Model for Simulating Big Training Data in Deep PROPELLER MRI
Gulfam A Saju, Marjan Akhi, Yuchou Chang (UMass Dartmouth)
Artificial Intelligence Solution on Intel Xeon Processor Power and Performance Engineering
Zhongbin Liu, Xiaofei Jiang, Jiajia Zhang (Intel)
Boosting the Performance of Reinforcement Learning-based Task Scheduling using Offline Inference
Chedi Morchdi (Univ. of Utah), Cheng-Hsiang Chiu (Univ. of Wisconsin), Yi Zhou (Univ. of Utah), Tsung-Wei Huang (Univ. of Wisconsin)

4-3: Innovative Computing Session (14:15-15:30)

Co-Chairs: K. Keville & P. Luszczek

Reinforcement Learning-generated Topological Order for Dynamic Task Graph Scheduling
Cheng-Hsiang Chiu (Univ. of Wisconsin), Chedi Morchdi, Yi Zhou (Univ. of Utah), Boyang Zhang, Che Chang, Tsung-Wei Huang (Univ. of Wisconsin)
FPGA Acceleration for Scalable High-Resolution OPIR Target Detection
Daniel C Stumpp (Univ. of Pittsburgh), Alan George (NSF Center for High Performance Reconfigurable Computing)
Hybrid Computing Architecture Based on Analog Phase-Change Memory Chips for Deep Neural Network Training
Zhenhao Jiao (Univ. of Science and Technology of China), Tao Hong, Xiaogang Chen, Weibang Dai (Shanghai Institute of Microsystem and Information Technology, CAS), Chengcai Tu (Donghua University), Shunfen Li, Houpeng Chen, Zhitang Song (Shanghai Institute of Microsystem and Information Technology, CAS)
Exploring the Trade-off Between Repair Time and Reliability in Large Scale Cluster Computers: A Simulation-Based Approach
Leslie Horace (Georgia Inst. of Tech.), Craig Walker, William M Jones (Coastal Carolina Univ.), Nathan DeBardeleben, Vivian Hafener, Steven Senator (LANL)
Experiences with VITIS AI for Deep Reinforcement Learning
Nabayan Chaudhury, Atharva M Gondhalekar, Wu-chun Feng (Virginia Tech)

4-4: Graph Challenge Session (15:45-17:30)

Co-Chairs: J. Kepner & A. Reuther

Mercury: Efficient Subgraph Matching on GPUs with Hybrid Scheduling
Zhiheng Lin (Inst. of Computing Tech, CAS), Changjie Xu (UCAS), Ke Meng, Guangming Tan (Inst. of Computing Tech, CAS)
Towards Faster Graph Partitioning via Pre-training and Inductive Inference
Meng Qin (HKUST), Chaorui Zhang (Huawei), Yu Gao (Independent), Yibing Ding, Weipeng Jiang (Huawei), Weixi Zhang (Huawei Technologies), Wei Han (Huawei), Bo Bai (Huawei Technologies)
Distributed-Memory Sparse Deep Neural Network Inference Using Global Arrays
Sayan Ghosh, Bruce Palmer, Andres Marquez (PNNL)
Anonymized Network Sensing Graph Challenge
Hayden R Jananthan, Michael Jones, William Arcand, David Bestor, William Bergeron, Daniel Burrill (MIT Lincoln Laboratory), Aydin Buluc (LBNL), Chansup Byun (MIT Lincoln Laboratory), Timothy Davis (Texas A&M), Vijay Gadepally (MIT Lincoln Laboratory), Daniel Grant (GreyNoise), Michael Houle, Matthew Hubbell, Piotr Luszczek, Peter Michaleas (MIT Lincoln Laboratory), Lauren Milechin (MIT), Chasen Milner, Guillermo Morales (MIT Lincoln Laboratory), Andrew Morris (GreyNoise), Julie Mullen, Ritesh Patel (MIT Lincoln Laboratory), Alex Pentland (MIT), Sandeep Pisharody, Andrew Prout, Albert Reuther, Antonio Rosa, Gabriel Wachman, Charles Yee, Jeremy Kepner (MIT Lincoln Laboratory)
Extracting TCPIP Headers at High Speed for the Anonymized Network Traffic Graph Challenge
Zhaoyang Han, Andrew Briasco-Stewart (Northeastern Univ.), Michael Zink (UMass Amherst), Miriam Leeser (Northeastern Univ.)
Sans: Streaming Anonymized Network Sensing
Ketai Zhao, Yuhang Zhou, Hong Xu Pan, Zhibin Wang, Sheng Zhong, Chen Tian (Nanjing Univ.)

Friday, September 27

5-K: Keynote Session (10:30-11:00)

Co-Chairs: J. Kepner & A. Reuther

Keynote Talk: AI/ML Applications for Global Security
Eric Evans (MIT)

5-1: High Performance Computing 1 Session (11:00-12:15)

Co-Chairs: D. Ricke & B. Raut

Cycle-Stealing in Load-Imbalanced HPC Applications [Outstanding Student Paper Award]
Po Hao Chen (Brown University), Akshaya Bali, Shining Yang, Pouya Haghi, Carlton Knox, Benjamin Li (Boston Univ.), Amr Abouelmagd, Anthony Skjellum (Tennessee Tech), Martin Herbordt (Boston Univ.)
Tightly-Coupled FPGA Accelerator for Molecular Dynamics Simulation: Hardware-Software Co-Design and Fine-Grained Task Management [Outstanding Student Paper Award]
Zekang Cheng, Zerong S He, Xi Jin (Univ. of Science and Technology of China)
MST in Incremental Graphs Through Tree Contractions
Akanksha Dwivedi, Sameer Sharma, Dip Sankar Banerjee (IIT Jodhpur)
Syndeo: Portable Ray Clusters with Secure Containerization
William Li, Rodney S Lafuente Mercado, Jaime Pena, Ross E Allen (MIT Lincoln Laboratory)
Evaluating One-Sided Communication on Graph500 with MPI-RMA and OpenSHMEM
Jefferson Boothe (Univ. of Pittsburgh), Alan George (NSF Center for High Performance Reconfigurable Computing)

5-P1 (12:15-13:15): Poster Session 5-1

Chair(s)/Host(s): K. Keville

Synthesizing Numerical Linear Algebra using Julia [Best Short Paper Award]
Sophie Xuan, Rabab MA Alomairy, Evelyne Ringoot, Felipe Tome, Julian Samaroo, Alan Edelman (MIT)
Towards LibraryX: A Framework for Cross-Library Call Optimization [Outstanding Short Paper Award]
Sanil Rao, Anant Prakash, Franz Franchetti (Carnegie Mellon Univ.)
Compressed Cannon’s Algorithm
Louis Jencka, Amanda J Bienz (Univ. of New Mexico)
Multiplication of Sparse Matrices and their Transpose using Compressed Sparse Diagonals
Sardar Anisul Haque (Oryx Universal College in Partnership with LJMU (UK)), Mohammad Tanvir Parvez (Qassim Univ.), Shahadat Hossain (Univ. of Northern British Columbia)
Augmenting HPC Profilers with Analysis Capabilities
Abhishek N Patil, Shamjith K V, Senthil Kumar RK, Dr. S D Sudarsan (C-DAC)
Explainable DiGCNs for Decomposition of Opaque Node Ranking Functions
Vishal Chandra (MIT Lincoln Laboratory)

5-2: High Performance Computing 2 Session (12:30-13:45)

Co-Chairs: D. Ricke & J. Mullen

An Efficient Multi-DNN Accelerator Based on Multiple Systolic Array
Jianjun Chen, Han Jiao, Wenjin Huang, Yihua Huang (Sun Yat-Sen Univ.)
JACC.shared: Leveraging HPC Metaprogramming and Performance Portability for Computations That Use Shared Memory GPUs
Pedro Valero-Lara, William Godoy, Keita Teranishi, Jeffrey Vetter (ORNL)
OCO-GAT: An Accelerator for Graph Attention Network with Optimized Calculation Order
Qi Liu, Wenjin Huang, Wenlu Peng, Yihua Huang (Sun Yat-Sen Univ.)
Task-Level Parallelism for the Multifrontal Method in Tightly Coupled CPU-FPGA Architectures
Zerong S He, Zekang Cheng, Zhongguang Xu, Xi Jin (Univ. of Science and Technology of China)
LLload: An Easy-to-Use HPC Utilization Tool
Chansup Byun, Albert Reuther, Julie Mullen, LaToya Anderson, William Arcand, Bill Bergeron, David Bestor, Alexander Bonn, Daniel Burrill, Vijay Gadepally, Michael Houle, Matthew Hubbell, Hayden Jananthan, Michael Jones, Piotr Luszczek, Peter Michaleas (MIT Lincoln Laboratory), Lauren Milechin (MIT), Guillermo Morales, Andrew Prout, Antonio Rosa, Charles Yee, Jeremy Kepner (MIT Lincoln Laboratory)

5-P2 (13:45-14:45): Poster Session 5-2

Chair(s)/Host(s): K. Keville

A Framework for Analyzing the Performance of Sparse Matrix and Graph Operations
Khaled Abdelaal, Richard M Veras (Univ. of Oklahoma)
Efficient Eigenvalue Computation of Parahermitian Matrices Using Neural Networks
Diyari A. Hassan (Qaiwan Intl. Univ.), Yunus Egi, Soydan Redif (American Univ. of Middle East)
Towards a RISC-V Instruction Set Extension for Multi-word Arithmetic
Youngjin Eum, Naifeng Zhang, Larry Tang, Franz Franchetti (Carnegie Mellon Univ.)
AstraMQ: Distributed MQTT Broker
Rohan M Doshi, Sanika Inamdar, Tanmay Karmarkar, Madhuri S Wakode (Pune Inst. Of Computer Tech.)
Computational and Numerical Properties of a Broadband Subspace-Based Likelihood Ratio Test
Cornelius Pahalson, Louise Crockett, Stephan Weiss (Univ. of Strathclyde)

5-3: High Performance Computing 3 Session (14:15-15:30)

Co-Chairs: J. Mullen & N. Pitsianis

Persistent and Partitioned MPI for Stencil Communication
Gerald Collom (Univ. of New Mexico), Jason Burmark, Olga Pearce (LLNL), Amanda J Bienz (Univ. of New Mexico)
HPC Network Simulation Tuning via Automatic Extraction of Hardware Parameters
Joshua Suetterlein, Stephen Young, Jesun S Firoz, Joseph Manzano, Nathan Tallent, Ryan Friese, Kevin Barker, Timothy Stavenger (PNNL)
Accelerating Multi-Agent DDPG Training on Multi-GPU Platforms
Samuel Wiggins, Viktor K Prasanna (USC)
Binary Bleed: Fast Distributed and Parallel Method for Automatic Model Selection
Ryan C Barron, Maksim Eren (LANL), Manish Bhattarai (Los Alamos National Lab), Ismael Boureima (LANL), Cynthia Matuszek (UMBC), Boian Alexandroe (LANL)
GPU Sharing with Triples Mode
Chansup Byun, Albert Reuther, LaToya Anderson, William Arcand, Bill Bergeron, David Bestor, Alexander Bonn, Daniel Burrill, Vijay Gadepally, Michael Houle, Matthew Hubbell, Hayden Jananthan, Michael Jones, Piotr Luszczek, Peter Michaleas (MIT Lincoln Laboratory), Lauren Milechin (MIT), Guillermo Morales, Julie Mullen, Andrew Prout, Antonio Rosa, Charles Yee, Jeremy Kepner (MIT Lincoln Laboratory)

5-4: High Performance Computing 4 Session (15:45-17:30)

Co-Chairs: N. Pitsianis & C. Byun

BB-CVXOPT: Basic Block Execution Count Estimation and Extrapolation using Constrained Convex Optimization
Youssef A Aly (New Mexico State Univ.), Atanu Barai, Nandakishore Santhi (LANL), Hameed Badawy (New Mexico State Univ.)
Parallel Online Directed Acyclic Graph Exploration for Atlasing Soft-Matter Assembly Configuration Spaces
Rahul Prabhu, Amit Verma, Meera Sitharam (Univ. of Florida)
Towards Just-in-Time Instruction Generation for Accelerated Sparse Matrix-Matrix Multiplication on GPUs
Seth Kay, H. Howie Huang (George Washington Univ.)
HBM-based Hardware Accelerator for GNN Sampling and Aggregation
Yuchen Gui (Univ. of Science and Technology of China), Qizhe Wu, Wei Yuan (USTC), Huawen Liang, Xiaotian Wang, Xi Jin (Univ. of Science and Technology of China)
GPU Accelerated Construction of Time Respecting Data Structure for Temporal Graphs
Animan Naskar, Venkata Kalyan Tavva (IIT Ropar), Subhasis Banerjee (Shell India Markets Pvt. Ltd.)
Comparative Analysis of GCC and LLVM for Performance Optimization on Aarch64
Mriganka Bezbaruah, Samruddhi Dhakulkar, Prachi Pandey, Haribabu Pasupuleti, S A Kumar, S D Sudarsan (C-DAC)
Benchmarking the Performance of Large Language Models on the Cerebras Wafer Scale Engine [Outstanding Student Paper Award]
Zuoning Zhang, Dhruv Parikh (USC), Youning Zhang (UC, Berkeley), Viktor K Prasanna (USC)

IEEE HPEC 2023