HPEC 2024 Agenda

All times are EDT (UTC/GMT -04 hours)

Speaker/Presenting Author in Italics

DayMondayTuesdayWednesdayThursdayFriday
10:30-11:00amSession 1-K: KickoffSession 2-K: KeynoteSession 3-K: KeynoteSession 4-K: KeynoteSession 5-K: Keynote
11:00am-12:15pmSession 1-1: Advanced Multicore Software Technologies Session 2-1: Case Studies, Benchmarking, and Tools 1 Session 3-1: AI / Machine Learning 1  Session 4-1: AI at Scale and AI on the Edge Session 5-1: High Performance Computing 1
12:15-12:30pmBreakSession 1-P1 (12:15-13:15): Poster Session 1-1BreakSession 2-P1 (12:15-13:15): Poster Session 2-1BreakSession 3-P1 (12:15-13:15): Poster Session 3-1Tutorial Session 3-T (12:15-15:45): Spiral TutorialBreakSession 4-P1 (12:15-13:15): Poster Session 4-1BreakSession 5-P1 (12:15-13:15): Poster Session 5-1
12:30-1:45pmSession 1-2: Advanced Processor ArchitecturesSession 2-2: Case Studies, Benchmarking, and Tools 2Session 3-2: Scaling Research Computing EducationSession 4-2: Large AI ModelsSession 5-2: High Performance Computing 2
1:45-2:15pmBreakSession 1-P2 (13:45-14:45): Poster Session 1-2BreakSession 2-P2 (13:45-14:45): Poster Session 2-2BreakSession 3-P2 (13:45-14:45): Poster Session 3-2BreakSession 4-P2 (13:45-14:45): Poster Session 4-2BreakSession 5-P2 (13:45-14:45): Poster Session 5-2
2:15-3:30pmSession 1-3: ASIC and FPGA AdvancesSession 2-3: Graph Analytics & Network Science 1Session 3-3: AI / Machine Learning 2Session 4-3: Innovative ComputingSession 5-3: High Performance Computing 3
3:30-3:45pmBreakBreakBreakBreakBreak
3:45-5:00pmSession 1-4: BRAINS – Building Resilience through Artificial Intelligence for Networked Systems Session 2-4: Graph Analytics & Network Science 2 Session 3-4: General Purpose GPU Computing 1 Session 4-4: Graph Challenge Session 5-4: High Performance Computing 4
5:00-5:30pm BreakBreak Break
5:30-7:30pmSession Session 2-S1: GraphBLAS BoFSession 3-S1: LLMs: Opportunities & ChallengesSession Session

Monday, September 23

1-K: Kickoff Session (10:30-11:00)

Co-Chairs: J. Kepner & A. Reuther

Kickoff Talk: Where We Stand: Education, Research, and High Performance Computing

Peter Fisher (MIT)

1-1: Advanced Multicore Software Technologies Session (11:00-12:15)

Co-Chairs: A. Conard & C. Byun

Supercomputer 3D Digital Twin for User Focused Real-Time Monitoring [Outstanding Paper Award]

William Bergeron, Matthew Hubbell, Daniel Mojica, Albert Reuther, William Arcand, David Bestor, Daniel Burrill, Chansup, Byun, Vijay Gadepally, Michael Houle, Hayden Jananthan, Michael Jones, Piotr Luszczek, Peter Michaleas (MIT Lincoln Laboratory), Lauren Milechin (MIT), Julie Mullen, Andrew Prout, Antonio Rosa, Charles Yee, Jeremy Kepner (MIT Lincoln Laboratory)
Dynamic Task Scheduling with Data Dependency Awareness Using Julia

Rabab MA Alomairy, Felipe Tome, Julian Samaroo, Alan Edelman (MIT)
Optimization Strategies to Accelerate BLAS Operations with ARM SVE

Aniket P Garade, Sushil Pratap Singh, Juliya James, Deepika H V, Haribabu Pasupuleti, S A Kumar, Sudarsan S D (C-DAC)
A Highly Scalable Parallel Design for Data Compression

S Biplab Raut (AMD)
Investigating Resilience of Loops in HPC Programs: A Semantic Approach with LLMs

Hailong Jiang, Jianfeng Zhu (Kent State Univ.), Bo Fang (PNNL), Chao Chen (Intel), Qiang Guan (Kent State Univ.)

1-P1 (12:15-13:15): Poster Session 1-1

Chair(s)/Host(s): K Keville & K. Cain

Performance Benchmarking of H2O AutoML and Individual Models on Malware Detection Tasks

Minakshi Arya (NDSU), Shubhavi Arya (Indiana Univ.), Saatvik Arya (Univ. of Washington)
IOS: A Low Cost Defense to Mitigate Meltdown and Spectre like Attacks

Xin Wang (Virginia Commonwealth Univ.), Wei Zhang (Univ. of Louisville)
Authentication in High Noise Environments using PUF-Based Parallel Probabilistic Searches

Brian Donnelly, Michael Gowanlock (Northern Arizona Univ.)
Intel Xeon Optimization for Efficient Media Workload Acceleration

Karan Puttannaiah, Rajesh Poornachandran (Intel)
Towards an End-to-End Processing-in-DRAM Acceleration of Spectral Library Search

Tianyun Zhang, Eric Tang (Carnegie Mellon Univ.), Farzana A Siddique, Kevin Skadron (Univ. of Virginia), Franz Franchetti (Carnegie Mellon Univ.)
Neuromorphic Circuits with Spiking Astrocytes for Increased Energy Efficiency, Fault Tolerance, and Memory Capacitance

Murat Isik (Drexel Univ.), Kaushal Gawri (SemaAI), Maurizio De Pitta (University Health Network)

1-2: Advanced Processor Architectures Session (12:30-13:45)

Co-Chairs: M. Barnell & K. Gettings

VeBPF Many-Core Architecture for Network Functions in FPGA-based SmartNICs and IoT

Zaid Tahir (Boston Univ.), Ahmed Sanaullah (Red Hat), Sahan Bandara (Boston Univ.), Ulrich Drepper (Red Hat), Martin Herbordt (Boston Univ.)
Hunting the Needle – The Potential of Innovation in Architecture

Peter Kogge (Univ. of Notre Dame), Janice McMahon (Self), Timothy Dysart (Tactical Computing Labs)
Predictive Performance of Photonic SRAM-based In-Memory Computing for Tensor Decomposition [Best Student Paper Award]

Sasindu Wijeratne (USC), Sugeet Sunder (USC Information Sciences Institute), Md Abdullah-Al Kaiser, Akhilesh Jaiswal (Univ. of Wisconsin), Clynn Mathew, Ajey Jacob (USC Information Sciences Institute), Viktor K Prasanna (USC)
A Multilevel Approach For Solving Large-Scale QUBO Problems With Noisy Hybrid Quantum Approximate Optimization

Filip B Maciejewski (NASA/USRA), Bao Gia Bach (Univ. of Delaware), Maxime Dupont (Rigetti Computing), Paul A Lott (Universities Space Research Association), Bhuvanesh Sundar (Rigetti Computing), David Neira (Purdue University/USRA), Ilya Safro (Univ. of Delaware), Davide Venturelli (Universities Space Research Association)

1-P2 (13:45-14:45): Poster Session 1-2

Chair(s)/Host(s): P. Luszczek

Quantum Machine Learning in the Cognitive Domain: Alzheimer’s Disease Study

Emine Akpinar (Yıldız Technical Univ.)
On the Design of the Quantum-Classical Hybrid-Service Architecture

Yi Liu, Yuchou Chang (UMass Dartmouth)
Quantum Computing for Data Calibration in Parallel Magnetic Resonance Imaging Reconstruction

Girish Babu Reddy, Gulfam A Saju, Yi Liu, Yuchou Chang (UMass Dartmouth)
Ultra Low Latency Hardware Optimised Radix-4 FFT for Optical Wireless FPGA Transceiver’s via Hermitian Symmetry Characteristics

Michael Codd, Ciara McDonald (Maynooth Univ.), Yiyue Jiang, Chunan Chen (Northeastern Univ.), Holger Claussen (Tyndall National Institute), Miriam Leeser (Northeastern Univ.), John Dooley (Maynooth Univ.)
Fully Transparent Client-Side Caching for Key-Value Store Applications Using FPGAs

Sahan Bandara, Noah Cherry, Martin Herbordt (Boston Univ.)
Impact of Grid Processing on Signal Cross-Correlation

Rhea Senthil Kumar, Nathan Simard, Jonathan Mathews, Jeremy Kepner, Timothy Collard (MIT Lincoln Laboratory)

1-3: ASIC and FPGA Advances Session (14:15-15:30)

Co-Chairs: C. Long & S. Shankar

A High-Performance Curve25519 and Curve448 Unified Elliptic Curve Cryptography Accelerator

Aniket Banerjee (IISc), Utsav Banerjee (Indian Institute of Science)
Direct RF FPGAs built with Multi-Chip Packaging Overcome Technology Challenges

Marjorie Catt, Dustin J Henderson (Altera)
A Run-Time Configurable NTT Architecture for Homomorphic Encryption Based on 3D Algorithm

Weicong Lu, Xiaojie Chen, Dihu Chen, Tao Su (Sun Yat-Sen Univ.)
Optimizing FPGA Memory Allocation for Matrix-Matrix Multiplication using Bayesian Optimization

Mehmet Gungor, Stratis Ioannidis, Miriam Leeser (Northeastern Univ.)
pc-COP: An Efficient and Configurable 2048-p-Bit Fully-Connected Probabilistic Computing Accelerator for Combinatorial Optimization

Kiran Magar (IISc), Shreya Bharathan (National Inst. of Tech., Tiruchirappalli), Utsav Banerjee (Indian Institute of Science)

1-4: BRAINS – Building Resilience through Artificial Intelligence for Networked Systems Session (15:45-17:30)

Co-Chairs: S. Pisharody & J. Holodnak

Invited Talk: The SWARM Project: Reimagining Workflow and Resource Management Systems with Swarm Intelligence

Prasanna Balaprakash (ORNL)
Invited Talk: The Convergence of Intuitive AI and Exascale Computing: Redefining What’s Possible

Eliu Huerta (ANL)
Invited Talk: The National Cybersecurity Strategy: A Progress Report

Robert Knake (Orkestrel)
Invited Talk: Operational AI/ML Opportunities

Scott Weed (US Air Force)
Hardware Trojan Detection Utilizing Graph Neural Networks and Structural Checking

Hunter Nauman, Jia Di (Univ. of Arkansas)
Break

Composable Mission-Critical Embedded System Architecture for High Assurance

Michael Vai, Eric Simpson, Alice Lee, Huy Nguyen, Jeffrey Hughes, Ben Nahill, Jeffery Lim, Roger Khazan, Sean O’melia (MIT Lincoln Laboratory), Fred Schneider (Cornell University)
What is Normal? A Big Data Observational Science Model of Anonymized Internet Traffic

Jeremy Kepner, Hayden Jananthan, Michael Jones, William Arcand, David Bestor, William Bergeron, Daniel Burrill (MIT Lincoln Laboratory), Aydin Buluc (LBNL), Chansup Byun (MIT Lincoln Laboratory), Timothy Davis (Texas A&M), Vijay Gadepally (MIT Lincoln Laboratory), Daniel Grant (GreyNoise), Michael Houle, Matthew Hubbell, Piotr Luszczek (MIT Lincoln Laboratory), Lauren Milechin (MIT), Chasen Milner, Guillermo Morales (MIT Lincoln Laboratory), Andrew Morris (GreyNoise), Julie Mullen, Ritesh Patel (MIT Lincoln Laboratory), Alex Pentland (MIT), Sandeep Pisharody, Andrew Prout, Albert Reuther, Antonio Rosa, Gabriel Wachman, Charles Yee, Peter Michaleas (MIT Lincoln Laboratory)
Invited Talk: National Centers of Academic Excellence in Cybersecurity Program

Teddy Lynch (NSA)
Keynote Talk: Verification in ML

Shafi Goldwasser (Simons Theory of Computing Institute)

Tuesday, September 24

2-K: Keynote Session (10:30-11:00)

Co-Chairs: J. Kepner & A. Reuther

Keynote Talk: Energy Efficiency Scaling for 2 Decades (EES2) Roadmap for Computing

Tina Kaarsberg (Dept. of Energy)

2-1: Case Studies, Benchmarking, and Tools 1 Session (11:00-12:15)

Co-Chairs: B. Raut & C. Byun

A Neural Network Based GCC Cost Model for Faster Compiler Tuning

Hafsah Shahzad (Boston Univ.), Ahmed Sanaullah, Sanjay Arora, Ulrich Drepper (Red Hat), Martin Herbordt (Boston Univ.)
HENNC: Hardware Engine for Artificial Neural Network-Based Chaotic Oscillators

Mobin Vaziri (Polytechnique Montréal), Shervin Vakili (Institut National de la Recherche Scientifique), Mohammad Mehdi Rahimifar (Interdisciplinary Institute for Technological Innovation), Pierre Langlois (Polytechnique Montréal)
A Graph-Based Algorithm for Optimizing GCC Compiler Flag Settings

Reza Sajjadinasab (Boston Univ.), Sanjay Arora, Ulrich Drepper, Ahmed Sanaullah (Red Hat), Martin Herbordt (Boston Univ.)
Analyzing an In-line Compression on the Matrix Matrix Multiplication Kernel

Steven Platt, Jon C Calhoun (Clemson Univ.)
On the Scalability of Computing Genomic Diversity Using SparkLeBLAST: A Feasibility Study [Outstanding Student Paper Award]

Ritvik R Prabhu, Bernard Moussad (Virginia Tech), Karim Youssef (LLNL), Emil Vatai (RIKEN), Wu-chun Feng (Virginia Tech)

2-P1 (12:15-13:15): Poster Session 2-1

Chair(s)/Host(s): K. Keville

Running GraphBLAS on the FABRIC testbed [Outstanding Short Paper Award]

Vaneshi Ramdhony, Hyunsuk Bang, Nik Sultana (Illinois Institute of Technology)
Solving Hard Combinatorial Problems in Parallel Using Lift-and-Project Preconditioning

Bogdan Zavalnij (Renyi Institute)
Community Detection in Stochastic Block Model Variations

Allison I Gunby-Mann, Peter Chin (Dartmouth Coll.)
Hypersparse Traffic Matrices from Suricata Network Flows using GraphBLAS [Outstanding Short Paper Award]

Michael D Houle, Michael Jones (MIT Lincoln Laboratory), Dan Wallmeyer, Risa Brodeur, Justin Burr (Center for Internet Security),
Hayden Jananthan (MIT Lincoln Laboratory), Sam Merrell (Center for Internet Security), Peter Michaleas (MIT Lincoln Laboratory), Anthony Perez (Center for Internet Security), Andrew Prout, Jeremy Kepner (MIT Lincoln Laboratory)

2-2: Case Studies, Benchmarking, and Tools 2 Session (12:30-13:45)

Co-Chairs: C. Valentine & C. Byun

Characterization and Optimization of the Fitting of Quantum Correlation Functions

Pi-Yueh Chuang, Niteya M Shah (Virginia Tech), Patrick Barry, Ian Cloet, Emil Constantinescu (Argonne National Laboratory), Nobuo Sato (Jefferson Lab), Wu-chun Feng (Virginia Tech)
Elucidating US Import Supply Chain Dynamics: A Spatial-Temporal Graph Neural Network Approach

Nikolay Aristov (MIT-CTL), Ziyan Li, Thomas Koch, Elenna Dugundji (MIT)
The Genomic Computing Revolution: Defining the Next Decades of Accelerating Genomics

Harisankar Sadasivan (AMD), Artur Klauser (-NA-), Juergen Hench (University Hospital Basel), Yatish Turakhia (UCSD), Gagandeep Singh, Alberto Zeni (AMD), Sarah Beecroft (Pawsey Supercomputing Research Centre), Satish Narayanasamy (University of Michigan), Jeff Nivala (Univ. of Washington Seattle), Bob Robey (AMD), Onur Mutlu (ETH), Kristof Denolf, Gina Sitaraman (AMD)
Comparison of Vectorization Capabilities of Different Compilers for X86 and ARM CPUs

Nazmus Sakib (New Mexico State Univ.), Tarun Prabhu, Nandakishore Santhi (LANL), John Shalf (LBNL), Abdel-Hameed Badawy (New Mexico State Univ.)

2-P2 (13:45-14:45): Poster Session 2-2

Chair(s)/Host(s): TBD

Performance Analysis of Falcon Post-Quantum Cryptography in Embedded Hardware-Software Integration [Outstanding Short Paper Award]

John Biselx (HES-SO), Andrea Guerrieri (HES-SO and EPFL)
A Performance Analysis of GPU-Aware MPI Implementations Over the Slingshot-11 Interconnect

Michael Beebe (Texas Tech Univ.), Rahulkumar Gayatri, Kevin Gott, Adam Lavely (LBNL), Muhammad Haseeb (Nvidia), Brandon Cook (LBNL), Yong Chen (Texas Tech Univ.)
Application of Virtual Client for Azure Hardware Qualification

Anna Mary Mathew, Bryan DeYoung, Michael Chhor, Sharjil Khan (Microsoft)
Applying Natural Language Processing for Initial Categorizing of Product Descriptions

Nikolay Aristov, Thomas Koch, Elenna Dugundji (MIT-CTL)
Privacy-Preserving AI for Document Understanding with Controlled Unclassified Information

Scott M Sawyer (Paperless Parts, Inc.)

2-3: Graph Analytics & Network Science 1 Session (14:15-15:30)

Co-Chairs: P. Luszczek & O. Green

Multilevel Diffusion Based Spectral Graph Clustering [Outstanding Paper Award]

Malik Lechekhab, Dimosthenis Pasadakis, Olaf Schenk (Univ. della Svizzera italiana)
Batch-Parallel Compressed Sparse Row: A Locality-Optimized Dynamic-Graph Representation [Outstanding Student Paper Award]

Brian Wheatman, Randal Burns (Johns Hopkins Univ.), Helen Xu (Georgia Inst. of Tech.)
Indexed Binary Operations in the GraphBLAS

Tim Mattson (Human Learning Group), Manaswinee Bezbaruah, Matthias Maier (Texas A&M Univ.), Scott McMillan (CMU Software Engineering Institute), Michel Pelletier (Graphegon), Erik Welch (Nvidia), Timothy A Davis (Texas A&M Univ.)
VF2-PS: Parallel and Scalable Subgraph Monomorphism in Arachne

Mohammad Dindoost, Oliver Alvarado Rodriguez (New Jersey Inst. of Tech.), Sounak Bagchi (Edison Academy Magnet School), Palina Pauliuchenka, Zhihui Du, David A Bader (New Jersey Inst. of Tech.)
MESM: A Query-Agnostic and Memory-Efficient Parallel Subgraph Matching Algorithm

Shubhashish Kar, Shaikh Arifuzzaman (UNLV)

2-4: Graph Analytics & Network Science 2 Session (15:45-17:30)

Co-Chairs: X. Sun & TBD

Constant-Memory Graph Coarsening

Christopher Brissette (NVIDIA), George M Slota (RPI)
Algebraic Vertex Ordering of a Sparse Graph for Adjacency Access Locality and Graph Compression

Dimitris Floros (Duke Univ.), Nikos P Pitsianis (Aristotle Univ. of Thessaloniki), Xiaobai Sun (Duke Univ.)
An Efficient Multi-core Parallel Implementation of SSSP Algorithm with Decreasing Delta-stepping

Rakibul Hassan, Shaikh Arifuzzaman (UNLV)
IRIS-MEMFLOW: Data Flow Enabled Portable Memory Orchestration in IRIS Runtime for Diverse Heterogeneity

Mohammad Alaul Haque Monil, Narasinga Rao Miniskar, Seyong Lee, Beau Johnston, Pedro Valero-Lara, Aaron Young, Keita Teranishi, Jeffrey Vetter (ORNL)
A Deployment Tool for Large Scale Graph Analytics Framework Arachne

Garrett R Gonzalez-Rivas, Zhihui Du, David A Bader (New Jersey Inst. of Tech.)

2-S1: GraphBLAS BoF Special (17:30-19:30)

Organizers: T. Mattson, B. Brock & S. McMillan

Report on the binsparse Specification

Ben Brock (Intel)
SuiteSparse Update

Tim Davis (Texas A&M Univ.)
Python GraphBLAS Update

Julia GraphBLAS Update

Raye Kimmerer (MIT)
Postgres GraphBLAS Update

Michel Pelletier (Graphegon)
Wild and Crazy Ideas for GraphBLAS 3.0

Raye Kimmerer (MIT)
Keynote Talk: The Future of Sparse Computing is Compilers

Fredrik Kjolstad (Stanford)

Wednesday, September 25

3-K: Keynote Session (10:30-11:00)

Co-Chairs: J. Kepner & A. Reuther

Keynote Talk: The Building Blocks of Cloud – Research Enablement

Scott Yockel (Harvard Univ.)

3-1: AI / Machine Learning 1 Session (11:00-12:15)

Co-Chairs: P. Luszczek & P. Monticciolo

ModelGauge: Inference Profiling of Deep-Learning Models [Outstanding Paper Award]

Calvin B Gealy (Univ. of Pittsburgh), David Langerman (NSF SHREC), Alan George (NSF Center for High Performance Reconfigurable Computing)
Enhanced Knowledge Graph Attention Networks for Efficient Graph Learning [Outstanding Student Paper Award]

Fernando P Vera Buschmann, Zhihui Du, David A Bader (New Jersey Inst. of Tech.)
Mobile-Optimized Vessel Segmentation for Ultrasound-Guided Surgical Procedures

Mateusz Wolak, Fin Amin, Nancy DeLosa, Brian A Telfer, Benjamin Roop, Lars Gjesteby (MIT Lincoln Laboratory)
GLITCHES: GPU-FPGA LLM Inference Through a Collaborative Heterogeneous System

Fan Yang (Tsinghua Univ., SenseTime Inc.), Xinhao Yang, Hongyi Wang, Zehao Wang, Zhenhua Zhu, Shulin Zeng, Yu Wang (Tsinghua Univ.)
Graphical Learning Optimization and Dimensionality Reduction with Geometric Multi-Resolution Analysis

Felicia Schenkelberg, Allison I Gunby-Mann, Emma Graham (Dartmouth Coll.), Shuoxuan Li (Carnegie Mellon Univ.), Peter Chin (Dartmouth Coll.)

3-P1 (12:15-13:15): Poster Session 3-1

Chair(s)/Host(s): K. Keville

CLIP-Embed-KD: Computationally Efficient Knowledge Distillation Using Embeddings as Teachers [Outstanding Short Paper Award]

Lakshmi V Nair (Lightmatter)
Efficiency of Data Intensive Computing (DIC) in MEMS Research for Data Processing and Analysis

Yeligay Segizbay (Nazarbayev Univ.)
Capturing the Carbon Impact of Deep Learning

Alexis Corona, Sanmukh Kuppannagari (Case Western Reserve Univ.)
Transfer Learning Assisted Parameter Selection for Water-Fat Separation in Dixon MRI

Alan Okinaka (Ursinus College), Gulfam A Saju, Yuchou Chang (UMass Dartmouth)
Traditional Costume Image Classification for Indian States Using Deep Learning

Sahana R Koti, Sahana Channappa Jatti, Anupama S Nandeppanavar, Medha Kudari (KLE Institute of Technology)
Scalable Approach for Analytic Polynomial Subspace Projection Matrices for a Space-Time Covariance Matrix

Faizan Ahmad Khattak, Mohammed Bakhit, Ian K. Proudler, Stephan Weiss (Univ. of Strathclyde)

Tutorial Session: 3-T (12:15-15:45): Spiral Tutorial

Organizer(s): F. Franchetti and M. Franusich

3-2: Scaling Research Computing Education Session (12:30-13:45)

Co-Chairs: J. Mullen, L. Milechin & H. Jananthan

Invited Talk: Scaling Project-based Learning from Education to Research

Joel Grimm (MIT Lincoln Laboratory)
Invited Talk: Educational Game Dev from Start to Finish: A Short Example

Chasen Milner (USAF)
Invited Talk: HPC-ED: A Federated Catalog to Share and Discover CyberTraining Materials

Susan Mehringer (Cornell Center for Advanced Computing)
Invited Talk: The Wide Area Classroom – 10 Years On

John Urbanic (CMU and Pittsburgh Supercomputing Center)

3-P2 (13:45-14:45): Poster Session 3-2

Chair(s)/Host(s): K. Cain

Gesture Controlled System to Automate Shutdown, Screenshot and Volume Toggle

Prisha Bhosale, Ananya Dandekar, Ria Dcosta, Sri Aishwarya Jonnavittula, Shagufta Rajguru (Fr. Conceicao Rodrigues Institute of Technology)
Machine Learning Application for Smart Network Traffic Prediction

Islam Omar (New Mexico State Univ.), Whit Schonbein (SNL), Hameed Badawy (New Mexico State Univ.)
Model to Predict Inventory Demand in Retail SMEs Using CRISP-DM and Machine Learning

Jhomax R Torres, Diego Moises Carpio Andia, Victor Parasi (Univ. Peruana de Ciencias Aplicadas)
Determination of Game-based Design Equilibria by Using Machine Learning Approach

Sara Karimi, Ehsan Ghotbi (Alfred Univ.)
The Analysis of the Sparse Multi-GPU Parallel Method on the Large Sparse Power Flow Calculation

Lei Zeng, Shadi Alawneh (Oakland Univ.)

3-3: AI / Machine Learning 2 Session (14:15-15:30)

Co-Chairs: H. Badawy & C. Long

A Dynamic Weighting Strategy to Mitigate Worker Node Failure in Distributed Deep Learning

Yuesheng Xu, Arielle Carr (Lehigh Univ.)
P-YOLOv8: Efficient and Accurate Real-Time Detection of Distracted Driving

Mohamed R. Elshamy (New Mexico State Univ.), Heba Emara (Pyramids High Institute of Electronic Engineering), Mohamed Nanyang Shoaib (Nanyang Tech. Univ.), Hameed Badawy (New Mexico State Univ.)
Spike-driven YOLO: Ultra Low-Power Object Detection with Neuromorphic Computing

Mark Barnell, Courtney Raymond, Lisa Loomis (AFRL), Francesca Vidal, Daniel Brown, Darrek Isereau (SRC)
Exploring sparse inference with SuiteSparse:GraphBLAS

Deepak Suresh, Timothy A Davis (Texas A&M Univ.)
Improving Regression in Spiking Neural Networks for Oceanographic Data Analysis

Alissa Kane, Yuchou Chang (UMass Dartmouth)

3-4: General Purpose GPU Computing 1 Session (15:45-17:30)

Co-Chairs: S. Gottlieb & N. Prajapati

Benchmarking Thread Block Cluster [Best Paper Award]

Tim Lühnen, Tobias Marschner, Sohan Lal (TU Hamburg)
Understanding the Efficacy of Power Profiles: A Case Study of AMD Instinct MI100 GPU

Ghazanfar Ali, Mert Side (Texas Tech Univ.), Sridutt Bhalachandra (Univ. of North Carolina), Tommy Dang, Alan Sill, Yong Chen (Texas Tech Univ.)
Community Detection for Large Graphs on GPUs with Unified Memory

Emre Dinçer, Işıl Öz (Izmir Institute of Technology)
Invited Talk: From Simple to Hyper Co-Design of HPC Platforms

Gary Grider (Los Alamos National Laboratory)
Invited Talk: Lessons Learned from Implementing the Anonymized Network Sensing Graph Challenge with GPUs and Commodity Software

Siddharth Samsi, Dan Campbell, Emanuel Scoullos, and Oded Green (NVIDIA)

3-S1: LLMs: Opportunities & Challenges Special (17:30-19:30)

Organizers: V. Gadepally & D. Burrill

Invited Talk: Generative AI in the DoD: Use Cases and Challenges

Manuel Xavier Lugo (US Navy)
Invited Talk: How to Make an LLM Understand Human Conversation for Fun & Profit

Kartik Talamadupula (Symbl.ai)
Invited Talk: Innovations for Reducing the Environmental Impact of LLMs

Boris Gamazaychikov (Salesforce)
Invited Talk: Tackling Generative AI Productivity and Efficiency Challenge with Intel® Gaudi® 3 AI Accelerators

Vasudev Lal (Intel)

Thursday, September 26

4-K: Keynote Session (10:30-11:00)

Co-Chairs: J. Kepner & A. Reuther

Keynote Talk: Convergence across the Computing Continuum: The NSF Leadership Class Computing Facility meets the Edge, Interactive Computing, and Low-Precision AI

Dan Stanzione (Texas Advanced Computing Center)

4-1: AI at Scale and AI on the Edge Session (11:00-12:15)

Co-Chairs: B. Sroka & K. Gettings

Breakthrough Edge AI Inference Performance using NorthPole in 3U VPX Form Factor

Filipp Akopyan, William Risk, John Arthur, Andrew Cassidy, Michael Debole, Carlos Ortega Otero, Jun Sawada, Evan Colgan, Michael Criscolo (IBM Research), Phillip Mann (IBM), Heinz Baier, Kai Schleupen, Arnon Amir (IBM Research), Alexander Andreopoulos (IBM), Rathinakumar Appuswamy, Deepika Bablani, Peter Carlson, Pallab Datta, Steven Esser, Myron Flickner, Rajamohan Gandhasri, Guillaume Garreau, Megumi Ito, Jennifer Klamo, Jeffrey Kusnitz, Nathaniel McClatchey, Neil McGlohon, Jeffrey McKinstry, Yutaka Nakamura (IBM Research), Tapan Nayak (IBM Corporation), Jay Sivagnaname, Daniel Smith, Rafael Sousa, Brian Taba, Ignacio Terrizzano, Takanori Ueda, Dharmendra Modha (IBM Research)
Breakthrough LLM Inference Performance using NorthPole

Rathinakumar Appuswamy, Michael Debole, Brian Taba, Steven Esser, Andrew Cassidy, Arnon Amir (IBM Research), Alexander Andreopoulos (IBM), Deepika Bablani, Pallab Datta, Jeffrey Kusnitz, Nathaniel McClatchey, Neil McGlohon, Jeffrey McKinstry (IBM Research), Tapan Nayak (IBM Corporation), Daniel Smith, Rafael Sousa, Ignacio Terrizzano, Filipp Akopyan, Peter Carlson, Rajamohan Gandhasri, Guillaume Garreau, Nelson Gonzalez, Megumi Ito, Jennifer Klamo, Yutaka Nakamura, Carlos Ortega Otero, William Risk, Jun Sawada, Kai Schleupen, Jay Sivagnaname, Matthew Stallone, Takanori Ueda, Myron Flickner, John Arthur (IBM Research), Rameswar Panda, David Cox (MIT-IBM Watson AI Lab), Dharmendra Modha (IBM Research)
A Framework to Enable Algorithmic Design Choice Exploration in DNNs

Timothy Cronin, Sanmukh Kuppannagari (Case Western Reserve Univ.)
Benchmarking Edge AI Platforms for High-Performance ML Inference

Rakshith Jayanth, Neelesh Gupta, Viktor K Prasanna (USC)
Transformers: A Graph Processing Perspective

Manish Sri Sai Surya Routhu, Sai Dheeraj Yanduru, Nathaniel K Tomczak, Sanmukh Kuppannagari (Case Western Reserve Univ.)

4-P1 (12:15-13:15): Poster Session 4-1

Chair(s)/Host(s): K. Keville

Perspective-Aware Ai (PAi) for Augmenting Critical Decision Making

Marjan Alirezaie, Daniel Platnick (Flybits), Hossein Rahnama, Dava Newman, Alex Pentland (MIT)
Evaluating the Impact of Noisy Blades on PROPELLER MRI Reconstruction Quality

Gulfam A Saju, Marjan Akhi, Yuchou Chang (UMass Dartmouth)
CompJouleS: Energy Estimate Tool for Machine Learning Algorithms for Multiple Applications in CPU, GPU, and FPGA Architectures

Murat Isik (Stanford Univ.), Jens E. Pedersen (SLAC National Accelerator Laboratory), Vedant Karia (Univ. of Texas at San Antonio), Sadasivan Shankar (Stanford Univ.)
Power Efficient Deep Learning Acceleration using Intel Xeon® Processors

Xiaofei Jiang, Mona Minakshi, Rajesh Poornachandran, Shamima Najnin (Intel)
Impact of Estimation Errors of a Matrix of Transfer Functions onto Its Analytic Singular Values and Their Potential Algorithmic Extraction

Mohammed Bakhit, Faizan Ahmad Khattak, Ian K. Proudler, Stephan Weiss (Univ. of Strathclyde)
Disaggregation Patterns for Secure AI Systems

Mohamed Ghamri, Marc A Lacoste, Divi De Lacour (Orange)

4-2: Large AI Models Session (12:30-13:45)

Co-Chairs: N. Pitsianis & B. Sroka

MonoCoder: Domain-Specific Code Language Model for HPC Codes and Tasks [Outstanding Paper Award]

Tal Kadosh (Ben-Gurion Univ., IAEC), Niranjan Hasabnis (Intel), Vy Vo (Intel Labs), Nadav Schneider (Ben-Gurion University), Neva Krien (Independent), Mihai Capotă (Intel Labs), Abdul Wasay, Guy Tamir (Intel), Theodore L Willke, Nesreen Ahmed (Intel Labs), Yuval Pinter (Ben-Gurion University), Tim Mattson (Human Learning Group), Gal Oren (Technion, Stanford Univ.)
LLM Inference Serving: Survey of Recent Advances and Opportunities [Outstanding Paper Award]

Baolin Li, Yankai Jiang (Northeastern Univ.), Vijay Gadepally (MIT Lincoln Laboratory), Devesh Tiwari (Northeastern Univ.)
Enhancing Code Translation in Language Models with Few-Shot Learning via Retrieval-Augmented Generation

Manish Bhattarai, Javier E Santos, Shawn M Jones, Ayan Biswas, Boian Alexandroe, Daniel O Malley (LANL)
High Performance Im2win and Direct Convolutions using Three Tensor Layouts on SIMD Architectures

Xiang Fu, Xinpeng Zhang, Jixiang Ma (Nanchang Hangkong Univ.), Peng Zhao (Microsoft), Shuai Lu (Nanchang Hangkong Univ.), Xu Liu (Univ. of Washington)
Accelerating Sensor Fusion in Neuromorphic Computing: A Case Study on Loihi-2

Murat Isik (Drexel Univ.), Karn Tiwari (IIS Bangalore), Burak Eryilmaz (Bilkent Univ.), Ismail Can Dikmen (TEMSA)

4-P2 (13:45-14:45): Poster Session 4-2

Chair(s)/Host(s): TBD

NeuroVM: Dynamic Neuromorphic Hardware Virtualization

Murat Isik (Drexel Univ.), Kayode Inadagbo (Prairie View A&M Univ.), Ismail Can Dikmen (TEMSA)
LLMs for Closed-Library Multi-Document Query, Test Generation, and Evaluation

Claire Randolph (Dept. of the Air Force), Adam Michaleas, Darrell O Ricke (MIT Lincoln Laboratory)
LLM-Based Task Planning for Navigating Companion Robot from Emotion Signals

Yuchou Chang (UMass Dartmouth), Huy Anh Pham (Intelligent Medical Objects, Inc.), Gulfam A Saju (UMass Dartmouth)
Large Multimodal Model for Simulating Big Training Data in Deep PROPELLER MRI

Gulfam A Saju, Marjan Akhi, Yuchou Chang (UMass Dartmouth)
Artificial Intelligence Solution on Intel Xeon Processor Power and Performance Engineering

Zhongbin Liu, Xiaofei Jiang, Jiajia Zhang (Intel)
Boosting the Performance of Reinforcement Learning-based Task Scheduling using Offline Inference

Chedi Morchdi (Univ. of Utah), Cheng-Hsiang Chiu (Univ. of Wisconsin), Yi Zhou (Univ. of Utah), Tsung-Wei Huang (Univ. of Wisconsin)

4-3: Innovative Computing Session (14:15-15:30)

Co-Chairs: K. Keville & P. Luszczek

Reinforcement Learning-generated Topological Order for Dynamic Task Graph Scheduling

Cheng-Hsiang Chiu (Univ. of Wisconsin), Chedi Morchdi, Yi Zhou (Univ. of Utah), Boyang Zhang, Che Chang, Tsung-Wei Huang (Univ. of Wisconsin)
FPGA Acceleration for Scalable High-Resolution OPIR Target Detection

Daniel C Stumpp (Univ. of Pittsburgh), Alan George (NSF Center for High Performance Reconfigurable Computing)
Hybrid Computing Architecture Based on Analog Phase-Change Memory Chips for Deep Neural Network Training

Zhenhao Jiao (Univ. of Science and Technology of China), Tao Hong, Xiaogang Chen, Weibang Dai (Shanghai Institute of Microsystem and Information Technology, CAS), Chengcai Tu (Donghua University), Shunfen Li, Houpeng Chen, Zhitang Song (Shanghai Institute of Microsystem and Information Technology, CAS)
Exploring the Trade-off Between Repair Time and Reliability in Large Scale Cluster Computers: A Simulation-Based Approach

Leslie Horace (Georgia Inst. of Tech.), Craig Walker, William M Jones (Coastal Carolina Univ.), Nathan DeBardeleben, Vivian Hafener, Steven Senator (LANL)
Experiences with VITIS AI for Deep Reinforcement Learning

Nabayan Chaudhury, Atharva M Gondhalekar, Wu-chun Feng (Virginia Tech)

4-4: Graph Challenge Session (15:45-17:30)

Co-Chairs: J. Kepner & A. Reuther

Mercury: Efficient Subgraph Matching on GPUs with Hybrid Scheduling

Zhiheng Lin (Inst. of Computing Tech, CAS), Changjie Xu (UCAS), Ke Meng, Guangming Tan (Inst. of Computing Tech, CAS)
Towards Faster Graph Partitioning via Pre-training and Inductive Inference

Meng Qin (HKUST), Chaorui Zhang (Huawei), Yu Gao (Independent), Yibing Ding, Weipeng Jiang (Huawei), Weixi Zhang (Huawei Technologies), Wei Han (Huawei), Bo Bai (Huawei Technologies)
Distributed-Memory Sparse Deep Neural Network Inference Using Global Arrays

Sayan Ghosh, Bruce Palmer, Andres Marquez (PNNL)
Anonymized Network Sensing Graph Challenge

Hayden R Jananthan, Michael Jones, William Arcand, David Bestor, William Bergeron, Daniel Burrill (MIT Lincoln Laboratory), Aydin Buluc (LBNL), Chansup Byun (MIT Lincoln Laboratory), Timothy Davis (Texas A&M), Vijay Gadepally (MIT Lincoln Laboratory), Daniel Grant (GreyNoise), Michael Houle, Matthew Hubbell, Piotr Luszczek, Peter Michaleas (MIT Lincoln Laboratory), Lauren Milechin (MIT), Chasen Milner, Guillermo Morales (MIT Lincoln Laboratory), Andrew Morris (GreyNoise), Julie Mullen, Ritesh Patel (MIT Lincoln Laboratory), Alex Pentland (MIT), Sandeep Pisharody, Andrew Prout, Albert Reuther, Antonio Rosa, Gabriel Wachman, Charles Yee, Jeremy Kepner (MIT Lincoln Laboratory)
Extracting TCPIP Headers at High Speed for the Anonymized Network Traffic Graph Challenge

Zhaoyang Han, Andrew Briasco-Stewart (Northeastern Univ.), Michael Zink (UMass Amherst), Miriam Leeser (Northeastern Univ.)
Sans: Streaming Anonymized Network Sensing

Ketai Zhao, Yuhang Zhou, Hong Xu Pan, Zhibin Wang, Sheng Zhong, Chen Tian (Nanjing Univ.)

Friday, September 27

5-K: Keynote Session (10:30-11:00)

Co-Chairs: J. Kepner & A. Reuther

Keynote Talk: AI/ML Applications for Global Security

Eric Evans (MIT)

5-1: High Performance Computing 1 Session (11:00-12:15)

Co-Chairs: D. Ricke & B. Raut

Cycle-Stealing in Load-Imbalanced HPC Applications [Outstanding Student Paper Award]

Po Hao Chen (Brown University), Akshaya Bali, Shining Yang, Pouya Haghi, Carlton Knox, Benjamin Li (Boston Univ.), Amr Abouelmagd, Anthony Skjellum (Tennessee Tech), Martin Herbordt (Boston Univ.)
Tightly-Coupled FPGA Accelerator for Molecular Dynamics Simulation: Hardware-Software Co-Design and Fine-Grained Task Management [Outstanding Student Paper Award]

Zekang Cheng, Zerong S He, Xi Jin (Univ. of Science and Technology of China)
MST in Incremental Graphs Through Tree Contractions

Akanksha Dwivedi, Sameer Sharma, Dip Sankar Banerjee (IIT Jodhpur)
Syndeo: Portable Ray Clusters with Secure Containerization

William Li, Rodney S Lafuente Mercado, Jaime Pena, Ross E Allen (MIT Lincoln Laboratory)
Evaluating One-Sided Communication on Graph500 with MPI-RMA and OpenSHMEM

Jefferson Boothe (Univ. of Pittsburgh), Alan George (NSF Center for High Performance Reconfigurable Computing)

5-P1 (12:15-13:15): Poster Session 5-1

Chair(s)/Host(s): K. Keville

Synthesizing Numerical Linear Algebra using Julia [Best Short Paper Award]

Sophie Xuan, Rabab MA Alomairy, Evelyne Ringoot, Felipe Tome, Julian Samaroo, Alan Edelman (MIT)
Towards LibraryX: A Framework for Cross-Library Call Optimization [Outstanding Short Paper Award]

Sanil Rao, Anant Prakash, Franz Franchetti (Carnegie Mellon Univ.)
Compressed Cannon’s Algorithm

Louis Jencka, Amanda J Bienz (Univ. of New Mexico)
Multiplication of Sparse Matrices and their Transpose using Compressed Sparse Diagonals

Sardar Anisul Haque (Oryx Universal College in Partnership with LJMU (UK)), Mohammad Tanvir Parvez (Qassim Univ.), Shahadat Hossain (Univ. of Northern British Columbia)
Augmenting HPC Profilers with Analysis Capabilities

Abhishek N Patil, Shamjith K V, Senthil Kumar RK, Dr. S D Sudarsan (C-DAC)
Explainable DiGCNs for Decomposition of Opaque Node Ranking Functions

Vishal Chandra (MIT Lincoln Laboratory)

5-2: High Performance Computing 2 Session (12:30-13:45)

Co-Chairs: D. Ricke & J. Mullen

An Efficient Multi-DNN Accelerator Based on Multiple Systolic Array

Jianjun Chen, Han Jiao, Wenjin Huang, Yihua Huang (Sun Yat-Sen Univ.)
JACC.shared: Leveraging HPC Metaprogramming and Performance Portability for Computations That Use Shared Memory GPUs

Pedro Valero-Lara, William Godoy, Keita Teranishi, Jeffrey Vetter (ORNL)
OCO-GAT: An Accelerator for Graph Attention Network with Optimized Calculation Order

Qi Liu, Wenjin Huang, Wenlu Peng, Yihua Huang (Sun Yat-Sen Univ.)
Task-Level Parallelism for the Multifrontal Method in Tightly Coupled CPU-FPGA Architectures

Zerong S He, Zekang Cheng, Zhongguang Xu, Xi Jin (Univ. of Science and Technology of China)
LLload: An Easy-to-Use HPC Utilization Tool

Chansup Byun, Albert Reuther, Julie Mullen, LaToya Anderson, William Arcand, Bill Bergeron, David Bestor, Alexander Bonn, Daniel Burrill, Vijay Gadepally, Michael Houle, Matthew Hubbell, Hayden Jananthan, Michael Jones, Piotr Luszczek, Peter Michaleas (MIT Lincoln Laboratory), Lauren Milechin (MIT), Guillermo Morales, Andrew Prout, Antonio Rosa, Charles Yee, Jeremy Kepner (MIT Lincoln Laboratory)

5-P2 (13:45-14:45): Poster Session 5-2

Chair(s)/Host(s): K. Keville

A Framework for Analyzing the Performance of Sparse Matrix and Graph Operations

Khaled Abdelaal, Richard M Veras (Univ. of Oklahoma)
Efficient Eigenvalue Computation of Parahermitian Matrices Using Neural Networks

Diyari A. Hassan (Qaiwan Intl. Univ.), Yunus Egi, Soydan Redif (American Univ. of Middle East)
Towards a RISC-V Instruction Set Extension for Multi-word Arithmetic

Youngjin Eum, Naifeng Zhang, Larry Tang, Franz Franchetti (Carnegie Mellon Univ.)
AstraMQ: Distributed MQTT Broker

Rohan M Doshi, Sanika Inamdar, Tanmay Karmarkar, Madhuri S Wakode (Pune Inst. Of Computer Tech.)
Computational and Numerical Properties of a Broadband Subspace-Based Likelihood Ratio Test

Cornelius Pahalson, Louise Crockett, Stephan Weiss (Univ. of Strathclyde)

5-3: High Performance Computing 3 Session (14:15-15:30)

Co-Chairs: J. Mullen & N. Pitsianis

Persistent and Partitioned MPI for Stencil Communication

Gerald Collom (Univ. of New Mexico), Jason Burmark, Olga Pearce (LLNL), Amanda J Bienz (Univ. of New Mexico)
HPC Network Simulation Tuning via Automatic Extraction of Hardware Parameters

Joshua Suetterlein, Stephen Young, Jesun S Firoz, Joseph Manzano, Nathan Tallent, Ryan Friese, Kevin Barker, Timothy Stavenger (PNNL)
Accelerating Multi-Agent DDPG Training on Multi-GPU Platforms

Samuel Wiggins, Viktor K Prasanna (USC)
Binary Bleed: Fast Distributed and Parallel Method for Automatic Model Selection

Ryan C Barron, Maksim Eren (LANL), Manish Bhattarai (Los Alamos National Lab), Ismael Boureima (LANL), Cynthia Matuszek (UMBC), Boian Alexandroe (LANL)
GPU Sharing with Triples Mode

Chansup Byun, Albert Reuther, LaToya Anderson, William Arcand, Bill Bergeron, David Bestor, Alexander Bonn, Daniel Burrill, Vijay Gadepally, Michael Houle, Matthew Hubbell, Hayden Jananthan, Michael Jones, Piotr Luszczek, Peter Michaleas (MIT Lincoln Laboratory), Lauren Milechin (MIT), Guillermo Morales, Julie Mullen, Andrew Prout, Antonio Rosa, Charles Yee, Jeremy Kepner (MIT Lincoln Laboratory)

5-4: High Performance Computing 4 Session (15:45-17:30)

Co-Chairs: N. Pitsianis & C. Byun

BB-CVXOPT: Basic Block Execution Count Estimation and Extrapolation using Constrained Convex Optimization

Youssef A Aly (New Mexico State Univ.), Atanu Barai, Nandakishore Santhi (LANL), Hameed Badawy (New Mexico State Univ.)
Parallel Online Directed Acyclic Graph Exploration for Atlasing Soft-Matter Assembly Configuration Spaces

Rahul Prabhu, Amit Verma, Meera Sitharam (Univ. of Florida)
Towards Just-in-Time Instruction Generation for Accelerated Sparse Matrix-Matrix Multiplication on GPUs

Seth Kay, H. Howie Huang (George Washington Univ.)
HBM-based Hardware Accelerator for GNN Sampling and Aggregation

Yuchen Gui (Univ. of Science and Technology of China), Qizhe Wu, Wei Yuan (USTC), Huawen Liang, Xiaotian Wang, Xi Jin (Univ. of Science and Technology of China)
GPU Accelerated Construction of Time Respecting Data Structure for Temporal Graphs

Animan Naskar, Venkata Kalyan Tavva (IIT Ropar), Subhasis Banerjee (Shell India Markets Pvt. Ltd.)
Comparative Analysis of GCC and LLVM for Performance Optimization on Aarch64

Mriganka Bezbaruah, Samruddhi Dhakulkar, Prachi Pandey, Haribabu Pasupuleti, S A Kumar, S D Sudarsan (C-DAC)
Benchmarking the Performance of Large Language Models on the Cerebras Wafer Scale Engine [Outstanding Student Paper Award]

Zuoning Zhang, Dhruv Parikh (USC), Youning Zhang (UC, Berkeley), Viktor K Prasanna (USC)

IEEE HPEC 2023