All times are EDT (UTC/GMT -04 hours)
Speaker/Presenting Author in Italics
Co-Chairs: J. Kepner & A. Reuther
- Kickoff Talk: Where We Stand: Education, Research, and High Performance Computing
- Peter Fisher (MIT)
Co-Chairs: A. Conard & C. Byun
- Supercomputer 3D Digital Twin for User Focused Real-Time Monitoring [Outstanding Paper Award]
- William Bergeron, Matthew Hubbell, Daniel Mojica, Albert Reuther, William Arcand, David Bestor, Daniel Burrill, Chansup, Byun, Vijay Gadepally, Michael Houle, Hayden Jananthan, Michael Jones, Piotr Luszczek, Peter Michaleas (MIT Lincoln Laboratory), Lauren Milechin (MIT), Julie Mullen, Andrew Prout, Antonio Rosa, Charles Yee, Jeremy Kepner (MIT Lincoln Laboratory)
- Dynamic Task Scheduling with Data Dependency Awareness Using Julia
- Rabab MA Alomairy, Felipe Tome, Julian Samaroo, Alan Edelman (MIT)
- Optimization Strategies to Accelerate BLAS Operations with ARM SVE
- Aniket P Garade, Sushil Pratap Singh, Juliya James, Deepika H V, Haribabu Pasupuleti, S A Kumar, Sudarsan S D (C-DAC)
- A Highly Scalable Parallel Design for Data Compression
- S Biplab Raut (AMD)
- Investigating Resilience of Loops in HPC Programs: A Semantic Approach with LLMs
- Hailong Jiang, Jianfeng Zhu (Kent State Univ.), Bo Fang (PNNL), Chao Chen (Intel), Qiang Guan (Kent State Univ.)
Chair(s)/Host(s): K Keville & K. Cain
- Performance Benchmarking of H2O AutoML and Individual Models on Malware Detection Tasks
- Minakshi Arya (NDSU), Shubhavi Arya (Indiana Univ.), Saatvik Arya (Univ. of Washington)
- IOS: A Low Cost Defense to Mitigate Meltdown and Spectre like Attacks
- Xin Wang (Virginia Commonwealth Univ.), Wei Zhang (Univ. of Louisville)
- Authentication in High Noise Environments using PUF-Based Parallel Probabilistic Searches
- Brian Donnelly, Michael Gowanlock (Northern Arizona Univ.)
- Intel Xeon Optimization for Efficient Media Workload Acceleration
- Karan Puttannaiah, Rajesh Poornachandran (Intel)
- Towards an End-to-End Processing-in-DRAM Acceleration of Spectral Library Search
- Tianyun Zhang, Eric Tang (Carnegie Mellon Univ.), Farzana A Siddique, Kevin Skadron (Univ. of Virginia), Franz Franchetti (Carnegie Mellon Univ.)
- Neuromorphic Circuits with Spiking Astrocytes for Increased Energy Efficiency, Fault Tolerance, and Memory Capacitance
- Murat Isik (Drexel Univ.), Kaushal Gawri (SemaAI), Maurizio De Pitta (University Health Network)
Co-Chairs: M. Barnell & K. Gettings
- VeBPF Many-Core Architecture for Network Functions in FPGA-based SmartNICs and IoT
- Zaid Tahir (Boston Univ.), Ahmed Sanaullah (Red Hat), Sahan Bandara (Boston Univ.), Ulrich Drepper (Red Hat), Martin Herbordt (Boston Univ.)
- Hunting the Needle – The Potential of Innovation in Architecture
- Peter Kogge (Univ. of Notre Dame), Janice McMahon (Self), Timothy Dysart (Tactical Computing Labs)
- Predictive Performance of Photonic SRAM-based In-Memory Computing for Tensor Decomposition [Best Student Paper Award]
- Sasindu Wijeratne (USC), Sugeet Sunder (USC Information Sciences Institute), Md Abdullah-Al Kaiser, Akhilesh Jaiswal (Univ. of Wisconsin), Clynn Mathew, Ajey Jacob (USC Information Sciences Institute), Viktor K Prasanna (USC)
- A Multilevel Approach For Solving Large-Scale QUBO Problems With Noisy Hybrid Quantum Approximate Optimization
- Filip B Maciejewski (NASA/USRA), Bao Gia Bach (Univ. of Delaware), Maxime Dupont (Rigetti Computing), Paul A Lott (Universities Space Research Association), Bhuvanesh Sundar (Rigetti Computing), David Neira (Purdue University/USRA), Ilya Safro (Univ. of Delaware), Davide Venturelli (Universities Space Research Association)
Chair(s)/Host(s): P. Luszczek
- Quantum Machine Learning in the Cognitive Domain: Alzheimer’s Disease Study
- Emine Akpinar (Yıldız Technical Univ.)
- On the Design of the Quantum-Classical Hybrid-Service Architecture
- Yi Liu, Yuchou Chang (UMass Dartmouth)
- Quantum Computing for Data Calibration in Parallel Magnetic Resonance Imaging Reconstruction
- Girish Babu Reddy, Gulfam A Saju, Yi Liu, Yuchou Chang (UMass Dartmouth)
- Ultra Low Latency Hardware Optimised Radix-4 FFT for Optical Wireless FPGA Transceiver’s via Hermitian Symmetry Characteristics
- Michael Codd, Ciara McDonald (Maynooth Univ.), Yiyue Jiang, Chunan Chen (Northeastern Univ.), Holger Claussen (Tyndall National Institute), Miriam Leeser (Northeastern Univ.), John Dooley (Maynooth Univ.)
- Fully Transparent Client-Side Caching for Key-Value Store Applications Using FPGAs
- Sahan Bandara, Noah Cherry, Martin Herbordt (Boston Univ.)
- Impact of Grid Processing on Signal Cross-Correlation
- Rhea Senthil Kumar, Nathan Simard, Jonathan Mathews, Jeremy Kepner, Timothy Collard (MIT Lincoln Laboratory)
Co-Chairs: C. Long & S. Shankar
- A High-Performance Curve25519 and Curve448 Unified Elliptic Curve Cryptography Accelerator
- Aniket Banerjee (IISc), Utsav Banerjee (Indian Institute of Science)
- Direct RF FPGAs built with Multi-Chip Packaging Overcome Technology Challenges
- Marjorie Catt, Dustin J Henderson (Altera)
- A Run-Time Configurable NTT Architecture for Homomorphic Encryption Based on 3D Algorithm
- Weicong Lu, Xiaojie Chen, Dihu Chen, Tao Su (Sun Yat-Sen Univ.)
- Optimizing FPGA Memory Allocation for Matrix-Matrix Multiplication using Bayesian Optimization
- Mehmet Gungor, Stratis Ioannidis, Miriam Leeser (Northeastern Univ.)
- pc-COP: An Efficient and Configurable 2048-p-Bit Fully-Connected Probabilistic Computing Accelerator for Combinatorial Optimization
- Kiran Magar (IISc), Shreya Bharathan (National Inst. of Tech., Tiruchirappalli), Utsav Banerjee (Indian Institute of Science)
Co-Chairs: S. Pisharody & J. Holodnak
- Invited Talk: The SWARM Project: Reimagining Workflow and Resource Management Systems with Swarm Intelligence
- Prasanna Balaprakash (ORNL)
- Invited Talk: The Convergence of Intuitive AI and Exascale Computing: Redefining What’s Possible
- Eliu Huerta (ANL)
- Invited Talk: The National Cybersecurity Strategy: A Progress Report
- Robert Knake (Orkestrel)
- Invited Talk: Operational AI/ML Opportunities
- Scott Weed (US Air Force)
- Hardware Trojan Detection Utilizing Graph Neural Networks and Structural Checking
- Hunter Nauman, Jia Di (Univ. of Arkansas)
- Break
- Composable Mission-Critical Embedded System Architecture for High Assurance
- Michael Vai, Eric Simpson, Alice Lee, Huy Nguyen, Jeffrey Hughes, Ben Nahill, Jeffery Lim, Roger Khazan, Sean O’melia (MIT Lincoln Laboratory), Fred Schneider (Cornell University)
- What is Normal? A Big Data Observational Science Model of Anonymized Internet Traffic
- Jeremy Kepner, Hayden Jananthan, Michael Jones, William Arcand, David Bestor, William Bergeron, Daniel Burrill (MIT Lincoln Laboratory), Aydin Buluc (LBNL), Chansup Byun (MIT Lincoln Laboratory), Timothy Davis (Texas A&M), Vijay Gadepally (MIT Lincoln Laboratory), Daniel Grant (GreyNoise), Michael Houle, Matthew Hubbell, Piotr Luszczek (MIT Lincoln Laboratory), Lauren Milechin (MIT), Chasen Milner, Guillermo Morales (MIT Lincoln Laboratory), Andrew Morris (GreyNoise), Julie Mullen, Ritesh Patel (MIT Lincoln Laboratory), Alex Pentland (MIT), Sandeep Pisharody, Andrew Prout, Albert Reuther, Antonio Rosa, Gabriel Wachman, Charles Yee, Peter Michaleas (MIT Lincoln Laboratory)
- Invited Talk: National Centers of Academic Excellence in Cybersecurity Program
- Teddy Lynch (NSA)
- Keynote Talk: Verification in ML
- Shafi Goldwasser (Simons Theory of Computing Institute)
Co-Chairs: J. Kepner & A. Reuther
- Keynote Talk: Energy Efficiency Scaling for 2 Decades (EES2) Roadmap for Computing
- Tina Kaarsberg (Dept. of Energy)
Co-Chairs: B. Raut & C. Byun
- A Neural Network Based GCC Cost Model for Faster Compiler Tuning
- Hafsah Shahzad (Boston Univ.), Ahmed Sanaullah, Sanjay Arora, Ulrich Drepper (Red Hat), Martin Herbordt (Boston Univ.)
- HENNC: Hardware Engine for Artificial Neural Network-Based Chaotic Oscillators
- Mobin Vaziri (Polytechnique Montréal), Shervin Vakili (Institut National de la Recherche Scientifique), Mohammad Mehdi Rahimifar (Interdisciplinary Institute for Technological Innovation), Pierre Langlois (Polytechnique Montréal)
- A Graph-Based Algorithm for Optimizing GCC Compiler Flag Settings
- Reza Sajjadinasab (Boston Univ.), Sanjay Arora, Ulrich Drepper, Ahmed Sanaullah (Red Hat), Martin Herbordt (Boston Univ.)
- Analyzing an In-line Compression on the Matrix Matrix Multiplication Kernel
- Steven Platt, Jon C Calhoun (Clemson Univ.)
- On the Scalability of Computing Genomic Diversity Using SparkLeBLAST: A Feasibility Study [Outstanding Student Paper Award]
- Ritvik R Prabhu, Bernard Moussad (Virginia Tech), Karim Youssef (LLNL), Emil Vatai (RIKEN), Wu-chun Feng (Virginia Tech)
Chair(s)/Host(s): K. Keville
- Running GraphBLAS on the FABRIC testbed [Outstanding Short Paper Award]
- Vaneshi Ramdhony, Hyunsuk Bang, Nik Sultana (Illinois Institute of Technology)
- Solving Hard Combinatorial Problems in Parallel Using Lift-and-Project Preconditioning
- Bogdan Zavalnij (Renyi Institute)
- Community Detection in Stochastic Block Model Variations
- Allison I Gunby-Mann, Peter Chin (Dartmouth Coll.)
- Hypersparse Traffic Matrices from Suricata Network Flows using GraphBLAS [Outstanding Short Paper Award]
- Michael D Houle, Michael Jones (MIT Lincoln Laboratory), Dan Wallmeyer, Risa Brodeur, Justin Burr (Center for Internet Security),
Hayden Jananthan (MIT Lincoln Laboratory), Sam Merrell (Center for Internet Security), Peter Michaleas (MIT Lincoln Laboratory), Anthony Perez (Center for Internet Security), Andrew Prout, Jeremy Kepner (MIT Lincoln Laboratory)
Co-Chairs: C. Valentine & C. Byun
- Characterization and Optimization of the Fitting of Quantum Correlation Functions
- Pi-Yueh Chuang, Niteya M Shah (Virginia Tech), Patrick Barry, Ian Cloet, Emil Constantinescu (Argonne National Laboratory), Nobuo Sato (Jefferson Lab), Wu-chun Feng (Virginia Tech)
- Elucidating US Import Supply Chain Dynamics: A Spatial-Temporal Graph Neural Network Approach
- Nikolay Aristov (MIT-CTL), Ziyan Li, Thomas Koch, Elenna Dugundji (MIT)
- The Genomic Computing Revolution: Defining the Next Decades of Accelerating Genomics
- Harisankar Sadasivan (AMD), Artur Klauser (-NA-), Juergen Hench (University Hospital Basel), Yatish Turakhia (UCSD), Gagandeep Singh, Alberto Zeni (AMD), Sarah Beecroft (Pawsey Supercomputing Research Centre), Satish Narayanasamy (University of Michigan), Jeff Nivala (Univ. of Washington Seattle), Bob Robey (AMD), Onur Mutlu (ETH), Kristof Denolf, Gina Sitaraman (AMD)
- Comparison of Vectorization Capabilities of Different Compilers for X86 and ARM CPUs
- Nazmus Sakib (New Mexico State Univ.), Tarun Prabhu, Nandakishore Santhi (LANL), John Shalf (LBNL), Abdel-Hameed Badawy (New Mexico State Univ.)
Chair(s)/Host(s): TBD
- Performance Analysis of Falcon Post-Quantum Cryptography in Embedded Hardware-Software Integration [Outstanding Short Paper Award]
- John Biselx (HES-SO), Andrea Guerrieri (HES-SO and EPFL)
- A Performance Analysis of GPU-Aware MPI Implementations Over the Slingshot-11 Interconnect
- Michael Beebe (Texas Tech Univ.), Rahulkumar Gayatri, Kevin Gott, Adam Lavely (LBNL), Muhammad Haseeb (Nvidia), Brandon Cook (LBNL), Yong Chen (Texas Tech Univ.)
- Application of Virtual Client for Azure Hardware Qualification
- Anna Mary Mathew, Bryan DeYoung, Michael Chhor, Sharjil Khan (Microsoft)
- Applying Natural Language Processing for Initial Categorizing of Product Descriptions
- Nikolay Aristov, Thomas Koch, Elenna Dugundji (MIT-CTL)
- Privacy-Preserving AI for Document Understanding with Controlled Unclassified Information
- Scott M Sawyer (Paperless Parts, Inc.)
Co-Chairs: P. Luszczek & O. Green
- Multilevel Diffusion Based Spectral Graph Clustering [Outstanding Paper Award]
- Malik Lechekhab, Dimosthenis Pasadakis, Olaf Schenk (Univ. della Svizzera italiana)
- Batch-Parallel Compressed Sparse Row: A Locality-Optimized Dynamic-Graph Representation [Outstanding Student Paper Award]
- Brian Wheatman, Randal Burns (Johns Hopkins Univ.), Helen Xu (Georgia Inst. of Tech.)
- Indexed Binary Operations in the GraphBLAS
- Tim Mattson (Human Learning Group), Manaswinee Bezbaruah, Matthias Maier (Texas A&M Univ.), Scott McMillan (CMU Software Engineering Institute), Michel Pelletier (Graphegon), Erik Welch (Nvidia), Timothy A Davis (Texas A&M Univ.)
- VF2-PS: Parallel and Scalable Subgraph Monomorphism in Arachne
- Mohammad Dindoost, Oliver Alvarado Rodriguez (New Jersey Inst. of Tech.), Sounak Bagchi (Edison Academy Magnet School), Palina Pauliuchenka, Zhihui Du, David A Bader (New Jersey Inst. of Tech.)
- MESM: A Query-Agnostic and Memory-Efficient Parallel Subgraph Matching Algorithm
- Shubhashish Kar, Shaikh Arifuzzaman (UNLV)
Co-Chairs: X. Sun & TBD
- Constant-Memory Graph Coarsening
- Christopher Brissette (NVIDIA), George M Slota (RPI)
- Algebraic Vertex Ordering of a Sparse Graph for Adjacency Access Locality and Graph Compression
- Dimitris Floros (Duke Univ.), Nikos P Pitsianis (Aristotle Univ. of Thessaloniki), Xiaobai Sun (Duke Univ.)
- An Efficient Multi-core Parallel Implementation of SSSP Algorithm with Decreasing Delta-stepping
- Rakibul Hassan, Shaikh Arifuzzaman (UNLV)
- IRIS-MEMFLOW: Data Flow Enabled Portable Memory Orchestration in IRIS Runtime for Diverse Heterogeneity
- Mohammad Alaul Haque Monil, Narasinga Rao Miniskar, Seyong Lee, Beau Johnston, Pedro Valero-Lara, Aaron Young, Keita Teranishi, Jeffrey Vetter (ORNL)
- A Deployment Tool for Large Scale Graph Analytics Framework Arachne
- Garrett R Gonzalez-Rivas, Zhihui Du, David A Bader (New Jersey Inst. of Tech.)
Organizers: T. Mattson, B. Brock & S. McMillan
- Report on the binsparse Specification
- Ben Brock (Intel)
- SuiteSparse Update
- Tim Davis (Texas A&M Univ.)
- Python GraphBLAS Update
- Julia GraphBLAS Update
- Raye Kimmerer (MIT)
- Postgres GraphBLAS Update
- Michel Pelletier (Graphegon)
- Wild and Crazy Ideas for GraphBLAS 3.0
- Raye Kimmerer (MIT)
- Keynote Talk: The Future of Sparse Computing is Compilers
- Fredrik Kjolstad (Stanford)
Co-Chairs: J. Kepner & A. Reuther
- Keynote Talk: The Building Blocks of Cloud – Research Enablement
- Scott Yockel (Harvard Univ.)
Co-Chairs: P. Luszczek & P. Monticciolo
- ModelGauge: Inference Profiling of Deep-Learning Models [Outstanding Paper Award]
- Calvin B Gealy (Univ. of Pittsburgh), David Langerman (NSF SHREC), Alan George (NSF Center for High Performance Reconfigurable Computing)
- Enhanced Knowledge Graph Attention Networks for Efficient Graph Learning [Outstanding Student Paper Award]
- Fernando P Vera Buschmann, Zhihui Du, David A Bader (New Jersey Inst. of Tech.)
- Mobile-Optimized Vessel Segmentation for Ultrasound-Guided Surgical Procedures
- Mateusz Wolak, Fin Amin, Nancy DeLosa, Brian A Telfer, Benjamin Roop, Lars Gjesteby (MIT Lincoln Laboratory)
- GLITCHES: GPU-FPGA LLM Inference Through a Collaborative Heterogeneous System
- Fan Yang (Tsinghua Univ., SenseTime Inc.), Xinhao Yang, Hongyi Wang, Zehao Wang, Zhenhua Zhu, Shulin Zeng, Yu Wang (Tsinghua Univ.)
- Graphical Learning Optimization and Dimensionality Reduction with Geometric Multi-Resolution Analysis
- Felicia Schenkelberg, Allison I Gunby-Mann, Emma Graham (Dartmouth Coll.), Shuoxuan Li (Carnegie Mellon Univ.), Peter Chin (Dartmouth Coll.)
Chair(s)/Host(s): K. Keville
- CLIP-Embed-KD: Computationally Efficient Knowledge Distillation Using Embeddings as Teachers [Outstanding Short Paper Award]
- Lakshmi V Nair (Lightmatter)
- Efficiency of Data Intensive Computing (DIC) in MEMS Research for Data Processing and Analysis
- Yeligay Segizbay (Nazarbayev Univ.)
- Capturing the Carbon Impact of Deep Learning
- Alexis Corona, Sanmukh Kuppannagari (Case Western Reserve Univ.)
- Transfer Learning Assisted Parameter Selection for Water-Fat Separation in Dixon MRI
- Alan Okinaka (Ursinus College), Gulfam A Saju, Yuchou Chang (UMass Dartmouth)
- Traditional Costume Image Classification for Indian States Using Deep Learning
- Sahana R Koti, Sahana Channappa Jatti, Anupama S Nandeppanavar, Medha Kudari (KLE Institute of Technology)
- Scalable Approach for Analytic Polynomial Subspace Projection Matrices for a Space-Time Covariance Matrix
- Faizan Ahmad Khattak, Mohammed Bakhit, Ian K. Proudler, Stephan Weiss (Univ. of Strathclyde)
Organizer(s): F. Franchetti and M. Franusich
Co-Chairs: J. Mullen, L. Milechin & H. Jananthan
- Invited Talk: Scaling Project-based Learning from Education to Research
- Joel Grimm (MIT Lincoln Laboratory)
- Invited Talk: Educational Game Dev from Start to Finish: A Short Example
- Chasen Milner (USAF)
- Invited Talk: HPC-ED: A Federated Catalog to Share and Discover CyberTraining Materials
- Susan Mehringer (Cornell Center for Advanced Computing)
- Invited Talk: The Wide Area Classroom – 10 Years On
- John Urbanic (CMU and Pittsburgh Supercomputing Center)
Chair(s)/Host(s): K. Cain
- Gesture Controlled System to Automate Shutdown, Screenshot and Volume Toggle
- Prisha Bhosale, Ananya Dandekar, Ria Dcosta, Sri Aishwarya Jonnavittula, Shagufta Rajguru (Fr. Conceicao Rodrigues Institute of Technology)
- Machine Learning Application for Smart Network Traffic Prediction
- Islam Omar (New Mexico State Univ.), Whit Schonbein (SNL), Hameed Badawy (New Mexico State Univ.)
- Model to Predict Inventory Demand in Retail SMEs Using CRISP-DM and Machine Learning
- Jhomax R Torres, Diego Moises Carpio Andia, Victor Parasi (Univ. Peruana de Ciencias Aplicadas)
- Determination of Game-based Design Equilibria by Using Machine Learning Approach
- Sara Karimi, Ehsan Ghotbi (Alfred Univ.)
- The Analysis of the Sparse Multi-GPU Parallel Method on the Large Sparse Power Flow Calculation
- Lei Zeng, Shadi Alawneh (Oakland Univ.)
Co-Chairs: H. Badawy & C. Long
- A Dynamic Weighting Strategy to Mitigate Worker Node Failure in Distributed Deep Learning
- Yuesheng Xu, Arielle Carr (Lehigh Univ.)
- P-YOLOv8: Efficient and Accurate Real-Time Detection of Distracted Driving
- Mohamed R. Elshamy (New Mexico State Univ.), Heba Emara (Pyramids High Institute of Electronic Engineering), Mohamed Nanyang Shoaib (Nanyang Tech. Univ.), Hameed Badawy (New Mexico State Univ.)
- Spike-driven YOLO: Ultra Low-Power Object Detection with Neuromorphic Computing
- Mark Barnell, Courtney Raymond, Lisa Loomis (AFRL), Francesca Vidal, Daniel Brown, Darrek Isereau (SRC)
- Exploring sparse inference with SuiteSparse:GraphBLAS
- Deepak Suresh, Timothy A Davis (Texas A&M Univ.)
- Improving Regression in Spiking Neural Networks for Oceanographic Data Analysis
- Alissa Kane, Yuchou Chang (UMass Dartmouth)
Co-Chairs: S. Gottlieb & N. Prajapati
- Benchmarking Thread Block Cluster [Best Paper Award]
- Tim Lühnen, Tobias Marschner, Sohan Lal (TU Hamburg)
- Understanding the Efficacy of Power Profiles: A Case Study of AMD Instinct MI100 GPU
- Ghazanfar Ali, Mert Side (Texas Tech Univ.), Sridutt Bhalachandra (Univ. of North Carolina), Tommy Dang, Alan Sill, Yong Chen (Texas Tech Univ.)
- Community Detection for Large Graphs on GPUs with Unified Memory
- Emre Dinçer, Işıl Öz (Izmir Institute of Technology)
- Invited Talk: From Simple to Hyper Co-Design of HPC Platforms
- Gary Grider (Los Alamos National Laboratory)
- Invited Talk: Lessons Learned from Implementing the Anonymized Network Sensing Graph Challenge with GPUs and Commodity Software
- Siddharth Samsi, Dan Campbell, Emanuel Scoullos, and Oded Green (NVIDIA)
Organizers: V. Gadepally & D. Burrill
- Invited Talk: Generative AI in the DoD: Use Cases and Challenges
- Manuel Xavier Lugo (US Navy)
- Invited Talk: How to Make an LLM Understand Human Conversation for Fun & Profit
- Kartik Talamadupula (Symbl.ai)
- Invited Talk: Innovations for Reducing the Environmental Impact of LLMs
- Boris Gamazaychikov (Salesforce)
- Invited Talk: Tackling Generative AI Productivity and Efficiency Challenge with Intel® Gaudi® 3 AI Accelerators
- Vasudev Lal (Intel)
Co-Chairs: J. Kepner & A. Reuther
- Keynote Talk: Convergence across the Computing Continuum: The NSF Leadership Class Computing Facility meets the Edge, Interactive Computing, and Low-Precision AI
- Dan Stanzione (Texas Advanced Computing Center)
Co-Chairs: B. Sroka & K. Gettings
- Breakthrough Edge AI Inference Performance using NorthPole in 3U VPX Form Factor
- Filipp Akopyan, William Risk, John Arthur, Andrew Cassidy, Michael Debole, Carlos Ortega Otero, Jun Sawada, Evan Colgan, Michael Criscolo (IBM Research), Phillip Mann (IBM), Heinz Baier, Kai Schleupen, Arnon Amir (IBM Research), Alexander Andreopoulos (IBM), Rathinakumar Appuswamy, Deepika Bablani, Peter Carlson, Pallab Datta, Steven Esser, Myron Flickner, Rajamohan Gandhasri, Guillaume Garreau, Megumi Ito, Jennifer Klamo, Jeffrey Kusnitz, Nathaniel McClatchey, Neil McGlohon, Jeffrey McKinstry, Yutaka Nakamura (IBM Research), Tapan Nayak (IBM Corporation), Jay Sivagnaname, Daniel Smith, Rafael Sousa, Brian Taba, Ignacio Terrizzano, Takanori Ueda, Dharmendra Modha (IBM Research)
- Breakthrough LLM Inference Performance using NorthPole
- Rathinakumar Appuswamy, Michael Debole, Brian Taba, Steven Esser, Andrew Cassidy, Arnon Amir (IBM Research), Alexander Andreopoulos (IBM), Deepika Bablani, Pallab Datta, Jeffrey Kusnitz, Nathaniel McClatchey, Neil McGlohon, Jeffrey McKinstry (IBM Research), Tapan Nayak (IBM Corporation), Daniel Smith, Rafael Sousa, Ignacio Terrizzano, Filipp Akopyan, Peter Carlson, Rajamohan Gandhasri, Guillaume Garreau, Nelson Gonzalez, Megumi Ito, Jennifer Klamo, Yutaka Nakamura, Carlos Ortega Otero, William Risk, Jun Sawada, Kai Schleupen, Jay Sivagnaname, Matthew Stallone, Takanori Ueda, Myron Flickner, John Arthur (IBM Research), Rameswar Panda, David Cox (MIT-IBM Watson AI Lab), Dharmendra Modha (IBM Research)
- A Framework to Enable Algorithmic Design Choice Exploration in DNNs
- Timothy Cronin, Sanmukh Kuppannagari (Case Western Reserve Univ.)
- Benchmarking Edge AI Platforms for High-Performance ML Inference
- Rakshith Jayanth, Neelesh Gupta, Viktor K Prasanna (USC)
- Transformers: A Graph Processing Perspective
- Manish Sri Sai Surya Routhu, Sai Dheeraj Yanduru, Nathaniel K Tomczak, Sanmukh Kuppannagari (Case Western Reserve Univ.)
Chair(s)/Host(s): K. Keville
- Perspective-Aware Ai (PAi) for Augmenting Critical Decision Making
- Marjan Alirezaie, Daniel Platnick (Flybits), Hossein Rahnama, Dava Newman, Alex Pentland (MIT)
- Evaluating the Impact of Noisy Blades on PROPELLER MRI Reconstruction Quality
- Gulfam A Saju, Marjan Akhi, Yuchou Chang (UMass Dartmouth)
- CompJouleS: Energy Estimate Tool for Machine Learning Algorithms for Multiple Applications in CPU, GPU, and FPGA Architectures
- Murat Isik (Stanford Univ.), Jens E. Pedersen (SLAC National Accelerator Laboratory), Vedant Karia (Univ. of Texas at San Antonio), Sadasivan Shankar (Stanford Univ.)
- Power Efficient Deep Learning Acceleration using Intel Xeon® Processors
- Xiaofei Jiang, Mona Minakshi, Rajesh Poornachandran, Shamima Najnin (Intel)
- Impact of Estimation Errors of a Matrix of Transfer Functions onto Its Analytic Singular Values and Their Potential Algorithmic Extraction
- Mohammed Bakhit, Faizan Ahmad Khattak, Ian K. Proudler, Stephan Weiss (Univ. of Strathclyde)
- Disaggregation Patterns for Secure AI Systems
- Mohamed Ghamri, Marc A Lacoste, Divi De Lacour (Orange)
Co-Chairs: N. Pitsianis & B. Sroka
- MonoCoder: Domain-Specific Code Language Model for HPC Codes and Tasks [Outstanding Paper Award]
- Tal Kadosh (Ben-Gurion Univ., IAEC), Niranjan Hasabnis (Intel), Vy Vo (Intel Labs), Nadav Schneider (Ben-Gurion University), Neva Krien (Independent), Mihai Capotă (Intel Labs), Abdul Wasay, Guy Tamir (Intel), Theodore L Willke, Nesreen Ahmed (Intel Labs), Yuval Pinter (Ben-Gurion University), Tim Mattson (Human Learning Group), Gal Oren (Technion, Stanford Univ.)
- LLM Inference Serving: Survey of Recent Advances and Opportunities [Outstanding Paper Award]
- Baolin Li, Yankai Jiang (Northeastern Univ.), Vijay Gadepally (MIT Lincoln Laboratory), Devesh Tiwari (Northeastern Univ.)
- Enhancing Code Translation in Language Models with Few-Shot Learning via Retrieval-Augmented Generation
- Manish Bhattarai, Javier E Santos, Shawn M Jones, Ayan Biswas, Boian Alexandroe, Daniel O Malley (LANL)
- High Performance Im2win and Direct Convolutions using Three Tensor Layouts on SIMD Architectures
- Xiang Fu, Xinpeng Zhang, Jixiang Ma (Nanchang Hangkong Univ.), Peng Zhao (Microsoft), Shuai Lu (Nanchang Hangkong Univ.), Xu Liu (Univ. of Washington)
- Accelerating Sensor Fusion in Neuromorphic Computing: A Case Study on Loihi-2
- Murat Isik (Drexel Univ.), Karn Tiwari (IIS Bangalore), Burak Eryilmaz (Bilkent Univ.), Ismail Can Dikmen (TEMSA)
Chair(s)/Host(s): TBD
- NeuroVM: Dynamic Neuromorphic Hardware Virtualization
- Murat Isik (Drexel Univ.), Kayode Inadagbo (Prairie View A&M Univ.), Ismail Can Dikmen (TEMSA)
- LLMs for Closed-Library Multi-Document Query, Test Generation, and Evaluation
- Claire Randolph (Dept. of the Air Force), Adam Michaleas, Darrell O Ricke (MIT Lincoln Laboratory)
- LLM-Based Task Planning for Navigating Companion Robot from Emotion Signals
- Yuchou Chang (UMass Dartmouth), Huy Anh Pham (Intelligent Medical Objects, Inc.), Gulfam A Saju (UMass Dartmouth)
- Large Multimodal Model for Simulating Big Training Data in Deep PROPELLER MRI
- Gulfam A Saju, Marjan Akhi, Yuchou Chang (UMass Dartmouth)
- Artificial Intelligence Solution on Intel Xeon Processor Power and Performance Engineering
- Zhongbin Liu, Xiaofei Jiang, Jiajia Zhang (Intel)
- Boosting the Performance of Reinforcement Learning-based Task Scheduling using Offline Inference
- Chedi Morchdi (Univ. of Utah), Cheng-Hsiang Chiu (Univ. of Wisconsin), Yi Zhou (Univ. of Utah), Tsung-Wei Huang (Univ. of Wisconsin)
Co-Chairs: K. Keville & P. Luszczek
- Reinforcement Learning-generated Topological Order for Dynamic Task Graph Scheduling
- Cheng-Hsiang Chiu (Univ. of Wisconsin), Chedi Morchdi, Yi Zhou (Univ. of Utah), Boyang Zhang, Che Chang, Tsung-Wei Huang (Univ. of Wisconsin)
- FPGA Acceleration for Scalable High-Resolution OPIR Target Detection
- Daniel C Stumpp (Univ. of Pittsburgh), Alan George (NSF Center for High Performance Reconfigurable Computing)
- Hybrid Computing Architecture Based on Analog Phase-Change Memory Chips for Deep Neural Network Training
- Zhenhao Jiao (Univ. of Science and Technology of China), Tao Hong, Xiaogang Chen, Weibang Dai (Shanghai Institute of Microsystem and Information Technology, CAS), Chengcai Tu (Donghua University), Shunfen Li, Houpeng Chen, Zhitang Song (Shanghai Institute of Microsystem and Information Technology, CAS)
- Exploring the Trade-off Between Repair Time and Reliability in Large Scale Cluster Computers: A Simulation-Based Approach
- Leslie Horace (Georgia Inst. of Tech.), Craig Walker, William M Jones (Coastal Carolina Univ.), Nathan DeBardeleben, Vivian Hafener, Steven Senator (LANL)
- Experiences with VITIS AI for Deep Reinforcement Learning
- Nabayan Chaudhury, Atharva M Gondhalekar, Wu-chun Feng (Virginia Tech)
Co-Chairs: J. Kepner & A. Reuther
- Mercury: Efficient Subgraph Matching on GPUs with Hybrid Scheduling
- Zhiheng Lin (Inst. of Computing Tech, CAS), Changjie Xu (UCAS), Ke Meng, Guangming Tan (Inst. of Computing Tech, CAS)
- Towards Faster Graph Partitioning via Pre-training and Inductive Inference
- Meng Qin (HKUST), Chaorui Zhang (Huawei), Yu Gao (Independent), Yibing Ding, Weipeng Jiang (Huawei), Weixi Zhang (Huawei Technologies), Wei Han (Huawei), Bo Bai (Huawei Technologies)
- Distributed-Memory Sparse Deep Neural Network Inference Using Global Arrays
- Sayan Ghosh, Bruce Palmer, Andres Marquez (PNNL)
- Anonymized Network Sensing Graph Challenge
- Hayden R Jananthan, Michael Jones, William Arcand, David Bestor, William Bergeron, Daniel Burrill (MIT Lincoln Laboratory), Aydin Buluc (LBNL), Chansup Byun (MIT Lincoln Laboratory), Timothy Davis (Texas A&M), Vijay Gadepally (MIT Lincoln Laboratory), Daniel Grant (GreyNoise), Michael Houle, Matthew Hubbell, Piotr Luszczek, Peter Michaleas (MIT Lincoln Laboratory), Lauren Milechin (MIT), Chasen Milner, Guillermo Morales (MIT Lincoln Laboratory), Andrew Morris (GreyNoise), Julie Mullen, Ritesh Patel (MIT Lincoln Laboratory), Alex Pentland (MIT), Sandeep Pisharody, Andrew Prout, Albert Reuther, Antonio Rosa, Gabriel Wachman, Charles Yee, Jeremy Kepner (MIT Lincoln Laboratory)
- Extracting TCPIP Headers at High Speed for the Anonymized Network Traffic Graph Challenge
- Zhaoyang Han, Andrew Briasco-Stewart (Northeastern Univ.), Michael Zink (UMass Amherst), Miriam Leeser (Northeastern Univ.)
- Sans: Streaming Anonymized Network Sensing
- Ketai Zhao, Yuhang Zhou, Hong Xu Pan, Zhibin Wang, Sheng Zhong, Chen Tian (Nanjing Univ.)
Co-Chairs: J. Kepner & A. Reuther
- Keynote Talk: AI/ML Applications for Global Security
- Eric Evans (MIT)
Co-Chairs: D. Ricke & B. Raut
- Cycle-Stealing in Load-Imbalanced HPC Applications [Outstanding Student Paper Award]
- Po Hao Chen (Brown University), Akshaya Bali, Shining Yang, Pouya Haghi, Carlton Knox, Benjamin Li (Boston Univ.), Amr Abouelmagd, Anthony Skjellum (Tennessee Tech), Martin Herbordt (Boston Univ.)
- Tightly-Coupled FPGA Accelerator for Molecular Dynamics Simulation: Hardware-Software Co-Design and Fine-Grained Task Management [Outstanding Student Paper Award]
- Zekang Cheng, Zerong S He, Xi Jin (Univ. of Science and Technology of China)
- MST in Incremental Graphs Through Tree Contractions
- Akanksha Dwivedi, Sameer Sharma, Dip Sankar Banerjee (IIT Jodhpur)
- Syndeo: Portable Ray Clusters with Secure Containerization
- William Li, Rodney S Lafuente Mercado, Jaime Pena, Ross E Allen (MIT Lincoln Laboratory)
- Evaluating One-Sided Communication on Graph500 with MPI-RMA and OpenSHMEM
- Jefferson Boothe (Univ. of Pittsburgh), Alan George (NSF Center for High Performance Reconfigurable Computing)
Chair(s)/Host(s): K. Keville
- Synthesizing Numerical Linear Algebra using Julia [Best Short Paper Award]
- Sophie Xuan, Rabab MA Alomairy, Evelyne Ringoot, Felipe Tome, Julian Samaroo, Alan Edelman (MIT)
- Towards LibraryX: A Framework for Cross-Library Call Optimization [Outstanding Short Paper Award]
- Sanil Rao, Anant Prakash, Franz Franchetti (Carnegie Mellon Univ.)
- Compressed Cannon’s Algorithm
- Louis Jencka, Amanda J Bienz (Univ. of New Mexico)
- Multiplication of Sparse Matrices and their Transpose using Compressed Sparse Diagonals
- Sardar Anisul Haque (Oryx Universal College in Partnership with LJMU (UK)), Mohammad Tanvir Parvez (Qassim Univ.), Shahadat Hossain (Univ. of Northern British Columbia)
- Augmenting HPC Profilers with Analysis Capabilities
- Abhishek N Patil, Shamjith K V, Senthil Kumar RK, Dr. S D Sudarsan (C-DAC)
- Explainable DiGCNs for Decomposition of Opaque Node Ranking Functions
- Vishal Chandra (MIT Lincoln Laboratory)
Co-Chairs: D. Ricke & J. Mullen
- An Efficient Multi-DNN Accelerator Based on Multiple Systolic Array
- Jianjun Chen, Han Jiao, Wenjin Huang, Yihua Huang (Sun Yat-Sen Univ.)
- JACC.shared: Leveraging HPC Metaprogramming and Performance Portability for Computations That Use Shared Memory GPUs
- Pedro Valero-Lara, William Godoy, Keita Teranishi, Jeffrey Vetter (ORNL)
- OCO-GAT: An Accelerator for Graph Attention Network with Optimized Calculation Order
- Qi Liu, Wenjin Huang, Wenlu Peng, Yihua Huang (Sun Yat-Sen Univ.)
- Task-Level Parallelism for the Multifrontal Method in Tightly Coupled CPU-FPGA Architectures
- Zerong S He, Zekang Cheng, Zhongguang Xu, Xi Jin (Univ. of Science and Technology of China)
- LLload: An Easy-to-Use HPC Utilization Tool
- Chansup Byun, Albert Reuther, Julie Mullen, LaToya Anderson, William Arcand, Bill Bergeron, David Bestor, Alexander Bonn, Daniel Burrill, Vijay Gadepally, Michael Houle, Matthew Hubbell, Hayden Jananthan, Michael Jones, Piotr Luszczek, Peter Michaleas (MIT Lincoln Laboratory), Lauren Milechin (MIT), Guillermo Morales, Andrew Prout, Antonio Rosa, Charles Yee, Jeremy Kepner (MIT Lincoln Laboratory)
Chair(s)/Host(s): K. Keville
- A Framework for Analyzing the Performance of Sparse Matrix and Graph Operations
- Khaled Abdelaal, Richard M Veras (Univ. of Oklahoma)
- Efficient Eigenvalue Computation of Parahermitian Matrices Using Neural Networks
- Diyari A. Hassan (Qaiwan Intl. Univ.), Yunus Egi, Soydan Redif (American Univ. of Middle East)
- Towards a RISC-V Instruction Set Extension for Multi-word Arithmetic
- Youngjin Eum, Naifeng Zhang, Larry Tang, Franz Franchetti (Carnegie Mellon Univ.)
- AstraMQ: Distributed MQTT Broker
- Rohan M Doshi, Sanika Inamdar, Tanmay Karmarkar, Madhuri S Wakode (Pune Inst. Of Computer Tech.)
- Computational and Numerical Properties of a Broadband Subspace-Based Likelihood Ratio Test
- Cornelius Pahalson, Louise Crockett, Stephan Weiss (Univ. of Strathclyde)
Co-Chairs: J. Mullen & N. Pitsianis
- Persistent and Partitioned MPI for Stencil Communication
- Gerald Collom (Univ. of New Mexico), Jason Burmark, Olga Pearce (LLNL), Amanda J Bienz (Univ. of New Mexico)
- HPC Network Simulation Tuning via Automatic Extraction of Hardware Parameters
- Joshua Suetterlein, Stephen Young, Jesun S Firoz, Joseph Manzano, Nathan Tallent, Ryan Friese, Kevin Barker, Timothy Stavenger (PNNL)
- Accelerating Multi-Agent DDPG Training on Multi-GPU Platforms
- Samuel Wiggins, Viktor K Prasanna (USC)
- Binary Bleed: Fast Distributed and Parallel Method for Automatic Model Selection
- Ryan C Barron, Maksim Eren (LANL), Manish Bhattarai (Los Alamos National Lab), Ismael Boureima (LANL), Cynthia Matuszek (UMBC), Boian Alexandroe (LANL)
- GPU Sharing with Triples Mode
- Chansup Byun, Albert Reuther, LaToya Anderson, William Arcand, Bill Bergeron, David Bestor, Alexander Bonn, Daniel Burrill, Vijay Gadepally, Michael Houle, Matthew Hubbell, Hayden Jananthan, Michael Jones, Piotr Luszczek, Peter Michaleas (MIT Lincoln Laboratory), Lauren Milechin (MIT), Guillermo Morales, Julie Mullen, Andrew Prout, Antonio Rosa, Charles Yee, Jeremy Kepner (MIT Lincoln Laboratory)
Co-Chairs: N. Pitsianis & C. Byun
- BB-CVXOPT: Basic Block Execution Count Estimation and Extrapolation using Constrained Convex Optimization
- Youssef A Aly (New Mexico State Univ.), Atanu Barai, Nandakishore Santhi (LANL), Hameed Badawy (New Mexico State Univ.)
- Parallel Online Directed Acyclic Graph Exploration for Atlasing Soft-Matter Assembly Configuration Spaces
- Rahul Prabhu, Amit Verma, Meera Sitharam (Univ. of Florida)
- Towards Just-in-Time Instruction Generation for Accelerated Sparse Matrix-Matrix Multiplication on GPUs
- Seth Kay, H. Howie Huang (George Washington Univ.)
- HBM-based Hardware Accelerator for GNN Sampling and Aggregation
- Yuchen Gui (Univ. of Science and Technology of China), Qizhe Wu, Wei Yuan (USTC), Huawen Liang, Xiaotian Wang, Xi Jin (Univ. of Science and Technology of China)
- GPU Accelerated Construction of Time Respecting Data Structure for Temporal Graphs
- Animan Naskar, Venkata Kalyan Tavva (IIT Ropar), Subhasis Banerjee (Shell India Markets Pvt. Ltd.)
- Comparative Analysis of GCC and LLVM for Performance Optimization on Aarch64
- Mriganka Bezbaruah, Samruddhi Dhakulkar, Prachi Pandey, Haribabu Pasupuleti, S A Kumar, S D Sudarsan (C-DAC)
- Benchmarking the Performance of Large Language Models on the Cerebras Wafer Scale Engine [Outstanding Student Paper Award]
- Zuoning Zhang, Dhruv Parikh (USC), Youning Zhang (UC, Berkeley), Viktor K Prasanna (USC)