Proceedings are available online in the ACM Digital Library.
Registered participants: Please check the Whova app/site for the detailed Agenda. You can also download the proceedings as a password-protected zip file (~ 110MB) from: PACT2020 proceedings.
Friday, October 2
11:00 am – 3:00 pm (Tutorial)
- Design Space Exploration
12:00 pm – 5:00 pm (Workshop)
- Machine Learning for Software Hardware Co-Design
Monday, October 5
12:00 pm – 12:15 pm
- Opening Remarks
12:15 pm – 1:15 pm (Keynote, Chair:Vivek Sarkar)
1:15 pm – 2:30 pm (Optimizations for GPUs, Chair:Gilles Pokam)
- cuSZ: An Efficient GPU Based Error-Bounded Lossy Compression Framework for Scientific Data
- TAFE: Thread Address Footprint Estimation for Capturing Data/Thread Locality in GPU Systems
- SparseRT: Accelerating Unstructured Sparsity on GPUs for Deep Learning Inference
- GOPipe: A Granularity-Oblivious Programming Framework for Pipelined Stencil Executions on GPU
- Exploring the Design Space of Static and Incremental Connected Components on GPUs
2:30 pm – 2:45 pm (Break)
2:45 pm – 4:15 pm (Compiler optimization and code generation, Chair:Rajiv Gupta)
- Fireiron: A Data-Movement-Aware Scheduling Language for GPUs
- Automatic Generation of Multi-Objective Polyhedral Compiler Transformations
- Bandwidth-Aware Loop Tiling for DMA-Supported Scratch Pad Memory
- Deep Program Structure Modeling Through Multi-Relational Graph-based Learning
- AutoHOOT: Automatic High-Order Optimization for Tensors
- Intelligent Data Placement on Discrete GPU Nodes with Unified Memory
4:15 pm – 5:00 pm (Panel, Moderator : Tushar Krishna)
- How ML influence compiler/architecture research
Michael O’Boyle (The University of Edinburgh),
Albert Cohen (Google),
Xipeng Shen (North Carolina State University),
Taewook Oh (Facebook),
Andreas Moshovos (University of Toronto),
Hadi Esmaeilzadeh (University of California San Diego).
5:00 pm – 6:00 pm (Poster Session, Chair:Adwait Jog)
- Deep Learning Assisted Resource Partitioning for Improving Performance on Commodity Servers
- Decoupled Address Translation for Heterogeneous Memory Systems
- Bandwidth Bottleneck in Network-on-Chip for High-Throughput Processors
- Hybrid Transactional Memory With Just Enough Metadata
- Energy-Aware Differential Neural Architecture Search for Edge Devices
- Relaxed Skiplist for Concurrent Priority Queues
- Novel Reuse Distance Analysis Tool for Multicore
- md_stencil: High-Performance Stencil Computations on CPU and GPU via Multi-Dimensional Homomorphisms
- Archipelago: Architectural Support for Graph Analytics on GPUs
Tuesday, October 6
12:00 pm – 1:00 pm (Keynote, Chair:Hyesoon Kim)
1:00 pm – 2:15 pm (Parallel architectures, Chair:Jose Joao)
- Analyzing and Leveraging Shared L1 Caches in GPUs
- Transmuter: Bridging the Efficiency Gap using Memory and Dataflow Reconfiguration
- Enhancing Address Translations in Throughput Processors via Compression
- Regional Out-of-Order Writes in Total Store Order
- Parallel and Scalable Precise Clustering
2:15 pm – 2:30 pm (Break)
2:30 pm – 3:45 pm (Hardware/software for security & machine learning, Chair:Divya Mahajan)
- SecSched: Flexible Scheduling in Secure Processors
- Clearing the Shadows: Recovering Lost Performance for Invisible Speculative Execution through HW/SW Co-Design
- Fast Convolutional Neural Networks with Fine-Grained FFTs
- Accelerating Sparse CNN Inference on GPUs withPerformance-Aware Weight Pruning
- SparseTrain: Leveraging Dynamic Sparsity in Software for Training DNNs on General-Purpose SIMD Processors
3:45 pm – 4:00 pm (Break)
4:00 pm – 5:00 pm (Best paper, Chair:Hyeran Jeon)
- Helix: Algorithm/Architecture Co-design for Accelerating Nanopore Genome Base-calling
- Opportunistic Early Pipeline Re-steering for Data-dependent Branches
- Model-Based Warp Overlapped Tiling for Image Processing Programs on GPUs
- Low-Latency Proactive Continuous Vision
5:00 pm – 6:00 pm (Poster Session)
- Collective Affinity Aware Computation Mapping
- VTensor: Using Virtual Tensors to Build a Layout-oblivious AI Programming Framework
- Parallelizing Parallel Programs: A Dynamic Pattern Analysis for Modernization of Legacy Parallel Code
- A New Qubits Mapping Mechanism for Multi-programming Quantum Computing
- Exploiting Locality in Scalable Ordered Maps
- DeepSwapper: A Deep Learning Based Page Swap Management Scheme for Hybrid Memory Systems
- VP Float: First Class Treatment for Variable Precision Floating Point Arithmetic
- Approximate Pattern Matching for On-Chip Interconnect Traffic Prediction
Wednesday, October 7
12:00 pm – 1:00 pm (Keynote, Chair:Jeffrey S Young)
1:00 pm – 2:15 pm (Domain/application-specific hardware/software, Chair:Mahmut T. Kandemir)
- The Forward Slice Core Microarchitecture
- A Methodology for Principled Approximation in Visual SLAM
- Memory-Equipped Quantum Architectures – The Power of Random Access
- Mixed-Signal Charge-Domain Acceleration of Deep Neural Networks through Interleaved Bit-Partitioned Arithmetic
- MEPHESTO: Modeling Energy-Performance in Heterogeneous SOCs and Their Trade-Offs
2:15 pm – 2:30 pm (Break)
2:30 pm – 3:45 pm (Memory/storage systems, Chair:Onur Kayiran)
- Ribbon: High Performance Cache Line Flushing for Persistent Memory
- PRISM: Architectural Support for Variable-granularity Memory Metadata
- Valkyrie: Leveraging Inter-TLB Locality to Enhance GPU Performance
- RackMem: A Tailored Caching Layer for Rack Scale Computing
- ATTC(@c): Addressable-TLB based Translation Coherence
3:45 pm – 4:00 pm (Break)
4:00 pm – 4:45 pm (Panel)
- How to broaden compiler/architecture research participants by utilizing modern infrastructures.
Paul V. Gratz (Texas A&M),
Tim Sherwood (University of California Santa Barbara),
Dan Quinlan (Lawrence Livermore National Laboratory),
Daniel Mosse (University of Pittsburgh),
Lena Olson (Google),
Siva Rajamanickam (Sandia National Laboratories).
4:45 pm – 5:00 pm
- Closing Remarks / Award ceremony