Proceedings are available online in the ACM Digital Library.
Registered participants: Please check the Whova app/site for the detailed Agenda. You can also download the proceedings as a password-protected zip file (~ 110MB) from: PACT2020 proceedings.
Friday, October 2
11:00 am – 3:00 pm (Tutorial)
- Design Space Exploration
12:00 pm – 5:00 pm (Workshop)
- Machine Learning for Software Hardware Co-Design
Monday, October 5
12:00 pm – 12:15 pm
- Opening Remarks
12:15 pm – 1:15 pm (Keynote, Chair:Vivek Sarkar)
1:15 pm – 2:30 pm (Optimizations for GPUs, Chair:Gilles Pokam)
- cuSZ: An Efficient GPU Based Error-Bounded Lossy Compression Framework for Scientific Data
- TAFE: Thread Address Footprint Estimation for Capturing Data/Thread Locality in GPU Systems
- SparseRT: Accelerating Unstructured Sparsity on GPUs for Deep Learning Inference
- GOPipe: A Granularity-Oblivious Programming Framework for Pipelined Stencil Executions on GPU
- Exploring the Design Space of Static and Incremental Connected Components on GPUs
2:30 pm – 2:45 pm (Break)
2:45 pm – 4:15 pm (Compiler optimization and code generation, Chair:Rajiv Gupta)
- Fireiron: A Data-Movement-Aware Scheduling Language for GPUs
- Automatic Generation of Multi-Objective Polyhedral Compiler Transformations
- Bandwidth-Aware Loop Tiling for DMA-Supported Scratch Pad Memory
- Deep Program Structure Modeling Through Multi-Relational Graph-based Learning
- AutoHOOT: Automatic High-Order Optimization for Tensors
- Intelligent Data Placement on Discrete GPU Nodes with Unified Memory
4:15 pm – 5:00 pm (Panel, Moderator : Tushar Krishna)
- How ML influence compiler/architecture research
Panelist:
Michael O’Boyle (The University of Edinburgh),
Albert Cohen (Google),
Xipeng Shen (North Carolina State University),
Taewook Oh (Facebook),
Andreas Moshovos (University of Toronto),
Hadi Esmaeilzadeh (University of California San Diego).
5:00 pm – 6:00 pm (Poster Session, Chair:Adwait Jog)
- Deep Learning Assisted Resource Partitioning for Improving Performance on Commodity Servers
- Decoupled Address Translation for Heterogeneous Memory Systems
- Bandwidth Bottleneck in Network-on-Chip for High-Throughput Processors
- Hybrid Transactional Memory With Just Enough Metadata
- Energy-Aware Differential Neural Architecture Search for Edge Devices
- Relaxed Skiplist for Concurrent Priority Queues
- Novel Reuse Distance Analysis Tool for Multicore
- md_stencil: High-Performance Stencil Computations on CPU and GPU via Multi-Dimensional Homomorphisms
- Archipelago: Architectural Support for Graph Analytics on GPUs
Tuesday, October 6
12:00 pm – 1:00 pm (Keynote, Chair:Hyesoon Kim)
1:00 pm – 2:15 pm (Parallel architectures, Chair:Jose Joao)
- Analyzing and Leveraging Shared L1 Caches in GPUs
- Transmuter: Bridging the Efficiency Gap using Memory and Dataflow Reconfiguration
- Enhancing Address Translations in Throughput Processors via Compression
- Regional Out-of-Order Writes in Total Store Order
- Parallel and Scalable Precise Clustering
2:15 pm – 2:30 pm (Break)
2:30 pm – 3:45 pm (Hardware/software for security & machine learning, Chair:Divya Mahajan)
- SecSched: Flexible Scheduling in Secure Processors
- Clearing the Shadows: Recovering Lost Performance for Invisible Speculative Execution through HW/SW Co-Design
- Fast Convolutional Neural Networks with Fine-Grained FFTs
- Accelerating Sparse CNN Inference on GPUs withPerformance-Aware Weight Pruning
- SparseTrain: Leveraging Dynamic Sparsity in Software for Training DNNs on General-Purpose SIMD Processors
3:45 pm – 4:00 pm (Break)
4:00 pm – 5:00 pm (Best paper, Chair:Hyeran Jeon)
- Helix: Algorithm/Architecture Co-design for Accelerating Nanopore Genome Base-calling
- Opportunistic Early Pipeline Re-steering for Data-dependent Branches
- Model-Based Warp Overlapped Tiling for Image Processing Programs on GPUs
- Low-Latency Proactive Continuous Vision
5:00 pm – 6:00 pm (Poster Session)
- Collective Affinity Aware Computation Mapping
- VTensor: Using Virtual Tensors to Build a Layout-oblivious AI Programming Framework
- Parallelizing Parallel Programs: A Dynamic Pattern Analysis for Modernization of Legacy Parallel Code
- A New Qubits Mapping Mechanism for Multi-programming Quantum Computing
- Exploiting Locality in Scalable Ordered Maps
- DeepSwapper: A Deep Learning Based Page Swap Management Scheme for Hybrid Memory Systems
- VP Float: First Class Treatment for Variable Precision Floating Point Arithmetic
- Approximate Pattern Matching for On-Chip Interconnect Traffic Prediction
Wednesday, October 7
12:00 pm – 1:00 pm (Keynote, Chair:Jeffrey S Young)
1:00 pm – 2:15 pm (Domain/application-specific hardware/software, Chair:Mahmut T. Kandemir)
- The Forward Slice Core Microarchitecture
- A Methodology for Principled Approximation in Visual SLAM
- Memory-Equipped Quantum Architectures – The Power of Random Access
- Mixed-Signal Charge-Domain Acceleration of Deep Neural Networks through Interleaved Bit-Partitioned Arithmetic
- MEPHESTO: Modeling Energy-Performance in Heterogeneous SOCs and Their Trade-Offs
2:15 pm – 2:30 pm (Break)
2:30 pm – 3:45 pm (Memory/storage systems, Chair:Onur Kayiran)
- Ribbon: High Performance Cache Line Flushing for Persistent Memory
- PRISM: Architectural Support for Variable-granularity Memory Metadata
- Valkyrie: Leveraging Inter-TLB Locality to Enhance GPU Performance
- RackMem: A Tailored Caching Layer for Rack Scale Computing
- ATTC(@c): Addressable-TLB based Translation Coherence
3:45 pm – 4:00 pm (Break)
4:00 pm – 4:45 pm (Panel)
- How to broaden compiler/architecture research participants by utilizing modern infrastructures.
Panelist:
Paul V. Gratz (Texas A&M),
Tim Sherwood (University of California Santa Barbara),
Dan Quinlan (Lawrence Livermore National Laboratory),
Daniel Mosse (University of Pittsburgh),
Lena Olson (Google),
Siva Rajamanickam (Sandia National Laboratories).
4:45 pm – 5:00 pm
- Closing Remarks / Award ceremony