Time: Fridays, noon - 1:15pm (PT)
Location: The Internet
Organizers: Lindsey Kuper and Tyler Sorensen
The Languages, Systems, and Data Seminar meets weekly to discuss interesting topics in the areas of programming languages, systems, databases, formal methods, security, software engineering, verification, architecture, and beyond. Our goal is to encourage interactions and discussions between students, researchers, and faculty with interests in these areas. The seminar is open to everyone interested. Participating students should register for the 2-credit course CSE 280O.
For winter 2021, this seminar is completely virtual and will feature a mix of internal and external speakers.
|Jan. 8||None||Introductions and social time|
|Jan. 15||Kenny Foner||Dialectic: Pragmatic, Efficient Session Types for Async Rust|
|Jan. 22||Naorin Hossain||TransForm: Formally Specifying Transistency Models and Synthesizing Enhanced Litmus Tests|
|Jan. 29||Aviral Goel||On the Design, Implementation, and Use of Laziness in R|
|Feb. 5||Victor Ying||Parallelizing Sequential Code with Compiler-Hardware Co-Design|
|Feb. 12||Paulette Koronkevich||The ANF Translation Preserves Dependent Types up to Extensional Equality|
|Feb. 19||Kamala Ramasubramanian||ACT now: Aggregate Comparison of Traces for Incident Localization|
|Feb. 26||Prakash Murali||Closing the Gap between Quantum Algorithms and Hardware using Compilation and Architecture|
|March 5||Vinu Joseph||Programmable Neural Network Compression with Correctness Emphasis|
|March 12||Daniel Bittman||Twizzler: Evolving Operating Systems for Non-volatile Memory|
Introductions and social time
Speaker: Kenny Foner
Title: Dialectic: Pragmatic, Efficient Session Types for Async Rust
Abstract: Session types have long promised the ability to enforce complex temporal invariants about message ordering in distributed systems, guaranteeing that a well-session-typed program always correctly follows a messaging protocol to its end. Despite their potential, these type systems have seen relatively little adoption in the mainstream. In part, this is due to their inherent reliance on linear typing, a feature not present and difficult to emulate in most popular languages. Recently, this has changed with the emergence of Rust, a systems programming language built atop a flexible “ownership type system” to track reference aliasing. Although Rust enforces in actuality a form of affine typing (weaker than the linear typing most formal session-typed calculi require) it’s enough to build a library for session types that statically enforces the slightly-weaker guarantee that a program correctly follows a messaging protocol so long as it is running, but may quit the session early. Of course, this is the best we can hope to get in the real world! Networks go down, computers crash, and hardware just plain breaks. For session types to be broadly adopted, they must survive contact with this real and unpredictable world—and equally, with the real and unpredictable people who want to use them in their programs.
I’ve just released version 0.2 of Dialectic: a library for pragmatic, succinct, efficient session types in Rust. Dialectic embraces Rust’s burgeoning ecosystem of high-performance async networking by being polymorphic over any backend transport used to convey messages between parties. Unlike many libraries for session types, Dialectic assumes that the other party might break protocol or disconnect at any time, and is designed to gracefully handle such failures without compromising type safety. Dialectic is designed to be used for writing specifications and programs of every size from small to large, and provides what’s needed to write modular specifications and implementations of complex protocols—and not merely the regular session types expressible in most libraries, but all context-free session types.
More than merely showing off this cool thing I’m working on, I want to talk about the design process that went into making it. Designing a programming paradigm—whether you want to call it an embedded domain-specific language or merely a library—is a multi-faceted puzzle spanning considerations from psychological familiarity to formal computability. My hope is that in walking through this case study in embedded language design, we can spark more thoughts about how we as PL researchers can play in this field together and design beautiful, useful tools.
Bio: Kenny Foner (email@example.com) is a senior software engineer at Bolt Labs working on privacy-preserving financial technology for everyone. They have a master’s degree in programming languages from the University of Pennsylvania, where they worked on a smorgasbord of fun things from laziness to random testing. When they’re not writing Rust, they’re usually somewhere in the forest.
Speaker: Naorin Hossain
Title: TransForm: Formally Specifying Transistency Models and Synthesizing Enhanced Litmus Tests
Abstract: Memory consistency models (MCMs) have been formulated as a mechanism for expressing the legal ordering and visibility of shared memory accesses in hardware and software. They are fundamental for ensuring heterogeneous components of a system execute and interact as expected to prevent hardware-induced bugs in real-world programs. However, ISA-level MCMs are limited to defining the behavior of only user-facing assembly instructions and do not account for virtual memory implementations that may result in the execution of 1) hardware-level state updates and 2) system-level interactions. Both are capable of accessing memory and may affect program outcomes, thus making them software-visible. As a result, memory transistency models (MTMs) have been coined as a superset of MCMs to additionally capture and enforce virtual memory-aware ordering rules. However, no prior work enabled the formal specification or analysis of MTMs.
TransForm fills this gap by introducing an axiomatic vocabulary for formally specifying MTMs that builds on the standard axiomatic vocabulary traditionally used for describing MCMs. It provides new constructs for modeling transistency-specific features such as hardware-level state updates and system-level interactions. Using this new axiomatic vocabulary, MTMs can be formally specified and used with TransForm’s synthesis engine to synthesize litmus tests enhanced with transistency features, called enhanced litmus tests. This talk will cover TransForm’s axiomatic vocabulary and synthesis engine, as well as a case study performed with TransForm to formally define an approximate MTM for Intel x86 processors.
Speaker: Aviral Goel
Title: On the Design, Implementation, and Use of Laziness in R
Abstract: In this talk, I will present the design and implementation of call-by-need in R, and a data-driven study of how generations of programmers have put laziness to use in their code. In our study, we analyze 16,707 R packages and observe the creation of 270.9 B promises. Our data suggest that there is little supporting evidence to assert that programmers use laziness to avoid unnecessary computation or to operate over infinite data structures. For the most part R code appears to have been written without reliance on and in many cases even knowledge of, delayed argument evaluation. The only significant exception is a small number of packages which leverage call-by-need for meta-programming. I will discuss how we intend to leverage these insights to remove laziness from R and enable non-intrusive migration of code from lazy to eager evaluation.
Bio: Aviral Goel is a Computer Science Ph.D. student at Northeastern University, advised by Professor Jan Vitek. He received his Bachelor’s degree in Electronics and Communication Engineering from Netaji Subhas Institute of Technology, India.
He is interested in improving tools and techniques for data science applications. He is enabling R programmers to write faster and bug-free code by migrating the language from lazy-by-default to lazy-on-demand semantics.
He is also involved in the development of a type system for R.
Speaker: Victor Ying
Title: Parallelizing Sequential Code with Compiler-Hardware Co-Design
Abstract: Today, most code still runs on expensive, power-hungry processors that prioritize single-thread performance. Speculative parallelization is an enticing approach to accelerate computation while retaining the ease of sequential programming, by launching tasks in parallel before knowing if they are independent. Unfortunately, prior speculative parallelizing compilers and architectures achieved limited speedups due to high costs of recovering from misspeculation and hardware scalability bottlenecks.
We present T4, a parallelizing compiler that executes sequential programs as trees of tiny timestamped tasks. T4 targets the recent Swarm architecture, which presents new opportunities and challenges for automatic parallelization. T4 introduces novel compiler techniques to expose parallelism aggressively across the entire program, breaking applications into tiny tasks of tens of instructions each. Task trees unfold their branches in parallel to enable high task-spawn throughput while exploiting selective aborts to recover from misspeculation cheaply. T4 exploits parallelism across function calls, loops, and loop nests; performs new transformations to reduce task spawn costs and avoid false sharing; and exploits data locality among fine-grain tasks. As a result, T4 scales several hard-to-parallelize SPEC CPU2006 benchmarks to tens of cores, where prior work attained little or no speedup.
For more information, please visit swarm.csail.mit.edu
Bio: Victor Ying is a 5th year PhD student at MIT, advised by Daniel Sanchez. He works on parallel architectures, compilers, and programming models. Victor’s recent work focuses on redesigning abstractions between hardware and software to make it as easy to exploit multicore parallelism as it is to write ordinary sequential programs. His prior work includes Boolean satisfiability solvers, scheduling machine learning workloads on hardware accelerators, and embedded and distributed systems.
Speaker: Paulette Koronkevich
Title: The ANF Translation Preserves Dependent Types up to Extensional Equality
Abstract: Many programmers use dependently-typed languages like Coq to machine-verify high-assurance software. However, existing compilers for these languages provide no guarantees after compiling, nor when linking after compilation. Type-preserving compilers preserve guarantees encoded in types, then use type checking to verify compiled code and ensure safe linking with external code. Unfortunately, dependent type systems are highly sensitive to syntactic changes, including compilation, so preserving them through a compiler pass is difficult.
In this talk, I will present some examples of why dependent typing is difficult to preserve through simple syntactic changes. I will also present our solution to preserving dependent types through the ANF translation, a necessary transformation towards compiling a functional language down to machine code. Our ANF translation preserves dependent types, provided that the target type system has a way to encode these syntactic semantics-preserving changes. We encode these by including extensional equality in our target type system.
Bio: Paulette Koronkevich is a second year graduate student (finishing MSc and starting PhD) at the University of British Columbia, working with William J. Bowman. She has a undergraduate degree in computer science from Indiana University. Her interests include compilers, cats, and cooking.
Speaker: Kamala Ramasubramanian
Title: ACT now: Aggregate Comparison of Traces for Incident Localization
Abstract: Incidents in production systems are common and downtime is expensive. Applying an appropriate mitigating action quickly, such as changing a specific firewall rule or routing around a broken network link, saves money. Identifying where to mitigate is time-consuming since a single failure can produce widespread effects. Knowing how different system events relate to each other is necessary to quickly identify where to mitigate. Our approach, Aggregate Comparison of Traces (ACT), localizes incidents by comparing sets of traces (which capture events and their relationships for individual requests) sampled from the most recent steady-state operation and during an incident. In our quantitative experiments, we conduct hundreds of simulations and show that ACT is able to identify exactly where to mitigate in all but a few cases.
Bio: Kamala Ramasubramanian is a PhD candidate at University of California, Santa Cruz advised by Peter Alvaro. She works on understanding, implementing and troubleshooting distributed systems by reasoning about observed system executions. She recently became interested in system verification and how it may overlap with her current work. She is a vegetarian foodie, likes to CrossFit and wants to try surfing.
Speaker: Prakash Murali
Title: Closing the Gap between Quantum Algorithms and Hardware using Compilation and Architecture
Abstract: In recent years, quantum computing (QC) hardware has progressed considerably with small systems being prototyped by industry and academic vendors. However, there is a huge gap between the resource requirements of promising applications and the hardware that is buildable now; qubit counts and operational noise constraints of applications exceed hardware capabilities by 5-6 orders of magnitude. Our work seeks to enable practical QC by bridging this gap: from the top with novel compiler techniques and algorithmic optimizations to reduce application requirements and from the bottom via system architectures efficiently exploiting scarce QC resources.
In this talk, we present two cross-cutting optimizations that narrow the applications-to-hardware resource gap. First, we present noise-adaptive compilation techniques that optimize applications for the spatio-temporal noise variations seen in real QC systems. Using real executions, we demonstrate average fidelity improvements of 3X using noise-adaptivity, compared to industry compiler tools. Second, on the architecture front, we study instruction set design issues considering application requirements and hardware gate calibration overheads. Current QC systems either use ISAs with a single two-qubit gate type or families of continuous gate sets. Using architectural simulations based on Google and Rigetti hardware, we show that QC instruction sets with 4-8 two-qubit gate types are best suited for expressing application requirements, while incurring tractable calibration overheads. In response to our work, several industry vendors have included noise-adaptivity and its extensions as part of their toolflows and adjusted device architecture to expose more native operations and hardware characterization data.
Speaker: Vinu Joseph
Title: Programmable Neural Network Compression with Correctness Emphasis
Abstract: Deep neural networks (DNNs) frequently contain far more weights, represented at higher precision, than are required for the specific task which they are trained to perform. Consequently, they can often be compressed using techniques such as weight pruning and quantization that reduce both the model size and inference time without appreciable loss in accuracy. However, finding the best compression strategy and corresponding target sparsity for a given DNN, hardware platform, and optimization objective currently requires expensive, frequently manual, trial-and-error experimentation. In this talk, we introduce a programmable system for model compression called Condensa. Users programmatically compose simple operators, in Python, to build more complex and practically interesting compression strategies. Given a strategy and user-provided objective (such as minimization of running time), Condensa uses a novel Bayesian optimization-based algorithm to automatically infer desirable sparsities. Our experiments real-world DNNs demonstrate memory footprint and hardware runtime throughput improvements of 188x and 2.59x, respectively, using at most ten samples per search. We have released a reference implementation of Condensa. Next, we will talk about our recent correctness extension where we preserve not just the overall accuracy but also metrics relating to model fairness and interpretability. To achieve this, we augment the compression loss function with terms arising from the teacher-student learning paradigm and show how to automatically tune the associated parameters. We demonstrate the effectiveness of our approach on multiple compression schemes and accuracy recovery algorithms using several different real-world network architectures. We obtain a significant reduction of up to 4.1X in the number of mismatches between the compressed and reference models, while also achieving a significant reduction in class-wise disparate impact metrics compared to the reference models.
Bio: Vinu Joseph is a Ph.D. candidate in Computer Science at the School of Computing at the University of Utah, Salt Lake City, working on efficient deep learning computing, robustness, and security of deep learning algorithms, advised by Prof. Ganesh Gopalakrishnan. He is one of the five recipients of the NVIDIA Graduate fellowship, the recipients were selected based on their academic achievements and area of research. Prior to graduate studies, Vinu worked at ARM Inc. During his tenure at ARM, he was a recipient of the Bravo award for developing the programmer’s model for verifying real-time (‘R’) profile architecture which provides high-performing processors for safety-critical environments. Vinu’s current research focuses on optimizing deep neural network-based systems for performance and scalability. More broadly, His research is at the intersection of systems, programming languages, and machine learning, to create a more efficient, performant, secure, privacy-preserving, and correct software. His Ph.D. research has been mainly focused on deep neural network compression for resource-efficient inference and robustness. He is generously supported by an NVIDIA Ph.D. fellowship, mentored by Saurav Muralidharan and Michael Garland, he developed Condensa: A Programming System for Model Compression and Optimization.
Speaker: Daniel Bittman
Title: Twizzler: Evolving Operating Systems for Non-volatile Memory
Abstract: Byte addressable, non-volatile memory (NVM) demands that we rethink the entire system stack. Twizzler is an operating system designed for the near-future of NVM on the memory bus. It removes the kernel from the I/O path, provides programs with direct access to NVM, enabling simpler and more efficient long-term operations on persistent data. Twizzler provides a clean-slate programming model for persistent data, realizing the vision of Unix in a world of NVM. It does this through a pervasive notion of data identity coupled with an efficient design for persistent pointers that allows programmers to construct persistent data structures in a large, persistent, global address space. This talk will present an overview of Twizzler and discuss upcoming directions for Twizzler in security, distribution, and programming languages.Bio
Bio: Daniel is a PhD candidate at UC Santa Cruz, advised by Ethan Miller and Peter Alvaro. His interests are in kernel programming and design, security, non-volatile memory, and concurrent programming. His current project is on developing an operating system for non-volatile memories and developing operating system designs and interfaces for better programming and data models in such an environment.