Languages, Systems, and Data Seminar (Spring 2024)

Time: Fridays, noon - 1:05pm (PT)
Location: The Internet / The LSD Lab (Engineering 2, Room 398)
Organizers: Lindsey Kuper, Tyler Sorensen, and Gan Shen

The Languages, Systems, and Data Seminar meets weekly to discuss interesting topics in the areas of programming languages, systems, databases, formal methods, security, software engineering, verification, architecture, and beyond. Our goal is to encourage interactions and discussions between students, researchers, and faculty with interests in these areas. The seminar is open to everyone interested. Participating UCSC students should register for the 2-credit course CSE 280O (let the organizers know if you’re an undergrad and need a permission code).

For spring 2024, we will continue to host the LSD Seminar in a hybrid fashion. Anyone can attend on Zoom, and local folks can gather in person in the lab. Speakers can join either in person or on Zoom, whichever is convenient.

Talks will be advertised on the ucsc-lsd-seminar-announce (for anyone) and lsd-group (for UCSC-affiliated people) mailing lists.

Date	Speaker	Title
April 5	Luke Geeson	Compiler Testing with Relaxed Memory Models
April 12	Dan Plyukhin	Ozone: Fully Out-of-Order Choreographies
April 19	Ismail Kuru	Modal Abstractions for Virtualizing Memory Addresses
April 26	Jennifer Switzer	Hardware Repurposing to Reduce the Embodied Carbon of Computing
May 3	Julian Haas	LoRe: Reasoning about Safety and Consistency in Local-First Software
May 10	Robin Brown	WebAssembly Components: The Modular Polyglot Ecosystem We Need
May 17	Zach Sisco	Hardware Decompilation: Recovering Abstraction in Digital Circuits
May 24	Guannan Wei	Types and Metaprogramming for Correct, Safe, and Performant Software Systems
May 31	Canceled
June 7	Canceled

April 5

Speaker: Luke Geeson

Title: Compiler Testing with Relaxed Memory Models

Abstract: Finding bugs is key to the correctness of compilers in wide use today. If the behaviour of a compiled program, as allowed by its architecture memory model, is not a behaviour of the source program under its source model, then there is a bug. This holds for all programs, but we focus on concurrency bugs that occur only with two or more threads of execution. We focus on testing techniques that detect such bugs in C/C++ compilers.

We seek a testing technique that automatically covers concurrency bugs up to fixed bounds on program sizes and that scales to find bugs in compiled programs with many lines of code. Otherwise, a testing technique can miss bugs. Unfortunately, the state-of-the-art techniques are yet to satisfy all of these properties.

We present the Téléchat compiler testing tool for concurrent programs. Téléchat compiles a concurrent C/C++ program and compares source and compiled program behaviours using source and architecture memory models. We make three claims: Téléchat improves the state-of-the-art at finding bugs in code generation for multi-threaded execution, it is the first public description of a compiler testing tool for concurrency that is deployed in industry, and it is the first tool that takes a significant step towards the desired properties. We provide experimental evidence suggesting Téléchat finds bugs missed by other state-of-the-art techniques, case studies indicating that Téléchat satisfies the properties, and reports of our experience deploying Téléchat in industry regression testing.

Based on work to appear in the International Symposium on Code Generation and Optimization (CGO) 2024: https://conf.researchr.org/info/cgo-2024/accepted-papers

And recently presented work at The Future of Weak Memory Workshop (POPL) 2024: https://lukegeeson.com/talks/2024-01-15-POPL24/

Bio: Luke is a Computer Science PhD student at UCL, supervised by James Brotherston, Earl Barr, and Lee Smith. He is developing techniques to find concurrency bugs in C/C++ compilers using formal models of relaxed memory concurrency.

He is based in the compiler teams at Arm in Cambridge where he assists engineers with finding bugs and deploying automated concurrency testing as part of an EPSRC grant. Luke’s opinions are his own and Arm does not endorse his work.

April 12

Speaker: Dan Plyukhin

Title: Ozone: Fully Out-of-Order Choreographies

Abstract: Choreographic programming is a paradigm for writing distributed applications. It allows programmers to write a single program, called a choreography, that can be compiled to generate correct implementations of each process in the application. Although choreographies provide good static guarantees, they can exhibit high latency when messages or processes are delayed. This is because processes in a choreography typically execute in a fixed, deterministic order, and cannot adapt to the order that messages arrive at runtime. In non-choreographic code, programmers can address this problem by allowing processes to execute out of order – for instance by using futures or reactive programming. However, in choreographic code, out-of-order process execution can lead to serious and subtle bugs, called communication integrity violations (CIVs).

In this paper, we develop a model of choreographic programming for out-of-order processes that guarantees absence of CIVs and deadlocks. As an application of our approach, we also introduce an API for safe non-blocking communication via futures in the choreographic programming language Choral. The API allows processes to execute out of order, participate in multiple choreographies concurrently, and to handle unordered or dropped messages as in the UDP transport protocol. We provide an illustrative evaluation of our API, showing that out-of-order execution can reduce latency by overlapping communication with computation.

Bio: Dan works as a postdoc at the University of Southern Denmark and is receiving his PhD from the University of Illinois at Urbana-Champaign (UIUC). He is the author of the UIGC library for Akka, which provides fault-tolerant resource management for distributed actor systems. Dan’s research focuses on making distributed reactive applications easier to write and reason about.

April 19

Speaker: Ismail Kuru

Title: Modal Abstractions for Virtualizing Memory Addresses

Abstract: Virtual Memory Managers are critical pieces of general-purpose OS kernels. They enable virtualizing the addresses of memory regions by realizing the address translation mechanism using CPU’s memory management’s (MMU) kernel-controlled page tables. Operating systems manipulate these virtualized memory mappings to isolate untrusted processes, restrict which memory is accessible to different processes, hide memory limits from user programs, etc.

Unfortunately, verifying them becomes challenging as they are interfaced by the complex hardware: the page tables are updated via writes to those memory locations, using addresses that are themselves virtualized! Prior work on verification of VMM has either only handled a single address space, trusted significant pieces of assembly code, or resorted to direct reasoning over machine semantics rather than exposing a clean logical interface.

In this talk, I will be explaining the logical abstractions, some of which are inspired by Hybrid Logic and allow us to mention resources (virtualized memory addresses) belonging to different address spaces within the same specification.

Bio: I am Ismail Kuru, a final year PhD student at Drexel University, and I am advised by Dr. Colin S. Gordon. Right before coming to Drexel, I was a senior software engineer at CRYTEK Gaming Company. Before then, I had finished my computer science masters courses at TU Munich and graduated with an M.S. degree from Koc University as a Microsoft Research EMEA scholar for graduate studies.

April 26

Speaker: Jennifer Switzer

Title: Hardware Repurposing to Reduce the Embodied Carbon of Computing

Abstract: Sustainable computing efforts have traditionally focused on runtime efficiency. However, a significant fraction of the carbon emissions associated with computing systems are incurred not during use, but rather manufacture. These embodied emissions are responsible for 40% of the lifetime carbon footprint for server-class hardware, and as much as 80% for consumer electronics like smartphones. This work presents a new way of thinking about sustainable computing, in terms of both operational and embodied emissions. It explores one consequence of this thinking—that reducing the demand for newly manufactured hardware is an important strategy for emissions reduction. To this end, we propose the repurposing of consumer-class hardware as general purpose computing or sensing hardware. This is explored through several real-world deployments of repurposed smartphones. We find that repurposed devices can provide a computing platform that is several times more carbon-efficient than the alternative of manufacturing new hardware, and explore the applications for which repurposed devices are best suited.

Bio: Jennifer Switzer is a PhD candidate at UC San Diego. Her research interests lie at the intersection of sustainability and computing systems, and especially efforts to reduce the embodied carbon footprint of computing. She is supported by a Google Fellowship.

May 3

Speaker: Julian Haas

Title: LoRe: Reasoning about Safety and Consistency in Local-First Software

Abstract: The “Local-First Software” movement calls for distributed applications that move data processing from the cloud back to local user devices. This allows for applications that work offline and preserve user privacy while still enabling collaboration and data synchronization. Unfortunately, the distributed and asynchronous nature of such applications makes them hard to reason about and existing programming models provide little to no support for verification.

We propose LoRe, a programming language and compiler that automatically verifies developer-supplied safety properties for local-first applications. LoRe combines the declarative data flow of reactive programming with static analysis and verification techniques to precisely determine concurrent interactions that violate safety invariants and to selectively employ strong consistency through coordination where required. We introduce a formalized proof principle and demonstrate how to automate the process in a prototype implementation that outputs verified executable code.

The talk will be based on work that appeared in the ACM Transactions on Programming Languages and Systems (TOPLAS) 2024: https://dl.acm.org/doi/10.1145/3633769 And at ECOOP 2023: https://doi.org/10.4230/LIPIcs.ECOOP.2023.12

Bio: Julian is a third-year PhD student at TU Darmstadt in Germany, supervised by Mira Mezini and co-supervised by Annette Bieniusa at TU Kaiserslautern. In his research, he is working on programming languages and verification tools for distributed systems, with a focus on privacy-preserving decentralized applications. When not in front of a screen, he enjoys hiking and playing board games.

May 10

Speaker: Robin Brown

Title: WebAssembly Components: The Modular Polyglot Ecosystem We Need

Abstract: With the release of WASI 0.2 and the Component Model, WebAssembly (Wasm) has facilities for high-level interop between Wasm guests and hosts, as well as other guests. While some of this was possible with various project-bespoke ABIs, having a shared standards-track canonical ABI and Component Model enables the creation of a broader ecosystem without fragmentation.

An increasing number of programming languages are able to produce components which implement the standard WASI “worlds” for servers and CLI applications as well as an infinite variety of custom interfaces. There’s also been innovation in tools like WAC that make it possible to compose arbitrary components statically with strong static type checking. This makes Wasm incredibly powerful as a way to build modular composable polyglot systems.

The ecosystem being built around Wasm and the Component Model make it a very promising target for new programming languages especially if they build Component-native toolchains and take the most advantage of WASI, the Warg registry protocol, Wasm-to-Wasm optimizers like Binaryen, and Wasm-based dev tools like Wow. I created a new language called Claw recently to demonstrate this and also provide an ideal “glue code” language for augmenting and testing existing Components.

In this talk, I will * explain what Wasm and Components are, * summarize the state of the Wasm ecosystem, * argue that language designers and developers should consider targetting Wasm, * and give a live demo of Claw and Wow.

Bio: Robin Brown is a co-chair of a Bytecode Alliance group that brings together programming language ecosystems and helps them create Wasm component tooling that feels native to their ecosystem, which is called the Guest Languages SIG. She is also the creator of the compile-to-component programming language Claw and the co-creator of the Warg protocol, which is an open source Wasm registry protocol with a focus on supply chain security, federation, and offline mirroring.

May 17

Speaker: Zach Sisco

Title: Hardware Decompilation: Recovering Abstraction in Digital Circuits

Abstract: My research introduces the problem of “hardware decompilation”. Analogous to software decompilation, hardware decompilation is about analyzing a low-level artifact—in this case a netlist, i.e., a graph of wires and logical gates representing a digital circuit—in order to recover higher-level programming abstractions, and using those abstractions to generate code written in a hardware description language (HDL). The overall problem of hardware decompilation requires a number of pieces. In my initial paper, published at PLDI 2023, I focus on one specific piece of the puzzle: a technique called “hardware loop rerolling”. Hardware loop rerolling leverages clone detection and program synthesis techniques to identify repeated logic in netlists (such as would be synthesized from loops in the original HDL code) and reroll them into syntactic loops in the recovered HDL code. In this talk, I will introduce what hardware decompilation is, and why you would want to use a hardware decompiler. Then, I will describe my solution to the hardware loop rerolling problem, and give a preview of in-progress work tackling more aspects of hardware decompilation.

Bio: Zach Sisco is a PhD candidate at UC Santa Barbara. He is advised by Professors Jonathan Balkind and Ben Hardekopf. Zach’s research is about applying solver-aided programming techniques to problems in hardware design. His website is: https://zsisco.github.io/

May 24

Speaker: Guannan Wei

Title: Types and Metaprogramming for Correct, Safe, and Performant Software Systems

Abstract: In this talk, I will present some novel directions to build correct, safe, and performant software systems using programming languages and metaprogramming techniques. In the first part of the talk, I will present reachability type systems, a family of static type systems to track sharing, separation, and side effects in higher-order imperative programs. Reachability types lead to a smooth combination of Rust-style ownership types with higher-level programming abstractions (such as first-class functions). In the second part, I will discuss how metaprogramming techniques can help build correct, flexible, and performant program analyzers. I will present GenSym, a parallel symbolic-execution compiler that is derived from a high-level definitional symbolic interpreter using program generation techniques. GenSym generates code in continuation-passing style to perform parallel symbolic execution of LLVM IR programs, and significantly outperforms similar state-of-the-art tools. The talk also covers my future research plan to apply reachability types in designing languages for quantum computing.

Bio: Guannan Wei is a postdoctoral researcher at Purdue University. He will join Tufts University as a tenure-track assistant professor in Fall 2025. His research interests lie in programming languages and software engineering. His contributions have been published in flagship programming languages and software engineering venues, such as POPL, OOPSLA, ICFP, ECOOP, ICSE, and ESEC/FSE. Guannan received his PhD degree (2023) in Computer Science from Purdue University, advised by Tiark Rompf. More of Guannan’s work can be found at https://continuation.passing.style.