osdi 2021 accepted papersduncan hines banana cake mix recipes
First, it enables a caller to push a message to a callee in two hops, using a new way of assigning mailboxes to users that resembles how a post office assigns PO boxes to its customers. (Visa applications can take at least 30 working days to process.) Therefore, developers typically find data locality issues via dynamic profiling and repair them manually. High-performance tensor programs are critical for efficiently deploying deep neural network (DNN) models in real-world tasks. They collectively make the backup fresh, columnar, and fault-tolerant, even facing millions of concurrent transactions per second. Professor Veloso earned a Bachelor and Master of Science degrees in Electrical and Computer Engineering from Instituto Superior Tecnico in Lisbon, Portugal, a Master of Arts in Computer Science from Boston University, and Master of Science and PhD in Computer Science from Carnegie Mellon University. All papers will be available online to registered attendees before the conference. PC members are not required to read supplementary material when reviewing the paper, so each paper should stand alone without it. Mothy joined the Computer Science Department ETH Zurich in January 2007 and was named Fellow of the ACM in 2013 for contributions to operating systems and networking research. OSDI brings together professionals from academic and industrial backgrounds in a premier forum for discussing the design, implementation, and implications of systems software. The co-chairs may then share that paper with the workshops organizers and discuss it with them. As a member of ACCT, I have served two years on the bylaws and governance committee and two years on the finance and audit committee. An evaluation of Addra on a cluster of 80 machines on AWS demonstrates that it can serve 32K users with a 99-th percentile message latency of 726 msa 7 improvement over a prior system for text messaging in the same threat model. To this end, we propose GNNAdvisor, an adaptive and efficient runtime system to accelerate various GNN workloads on GPU platforms. Her specialties include network routing protocols and network security. (Oct 2018) Awarded an Intel Faculty Grant for Research on automated performance optimization (Sep. 2018) Our paper on Foreshadow is accepted to appear at USENIX Security. Session Chairs: Deniz Altinbken, Google, and Rashmi Vinayak, Carnegie Mellon University, Tanvir Ahmed Khan and Ian Neal, University of Michigan; Gilles Pokam, Intel Corporation; Barzan Mozafari and Baris Kasikci, University of Michigan. HotNets provides a venue for discussing innovative ideas and for debating future research agendas in networking. In this paper, we propose a software-hardware co-design to support dynamic, fine-grained, large-scale secure memory as well as fast-initialization. In particular, I'll argue for re-engaging with what computer hardware really is today and give two suggestions (among many) about how the OS research community can usefully do this, and exploit what is actually a tremendous opportunity. Radia Perlman is a Fellow at Dell Technologies. We describe Fluffy, a multi-transaction differential fuzzer for finding consensus bugs in Ethereum. Machine learning (ML) models trained on personal data have been shown to leak information about users. Computation separation makes it possible to construct a deep, bounded-asynchronous pipeline where graph and tensor parallel tasks can fully overlap, effectively hiding the network latency incurred by Lambdas. The 20th ACM Workshop on Hot Topics in Networks (HotNets 2021) will bring together researchers in computer networks and systems to engage in a lively debate on the theory and practice of computer networking. We present application studies for 8 applications, improving requests-per-second (RPS) by 7.7% and reducing RAM usage 2.4%. We focus on NVMe storage devices and show that it is natural to express these semantics in the kernel and the application and only requires a modest two-bit change to the device interface. Secure Computation (SC) is a family of cryptographic primitives for computing on encrypted data in single-party and multi-party settings. We argue that a key-value interface between a file system and an SSD is superior to the legacy block interface by presenting KEVIN. There is no explicit limit to the response, but authors are strongly encouraged to keep it under 500 words; reviewers are neither required nor expected to read excessively long responses. SC is being increasingly adopted by industry for a variety of applications. In this paper, we show how to address this inefficiency without requiring pages to be rewritten or browsers to be modified. Weak Links in Authentication Chains: A Large-scale Analysis of Email Sender Spoofing Attacks We develop rigorous theoretical foundations to simplify equivalence examination and correction for partially equivalent transformations, and design an efficient search algorithm to quickly discover highly optimized programs by combining fully and partially equivalent optimizations at the tensor, operator, and graph levels. Differential privacy (DP) enables model training with a guaranteed bound on this leakage. 64 papers accepted out of 341 submitted. OSDI brings together professionals from academic and industrial backgrounds in what has become a premier forum for discussing the design, implementation, and implications of systems software. The overhead of GPT is 5% for memory-intensive workloads (e.g., Redis) and negligible for CPU-intensive workloads (e.g., RV8 and Coremarks). These are hard deadlines, and no extensions will be given. Jaehyun Hwang and Midhul Vuppalapati, Cornell University; Simon Peter, UT Austin; Rachit Agarwal, Cornell University. Welcome to the 15th USENIX Symposium on Operating Systems Design and Implementation (OSDI '21) submissions site. Prior or concurrent workshop publication does not preclude publishing a related paper in OSDI. Typically, monolithic kernels share state across cores and rely on one-off synchronization patterns that are specialized for each kernel structure or subsystem. However, a plethora of recent data breaches show that even widely trusted service providers can be compromised. Professor Veloso has been recognized with a multiple honors, including being a Fellow of the ACM, IEEE, AAAS, and AAAI. Performance experiments show that GoNFS provides similar performance (e.g., at least 90% throughput across several benchmarks on an NVMe disk) to Linuxs NFS server exporting an ext4 file system, suggesting that GoJournal is a competitive journaling system. The NAL maintains 1) per-node partial views in PM for serving insert/update/delete operations with failure atomicity and 2) a global view in DRAM for serving lookup operations. In addition, increasing CPU core counts further complicate kernel development. The OSDI Symposium emphasizes innovative research as well as quantified or insightful experiences in systems design and implementation. MAGE outperforms the OS virtual memory system by up to an order of magnitude, and in many cases, runs SC computations that do not fit in memory at nearly the same speed as if the underlying machines had unbounded physical memory to fit the entire computation. For example, talks may be shorter than in prior years, or some parts of the conference may be multi-tracked. When uploading your OSDI 2021 reviews for your submission to SOSP, you can optionally append a note about how you addressed the reviews and comments. sosp ACM Symposium on Operating Systems Principles. She also has made contributions in network security, including scalable data expiration, distributed algorithms despite malicious participants, and DDOS prevention techniques. Upon these two primitives, our system can scale to thousands of concurrent enclaves with high resource utilization and eliminate the high-cost initialization of secure memory using fork-style enclave creation without weakening the security guarantees. For general conference information, see https://www . Our evaluation shows that, compared to existing participant selection mechanisms, Oort improves time-to-accuracy performance by 1.2X-14.1X and final model accuracy by 1.3%-9.8%, while efficiently enforcing developer-specified model testing criteria at the scale of millions of clients. Based on this observation, P3 proposes a new approach for distributed GNN training. Calibrated interrupts increase throughput by up to 35%, reduce CPU consumption by as much as 30%, and achieve up to 37% lower latency when interrupts are coalesced. There are two major GNN training obstacles: 1) it relies on high-end servers with many GPUs which are expensive to purchase and maintain, and 2) limited memory on GPUs cannot scale to today's billion-edge graphs. Moreover, as of October 2020, a review of the 50 most cited empirical papers that list personality as a keyword indicates that all 50 papers were authored by people with insti tutional affiliations in the United States, Canada, Germany, the UK, and New Zealand, and only three papers included samples outside of these regions (see Supplementary And yet, they continue to rely on centralized search engines and indexers to help users access the content they seek and navigate the apps. The chairs may reject abstracts or papers on the basis of egregious missing or extraneous conflicts. Hence, CLP enables efficient search and analytics on archived logs, something that was impossible without it. Professor Veloso is the Past President of AAAI (the Association for the Advancement of Artificial Intelligence), and the co-founder, Trustee, and Past President of RoboCup. blk-switch uses this insight to adapt techniques from the computer networking literature (e.g., multiple egress queues, prioritized processing of individual requests, load balancing, and switch scheduling) to the Linux kernel storage stack. Submissions may include as many additional pages as needed for references but not for appendices. She is the recipient of several best paper awards, the Einstein Chair of the Chinese Academy of Science, the ACM/SIGART Autonomous Agents Research Award, an NSF Career Award, and the Allen Newell Medal for Excellence in Research. AI enables principled representation of knowledge, complex strategy optimization, learning from data, and support to human decision making. Novel system designs, thorough empirical work, well-motivated theoretical results, and new application areas are all . All submissions will be treated as confidential prior to publication on the USENIX OSDI 21 website; rejected submissions will be permanently treated as confidential. Yuke Wang, Boyuan Feng, Gushu Li, Shuangchen Li, Lei Deng, Yuan Xie, and Yufei Ding, University of California, Santa Barbara. Starting with small invariant formulas and strongest possible invariants avoids large SMT queries, improving SMT solver performance. All the times listed below are in Pacific Daylight Time (PDT). This fast path contains programmable hardware support for low latency transport and congestion control as well as hardware support for efficient load balancing of RPCs to cores. Proceedings Front Matter She has a PhD in computer science from MIT. Memory allocation represents significant compute cost at the warehouse scale and its optimization can yield considerable cost savings. Erhu Feng, Xu Lu, Dong Du, Bicheng Yang, and Xueqiang Jiang, Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University; Engineering Research Center for Domain-specific Operating Systems, Ministry of Education, China; Yubin Xia, Binyu Zang, and Haibo Chen, Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University; Shanghai AI Laboratory; Engineering Research Center for Domain-specific Operating Systems, Ministry of Education, China. How can we design systems that will be reliable despite misbehaving participants? Our evaluation on the SPEC benchmarks shows that SanRazor can reduce the overhead of sanitizers significantly, from 73.8% to 28.062.0% for AddressSanitizer, and from 160.1% to 36.6124.4% for UndefinedBehaviorSanitizer (depending on the applied reduction scheme). Indeed, it is a prime target for powerful adversaries such as nation states. Mingyu Li, Jinhao Zhu, and Tianxu Zhang, Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University; Shanghai AI Laboratory; Engineering Research Center for Domain-specific Operating Systems, Ministry of Education, China; Cheng Tan, Northeastern University; Yubin Xia, Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University; Shanghai AI Laboratory; Engineering Research Center for Domain-specific Operating Systems, Ministry of Education, China; Sebastian Angel, University of Pennsylvania; Haibo Chen, Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University; Shanghai AI Laboratory; Engineering Research Center for Domain-specific Operating Systems, Ministry of Education, China. Across a wide range of pages, phones, and mobile networks covering web workloads in both developed and emerging regions, Horcrux reduces median browser computation delays by 31-44% and page load times by 18-37%. Papers accompanied by nondisclosure agreement forms will not be considered. We conclude with a discussion of additional techniques for improving the allocator development process and potential optimization strategies for future memory allocators. Just using Lambdas on top of CPU servers offers up to 2.75 more performance-per-dollar than training only with CPU servers. PET then automatically corrects results to restore full equivalence. To enable FL developers to interpret their results in model testing, Oort enforces their requirements on the distribution of participant data while improving the duration of federated testing by cherry-picking clients. We present DPF (Dominant Private Block Fairness) a variant of the popular Dominant Resource Fairness (DRF) algorithmthat is geared toward the non-replenishable privacy resource but enjoys similar theoretical properties as DRF. We present Nap, a black-box approach that converts concurrent persistent memory (PM) indexes into NUMA-aware counterparts. Today, privacy controls are enforced by data curators with full access to data in the clear. This motivates the need for a new approach to data privacy that can provide strong assurance and control to users. The experimental results show that Penglai can support 1,000s enclave instances running concurrently and scale up to 512GB secure memory with both encryption and integrity protection. Fluffy found two new consensus bugs in the most popular Geth Ethereum client which were exploitable on the live Ethereum mainnet. Because DistAI starts with the strongest possible invariants, if the SMT solver fails, DistAI does not need to discard failed invariants, but knows to monotonically weaken them and try again with the solver, repeating the process until it eventually succeeds. We convert five state-of-the-art PM indexes using Nap. Web pages today commonly include large amounts of JavaScript code in order to offer users a dynamic experience. Existing decentralized systems like Steemit, OpenBazaar, and the growing number of blockchain apps provide alternatives to existing services. When registering your abstract, you must provide information about conflicts with PC members. NrOS is primarily constructed as a simple, sequential kernel with no concurrency, making it easier to develop and reason about its correctness. Commonly used log archival and compression tools like Gzip provide high compression ratio, yet searching archived logs is a slow and painful process as it first requires decompressing the logs. He joined Intel Research at Berkeley in April 2002 as a principal architect of PlanetLab, an open, shared platform for developing and deploying planetary-scale services. Responses should be limited to clarifying the submitted work. We have implemented a prototype of our design based on Penglai, an open-sourced enclave system for RISC-V. We demonstrate that KEVIN reduces the amount of I/O traffic between the host and the device, and remains particularly robust as the system ages and the data become fragmented. Message from the Program Co-Chairs. We propose a learning-based framework that instead explicitly optimizes concurrency control via offline training to maximize performance. We also welcome work that explores the interface to related areas such as computer architecture, networking, programming languages, analytics, and databases. This formulation of memory management, which we call memory programming, is a generalization of paging that allows MAGE to provide a highly efficient virtual memory abstraction for SC. Sponsored by USENIX in cooperation with ACM SIGOPS. We propose PET, the first DNN framework that optimizes tensor programs with partially equivalent transformations and automated corrections. P3 exposes a simple API that captures many different classes of GNN architectures for generality. Prior or concurrent publication in non-peer-reviewed contexts, like arXiv.org, technical reports, talks, and social media posts, is permitted. Thanks to selective profiling, DMons profiling overhead is 1.36% on average, making it feasible for production use. Collaboration: You have a collaboration on a project, publication, grant proposal, program co-chairship, or editorship within the past two years (December 2018 through March 2021). The NVMe zoned namespace (ZNS) is emerging as a new storage interface, where the logical address space is divided into fixed-sized zones, and each zone must be written sequentially for flash-memory-friendly access. We first introduce two new hardware primitives: 1) Guarded Page Table (GPT), which protects page table pages to support page-level secure memory isolation; 2) Mountable Merkle Tree (MMT), which supports scalable integrity protection for secure memory. Furthermore, by combining SanRazor with an existing sanitizer reduction tool ASAP, we show synergistic effect by reducing the runtime cost to only 7.0% with a reasonable tradeoff of security. For general conference information, see https://www.usenix.org/conference/osdi22. NrOS replicates kernel state on each NUMA node and uses operation logs to maintain strong consistency between replicas. Many application domains can benefit from hybrid transaction/analytical processing (HTAP) by executing queries on real-time datasets produced by concurrent transactions. The copyback-aware block allocation considers different copy costs at different copy paths within the SSD. While several new GNN architectures have been proposed, the scale of real-world graphsin many cases billions of nodes and edgesposes challenges during model training. Sijie Shen, Rong Chen, Haibo Chen, and Binyu Zang, Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University; Shanghai Artificial Intelligence Laboratory; Engineering Research Center for Domain-specific Operating Systems, Ministry of Education, China. Existing systems that hide voice call metadata either require trusted intermediaries in the network or scale to only tens of users. Leveraging these information, Pollux dynamically (re-)assigns resources to improve cluster-wide goodput, while respecting fairness and continually optimizing each DL job to better utilize those resources. She developed the technology for making network routing self-stabilizing, largely self-managing, and scalable. Here, we focus on hugepage coverage. This paper presents Zeph, a system that enables users to set privacy preferences on how their data can be shared and processed. See the Preview Session page for an overview of the topics covered in the program. Lifting predicates and crash framing make the specification easy to use for developers, and logically atomic crash specifications allow for modular reasoning in GoJournal, making the proof tractable despite complex concurrency and crash interleavings. Devices employ adaptive interrupt coalescing heuristics that try to balance between these opposing goals. Foreshadow was chosen as an IEEE Micro Top Pick. At a high level, Addra follows a template in which callers and callees deposit and retrieve messages from private mailboxes hosted at an untrusted server. Submitted November 12, 2021 Accepted January 20, 2022. Shaghayegh Mardani, UCLA; Ayush Goel, University of Michigan; Ronny Ko, Harvard University; Harsha V. Madhyastha, University of Michigan; Ravi Netravali, Princeton University. Abstract registrations that do not provide sufficient information to understand the topic and contribution (e.g., empty abstracts, placeholder abstracts, or trivial abstracts) will be rejected, thereby precluding paper submission. This paper demonstrates that it is possible to achieve s-scale latency using Linux kernel storage stack, even when tens of latency-sensitive applications compete for host resources with throughput-bound applications that perform read/write operations at throughput close to hardware capacity. Although the number of submissions is lower than the past, it's likely only due to the late announcement; being in my first OSDI PC, I think the quality of the submitted and accepted papers remains as high as ever. Finding the inductive invariant of the distributed protocol is a critical step in verifying the correctness of distributed systems, but takes a long time to do even for simple protocols. Welcome to the 2021 USENIX Annual Technical Conference (ATC '21) submissions site! Existing frameworks optimize tensor programs by applying fully equivalent transformations, which maintain equivalence on every element of output tensors. This year, there were only 2 accepted papers from UK institutes. Submitted papers must be no longer than 12 single-spaced 8.5 x 11 pages, including figures and tables, plus as many pages as needed for references, using 10-point type on 12-point (single-spaced) leading, two-column format, Times Roman or a similar font, within a text block 7 wide x 9 deep. (Jan 2019) Our REPT paper won a best paper at OSDI'18 (Oct 2018) I will serve in the SOSP'19 PC. See the USENIX Conference Submissions Policy for details. Used Zotero to organize papers about the stress and diffusion between anode and electrolyte and made a summary . Poor data locality hurts an application's performance. She is the author of the textbook Interconnections (about network layers 2 and 3) and coauthor of Network Security. Welcome to the 16th USENIX Symposium on Operating Systems Design and Implementation (OSDI '22) submissions site. For instance, FAST 21 and NSDI 21 have author-notification dates after the OSDI 21 abstract-registration deadline. Perennial 2.0 makes this possible by introducing several techniques to formalize GoJournals specification and to manage the complexity in the proof of GoJournals implementation. A glance at this year's OSDI program shows that Operating Systems are a small niche topic for this conference, not even meriting their own full session. Mothy's current research centers on Enzian, a powerful hybrid CPU/FPGA machine designed for research into systems software. Our further evaluation on 38 CVEs from 10 commonly-used programs shows that SanRazor reduced checks suffice to detect at least 33 out of the 38 CVEs. Ethereum is the second-largest blockchain platform next to Bitcoin. Submission of a response is optional. A graph neural network (GNN) enables deep learning on structured graph data. To help more profitably utilize sanitizers, we introduce SanRazor, a practical tool aiming to effectively detect and remove redundant sanitizer checks. With her students, she had led research in AI, with a focus on robotics and machine learning, having concretely researched and developed a variety of autonomous robots, including teams of soccer robots, and mobile service robots. Academic and industrial participants present research and experience papers that cover the full range of theory and practice of computer . Papers must be in PDF format and must be submitted via the submission form. The blockchain community considers this hard fork the greatest challenge since the infamous 2016 DAO hack. We build Polyjuice based on our learning framework and evaluate it against several existing algorithms. Tej Chajed, MIT CSAIL; Joseph Tassarotti, Boston College; Mark Theng, MIT CSAIL; Ralf Jung, MPI-SWS; M. Frans Kaashoek and Nickolai Zeldovich, MIT CSAIL. Using selective profiling, we build DMon, a system that can automatically locate data locality problems in production, identify access patterns that hurt locality, and repair such patterns using targeted optimizations. The OSDI '21 program co-chairs have agreed not to submit their work to OSDI '21. Sanitizers detect unsafe actions such as invalid memory accesses by inserting checks that are validated during a programs execution. This is unfortunate because good OS design has always been driven by the underlying hardware, and right now that hardware is almost unrecognizable from ten years ago, let alone from the 1960s when Unix was written. We present selective profiling, a technique that locates data locality problems with low-enough overhead that is suitable for production use. We demonstrate the above using design, implementation and evaluation of blk-switch, a new Linux kernel storage stack architecture.