Middleware '23: Proceedings of the 24th International Middleware Conference

Full Citation in the ACM Digital Library

EESMR: Energy Efficient BFT---SMR for the masses

Adithya Bhat
Akhil Bandarupalli
Manish Nagaraj
Saurabh Bagchi
Aniket Kate
Michael K. Reiter

Modern Byzantine Fault-Tolerant State Machine Replication (BFT-SMR) solutions focus on reducing communication complexity, improving throughput, or lowering latency. This work explores the energy efficiency of BFT-SMR protocols. First, we propose a novel SMR protocol that optimizes for the steady state, i.e., when the leader is correct. This is done by reducing the number of required signatures per consensus unit and the communication complexity by order of the number of nodes n compared to the state-of-the-art BFT-SMR solutions. Concretely, we employ the idea that a quorum (collection) of signatures on a proposed value is avoidable during the failure-free runs. Second, we model and analyze the energy efficiency of protocols and argue why the steady-state needs to be optimized. Third, we present an application in the cyber-physical system (CPS) setting, where we consider a partially connected system by optionally leveraging wireless multicasts among neighbors. We analytically determine the parameter ranges for when our proposed protocol offers better energy efficiency than communicating with a baseline protocol utilizing an external trusted node. We present a hypergraph-based network model and generalize previous fault tolerance results to the model. Finally, we demonstrate our approach's practicality by analyzing our protocol's energy efficiency through experiments on a CPS test bed. In particular, we observe as high as 64% energy savings when compared to the state-of-the-art SMR solution for n = 10 settings using BLE.

Fast VM Replication on Heterogeneous Hypervisors for Robust Fault Tolerance

Jean-Baptiste Decourcelle
Tu Dinh Ngoc
Boris Teabe
Daniel Hagimont

The reliability of virtualization infrastructures in the face of availability issues is a long-standing problem. Current fault tolerance approaches such as live VM replication are effective at addressing external, accidental issues (e.g. hardware failures, power cuts, environmental disasters); however, against an active attacker exploiting zero-day denial-of-service (DoS) vulnerabilities in the hypervisor itself, these approaches do not address the root cause of said vulnerabilities, and therefore cannot protect against these issues. This is made more relevant by the prevalence of DoS vulnerabilities among many widely used hypervisors.

We introduce heterogeneous replication, a new solution that enhances live VM replication so that VMs can be replicated across different hypervisors. We show that heterogeneous replication not only mitigates accidental failures from the external operational environment, but also mitigates DoS attacks arising from hypervisor vulnerabilities. We further show that heterogeneous replication can be used to increase the security of virtualized infrastructures without sacrificing availability.

We build HERE, our implementation of the heterogeneous replication concept for replicating a protected VM across hypervisor boundaries. We describe the implementation of HERE, including details on the necessary VM state replications, as well as a dynamic checkpoint interval adjustment scheme that maximizes VM protection based on load levels. We evaluate HERE using various benchmarks to show that HERE meets the goal of protecting VMs from availability issues while adapting to the VM's computing load.

Smartpick: Workload Prediction for Serverless-enabled Scalable Data Analytics Systems

Anshuman Das Mohapatra
Kwangsung Oh

Many data analytic systems have adopted a newly emerging compute resource, serverless (SL), to handle data analytics queries in a timely and cost-efficient manner, i.e., serverless data analytics. While these systems can start processing queries quickly thanks to the agility and scalability of SL, they may encounter performance-and cost-bottlenecks based on workloads due to SL's worse performance and more expensive cost than traditional compute resources, e.g., virtual machine (VM). In this paper, we introduce Smartpick, a SL-enabled scalable data analytics system that exploits SL and VM together to realize composite benefits, i.e., agility from SL and better performance with reduced cost from VM. Smartpick uses a machine learning prediction scheme, decision-tree based Random Forest with Bayesian Optimizer, to determine SL and VM configurations, i.e., how many SL and VM instances for queries, that meet cost-performance goals. Smartpick offers a knob for applications to allow them to explore a richer cost-performance tradeoff space opened by exploiting SL and VM together. To maximize the benefits of SL, Smartpick supports a simple but strong mechanism, called relay-instances. Smartpick also supports event-driven prediction model retraining to deal with workload dynamics. A Smartpick prototype was implemented on Spark and deployed on live testbeds, Amazon AWS and Google Cloud Platform. Evaluation results indicate 97.05% and 83.49% prediction accuracies respectively with up to 50% cost reduction as opposed to the baselines. The results also confirm that Smartpick allows data analytics applications to navigate the richer cost-performance tradeoff space efficiently and to handle workload dynamics effectively and automatically.

Sora: A Latency Sensitive Approach for Microservice Soft Resource Adaptation

Jianshu Liu
Qingyang Wang
Shungeng Zhang
Liting Hu
Dilma Da Silva

Fast response time for modern web services that include numerous distributed and lightweight microservices becomes increasingly important due to its business impact. While hardware-only resource scaling approaches (e.g., FIRM [47] and PARSLO [40]) have been proposed to mitigate response time fluctuations on critical microservices, the re-adaptation of soft resources (e.g., threads or connections) that control the concurrency of hardware resource usage has been largely ignored. This paper shows that the soft resource adaptation of critical microservices has a significant impact on system scalability because either under- or over-allocation of soft resources can lead to inefficient usage of underlying hardware resources. We present Sora, an intelligent, fast soft resource adaptation management framework for quickly identifying and adjusting the optimal concurrency level of critical microservices to mitigate service-level objective (SLO) violations. Sora leverages online fine-grained system metrics and the propagated deadline along the critical path of request execution to quickly and accurately provide optimal concurrency setting for critical microservices. Based on six real-world bursty workload traces and two representative microservices benchmarks (Sock Shop and Social Network), our experimental results show that Sora can effectively mitigate large response time fluctuations and reduce the 99th percentile latency by up to 2.5× compared to the hardware-only scaling strategy FIRM [47] and 1.5× to the state-of-the-art concurrency-aware system scaling strategy ConScale.

INSANE: A Unified Middleware for QoS-aware Network Acceleration in Edge Cloud Computing

Lorenzo Rosa
Andrea Garbugli
Antonio Corradi
Paolo Bellavista

Edge cloud computing is a promising programming and deployment paradigm to empower delay-sensitive applications. By executing close to the network edge, distributed applications can have quicker reactions to event occurrence and consequently prompter dynamic adaptations. In addition, recent improvements in connectivity support allow developers to benefit from heterogeneous and alternative communication technologies (e.g., RDMA, DPDK, XDP, etc.) to meet the requirements of network-intensive edge applications. However, exploiting these technologies makes applications statically tailored to a specific network interface; this significantly limits the potential of edge cloud computing, where application components should be able to migrate seamlessly at runtime. INSANE aims at solving that issue by exposing a technology-agnostic middleware API that lets developers simply specify their QoS communication requirements; the dynamic selection of the most appropriate technology on the currently hosting edge node is delegated to INSANE. The paper also presents how it is possible to develop two different INSANE-based applications (a decentralized messaging system and an image streaming framework) with a few lines of code. Finally, an extensive performance evaluation shows that our middleware adds very limited ns-scale overhead to the raw acceleration technologies.

An End-to-End Performance Comparison of Seven Permissioned Blockchain Systems

Frank Christian Geyer
Hans-Arno Jacobsen
Ruben Mayer
Peter Mandl

The emergence of numerous blockchain solutions, offering innovative approaches to optimise performance, scalability, privacy, and governance, complicates performance analysis. Reasons for the difficulty of benchmarking blockchains include, for example, the high number of system parameters to configure and the effort to deploy a blockchain network. In addition, performance data, which mostly comes from system vendors, is often opaque. We provide a reproducible evaluation of the performance of seven permissioned blockchain systems across different parameter settings. We employ an end-to-end approach, where the clients sending the transactions are fully involved in the data collection approach. Our results underscore the unique characteristics and limitations of the systems we examined. Due to the insights given, our work forms the basis for continued research to optimise the performance of blockchain systems.

SinClave: Hardware-assisted Singletons for TEEs

Franz Gregor
Robert Krahn
Do Le Quoc
Christof Fetzer

For trusted execution environments (TEEs), remote attestation permits establishing trust in software executed on a remote host. It requires that the measurement of a remote TEE is both complete and fresh: We need to measure all aspects that might determine the behavior of an application, and this measurement has to be reasonably fresh. Performing measurements only at the start of a TEE simplifies the attestation but enables "reuse" attacks of enclaves. We demonstrate how to perform such reuse attacks for different TEE frameworks. We also show how to address this issue by enforcing freshness -- through the concept of a singleton enclave -- and completeness of the measurements. Completeness of measurements is not trivial since the secrets provisioned to an enclave and the content of the filesystem can both affect the behavior of the software, i.e., can be used to mount reuse attacks. We present mechanisms to include measurements of these two components in the remote attestation. Our evaluation based on real-world applications shows that our approach incurs only negligible overhead ranging from 1.03% to 13.2%.

LO: An Accountable Mempool for MEV Resistance

Bulat Nasrulin
Georgy Ishmaev
Jérémie Decouchant
Johan Pouwelse

Manipulation of user transactions by miners in permissionless blockchain systems is a growing concern. This problem is a pervasive and systemic issue that incurs high costs for users of decentralised applications and is known as Miner Extractable Value (MEV). Furthermore, transaction manipulations create other issues such as congestion, higher fees, and system instability. Detecting transaction manipulations is difficult, even though it is known that they originate from the pre-consensus phase of transaction selection for building blocks, at the base layer of blockchain protocols. In this paper, we summarize known transaction manipulation attacks. We present LO, an accountable base layer protocol designed to detect and mitigate transaction manipulations. LO is built around the accurate detection of transaction manipulations and assignment of blame at the granularity of a single mining node. LO forces miners to log all the transactions they receive into a secure mempool data structure and to process them in a verifiable manner. Overall, LO quickly and efficiently detects censorship, injection or re-ordering attempts. Our performance evaluation shows that LO is also practical and only introduces a marginal performance overhead.

Basalt: A Rock-Solid Byzantine-Tolerant Peer Sampling for Very Large Decentralized Networks

Alex Auvolat
Yérom-David Bromberg
Davide Frey
Djob Mvondo
François Taïani

Recent large-scale Byzantine-Fault-Tolerant (BFT) algorithms provide scalability at a low cost by exploiting a secure Random Peer Sampling (RPS) service: a service that provides a stream of random network nodes where no attacking entity can become over-represented. Unfortunately, producing good peer samples untainted by Byzantine behavior in a large-scale network is particularly difficult, with existing solutions unable to withstand aggressive attacks. In this paper, we propose a novel RPS algorithm, BASALT, that implements what we have termed a stubborn chaotic search over node IDs to counter attackers' attempts at becoming over-represented. Our evaluation based on a theoretical analysis, Monte Carlo simulations, and experiments on a live cryptocurrency network shows that BASALT delivers close-to-optimal protection against malicious behaviors and outperforms state-of-the-art solutions by a wide margin.

PrimCast: A Latency-Efficient Atomic Multicast

Leandro Pacheco
Paulo Coelho
Fernando Pedone

Atomic multicast is a communication abstraction that allows for messages to be addressed to and reliably delivered by multiple process groups, while ensuring a partial order on delivered messages. Strong ordering guarantees can greatly simplify the design and implementation of distributed applications. One critical property for the performance and scalability of an atomic multicast protocol is that of genuineness: a protocol is said to be genuine if only the sender and destinations of a message are involved in ordering the message. This paper presents PrimCast, the first genuine atomic multicast protocol able to deliver messages at every destination in three communication steps. PrimCast uses a primary-based consensus protocol for deciding on message timestamps at each group. Differently from previous work, it does not rely on consensus for advancing and maintaining logical clocks. PrimCast introduces a novel approach, relying on simple quorum intersection, to decide when a multicast message can be delivered. We also show how loosely synchronized clocks can be used to reduce the convoy effect that delays messages under high system load. We present the complete algorithm for PrimCast and evaluate its performance under various scenarios. Our results show that PrimCast achieves lower latency than state-of-the-art approaches while providing higher or comparable throughput.

OrderlessChain: A CRDT-based BFT Coordination-free Blockchain Without Global Order of Transactions

Pezhman Nasirifard
Ruben Mayer
Hans-Arno Jacobsen

Existing permissioned blockchains often rely on coordination-based consensus protocols to ensure the safe execution of applications in a Byzantine environment. Furthermore, the protocols serialize the transactions by ordering them in a global order. The serializ-ability preserves the correctness of the application's state stored on the blockchain. However, coordination-based protocols limit the throughput and scalability and induce high latency. In contrast, application-level correctness requirements exist that are not dependent on the order of transactions, known as invariant-confluence (I-confluence). The I-confluent applications can execute transactions in a coordination-free manner, benefiting from the improved scalability compared to the coordination-based approaches. The safety and liveness of I-confluent applications are studied in non-Byzantine environments, but the correct execution of such applications remains a challenge in Byzantine coordination-free environments. We introduce OrderlessChain, a novel permissioned blockchain based on a novel BFT coordination-free protocol for the safe and live execution of I-confluent applications in a Byzantine environment. We implemented a prototype of our system, and our evaluation results show that our coordination-free approach performs significantly better than coordination-based blockchains.

Characterizing Distributed Machine Learning Workloads on Apache Spark: (Experimentation and Deployment Paper)

Yasmine Djebrouni
Isabelly Rocha
Sara Bouchenak
Lydia Chen
Pascal Felber
Vania Marangozova
Valerio Schiavoni

Distributed machine learning (DML) environments are widely used in many application domains to build decision-making systems. However, the complexity of these environments is overwhelming for novice users. On the one hand, data scientists are more familiar with hyper-parameter tuning and typically lack an understanding of the trade-offs and challenges of parameterizing DML platforms to achieve good performance. On the other hand, system administrators focus on tuning distributed platforms, unaware of the possible implications of the platform on the quality of the learning models. To shed light on such parameter configuration interplay, we run multiple DML workloads on the widely used Apache Spark distributed platform, leveraging 13 popular learning methods and 6 real-world datasets on two distinct clusters. We collect and perform an in-depth analysis of workload execution traces to compare the efficiency of different configuration strategies. We consider tuning only hyper-parameters, tuning only platform parameters, and jointly tuning both hyper-parameters and platform parameters. We publicly release our collected traces and derive key takeaways on DML workloads. Counter-intuitively, platform parameters have a higher impact on the model quality than hyper-parameters. More generally, we show that multi-level parameter configuration can provide better results in terms of model quality and execution time while also optimizing resource costs.

Pravega: A Tiered Storage System for Data Streams

Raúl Gracia-Tinedo
Flavio Junqueira
Tom Kaitchuck
Sachin Joshi

The growing popularity of the data stream abstraction entails new challenging requirements when it comes to data ingestion and storage. Many organizations expect to retain data streams for extended periods of time and to store such stream data in a cost-effective manner. It is also crucial to reconcile apparently opposite properties, like data durability and consistency, along with high performance. Furthermore, data streams should not only deal with a high degree of parallelism, but also adapt to fluctuating workloads with little or no admin intervention. To our knowledge, no storage system for data streams fully copes with all these requirements.

In this paper, we present Pravega: a distributed, tiered storage system for data streams. Pravega streams are unbounded by design and cost-effective, as the system automatically moves data to a long-term storage tier (e.g., S3, NFS) and transparently manages it for the user. Pravega guarantees no duplicate or missing events, as well as per routing-key event ordering, while providing high performance streaming IO and historical reads. As a unique feature, Pravega streams are elastic: they can automatically change their degree of parallelism based on the ingestion workload. We compared the performance of Pravega with Apache Kafka and Apache Pulsar on AWS. Our results certify that Pravega can deliver performance improvements over them in many scenarios.

Bridging the Gap of Timing Assumptions in Byzantine Consensus

Zixuan Chen
Lei Fan
Shengyun Liu
Marko Vukolić
Xiangzhe Wang
Jingjing Zhang

Asynchronous Byzantine Fault-Tolerant (BFT) consensus protocols maintain strong consistency across nodes (i.e., ensure safety) and terminate probabilistically (i.e., ensure liveness) despite unbounded network delay. In contrast to protocols under partial synchrony, asynchronous counterparts pay no extra timing assumptions for electing a special role, and thus is more robust to network issues. To formally study this feature, we propose a new classification method for consensus and accordingly categorize relevant work: timing-balanced protocols are those that do not introduce strictly stronger timing-related assumptions for liveness, compared to ones required by safety.

We further propose ThemiX, a novel timing-balanced protocol under the hybrid model: either there is no corrupt node, or the number of correct and online nodes constitute a majority. ThemiX tolerates f Byzantine faults with a total of n = 2f + 1 nodes, achieving optimal resilience. If every node is honest or benign, ThemiX is an asynchronous protocol. Otherwise, ThemiX relies on timing assumptions for ensuring safety and probabilistic termination. No leader or any special role is elected. To boost performance, we further integrate two practical mechanisms that respectively allow ThemiX to proceed at the pace of actual network speed and bypass the expensive coin-tossing phase (i.e., randomization). Large-scale experiments on Amazon EC2 platform show that ThemiX achieves up to 86% reduction in latency compared to the consensus component of HoneyBadgerBFT and provides sustainable performance under simulated network faults.

Kernel-as-a-Service: A Serverless Programming Model for Heterogeneous Hardware Accelerators

Tobias Pfandzelter
Aditya Dhakal
Eitan Frachtenberg
Sai Rahul Chalamalasetti
Darel Emmot
Ninad Hogade
Rolando Pablo Hong Enriquez
Gourav Rattihalli
David Bermbach
Dejan Milojicic

With the slowing of Moore's law and decline of Dennard scaling, computing systems increasingly rely on specialized hardware accelerators in addition to general-purpose compute units. Increased hardware heterogeneity necessitates disaggregating applications into workflows of fine-grained tasks that run on a diverse set of CPUs and accelerators. Current accelerator delivery models cannot support such applications efficiently, as (1) the overhead of managing accelerators erases performance benefits for fine-grained tasks; (2) exclusive accelerator use per task leads to underutilization; and (3) specialization increases complexity for developers.

We propose adopting concepts from Function-as-a-Service (FaaS), which has solved these challenges for general-purpose CPUs in cloud computing. Kernel-as-a-Service (KaaS) is a novel serverless programming model for generic compute accelerators that aids heterogeneous workflows by combining the ease-of-use of higher-level abstractions with the performance of low-level hand-tuned code. We evaluate KaaS with a focus on the breadth of the idea and its generality to diverse architectures rather than on an in-depth implementation for a single accelerator. Using proof-of-concept prototypes, we show that this programming model provides performance, performance efficiency, and ease-of-use benefits across a diverse range of compute accelerators. Despite increased levels of abstraction, when compared to a naive accelerator implementation, KaaS reduces completion times for fine-grained tasks by up to 96.0% (GPU), 68.4% (FPGA), 98.6% (TPU), and 34.9% (QPU) in our experiments.

SecV: Secure Code Partitioning via Multi-Language Secure Values

Peterson Yuhala
Pascal Felber
Hugo Guiroux
Jean-Pierre Lozi
Alain Tchana
Valerio Schiavoni
Gaël Thomas

Trusted execution environments like Intel SGX provide enclaves, which offer strong security guarantees for applications. Running entire applications inside enclaves is possible, but this approach leads to a large trusted computing base (TCB). As such, various tools have been developed to partition programs written in languages such as C or Java into trusted and untrusted parts, which are run in and out of enclaves respectively. However, those tools depend on language-specific taint-analysis and partitioning techniques. They cannot be reused for other languages and there is thus a need for tools that transcend this language barrier.

We address this challenge by proposing a multi-language technique to specify sensitive code or data, as well as a multi-language tool to analyse and partition the resulting programs for trusted execution environments like Intel SGX. We leverage GraalVM's Truffle framework, which provides a language-agnostic abstract syntax tree (AST) representation for programs, to provide special AST nodes called secure nodes that encapsulate sensitive program information. Secure nodes can easily be embedded into the ASTs of a wide range of languages via Truffle's polyglot API. Our technique includes a multi-language dynamic taint tracking tool to analyse and partition applications based on our generic secure nodes. Our extensive evaluation with micro- and macro-benchmarks shows that we can use our technique for two languages (Javascript and Python), and that partitioned programs can obtain up to 14.5% performance improvement as compared to unpartitioned versions.

Fast, Light-weight, and Accurate Performance Evaluation using Representative Datacenter Behaviors

Jaewon Lee
Dongmoon Min
Ilkwon Byun
Hanhwi Jang
Jangwoo Kim

Datacenters rapidly evolve by adopting new features such as new hardware deployment and software patches. Adopting a new feature requires an accurate evaluation of its impact to minimize the risk to the multi-million dollar computing infrastructure. However, a comprehensive performance analysis of a datacenter is extremely challenging due to its cost and multitenancy. Evaluating the performance in a live datacenter is accurate but prohibitive to prevent any damage to production services. Using conventional load-testing benchmarks on small-scale testbeds is imprecise as they do not consider the effect of other co-located jobs.

In this paper, we propose FLARE, a fast, lightweight, and accurate performance evaluation method using representative datacenter behaviors. The key idea is to extract a small set of representative job colocation scenarios from all possible job colocations in a target datacenter. FLARE systematically characterizes and groups job colocations according to performance and resource metrics, providing high-level insights into the datacenter's behaviors. Then, it reconstructs the colocations on a testbed and allows accurate feature evaluation with load-testing benchmarks. We evaluate FLARE using an in-house datacenter and three features: cache sizing, DVFS, and SMT configurations. FLARE accurately estimates the impact of features with less than 1% errors by incurring 50× and 10× lower evaluation costs compared to full datacenter and sampling-based evaluation, respectively.

Systematic Analysis of DDS Implementations

Vincent Bode
David Buettner
Tobias Preclik
Carsten Trinitis
Martin Schulz

Publish-subscribe messaging is a popular communication paradigm in the (Industrial) Internet of Things, and the Data Distribution Service (DDS) is a well known standard for pub-sub communication middleware. Many vendor implementations of DDS exist, leaving users with the need to choose according to project and performance requirements. However, the wide range of parameters in DDS implementations not covered in the standard specification make this selection difficult and time-consuming. We present DDS-Perf, a novel and versatile cross-vendor benchmarking tool for performance analysis, and use it to provide data from studies on 4 popular DDS implementations (OpenDDS, RTI Connext, FastDDS and CycloneDDS) across a wide range of experimental setups. DDS-Perf allows us to provide a consistent methodology across all vendors, increasing fairness and comparability. Overall, we find that RTI Connext achieves the best all-round performance (exhibiting the best bandwidth and peak sample rate), while FastDDS (best end-to-end latency) and CycloneDDS also show promising results.

Glider: Serverless Ephemeral Stateful Near-Data Computation

Daniel Barcelona-Pons
Pedro García-López
Bernard Metzler

Serverless data analytics generate a large amount of intermediate data during computation stages. However, serverless functions, which are short-lived and lack direct communication, face significant challenges in managing this data effectively. The traditional approach of using object storage to carry the data proves to be slow and costly, as it involves constant movement of data back and forth. Although specialized ephemeral storage solutions have been developed to address this issue, they fail to tackle the fundamental challenge of minimizing data movements. This work focuses on incorporating near-data computation into an ephemeral storage system to reduce the volume of transferred data in serverless analytics. We present Glider with the aim to enhance communication between serverless compute stages, allowing data to smoothly "glide" through the processing pipeline instead of bouncing between different services. Glider achieves this by leveraging stateful near-data execution of complex data-bound operations and an efficient I/O streaming interface. Under evaluation, it reduces data transfers by up to 99.7%, improves storage utilization by up to 99.8%, and enhances performance by up to 2.7×. In sum, Glider improves serverless data analytics by optimizing data movement, streamlining processing, and avoiding redundant transfers.

SnapStore: A Snapshot Storage System for Serverless Systems

Abhisek Panda
Smruti R. Sarangi

Serverless computing is getting increasingly popular because of its fine-grained billing model and autoscaling features. To speed up the process of functions' sandbox creation, cloud providers typically utilize snapshot and restore-based mechanisms for pre-warmed snapshots. This effectively trades off the startup latency with the storage requirements and the overhead of creating/restoring these snapshots. Hence, there is a need to compress the snapshots by identifying identical data chunks across snapshots and then design methods to quickly deduplicate snapshots and retrieve them. We propose SnapStore -- a novel method of finding such duplicates. As opposed to conventional work that relies on better hashing methods, we use the natural structure of the program's memory map to reduce wasted work during deduplication. Furthermore, we sequentialize and minimize disk accesses as much as possible while retrieving a snapshot into a RAM-based cache. Both of these optimizations, yield a reasonably large speedup in the deduplication process as compared to the state-of-the-art (≈ 46% in the snapshot deduplication time and ≈ 82.6% in the retrieval time on HDDs). Upon integration with FaaSnap (a state-of-the-art serverless platform), SnapStore improves the end-to-end latency of serverless functions by 25.9% along with 2.4× storage space reduction over vanilla FaaSnap on HDDs. With SSDs, our deduplication time and retrieval time reduce by 36.2% and 75.8%, respectively, with almost no degradation in the end-to-end latency.

DynaCut: A Framework for Dynamic and Adaptive Program Customization

Abhijit Mahurkar
Xiaoguang Wang
Hang Zhang
Binoy Ravindran

Software is becoming increasingly complex and feature-rich, yet only part of any given codebase is frequently used. Existing software customization and debloating approaches target static binaries, focusing on feature discovery, control-flow analysis, and binary rewriting. As a result, the customized program binary has a smaller attack surface as well as less available functionality. This means that once a software's use scenario changes, the customized binary may not be usable.

This paper presents DynaCut, for dynamic software code customization. DynaCut can disable "not being used" code features during software runtime and re-enable them when required again. DynaCut works at the binary level; no source code is needed. To achieve its goal, DynaCut includes a dynamic process rewriting technique that seamlessly and transparently updates the image of a running process, with specific code features blocked or re-enabled. To help identify potentially unused code, DynaCut employs an execution trace-based differential analysis to pinpoint the code related to specific software features, which can be dynamically turned on/off based on user configuration. We also develop automatic methods to locate code that is only temporally used (e.g., initialization code), which can be dropped in a timely manner (e.g., after the initialization phase).

We prototype DynaCut and evaluate it using 3 widely used server applications and the SPECint2017_speed benchmark suite. The result shows that, compared to existing static binary customization approaches, DynaCut removes an additional 10% of code on average and up to 56% of temporally executed code due to the dynamic code customization.

FlexCast: Genuine Overlay-based Atomic Multicast

Eliã Batista
Paulo Coelho
Eduardo Alchieri
Fernando Dotti
Fernando Pedone

Atomic multicast is a communication abstraction where messages are propagated to groups of processes with reliability and order guarantees. Atomic multicast is at the core of strongly consistent storage and transactional systems. This paper presents FlexCast, the first genuine overlay-based atomic multicast protocol. Genuineness captures the essence of atomic multicast in that only the sender of a message and the message's destinations coordinate to order the message, leading to efficient protocols. Overlay-based protocols restrict how process groups can communicate. Limiting communication leads to simpler protocols and reduces the amount of information each process must keep about the rest of the system. FlexCast implements genuine atomic multicast using a complete DAG overlay. We experimentally evaluate FlexCast in a geographically distributed environment using gTPC-C, a variation of the TPC-C benchmark that takes into account geographical distribution and locality. We show that, by exploiting genuineness and workload locality, FlexCast outperforms well-established atomic multicast protocols without the inherent communication overhead of state-of-the-art non-genuine multicast protocols.

FLIPS: Federated Learning using Intelligent Participant Selection

Rahul Atul Bhope
K. R. Jayaram
Nalini Venkatasubramanian
Ashish Verma
Gegi Thomas

This paper presents the design and implementation of FLIPS, a middleware system to manage data and participant heterogeneity in federated learning (FL) training workloads. In particular, we examine the benefits of label distribution clustering on participant selection in federated learning. FLIPS clusters parties involved in an FL training job based on the label distribution of their data apriori, and during FL training, ensures that each cluster is equitably represented in the participants selected. FLIPS can support the most common FL algorithms, including FedAvg, FedProx, FedDyn, FedOpt and FedYogi. To manage platform heterogeneity and dynamic resource availability, FLIPS incorporates a straggler management mechanism to handle changing capacities in distributed, smart community applications. Privacy of label distributions, clustering and participant selection is ensured through a trusted execution environment (TEE). Our comprehensive empirical evaluation compares FLIPS with random participant selection, as well as three other "smart" selection mechanisms -- Oort [51], TiFL [15] and gradient clustering [27] using four real-world datasets, two different non-IID distributions and three common FL algorithms (FedYogi, FedProx and FedAvg). We demonstrate that FLIPS significantly improves convergence, achieving higher accuracy by 17-20 percentage points with 20-60% lower communication costs, and these benefits endure in the presence of straggler participants.

Trustworthy confidential virtual machines for the masses

Anna Galanou
Khushboo Bindlish
Luca Preibsch
Yvonne-Anne Pignolet
Christof Fetzer
Rüdiger Kapitza

Confidential computing alleviates the concerns of distrustful customers by removing the cloud provider from their trusted computing base and resolves their disincentive to migrate their workloads to the cloud. This is facilitated by new hardware extensions, like AMD's SEV Secure Nested Paging (SEV-SNP), which can run a whole virtual machine with confidentiality and integrity protection against a potentially malicious hypervisor owned by an untrusted cloud provider. However, the assurance of such protection to either the service providers deploying sensitive workloads or the end-users passing sensitive data to services requires sending proof to the interested parties. Service providers can retrieve such proof by performing remote attestation while end-users have typically no means to acquire this proof or validate its correctness and therefore have to rely on the trustworthiness of the service providers.

In this paper, we present Revelio, an approach that features two main contributions: i) it allows confidential virtual machine (VM)-based workloads to be designed and deployed in a way that disallows any tampering even by the service providers and ii) it empowers users to easily validate their integrity. In particular, we focus on web-facing workloads, protect them leveraging SEV-SNP, and enable end-users to remotely attest them seamlessly each time a new web session is established. To highlight the benefits of Revelio, we discuss how a standalone stateful VM that hosts an open-source collaboration office suite can be secured and present a replicated protocol proxy that enables commodity users to securely access the Internet Computer, a decentralized blockchain infrastructure.

The Middleware conference adheres strictly to the ACM policies against discrimination and harassment.

A Twitter List by TwitterDev

OPEN TABLE OF CONTENTS

Middleware '23: Proceedings of the 24th International Middleware Conference

The Middleware conference adheres strictly to the ACM policies against discrimination and harassment.