Publications
173 results found
Cadar C, Keeton K, Pietzuch P, et al., 2016, Proceedings of the 11th European Conference on Computer Systems, EuroSys 2016: Preface, European Conference on Computer Systems (EuroSys 2016)
Goeschka KM, Oliveira R, Pietzuch P, et al., 2016, Special track on dependable and adaptive distributed systems, ACM SAC, Pages: 490-491
Baguena M, Pamboris A, Pietzuch PR, et al., 2016, Towards Enabling Hyper-Responsive Mobile Apps Through Network Edge Assistance, The 13th Annual IEEE Consumer Communications & Networking Conference, Publisher: IEEE
Poor Internet performance currently underminesthe efficiency of hyper-responsive mobile apps such as augmentedreality clients and online games, which require low-latency accessto real-time backend services. While edge-assisted execution, i.e.moving entire services to the edge of an access network, helpseliminate part of the communication overhead involved, this doesnot scale to the number of users that share an edge infrastructure.This is due to a mismatch between the scarce availability ofresources in access networks and the aggregate demand forcomputational power from client applications.Instead, this paper proposes a hybrid edge-assisted deploymentmodel in which only part of a service executes on LTE edgeservers. We provide insights about the conditions that must holdfor such a model to be effective by investigating in simulationdifferent deployment and application scenarios. In particular,we show that using LTE edge servers with modest capabilities,performance can improve significantly as long as at most 50%of client requests are processed at the edge. Moreover, we arguethat edge servers should be installed at the core of a mobilenetwork, rather than the mobile base station: the difference inperformance is negligible, whereas the latter choice entails highdeployment costs. Finally, we verify that, for the proposed model,the impact of user mobility on TCP performance is low.
Pamboris A, Pietzuch P, 2015, C-RAM: breaking mobile device memory barriers using the cloud, IEEE Transactions on Mobile Computing, Vol: 15, Pages: 2692-2705, ISSN: 1558-0660
—Mobile applications are constrained by the available memory of mobile devices. We present C-RAM, a system thatuses cloud-based memory to extend the memory of mobile devices. It splits application state and its associated computationbetween a mobile device and a cloud node to allow applications to consume more memory, while minimising the performanceimpact. C-RAM thus enables developers to realise new applications or port legacy desktop applications with a large memoryfootprint to mobile platforms without explicitly designing them to account for memory limitations. To handle network failures withpartitioned application state, C-RAM uses a new snapshot-based fault tolerance mechanism in which changes to remote memoryobjects are periodically backed up to the device. After failure, or when network usage exceeds a given limit, the device rolls backexecution to continue from the last snapshot. C-RAM supports local execution with an application state that exceeds the availabledevice memory through a user-level virtual memory: objects are loaded on-demand from snapshots in flash memory. Our C-RAMprototype supports Objective-C applications on the unmodified iOS platform. With C-RAM, applications can consume 10× morememory than the device capacity, with a negligible impact on application performance. In some cases, C-RAM even achieves asignificant speed-up in execution time (up to 9.7×).
O'Keeffe D, Salonidis T, Pietzuch PR, 2015, Network-Aware Stream Query Processing in Mobile Ad-Hoc Networks, MILCOM 2015, Publisher: IEEE, Pages: 1335-1340
Many real-time decision support and sensing applicationscan be expressed as continuous stream queries overtime-varying data streams, following a data stream managementmodel. We consider the problem of the efficient and resilientexecution of continuous stream queries in tactical edge networksformed from mobile ad-hoc networks (MANETs) withlimited backend connectivity. Previous approaches for distributedstream query execution target data center environments in whichnetworks are static, and centralized control is feasible. Thedistributed, bandwidth-constrained and highly dynamic natureof MANETs render such approaches insufficient—while a streamquery executes in a MANET, changes in the network topologymean that any fixed query plan eventually becomes outdated.We introduce an adaptive, network-aware approach for streamquery planning in MANETs, which supports both single- andmulti-input windowed stream query operators. The basic idea isto increase the path diversity available when executing streamqueries by replicating query operators across many nodes in theMANET. During execution, it becomes possible to dynamicallyswitch between different operator replicas based on connectivityand other network path conditions. We evaluate our approach inemulated MANETs, showing that it can increase substantially therobustness of distributed stream query processing under mobility.
Muthukumaran D, O’Keeffe D, Priebe C, et al., 2015, FlowWatcher: Defending against Data Disclosure Vulnerabilities in Web Applications, 22nd ACM Conference on Computer and Communications Security (CCS 2015), Publisher: ACM, Pages: 603-615
Bugs in the authorisation logic of web applications can expose thedata of one user to another. Such data disclosure vulnerabilities arecommon—they can be caused by a single omitted access controlcheck in the application. We make the observation that, while theimplementation of the authorisation logic is complex and thereforeerror-prone, most web applications only use simple access controlmodels, in which each piece of data is accessible by a user or agroup of users. This makes it possible to validate the correct operationof the authorisation logic externally, based on the observeddata in HTTP traffic to and from an application.We describe FlowWatcher, an HTTP proxy that mitigates datadisclosure vulnerabilities in unmodified web applications. FlowWatchermonitors HTTP traffic and shadows part of an application’saccess control state based on a rule-based specification ofthe user-data-access (UDA) policy. The UDA policy states the intendeddata ownership and how it changes based on observed HTTPrequests. FlowWatcher detects violations of the UDA policy bytracking data items that are likely to be unique across HTTP requestsand responses of different users. We evaluate a prototypeimplementation of FlowWatcher as a plug-in for the Nginx reverseproxy and show that, with short UDA policies, it can mitigate CVEbugs in six popular web applications.
Chen X, Rupprecht L, Osman R, et al., 2015, CloudScope: diagnosing and managing performance interference in multi-tenant clouds, 23rd International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS), Publisher: IEEE, Pages: 164-173, ISSN: 1526-7539
Virtual machine consolidation is attractive in cloud computing platforms for several reasons including reduced infrastructure costs, lower energy consumption and ease of management. However, the interference between co-resident workloads caused by virtualization can violate the service level objectives (SLOs) that the cloud platform guarantees. Existing solutions to minimize interference between virtual machines (VMs) are mostly based on comprehensive micro-benchmarks or online training which makes them computationally intensive. In this paper, we present CloudScope, a system for diagnosing interference for multi-tenant cloud systems in a lightweight way. CloudScope employs a discrete-time Markov Chain model for the online prediction of performance interference of co-resident VMs. It uses the results to optimally (re)assign VMs to physical machines and to optimize the hypervisor configuration, e.g. the CPU share it can use, for different workloads. We have implemented CloudScope on top of the Xen hypervisor and conducted experiments using a set of CPU, disk, and network intensive workloads and a real system (MapReduce). Our results show that CloudScope interference prediction achieves an average error of 9%. The interference-aware scheduler improves VM performance by up to 10% compared to the default scheduler. In addition, the hypervisor reconfiguration can improve network throughput by up to 30%.
Pamboris A, Báguena M, Wolf AL, et al., 2015, Demo: NOMAD: an edge cloud platform for hyper-responsive mobile apps, 13th Annual International Conference on Mobile Systems, Applications, and Services (MobiSys 2015), Publisher: Association for Computing Machinery, Pages: 459-459
Pamboris A, Pietzuch PR, 2015, EdgeReduce: Eliminating mobile network traffic using application-specific edge proxies, ACM International Conference on Mobile Software Engineering and Systems (MOBILESoft), Publisher: Association for Computing Machinery, Pages: 72-82
Mobile carriers are struggling to cope with the surge in smartphone traffic, which reflects badly on end users who often experience poor connectivity in densely populated urban environments. Data transfers between mobile client applications and their Internet backend services contribute significantly to the contention in radio access networks (RANs). Client applications, however, typically transfer unnecessary data because (i) backend service APIs do not support a fine-grained specification of the data actually required by clients and (ii) clients aggressively prefetch data that is never used. We describe Edge Reduce, an automated approach for reducing the data transmitted from backend services to a mobile device. Based on source-level program analysis, Edge Reduce generates application-specific proxies for mobile client applications that execute part of the application logic at the network edge to filter data returned by backend API calls and only send used data to the client. Edge Reduce also permits the tuning of aggressive prefetching strategies: proxies replace large prefetched objects such as images by futures, whose access by the client triggers the retrieval of the object on-demand. We show that Edge Reduce reduces the RAN traffic for real-world iOS client applications by up to 8×, with only a modest increase in response time.
Goeschka KM, Oliveira R, Pietzuch P, et al., 2015, Special track on dependable and adaptive distributed systems, Pages: 426-427
Fernandez RC, Pietzuch P, Kreps J, et al., 2015, Liquid: Unifying nearline and offline big data integration
With more sophisticated data-parallel processing systems, the new bottleneck in data-intensive companies shifts from the back-end data systems to the data integration stack, which is responsible for the pre-processing of data for back-end applications. The use of back-end data systems with different access latencies and data integration requirements poses new challenges that current data integration stacks based on distributed file systems—proposed a decade ago for batch-oriented processing—cannot address. In this paper, we describe Liquid, a data integration stack that provides low latency data access to support near real-time in addition to batch applications. It supports incremental processing, and is cost-efficient and highly available. Liquid has two layers: a processing layer based on a stateful stream processing model, and a messaging layer with a highly-available publish/subscribe system. We report our experience of a Liquid deployment with back-end data systems at LinkedIn, a data-intensive company with over 300 million users.
Báguena M, Pamboris A, Pietzuch P, et al., 2015, Better performance in LTE networks with edge assistance: The world of warcraft case, Pages: 259-260
To improve the performance of Massively Multiplayer Online Games (MMOGs) in mobile networks, we explore the potential benefits of an edge-assisted deployment model: part of the MMOG backend service executes closer to the end user at the edge of the LTE network. We investigate the impact on game latency of (1) the exact placement of such edge servers; (2) the number of cooperating game clients; (3) the amount of client requests served at the network edge; (4) the hardware capabilities of edge servers; and (5) user roaming. Based on our analysis, we show that edge assistance can in fact increase the performance of online games over LTE networks as long as at most 50% of the user requests are processed at the network edge. Furthermore, we argue that the Packet Data Network Gateway (PGW) is the most appropriate place for hosting edge servers and show that TCP performance in the proposed setting is not affected by user roaming.
Pietzuch PR, Mai L, Rupprecht L, et al., 2014, NetAgg: Middleboxes for Application-Specific Traffic Aggregation, 10th International Conference on Emerging Networking Experiments and Technologies (CoNEXT), Publisher: ACM, Pages: 249-262
Pietzuch PR, Frischbier S, Buchmann A, 2014, Managing Expectations: Runtime Negotiation of Information Quality Requirements in Event-based Systems, 12th International Conference on Service-Oriented Computing (ICSOC), Publisher: Springer, Pages: 199-213, ISSN: 0302-9743
Pietzuch PR, Priebe C, Muthukumaran D, et al., 2014, CloudSafetyNet: Detecting Data Leakage between Cloud Tenants, ACM Cloud Computing Security Workshop (CCSW)
Castro Fernandez R, Pietzuch PR, Koshy J, et al., 2014, Liquid: Unifying Nearline and Offline Big Data Integration, Seventh Biennial Conference on Innovative Data Systems Research (CIDR)
Pietzuch PR, Chakravarthy S, Urban S, et al., 2014, Proceedings of the 7th ACM International Conference on Distributed Event-Based Systems, Proceedings of the 7th ACM International Conference on Distributed Event-Based Systems
Pietzuch PR, Fernandez RC, Migliavacca M, et al., 2014, Making State Explicit for Imperative Big Data Processing, USENIX Annual Technical Conference (USENIX ATC), Publisher: USENIX, Pages: 49-60
Data scientists often implement machine learning algorithms in imperative languages such as Java, Matlab and R. Yet such implementations fail to achieve the performance and scalability of specialised data-parallel processing frameworks. Our goal is to execute imperative Java programs in a data-parallel fashion with highthroughput and low latency. This raises two challenges: how to support the arbitrary mutable state of Java programs without compromising scalability, and how to recover that state after failure with low overhead. Our idea is to infer the dataflow and the types of state accesses from a Java program and use this informationto generate a stateful dataflow graph (SDG) . By explicitly separating datafrom mutablestate, SDGs have specific features to enable this translation: to ensure scalability, distributed state can be partitioned across nodes if computation can occur entirely in parallel; if this is not possible, partial state gives nodes local instances for independent computation, which are reconciled accordingto application semantics. For fault tolerance, large inmemory state is checkpointed asynchronously without global coordination. We show that the performance ofSDGs for several imperative online applications matchesthat of existing data-parallel processing frameworks.
Cadar C, Song J, Cadar C, et al., 2014, SymbexNet: Testing network protocol implementations with symbolic execution and rule-based specifications, IEEE Transactions on Software Engineering, Vol: 40, Pages: 695-709, ISSN: 0098-5589
Pietzuch PR, Fernandez RC, Weidlich M, et al., 2014, Grand Challenge: Scalable Stateful Stream Processing for Smart Grids, 8th ACM International Conference on Distributed Event Based Systems (DEBS)
We describe a solution to the ACM DEBS Grand Challenge 2014, which evaluates event-based systems for smart grid analytics. Our solution follows the paradigm ofstateful data stream processing and is implemented on top of the SEEPstream processing platform. It achieves high scalability by massive data-parallel processing and the option of performing semantic load-shedding. In addition, oursolution is fault-tolerant, ensuring that the large processing state of stream operators is not lost after failure. Our experimental results show that our solution processes 1 month worth of data for 40 houses in 4 hours. When we scale out the system, the time reduces linearly to 30 minutes before the system bottlenecks at the data source. We then apply semantic load-shedding, maintaining a low median prediction error and reducing the time further to 17 minutes. The system achieves these results with medianlatencies below 30 ms and a 90th percentile below 50 ms.
Pietzuch PR, Tang Y, Wang T, et al., 2014, IncBM-Tree: Outsourcing Big Stream with Authenticated Freshness (Demo), IEEE International Conference on Data Engineering (ICDE), Publisher: IEEE
Bacon J, Eyers D, Pasquier TFJ-M, et al., 2014, Information Flow Control for Secure Cloud Computing, IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, Vol: 11, Pages: 76-89, ISSN: 1932-4537
- Author Web Link
- Cite
- Citations: 62
Fernandez RC, Weidlich M, Pietzuch P, et al., 2014, Scalable stateful stream processing for smart grids, DEBS 2014 - Proceedings of the 8th ACM International Conference on Distributed Event-Based Systems, Pages: 276-281
We describe a solution to the ACM DEBS Grand Challenge 2014, which evaluates event-based systems for smart grid analytics. Our solution follows the paradigm of stateful data stream processing and is implemented on top of the SEEP stream processing platform. It achieves high scalability by massive data-parallel processing and the option of performing semantic load-shedding. In addition, our solution is fault-tolerant, ensuring that the large processing state of stream operators is not lost after failure. Our experimental results show that our solution processes 1 month worth of data for 40 houses in 4 hours. When we scale out the system, the time reduces linearly to 30 minutes before the system bottlenecks at the data source. We then apply semantic load-shedding, maintaining a low median prediction error and reducing the time further to 17 minutes. The system achieves these results with median latencies below 30 ms and a 90th percentile below 50 ms. © 2014 ACM.
Magoutis K, Pietzuch P, 2014, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics): Preface, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Vol: 8460 LNCS, ISSN: 0302-9743
Tang Y, Wang T, Hu X, et al., 2014, Outsourcing Multi-Version Key-Value Stores with Verifiable Data Freshness, 2014 IEEE 30TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), Pages: 1214-1217, ISSN: 1084-4627
- Author Web Link
- Cite
- Citations: 12
Frischbier S, Turan E, Gesmann M, et al., 2014, Effective runtime monitoring of distributed event-based enterprise systems with ASIA, IEEE 7th International Conference on Service-Oriented Computing and Applications (SOCA), Publisher: IEEE, Pages: 41-48, ISSN: 2163-2871
- Author Web Link
- Open Access Link
- Cite
- Citations: 1
Mai L, Rupprecht L, Costa P, et al., 2013, Supporting Application-specific In-network Processing in Data Centres, ACM SIGCOMM COMPUTER COMMUNICATION REVIEW, Vol: 43, Pages: 519-520, ISSN: 0146-4833
- Author Web Link
- Cite
- Citations: 3
Mai L, Rupprecht L, Costa P, et al., 2013, Supporting application-specific in-network processing in data centres, SIGCOMM 2013 - Proceedings of the ACM SIGCOMM 2013 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication, Pages: 519-520
- Cite
- Citations: 1
Ogden P, Thomas DBJ, Pietzuch PR, 2013, Scalable XML query processing using parallel pushdown transducers, Proceedings of the VLDB Endowment, Vol: 6, Pages: 1738-1749, ISSN: 2150-8097
In online social networking, network monitoring and financial applications, there is a need to query high rate streams of XML data, but methods for executing individual XPath queries on streaming XML data have not kept pace with multicore CPUs. For data-parallel processing, a single XML stream is typically split into well-formed fragments, which are then processed independently. Such an approach, however, introduces a sequential bottleneck and suffers from low cache locality, limiting its scalability across CPU cores. We describe a data-parallel approach for the processing of streaming XPath queries based on pushdown transducers. Our approach permits XML data to be split into arbitrarilysized chunks, with each chunk processed by a parallel automaton instance. Since chunks may be malformed, our automata consider all possible starting states for XML elements and build mappings from starting to finishing states. These mappings can be constructed independently for each chunk by different CPU cores. For streaming queries from the XPathMark benchmark, we show a processing throughput of 2.5 GB/s, with near linear scaling up to 64 CPU cores. © 2013 VLDB Endowment.
This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.