|
SUNDAY, JUNE 25
|
| 08:00-12:00
|
Parallel Tutorials
Reliability-Aware Microprocessor Architectures
Sarita Adve and Pradip Bose,
University of Illinois and IBM TJ Watson Research
- Click for detailed description.
In this tutorial, we present the foundational principles and
methodologies behind the design of microprocessors that meet
market-driven reliability targets, in the face of technological
constraints offered by the late CMOS era. The stress is on
early-stage (pre-RTL) definition at the microarchitecture level,
although relevant details from lower levels of design (e.g.
logic, circuits and below) are also covered where appropriate. In
particular, in order to explain the methodology for modeling the
effects of failures at the microarchitectural level, we delve
into some details of the actual physical failure mechanisms, and
the models that govern their onset and propagation.
We first cover the topic of pre-silicon modeling to estimate
performance, power, temperature and reliability, in the context
of target workloads of interest to the design team. While
estimating failure rates and mean time to failure (MTTF), we
consider the effects of hard (or permanent) failures, as well as
soft (or transient) errors. We address the issues of modeling
accuracy and validation in some detail. In particular, we examine
the basic axioms that guide alternate modeling methodologies, and
discuss the range of validity of these assumptions. We also touch
on the important topic of reliability "metrics": we discuss the
issue of defining appropriate metrics to quantify a given index
of reliability, and the need to guard against fallacies and
pitfalls when trying to interpret a projected reliability value.
Subsequently, we cover the topic of reliability-aware design at
the microarchitectural level. We discuss power-, area- and
performance-efficient approaches to provide temporal and/or
spatial redundancy support in order to meet required reliability
targets. We also discuss adaptive microarchitectures: those that
are designed to change with variations in the workload, with the
goal of maximizing one or more of: reliability, power-temperature
efficiency and performance. In discussing each topic area within
the tutorial, we provide a brief survey of past techniques and
results, before providing in-depth coverage of more recently
published methodologies.
Dependable Computing over Sensor Networks
Shivkant Mishra,
University of Colorado
- Click for detailed description.
Wireless sensor networks are rapidly growing in their importance and
relevance to both the research community and the public at large. In
a wireless sensor network, a distributed collection of sensor nodes
forms a network interconnected by wireless communication links. Each
sensor node acts as information source, sensing and colleting data
samples from its environment. Sensor nodes perform routing functions,
creating a multi-hop wireless networking fabric that relays data
samples to other sensor nodes and to external destinations.
Applications of wireless sensor networks are numerous, diverse, and
growing. They range from habitat monitoring to indoor monitoring of
semiconductor fabrication processes, and from counter-sniper
localization on battlefields to search and rescue operations. In each
of these application scenarios, lives and livelihoods may depend on
the timeliness and correctness of the sensor data obtained from
dispersed sensor nodes.
The main objective of this tutorial is to provide an in-depth
coverage of design and implementation issues in building a dependable
wireless sensor network, and cover the current state-of-the-art of
this promising technology. The tutorial will first provide a basic
introduction to the wireless sensor networks, and then cover five
important technical issues in the design and implementation of a
dependable wireless sensor network. These five issues are
cryptographic key management, secure and intrusion-tolerant routing,
secure in-network processing, secure dynamic reprogramming, and
prevention against traffic analysis attacks.
Because of the importance and sensitivity of tasks performed by
wireless sensor networks, a wireless sensor network is a target of
adversaries. There are many security threats to wireless sensor
networks. An adversary can prevent users from getting correct data
from sensor nodes by modifying the contents of the packets, or
spoofing the identity of the sensor nodes. An adversary can block
communication between a base station and sensor nodes by creating
false routing information, or simply generating jamming signals. An
adversary can gain control of an entire sensor network by spoofing
the identity of a base station. An adversary can compromise a sensor
node, get all information from that node, and can even re-program it
to behave like a malicious node.
The design and implementation of a dependable wireless sensor network
must simultaneously address three research challenges: (1) Resource
constraints of sensor nodes in terms of low power, low memory, slower
CPU, and limited communication bandwidth; (2) Vulnerability of
wireless communication to eavesdropping, unauthorized access,
spoofing, replay, and denial of service attacks; and (3) Added
physical security risk of individual sensor nodes falling into the
wrong hand and be compromised.
This tutorial will discuss some of the latest techniques that have
been proposed to address these research challenges. Finally, a case
study of a search-and-rescue application built using a wireless
sensor network will be done.
More Reliable Software Faster and Cheaper
John Musa,
Consultant
- Click for detailed description.
Stressed out by competitive pressures to deliver more reliable
software faster and cheaper? Want to control the process rather
than have it control you? Software reliability engineering
(SRE), a practice primarily developed on the job in industry, can
help. This unique tutorial will teach you the essentials of how
to apply it.
SRE is based on two powerful ideas:
-
Quantitatively characterize expected use and then focus
resources on most used and/or most critical functions. This
increases development efficiency and hence effective resource
pool available to add customer value to product.
-
Further increase customer value by setting quantitative
reliability objectives that precisely balance customer needs
for reliability, timely delivery, and cost; engineer project
strategies to meet them; and track reliability in test as a
release criterion
SRE is a standard, proven best practice. It applies not only to
software reliability but to dependability and safety as well.
You can apply it to any system using software and to members of
software component libraries. And you can start with the next
release.
|
| 12:00-13:30
| Lunch for Tutorial Attendees
|
| 13:30-17:30
|
Parallel Tutorials
Architecture Level Evaluation of Soft Errors
Shubu Mukherjee,
Intel Corp.
- Click for detailed description.
Tutorial contents:
- The Soft Error problem & Motivation (45 minutes)
- AVF (arch. vulnerability factor) Basics (30 minutes)
- Computing AVF using Statistical Fault Injection (15 minutes)
- Computing AVF using ACE & lifetime analysis (30 minutes)
- Computing AVF of Address-Based Structures (30 minutes)
- Examples & Results (15 minutes)
- AVF Reduction Techniques (30 minutes)
- Future use of AVF techniques (15 minutes)
- Open discussion (15 minutes)
Erasure Codes for Fault Tolerant Storage
James S. Plank,
University of Tennessee
- Click for detailed description.
Tutorial contents:
- To introduce you to the various erasure coding techniques.
- Reed Solomon codes.
- Parity-array codes.
- LDPC codes.
- To help you understand their tradeoffs.
- To help you evaluate your coding needs.
- This too is not straightforward.
Software Dependability: What You Didn't Learn in Kindergarten
John C. Knight and Elisabeth Strunk,
University of Virginia
- Click for detailed description.
This tutorial will discuss a number of important topics in
software system dependability with a heavy emphasis on their
practical relevance. The focus will be an example application
domain, digital avionics. Starting with a brief summary of that
domain, the tutorial will explore several topics including: (1)
Basic principles of software dependability; (2) the practical use
of formal specification in real systems using Z and VDM; (3) the
practical use of formal verification; (4) static analysis and the
associated role of the programming language; (5) the assessment
of software systems by statistical means and safety cases; and
(6) the role of standards.
This tutorial will be of interest to engineers and managers
engaged in the development of software systems for high-assurance
applications in all domains. No prior technical knowledge is
assumed beyond what the attendee learned in kindergarten.
|
|
|
|
MONDAY, JUNE 26
|
| 08:30-10:00
|
Opening Remarks and Keynote Address
Dr. Ambuj Goyal,
General Manger, Information Management Division, IBM
- Click for detailed description.
"Delivering Dependability: A Moving Target"
Are we thinking broadly enough about dependable systems? Just what
are the most influential factors that contribute to dependability?
Is it the manufacturing process? Is it circuit design? Or
software design? Or network capacity? The answer could be none of
these and most certainly it's changed dramatically from even five
years ago. Technology and best practices around its use have
evolved to the point where dependable systems are gated by factors
that go far beyond one's normal purview. Today, political,
cultural and societal issues may well have an equal if not greater
impact. We must challenge ourselves to consider all the dimensions
of today's world and think about non-obvious relationships. This
keynote will challenge the audience to think outside the box and
consider the realities of today's world and its impact on
dependable systems.
|
| 10:00-10:30
| Coffee Break
|
|
10:30-12:00
|
DCCS
Session 1A: Real-Time and Embedded Systems
Chair: Johan Karlsson
Efficient High Hamming Distance CRCs for Embedded Networks
Justin Ray and Philip Koopman
Memory-Conscious Reliable Execution on Embedded Chip Multiprocessors
G. Chen, M. Kandemir and I. Kolcu
Static Analysis to Enforce Safe Value Flow in Embedded Control Systems
Sumant Kowshik, Grigore Rosu and Lui Sha
|
|
PDS
Session 1B: Dependable Storage
Chair: Zbigniew Kalbarczyk
Dependability Analysis of Virtual Memory Systems
Lakshmi N. Bairavasundaram, Andrea C. Arpaci-Dusseau and Remzi H. Arpaci-Dusseau
Assessment of the Effect of Memory Page Retirement on System RAS Against Hardware Faults
Dong Tang, Peter Carruthers, Zuheir Totari and Michael Shapiro
Designing Dependable Storage Solutions for Shared Application Environments
Shravan Gaonkar, Kimberly Keeton, Arif Merchant and William H. Sanders
|
|
Session 1C: Industry Session (all day)
|
|
Session 1D: Workshop 1 on Applied Software Reliability (all day)
Reliable Multicast for Time-Critical Systems
M. Balakrishnan and K. Birman
Reliability Requirements of Wireless Sensor Networks for Dynamic Structural Monitoring
M. Cinque, D. Cotroneo, G. De Caro and M. Pelella
Reliability Requirements for Infrastructure System Sensor Networks
M. Bigrigg
Panel: "Reliability Requirements for Emerging Applications"
|
| 12:00-13:30
| Lunch Break
|
|
13:30-15:30
|
DCCS
Session 2A: Safety-Critical Systems
Chair: Neeraj Suri
The Startup Problem in Fault-Tolerant Time-Triggered Communication
Wilfried Steiner and Hermann Kopetz
A Reconfigurable Generic Dual-Core Architecture
Thomas Kottke and Andreas Steininger
A Dependable System Architecture for Safety-Critical Respiratory-Gated Radiation Therapy
Gregory Sharp and Nagarajan Kandasamy
User Interface Defect Detection by Hesitation Analysis
Robert W. Reeder and Roy A. Maxion
|
|
PDS
Session 2B: Attack Prevention and Mitigation
Chair: Paulo Verissimo
A Statistical Analysis of Attack Data to Separate Attacks
Michel Cukier, Robin Berthier, Susmit Panjwani and Stephanie Tan
VoIP Intrusion Detection Through Protocol State Machines
Hemant Sengar, Duminda Wijesekera, Haining Wang and Sushil Jajodia
Mitigating Active Attacks Towards Client Networks Using the Bitmap Filter
Chun-Ying Huang, Kuan-Ta Chen and Chin-Laung Lei
Accurate and Automated System Call Policy-Based Intrusion
Prevention
Lap Chung Lam, Wei Li and Tzi-cker Chiueh
|
|
Session 2C: Industry Session (contd.)
|
|
Session 2D: Workshop 1 on Applied Software Reliability (contd.)
Predicting Field Defects Based on Software Test Results
V.B. Mendiratta and J.M. Souza
Providing Automated Detection of Problems in Virtualized Servers using Monitor framework
G. Khanna, S. Bagchi, K. Beaty, A. Kochut and G. Ken
How the Hidden Hand Shapes the Market for Software Reliability
K. Birman, C. Chandersekaran, D. Dolev and R. van Renesse
Model-Centric Development of Highly Available Software Systems
R.W. Buskens and O.J. Gonzalez
Panel: "The Quest for Reliable Software: Paradigms and Factors Driving Industry"
|
| 15:30-16:00
| Coffee Break
|
|
16:00-17:30
|
DCCS
Session 3A: Architecture and Operating Systems
Chair: A. J. Kleinosowski
Dynamic Verification of Memory Consistency in Cache-Coherent Multithreaded Computer Architectures
Albert Meixner and Daniel J. Sorin
Automatic Instruction-Level Software-Only Recovery Methods
Jonathan Chang, George A. Reis and David I. August
Exploring Fault-Tolerant Network-on-Chip Architectures
Dongkook Park, Chrysostomos Nicopoulos, Jongman Kim, N. Vijaykrishnan and Chita R. Das
|
|
PDS
Session 3B: Dependability Models
Chair: Boudewijn Haverkort
BlueGene/L Failure Analysis and Prediction Models
Yinglung Liang, Yanyong Zhang, Morris Jette, Anand Sivasubramaniam and Ramendra Sahoo
Performance Assurance via Software Rejuvenation: Monitoring, Statistics and Algorithms
Alberto Avritzer, Andre Bondi, Michael Grottke, Kishor Trivedi and Elaine J. Weyuker
Automatic Recovery Using Bounded Partially Observable Markov Decision Processes
Kaustubh R. Joshi, Matti A. Hiltunen, William H. Sanders and Richard D. Schlichting
|
|
Session 3C: Industry Session (contd.)
|
|
Session 3D: Workshop 1 on Applied Software Reliability (contd.)
Big Gap from Academic Response to Industry's Demand for Optimized Engineering Efficacy
C. H. Pham, F. Lin, N. Gupta and K. Ma
Be Good (Reliable) or Be Careful (Fault Tolerant)
H. Hecht
Integrating Software Reliability and Software Engineering in Education (or Software Reliability Begins in the Classroom)
L. Bernstein and C. Kintala
Closing the Gap in Failure Analysis
B. Murphy, M. Garzia and N. Suri
Panel: "Closing the Gap between Academic Research and Industry Needs"
|
|
|
|
TUESDAY, JUNE 27
|
| 08:30-10:00
|
Plenary Panel Session
Moderator: Dr. Jeffrey Voas, SAIC Corporation
- Click for detailed description.
"Coordinated, Malicious Cyber and Physical Attacks on National Infrastructures"
Panelists: Don O'Neill, Bret Michael, Shashi Phoha, Adam L. Young
This panel will look a variety of issues that deal with the joint
threat created by both physical and cyber attacks on national and
global infrastructure. For example, is there a way to model such
events, and if so, what language should be used? Further, the panel
will discuss the current state of the field of computer security, and
explore what existing approaches from the dependability community are
relevant to such a joint threat, e.g., fault tolerance.
|
| 10:00-10:30
| Coffee Break
|
|
10:30-12:00
|
DCCS
Session 4A: Byzantine Faults
Chair: Marcos K. Aguilera
Scaling Byzantine Fault-Tolerant Replication to Wide Area Networks
Yair Amir, Claudiu Danilov, Danny Dolev, Jonathan Kirsch, John Lane, Cristina Nita-Rotaru, Josh Olsen and David Zage
Optimal Resilience for Erasure-Coded Byzantine Distributed Storage
Christian Cachin and Stefano Tessaro
Lucky Read/Write Access to Robust Atomic Atorage
Rachid Guerraoui, Ron R. Levy and Marko Vukolic
|
|
PDS
Session 4B: Attack Analysis
Chair: Bill Sanders
Using Attack Injection to Discover New Vulnerabilities
Nuno Neves, João Antunes, Miguel Correia, Paulo Veríssimo and Rui Neves
Assessing the Attack Threat due to IRC Channels
Robert Meyer and Michel Cukier
An Approach for Detecting and Distinguishing Errors versus Attacks
in Sensor Networks
Claudio Basile, Meeta Gupta, Zbigniew Kalbarczyk and Ravi K. Iyer
|
|
Session 4C: Fast Abstracts
|
|
Workshop 2 on Architecting Dependable Systems (WADS)
Session 4D: Software Architectures and Dependability
Keynote Talk
Professor Mary Shaw (Carnegie Mellon University)
Discussion
|
| 12:00-13:30
| Lunch Break
|
|
13:30-15:30
|
DCCS
Session 5A: Consensus and Leader Election
Chair: Tohru Kikuno
One-Step Consensus with Zero-Degradation
Dan Dobre and Neeraj Suri
Consensus with Byzantine Failures and Little System Synchrony
Marcos K. Aguilera, Carole Delporte-Gallet, Hugues Fauconnier and Sam Toueg
Solving Atomic Broadcast with Indirect Consensus
Richard Ekwall and AndréSchiper
Eventual Leader Election with Weak Assumptions on Initial Knowledge, Communication Reliability, and Synchrony
Antonio Fernández, Ernesto Jiménez and Michel Raynal
|
|
DCCS
Session 5B: Intrusion Detection and Tolerance
Chair: Mohamed Kaaniche
Hotspots: The Root Causes of Non-Uniformity in Self-Propagating Malware
Evan Cooke, Z. Morley Mao and Farnam Jahanian
A Multi-Resolution Approach for Worm Detection and Containment
Vyas Sekar, Yinglian Xie, Michael K. Reiter and Hui Zhang
Honeypot-Aware Advanced Botnet Construction and Maintenance
Cliff C. Zou and Ryan Cunningham
Barbarians in the Gate: An Experimental Validation of NIC-Based Distributed Firewall Performance and Flood Tolerance
Michael Ihde and William H. Sanders
|
|
Session 5C: Student Forum
|
|
Workshop 2 on Architecting Dependable Systems (WADS)
Session 5D: Fault Tolerance
Invited Talk: The SAE Architecture Analysis and Description Language (AADL) Standard: A Basis for Architecture-Driven Embedded Systems Engineering
Joyce L Tokar (Pyrrhus Software)
An Evaluation of Fault Tolerant TCP-Splice Based Web Server Architectures
Manish Marwah, Jacob Delgado, Shivakant Mishra and Christof Fetzer
Idealised Fault Tolerant Architectural Element
Rogerio de Lemos
Fault-tolerant Smart Sensor Architecture for Integrated Modular Avionics
Stefan Schneele, Klaus Echtle, Josef Schalk
|
| 15:30-16:00
| Coffee Break
|
|
16:00-17:30
|
DCCS
Session 6A: Storage Systems
Chair: James Plank
HoVer Erasure Codes for Disk Arrays
James Lee Hafner
Storage Allocation in Unreliable Peer-to-Peer Systems
John A. Chandy
Reliability for Networked Storage Nodes
KK Rao, James Lee Hafner and Richard A. Golding
|
|
PDS
Session 6B: Measuring and Modeling
Chair: Michel Cukier
A Component-Level Path Composition Approach for Efficient Transient Analysis of Large CTMCs
Vinh V. Lam, Peter Buchholz and William H. Sanders
Evaluating the Performability of Systems with Background Jobs
Qi Zhang, Alma Riska, Erik Riedel, Ningfang Mi and Evgenia Smirni
A Contribution Towards Solving the Web Workload Puzzle
Katerina Goseva-Popstojanova, Fengbin Li, Xuan Wang and Amit Sangle
|
|
Session 6C: Fast Abstracts
|
|
Workshop 2 on Architecting Dependable Systems (WADS)
Session 6D: Infrastructure for Dynamic Change
Invited Talk: TBA
Martin Hiller (Volvo Technology Corporation)
Impact-Sensitive Framework for Dynamic Change-Management
Tudor Dumitras, Daniela Rosu, Asit Dan and Priya Narasimhan
Discussion
Wrap-up / Future Directions
|
| 18:00-21:00
| Dinner Cruise
|
|
|
|
WEDNESDAY, JUNE 28
|
| 08:30-10:00
|
Plenary Panel Session
Moderator: Dr. Richard Schlichting, AT&T Labs
"Global Dependability Collaborations - Challenges and Successes"
|
| 10:00-10:30
| Coffee Break
|
|
10:30-12:00
|
DCCS
Session 7A: Complex and Large Scale Systems
Chair: Elmootazbellah Elnozahy
A Large-Scale Study of Failures in High-Performance-Computing Systems
Bianca Schroeder and Garth A. Gibson
Tracking Probabilistic Correlation of Monitoring Data for Fault Detection and Isolation in Complex Systems
Zhen Guo, Guofei Jiang, Haifeng Chen and Kenji Yoshihira
Efficiently Detecting All Dangling Pointer Uses in Production Servers
Nakar Dhurjati and Vikram Adve
|
|
PDS
Session 7B: Multiple-Server Systems
Chair: Paul Ezhilchelvan
Empirical and Analytical Evaluation of Systems with Multiple Unreliable Servers
J. Palmer and I. Mitrani
R-Opus: A Composite Framework for Application Performability and QoS in Shared Resource Pools
Ludmila Cherkasova and Jerry Rolia
Cost-Effective Configuration of Content Resiliency Services Under Correlated Failures
Jinliang Fan, Dimitrios Pendarakis, Zhen Liu and Tianying Chang
|
|
DCCS
Session 7C: VLSI
Chair: Cristian Constantinescu
In-Register Duplication: Exploiting Narrow-Width Value for Improving Register File Reliability
Jie Hu, Shuai Wang and Sotirios Ziavras
Run-Time Reconfiguration for Emulating Transient Faults in VLSI Systems
David de Andres, Juan Carlos Ruiz, Daniel Gil and Pedro Gil
CADRE: Cycle-Accurate Deterministic Replay for Processor Debugging
Smruti Sarangi, Brian Greskamp and Josep Torrellas
|
|
Session 7D: Workshop 3 on Empirical Evaluation of Dependability and Security (all day)
Empirical Evaluation of DEPENDABILITY
Safety
Experience and Lessons Learned With Quantitative Safety and Dependability Assessment of Industrial Safety Critical Systems
C. R. Elks and B. W. Johnson (Department of Electrical and Computer Engineering, University of Virginia)
Fault Injection
A field data study on the use of software metrics to define representative fault distribution
R. Moraes (CESET), J. Duraes (CISUC, University of Coimbra, Portugal), E. Martins (IC, State University of Campinas, UNICAMP, Sao Paulo, Brazil), H. Madeira (CISUC, University of Coimbra, Portugal)
Reliability
Empirical Testing of the Handling of a Reliability-Aware Storage Device
Michael W. Bigrigg (Institute for Complex Engineered Systems, Carnegie Mellon University)
Discussion
|
| 12:00-13:30
| Lunch Break
|
|
13:30-15:30
|
DCCS
Session 8A: Networking
Chair: Farnam Jahanian
Collecting and Analyzing Failure Data of Bluetooth Personal Area Networks
Marcello Cinque, Domenico Cotroneo and Stefano Russo
Improving BGP Convergence Delay for Large-Scale Failures
Amit Sahoo, Krishna Kant and Prasant Mohapatra
Secure Split Assignment Trajectory Sampling: A Malicious Router Detection System
Sihyung Lee, Tina Wong and Hyong S. Kim
A General Framework for Scalability and Performance Analysis of DHT Routing Systems
Joseph S. Kong, Jesse S. A. Bridgewater and Vwani P. Roychowdhury
|
|
PDS
Session 8B: Distributed Algorithms
Chair: Michael Reiter
High Throughput Uniform Total Order Broadcast for Cluster Environments
Rachid Guerraoui, Ron R. Levy, Bastian Pochon and Vivien Quema
Improving Fault Resilience of Overlay Multicast for Media Streaming
Guang Tan, Stephen A. Jarvis and Daniel P. Spooner
Randomized Intrusion-Tolerant Asynchronous Services
Henrique Moniz, Nuno Ferreira Neves, Miguel Correia and Paulo Veríssimo
A Performance Study on the Signal-on-Fail Approach to Imposing a Total Order in the Streets of Byzantium
Qurat-ul-Ain Inayat and Paul Devadoss Ezhilchelvan
|
|
Session 8C: Fast Abstracts
|
|
Session 8D: Workshop 3 on Empirical Evaluation of Dependability and Security (contd.)
Empirical Evaluation of SECURITY
Towards Security Evaluation based on Evidence Information Collection and Impact Analysis
Reijo Savola (VTT Technical Research Centre of Finland, Oulu, Finland) and Juha Roning (University of Oulu, Oulu, Finland)
Empirical Analysis and Statistical Modeling of Attack Processes based on Honeypots
M. Kaaniche (LAAS-CNRS, Universite de Toulouse, France), E. Alata (LAAS-CNRS, Universite de Toulouse, France), V. Nicomette (LAAS-CNRS, Universite de Toulouse, France), Y. Deswarte ((LAAS-CNRS, Universite de Toulouse, France), M. Dacier (Eurecom, France)
Can We Quantitatively Assess Security?
Boudewijn R. Haverkort (Design and Analysis of Communication Systems, Faculty for Electrical Engineering, Mathematics and Computer Science, University of Twente, Enschede, the Netherlands)
|
| 15:30-16:00
| Coffee Break
|
| 16:30-17:30
|
IEEE TC-FTC Business Meeting
|