A Proposed Byzantine Fault-Tolerant Voting Architecture using Time-Triggered Ethernet

Paper #:
  • 2017-01-2111

  • 2017-09-19
This paper proposes the Fault-Tolerant Voting Architecture using Time-Triggered Ethernet in high-integrity applications, which can robustly handle Byzantine faults. The problem of Byzantine agreement in real-life systems is fundamentally one of data distribution from a single source to multiple receivers. In voting systems designed to increase fault tolerance through the use of redundant processing elements, it is necessary to ensure that the data read from a single shared input device is bitwise identical across all redundant processors (i.e. have interactive consistency (IC)). Even more generally, the push for increasingly distributed control systems in spacecraft necessitates robust mechanisms for consensus among virtually all network connected components. Similar to the role of Redundancy Management Units (RMUs) in NASA Langley Research Center’s Scalable Processor-Independent Design for Electromagnetic Resilience (SPIDER) project, the network switches fill the role of interstages for the purpose of broadcasting data and ensuring interactive consistency [5]. Rather than a classical two-round IC exchange between OBCs, consistency can be achieved according to the following algorithm: 1) A networked device (e.g. a remote interface unit (RIU), one of the OBCs) broadcasts its data value to all switches. 2) Each switch broadcasts the received value to all OBCs. 3) Each OBC performs a hybrid-majority vote on all values received (i.e. messages that violate the protocol are not included in the vote). The current implementation of the voting system implements the hybrid-majority vote in software abstracted over each OBC’s network interface card (NIC) driver. However, future implementations could realize this function in the NIC itself – reducing the need for host resources. Each host processor is connected to the network through a standard-integrity end system controller whose fault hypothesis is assumed to include arbitrary failures. If standard-integrity switches are also used, then a 1FT design is realizable with three switches (one per network plane) and three processors comprising six total fault containment regions (FCRs). Alternatively, the use of high-integrity (HI) switches reduces the hardware requirements by limiting the possible failure modes – namely that a faulty switch may not create (nor modify to produce) a new valid message. In that case, a 1FT design is theoretically possible with only two redundant switches. In all cases, however, use of three channels minimizes the number of two-fault combinations resulting in system failure over two channel configurations.
SAE MOBILUS Subscriber? You may already have access.
Attention: This item is not yet published. Pre-Order to be notified, via email, when it becomes available.
Members save up to 40% off list price.
HTML for Linking to Page
Page URL

Related Items