Skip to content

CHESS SBA Extensions

Leonardo Montecchi edited this page Jan 9, 2018 · 11 revisions

Basic Concepts

In order to execute the State-Based Analysis plugin, the system/software/hardware architecture needs to be extended with information about failure behavior of components, and failure propagation. This is done by means of the CHESS-SBA extensions, which are part of the CHESS Dependability Profile, i.e., a set of UML stereotypes to be attached to architecture elements.

Supported Views

CHESS-SBA can be executed at system, software, or hardware level, or a combination thereof. Information on allocation allows the analysis to be extecuted on the software and hardware architectures together.

The annotation with CHESS-SBA extensions can be performed on specific views only:

  • System View for system elements.
  • Extra-Functional View for software elements. This view is activated as sub-view of the Component View, by clicking the Activate Extra-Functional View button in the toolbar. Activate Extra-Functional View
  • Dependability View for hardware elements. This view is activated as sub-view of the Deployment View, by clicking the Activate Dependability View button in the toolbar. Activate Dependability View
  • Dependability Analysis View, for the specification of analysis objectives and metrics. This is a sub-view of the Analysis View, and can be reached vie the model explorer.

Numerical Values

  • Scientific notation is supported. So, for example, the number 0.000001 can be written as 1.0E-6.
  • Although we use some time-related attributes, unit of measurements are never specified in CHESS-SBA models. It is assumed that the same unit of measurement is used across the model for time-related quantities. For example, if fault ocurrence rates are specified in _1/hours, then also propagation delays should be specified in hours.

Probability Distributions

Time-related quantities, e.g., fault occurrence or delays, can be specified as probability distributions, using MARTE's Value Specification Language (VSL) syntax.

The following distributions are currently supported:

  • Exponential, as exp(lambda) or exponential(lambda).
  • Deterministic, as det(value) or deterministic(value)
  • Uniform, as uni(a,b) or uniform(a,b)
  • Normal, as norm(mean,var) or normal(mean,var)
  • Gamma, as gam(alpha,beta) or gamma(alpha,beta)
  • Weibull, as wei(alpha,beta) or weibull(alpha,beta)

Failure Modes

We use the dot notation to reference failure modes occurring on a specific port of a component/block. For example, p1.omission specifies the occurrence of the omission failure mode on port p1.

Assumptions

The Stochastic Petri Nets models generated by CHESS-SBA are based on the following set of assumptions:

  • Internal faults occurring in different components are independent from each other.

  • The activation delay of faults is zero. That is, any fault occurrence immediately generates an error. This is a generalization of two possible situations which are modeled in the same way in CHESS-SBA:

    1. A fault occurs with delay d, and it immediately generates an error; or,
    2. The fault is always present in the component, and it is activated after a delay d (e.g. software development bugs).

    In both cases, we model such delay d as a fault occurrence delay.

  • When two components A and B are connected through a port, every failure mode of A on that port is experienced as an external fault of the same kind by B (and possibly vice-versa, depending on the direction of the port).

CHESS-SBA Stereotypes

Components

Three ways are available to the modeler for adding dependability information to a given component of the system:

  1. «SimpleStochasticBehavior»
  2. «FLABehavior»
  3. «ErrorModelBehavior»

Note that stereotypes 1 and 2 can co-exist on the same component/block. Other combinations (e.g., 1-3 or 2-3) are not allowed.

«SimpleStochasticBehavior»

The «SimpleStochasticBehavior» stereotype is the simplest way to add dependability information to a component/block. When using this stereotype we assume that the component can be affected by only one kind of internal fault, which immediately causes the component to fail. Possibly, the failure can manifest itself with different failure modes with a certain probability.

«SimpleStochasticBehavior» example

Attribute Description
failureOccurrence Specifies the time to the occurrence of a failure in the component. This is specified as a probability distribution.
failureModesDistribution (optional) Specifies the possible failure modes of the component, and their relative probabilities. This attribute accepts expressions in the grammar below.
repairDelay (optional) Specifies the time needed to repair the component, as a probability distribution.

Grammar for the failureModesDistribution attribute:

<FMD> ::= <D> | <PD> | <PD>; <PD>
<PD> ::= <PORT> <D>
<D> ::= { <FP> }
<FP> ::= <F> : <P> | <F> : <P>, <FP>
	<F> is a failure mode, <P> is a probability value,
	<PORT> is a port of the component

«FLABehavior»

The «FLABehavior» stereotype allow failure behavior of components to be specified in terms of failure logic specification. That is, it specifies how components propagate and/or transform failure modes experienced at their input This stereotype is primarily used for the CHESS-FLA analysis, but can be processed by CHESS-SBA as well. Actually, it is a convenient way to specify propagation behavior of components and redundancy.

«FLABehavior» example

Attribute Description
fptc FPTC expression

«ErrorModelBehavior»

The «ErrorModelBehavior» stereotype allows the provision of more details on faults, errors, and failure modes of system/software/hardware elements. Such details are specified using a particular kind of StateMachine, called Error Model, which is described below. The purpose of this stereotype is to link the component/block to the error model state machine.

«ErrorModelBehavior» example

Attribute Description
errorModel Reference to a UML StateMachine with the «ErrorModel» stereotype, which specifies the detailed dependability behavior of the component.

Error Model

The CHESS Error Model is a particular kind of StateMachine diagram containing information on fault/error propagation inside a given component/block. Such StateMachine is stereotyped with the «ErrorModel» stereotype.

«ErrorModel»

The «ErrorModel» stereotype has the purpose of identifying a StateMachine as a CHESS Error Model. As such, it doesn't have any attributes.

«ErrorModel» example

Like ordinary StateMachines, an error model must have an initial Pseudostate. In this context, it represents the intial, "healthy" state of the component.

«ErrorState»

The «ErrorState» stereotype is applied to UML State elements of the error model, to identify erroneous states of the component.

«ErrorState» example

«InternalFault»

The «InternalFault» stereotype is applied to UML Transition elements of the error model. It represents the possible occurrence of an internal fault inside the component. The transition connects to State of the error model. When the fault occurs, the component moves to the destination state of the transition, typically a «ErrorState».

«InternalFault» example

Attribute Description
occurrence The time to the occurrence of the fault in the component. This is specified as a probability distribution.

«InternalPropagation»

The «InternalPropagation» stereotype is applied to UML Transition elements of the error model. It represents an internal error propagation occuring in the component, which may trigger a state transition. Propagation may occur after a certain amount of time, or as a consequence of incoming external faults. When the propagation occurs, the component moves to the destination state of the transition.

«InternalPropagation» example

Attribute Description
delay The time after which propagation occurs. This is specified as a probability distribution. The case of immediate propagation (e.g., in response to incoming external faults) is modeled with deterministic(0).
externalFaults (optional) This attribute is a boolean expression on the occurrence of external faults on input ports of the component, and acts as a guard. That is, the propagation is triggered only when the specified faults occur. If left empty, the transition has no guard.
weight (optional) It may happen that two or more «InternalPropagation» transitions are enabled at the same time. The weight attribute assigns a relative probability of occurrence to the transition, to solve possible non-determinism. If left empty, the default value 1.0 is used.

«Failure»

The «Failure» stereotype is applied to UML Transition elements of the error model. It represent the occurrence of a failure of the component, with propagation of errors to its external interfaces (ports). When the transition occur it may cause the component to move to another state, but most often the source and target state of «Failure» transitions is the same State.

«Failure» example

Attribute Description
mode The failure modes that the component exhibits when this transition occurs. More than one failure mode can be specified for the same transition.

Propagation Paths

Propagation paths are automatically derived by the transformation algorithm, based on the structure of the defined architecture. Propagation may occur: i) between component/block instances connected throught ports, or ii) from hardware to software in case of allocation relations. Propagations are resolved globally, that is, connectors are followed until reaching an atomic component instance. This means the algorithm is able to extract propagation information from complex hierarchical CHESS ML models.

By, default, it is assumed that immediate and deterministic propagation will occur through these paths. However, different behaviour may be specified using the «Propagation» stereotype.

«Propagation»

The «Propagation» stereotype applies to all the elements that are considered propagation paths, i.e., UML Connector, SysML Connector and UML Comment with the «Assign» stereotype. This stereotype has two attributes to specify the delay and probability of propagation.

«Propagation» applied to a Connector «Propagation» applied to an «Assign» Comment

Attribute Description
prob (optional) Probability that propagation actually occurs. Default value is 1.0.
propDelay (optional) Delay after which propagation will occur, specified as a probability distribution. Default value is deterministic(0), i.e., immediate propagation.

Maintenance

When the «SimpleStochasticBehavior» stereotype is used, maintenance information can be attached using the repairDelay attribute, which specifies the (probabilistic) delay after which the component is restored to its original healthy state. While such simple specification of repairs is useful, it is not powerful enough to specify advanced maintenance strategies. CHESS-SBA provides an additional mechanism to model detailed maintenance strategies. The approach is based on the concept of activity: a maintenance strategy is a collection of activities that are performed when certain conditions hold.

Modeling maintenance in this way is also useful for components with the «ErroModelBehavior or «FLABehavior» stereotypes, which do not provide any information on repairs by themself. Maintenance activities can be modeled in the Dependability Analysis View only.

«Repair»

The only maintenance stereotype that is currently supported by transformations is «Repair». It represents a maintenance activity that makes a component instance return to its original healthy state. Depending on the kind of component, it be used to model different maintenance operations. For example, a «Repair» or a hardware component instance may involve its replacement with a new instance, while for a software component it may simply represent its restart. Furthermore, it can represent both preventive and corrective maintenance, based on the expression specified in the when attribute.

«Repair» example

Attribute Description
targets The component/block instances that are repaired by the activity. Multiple component/block instances can be specified.
when The policy for the execution of the activity, specified using a custom grammar, defined below.
duration (optional) The time required to perform the activity, specified as a probability distribution.
probSuccess (optional) The probability that the activity is successful and the component instances are actually repaired.

Grammar for the when attribute:

<S> ::= <T> [<EX>] | <T> [<EX>] {<L>}
<T> ::= Immediately | AtTime(<realnumber>) | Periodic(<realnumber>)
<EX> ::= (<EX> and <EX>) | (<EX> or <EX>) | not <EX> | true | <FD>
<FD> := Failed(<FailureMode>) | Detected(<Error>)
<L> := Before(<RealNumber>) | After(<RealNumber>) | 
       Interval(<RealNumber>,<RealNumber>)

Expressions of such grammar are constituted of three parts: i) the scheduling of the activity (<T> rule); ii) a condition on the system's state that must hold in order for the activity to be executed (<EX> rule); and iii) an _optional_ condition that restrict the execution of the activity only in a predefined interval of time (` rule).

The semantic of the symbols in the gramamr is the following:

Symbol Semantic
Immediately The activity is executed immediately as soon as the conditions specified by <EX> hold.
AtTime(<RealNumber>) The activity is executed at the instant of time specified by the element.
Periodic(<Distribution>) The activity is executed periodically, at intervals of time following the probability distribution that is specified as parameter. The period is counted from the beginning of an activity execution and the beginning of the subsequent one. If the activity duration is greater than this interval of time, the activity is executed immediately as soon as the previous execution completes.
Failed(<FailureMode>) Predicate on the state of a component. This predicate is true if the component has failed with the failure mode specified <FailureMode>.
Detected(<ErrorState>) Predicate on the state of a component. This predicate is true if the state <ErrorState> has been detected by error detection mechanisms.
Before(<RealNumber>) The activity can be executed only before the instant of time specified by <RealNumber>.
After(<RealNumber>) The activity can be executed only after the instant of time specified by .
Interval(<RealNumber>,<RealNumber>) The activity can be executed only in the interval of time identified by the two <RealNumber> values. The boundaries are included in the interval.

Metrics

To actually execute the CHESS-SBA tool we need to define the metrics that it will compute, i.e., the objectives of the analysis. To do this we use the «StateBasedAnalysis» stereotype from CHESS ML.

«StateBasedAnalysis»

The «StateBasedAnalysis» stereotype is an extensions of MARTE «GaAnalysisContext» and it is used to define the analysis context for CHESS-SBA. Its use is allowed on the Dependability Analysis View only, and it should be applied to UML Component elements.

«StateBasedAnalysis» example

Attribute Description
platform The system or subsystem on which the analysis should be performed. This should be a UML Package generated by the Build Instances command.
measure Definition of the dependability metric to evaluated on the system. Supported metrics are listed below.
targetDepComponent (optional) The component/block instance for which the metrics should be evaluated. This is typically set to the top-level component instance of the architecture under analysis, or to the component instance that delivers a certain service of interest. If left empty, the component top-level component instance in the platform is considered.
targetPort (optional) One or more ports belonging to targetDepComponent. If the attribute is specified, the metric defined in measure will be evaluated considering the component failed only when those specific services (ports) are failed. If left empty, all the output ports of the selected platform are considered.
targetFailureMode (optional) One or more failure modes applicable to ports in targetPort. It the attribute is specified, the metric defined in measure will be evaluated considering the component failed only when that specific failure mode occurs. If left empty, all the failure modes of the selected ports are considered.
measureEvaluationResult (read only) This is where the execution of the analysis will get back-annotated. As such, it should be left empty by the user. See the how the analysis is executed for further details.

Currently supported values for the measure attribute are:

Measure Description
Reliability { instantOfTime = t } Instant of time reliability: probability that the component does not fail until time t.
Availability { instantOfTime = t } Instant of time availability: probability that at time t the component is not failed (it also considers repairs).
Availability { intervalEnd = t } Fraction of time that the component is not failed in the interval [0,t].
PFD { t } Probability of failure on demand, computed as 1 – Reliability { instantOfTime = t }.

Now that you know how to annotate your model with CHESS-SBA stereotypes it is time to learn how to execute the analysis.