Model Validation and Uncertainty Quantification, Volume 3

13 Maintenance Planning Under Uncertainties Using a Continuous-State POMDP Framework 137 Qn .b;a/ DZ s ra.s/b.s/ C X o p oˇ ˇ ˇ b;a Vn 1b a;o (13.6) where is the discount factor. The optimal planning strategy is then obtained as: Vn.b/ Dmax a Qn .b;a/ (13.7) The computation of an additional horizon is termed the backup algorithm. The work performed in [16] proves that the optimal future expected reward can be represented by a set of ˛-vectors in the case of a discrete-state space. This has been further developed into continuous-state space via the use of ˛-functions as shown in [13]. The formulas are then Vn.b/ Dmax f˛i ngi Z s ˛i n.s/b.s/ (13.8) where ˚˛i n.s/ i D ( ra.s/ C X o ˛a;o;b.s/ ) a2A (13.9) ˛a;o;b Darg max n˛ j a;oo j Z s ˛j a;o.s/b.s/ (13.10) ˛j a;o.s/ D Z s0 ˛ j n 1 s0 p oˇ ˇ ˇ s0 p s0ˇ ˇ ˇ s;a (13.11) The computation of these integrals is computationally exhaustive, especially as the number of dimensions increases. For this purpose, simplification algorithms are available, representing the continuous belief state as a sum of weighted Gaussian functions [13]. b.s/ D X iD1:::k wi sˇ ˇ ˇ i ;†i (13.12) where the sum of the weights wi is defined as Pi D1 : : : kwi D1 and wi >0, 8i. The Gaussian function (sj , †) is described by a mean value, , and a variance, †. To fully exploit this simplification, the other elements of the tuple describing the POMDP are adjusted accordingly. The transition is modeled as a single Gaussian function for each action, i.e. (s0jsC (a), †a). In this formulation the state-shift is not dependent on the state itself, only on the chosen action a [13]. The observation model is a set of non-normalized mixtures of Gaussians with Po2O o D1, 8s andwi,o >0, 8(i, o) o.s/ DX l wo l sˇ ˇ ˇ o l ;†o l (13.13) The reward function is represented by a non-normalized sum of Gaussians. Positive areas (reward) as well as negative areas (costs) are possible in the same function. The absolute value of the reward function describes the effective cost for a given state. In the framework of Gaussian mixtures the integrals can be calculated analytically which reduces the computational effort drastically. The set of optimal strategies is then represented by the ˛-functions formulated as Gaussian mixtures: ˛ j n 1.s/ DX k wj k s0ˇ ˇ ˇ s j k;† j k (13.14)

RkJQdWJsaXNoZXIy MTMzNzEzMQ==