60 K. Worden and E. J. Cross Fig . 8. 4 Predictions using parameters from MCMC parameter samples identified in higher noise case. Bounds take into account parameter and state uncertainty and noise variance would carry out both structure detection and parameter identification within one identification run. Such an algorithm would ideally trade off model complexity against model fidelity and thereby protect against the possibility of overfitting. Such algorithms which carry out both model selection and parameter estimation have recently been grouped under the term equation discovery. As Bayesian algorithms immediately suggest themselves, they implicitly implement such a trade-off [22]. What then would such a model look like? Having just seen the power of MCMC for parameter estimation, one might conceive of a similar algorithm where the Markov chain not only jumps between values in the parameter space but also jumps between models in the candidate set. Such methods do exist, but they are necessarily more complex than the MH algorithm discussed earlier. One of the problems can be seen immediately by considering the situation where the candidate set is .{linear, Duffing}, where the linear model structure .M1 is given by Eq.(8.1) and the Duffing structure .M2 is given by (8.2). In this case, the parameter space for .M1 is three-dimensional, while that for .M2 is four-dimensional; the Markov chain would need to move between spaces of different dimension. Designing an MCMC algorithm with such a capability turns out to be quite difficult if one wishes for the convergence guarantees that hold for MH, etc. [22]. One algorithm with the desired properties is the reversible-jump MCMCalgorithm [26]. While RJ-MCMC has been applied for NLSI [27, 28], it is difficult to code, is computationally expensive and depends on a number of algorithm parameters (hyperparameters), which require careful tuning. Fortunately, there is a ‘simpler’ alternative—approximate Bayesian computing (ABC). The ABC algorithm has been applied in many contexts for equation discovery: genetics [29], biology [30, 31] and psychology [32]. It soon attracted attention as a promising method in NLSI [15, 33]. There are a number of variations on the ABC algorithm; for example, the approach in [33] was based on subset selection, while that of [15] was based on sequential MC (SMC). The SMC variant of the algorithm will be described and applied in the following. ABC offers the possibility of managing larger datasets and higher numbers of competing models with different dimensionalities, circumventing the limitations of RJ-MCMC. In the core ABC algorithm, the objective is to obtain a good and computationally affordable approximation to the posterior distribution: .p(w|D∗, M) ∝p(D∗|w, M)p(w|M) (8.19) where .Mis the model controlled by the set of parameters . w, .p(w|M) denotes the prior distribution over the parameter space and .p(D∗|w, M) is the likelihood of the observed data . D∗ for a given parameter vector . w. ABC was originally designed as a likelihood-free method to overcome issues from the intractable likelihood functions encountered in various real-world problems; as such, it relies on systematic comparisons between observe d and simulated
RkJQdWJsaXNoZXIy MTMzNzEzMQ==