Nonlinear Structures & Systems, Volume 1

On Affine Symbolic Regression Trees for the Solution of Functional Problems 101 Fgrammar ={{x, [0, 9]}, {sin, cos, log, exp,<expr >,<digit >}, {+, −,×, ÷,<func >}, {<op >}} (21) for the grammar representation (indicative of the study by Tsoulos et al. in [8]). The meta-operations (in angle brackets) are ignored in the computation of μ( grammar) resulting in an identical basis. As can be seen in Fig. 4, the size of these spaces grows extremely quickly with the number of nodes in the tree. The number of trees in grammar-based approaches appears to grow more quickly due to the trinary operation. However, it can be seen that the number of expressions representable by these encodings is equivalent. 4 Affine Regression Trees A novel encoding representation is proposed here—the affine symbolic regression tree. This encoding is essentially an extension of the expression tree with additional structure at each node. The new approach is inspired by two observations. First, the exact discovery of constants in symbolic regression schemes is an open problem with few approaches (there are several notable techniques that specify constants approximately—the reader is directed to [14–16] for examples). The second observation is that adjacent points in Eare unlikely to be adjacent in Ronce projected there via the objective function. Put plainly, it is often not trivial for the SR algorithm to increment towards the true solution despite a high degree of semantic similarity (for example, a constant that is wrong by a small integer/rational value). To this end, the affine symbolic regression tree is defined in the same manner as the expression tree. However, each node η is now a 3-tuple, η ={a,f,b} (22) where a and b are termed constants. During execution, nodes take the affine form, η =af +b (23) In this regard, the representation bears some similarity to the multiple regression approach in [14], whereby linear combinations of all tree subexpressions are used during a meta-optimisation step. However, the current approach differs both in that an affine combination is used and in that the constants are included as a part of the tree structure itself. Constants have their own structure and are represented by a 4-tuple, θ ={s,p,q,r} (24) where pand q are integers in the range of [0,z] and [1,z], respectively. r is a member of the set of permitted exponents R with a default value of 1. s is an element from the permitted transcendental constants S, also with a default value of 1. Upon execution, constants are evaluated as θ =s p q r (25) Upon initialisation, ‘a’ constants are given a value of 1 ({1,1, 1, 1}) and ‘b’ constants a value of 0 ({0,1, 1, 1}). Initialising the constants in this way provides a bias towards sparse solutions. Care is taken to ensure that q =0 so that illegal divisions are avoided by design. A more granular approach is also possible for the specification of the exponents. In this case, the constants are represented by the 5-tuple θ ={s,p,q,r1,r2} (26) and are evaluated as

RkJQdWJsaXNoZXIy MTMzNzEzMQ==