Special Topics in Structural Dynamics & Experimental Techniques, Volume 5

17 On the Application of Domain Adaptation in SHM 113 17.3 Three Examples of Domain Adaptation Techniques Before introducing the domain adaptation techniques, it is worth beginning with the definitions and terminology; Table 17.1 summaries the frequently-used notation and terms. 17.3.1 Transfer Component Analysis (TCA) TCA was introduced in [14]. The main idea of this method is based on the assumption that there exists a nonlinear transformation from the domain feature space into a Reproducing Kernel Hilbert Space (RKHS), i.e. φ : X →H, which makes P(φ(Xs)) ≈P(φ(Xt )) where P(Ys|φ(Xs)) =P(Yt |φ(Xt )). Therefore, the target distance between domains may be expressed as, Dist(Ds, Dt ) ≈Dist(P(Xs),P(Xt )) (17.1) TCA uses the maximum mean discrepancy (MMD) as the distance criteria, computed as, Dist(P( ˜Xs),P( ˜Xt )) = 1 ns ns i=1 φ(xs,i) − 1 nt nt j=1 φ(xt,j) 2 H (17.2) where ˜Xrepresents the transformed feature matrix. Using the ‘kernel trick’, k(xi, xj) =φ(xi) φ(xj), Eq. (17.1) may be written as, DistK(P(Xs),P(Xt )) =tr(KM), (17.3) where tr(·) denotes the trace of a matrix, K=φ(X) φ(X) ∈R(ns+nt )×(ns+nt ) is the kernel matrix defined on the combined input feature space, X=Xs ∪Xt ∈ R(ns+nt )×m, mis the number of features, andM∈R(ns+nt )×(ns+nt ) is the MMDmatrix, defined by, Mij = ⎧ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎩ 1 n2 s , xi,xj ∈Xs 1 n 2 t , xi,xj ∈Xt −1 nsnt , otherwise (17.4) Instead of learning the kernel k(·, ·), the problem can be solved via considering the kernel matrix K. In [15], a semidefinite programming (SDP) approach is formulated to directly learn the kernel matrix. However, there are several issues with doing this, e.g. the SDP solution is very computationally expensive. For this reason, TCA utilises an explicit low-rank representation of the kernel matrix, i.e. Table 17.1 Summary of notation and descriptions of terms Notation Description Notation Description X Input feature space n Number of input feature instances Y Label space P(X) Marginal distribution D Domain P(Y) Label distribution T Predictive learning task P(Y|X) Conditional distribution Subscript s Denotes source P(X|Y) Class-conditional distribution Subscript t Denotes target φ(·) Mapping function x Input feature vector f(·) Objective predictive function X Input feature matrix Dist(·, ·) Distance in distributions

RkJQdWJsaXNoZXIy MTMzNzEzMQ==