Assume that the x˙=f(x) is a smoooth dynamical system and (X˙,X) consists of finite subset of {(x˙(t),x(t))}t. If Θ(X) contains all terms(bases) of f, then there exists Ξ so that ε=X˙−Θ(X)Ξ=0. Note that ε≫0 means the violation of the assumption: f is not a smooth function, or Θ is not a complete set of bases, or the noise of (X˙,X) is not negligible.
Let X1,X2 be two disjoint datasets of a non-smooth dynamical system and Ξ1 and Ξ2 represent the minimizers of X˙1−Θ(X1)Ξ and X˙2−Θ(X2)Ξ, respectively.
Denote the concatenation of two matrices M1 and M2 as
M12=Θ(M12)=[M1M2][Θ(M1)Θ(M2)]
Let A,B be matrices the difference of Ξ1 and Ξ2 from Ξ12, respectively.
Ξ12=Ξ1−A=Ξ2−B
Then MSE of the concatenated dataset X12 is given by
ε======X˙12−Θ(X12)Ξ12[X˙1X˙2]−[Θ(X1)Θ(X2)]Ξ12[X˙1X˙2]−[Θ(X1)Ξ12Θ(X2)Ξ12][X˙1−Θ(X1)Ξ12X˙2−Θ(X2)Ξ12][X˙1−Θ(X1)Ξ1+Θ(X1)AX˙2−Θ(X2)Ξ2+Θ(X2)B][Θ(X1)AΘ(X2)B]
If X1 and X2 are from the same subsystem, then Ξ12=Ξ1=Ξ2⟹A=B=O and X˙12−Θ(X12)Ξ12=0, that is, Ξ12 can be minimizer both datasets.
Method
Let s=1.
Build the whole dataset X by vertical concatenating all datasets and calculate MSE ε=X˙−Θ(X)Ξ. This MSE will be used as a threshold for stopping the algorithm.
Take an arbitrary dataset Xi and calculate MSE εik=X˙ik−Θ(Xik)Ξ for all indices k. Note that εik indicates how much the k-th dataset is different from the i-th dataset.
Find j=argmaxkεik and calculate MSE εjk=X˙jk−Θ(Xjk)Ξ for all indices k. If εjk<ε, then we conclude that the most different dataset Xj is similar to the Xi so there is no more need to check the j-th dataset.
Compare εik and εjk for all indices k. If εik<εjk, then i-th dataset is a candidate dataset from the s-th subsystem. This step separates the datasets into two groups: candidates and non-candidates.
In the candidate group, pick the dataset with the largest MSE l=argmaxkεik and calculate εlj for all indices j. If εil>minj{εlj}, then non-candidates group has at least one dataset that is more similar than the i-th dataset and l-th dataset is not a candidate dataset anymore. We repeat this step until εil<minj{εlj}.
If εil<minj{εlj}, then we conclude that the datasets in candidates group are from s-th subsystem and remove the datasets.
Update s←s+1 and repeat step 2 until all datasets are labeled.