Locally weighted linear regression (1) - Python implemented

  • Algorithm features :
    Each point corresponds to a separate linear regression equation of the curve, the linear equation determined by a set of residuals after the weighted residuals from fitting the data points to be fitted with the distance in the phase space of the hyperplane , the weight depends on the distance data points and the fitted data points in the parameter space to be fitted.
  • Derivation algorithm :
    to be fit equation :
    \ the begin Equation} {\ label eq_1} {
    H _ {\ Theta} (X) = X ^ T \ Theta
    \ End {} Equation
    Least Squares :
    \ the begin Equation} {\ label {eq_2 }
    min \ \ FRAC. 1} {2} {(X-T ^ \ theta-\ the Y bar {}) ^ TW (X-T ^ \ theta-\ the Y bar {})
    \ Equation} End {
    wherein, $ X = [ x ^ 1, x ^ 2,to be fitted by a matrix of the input value data consisting of each component are a column vector, the last element of the column vector is set to 1 needs to meet the constant term of the linear fit. $ \ bar {Y} $ fit criterion to be output by the data value of the composition, is a column vector. $ $ W is to be fitted by the weight data of the weight composition, is a diagonal matrix , where $ I $ take the first diagonal element is $ exp. (- (x_i -of optimization variables to be solved, is a column vector .

    appeal optimization problem (\ ref {eq_2}) optimization unconstrained projections, there is an approximate analytic solution :
    \ the begin {Equation} \ label {eq_3}
    \ Theta = (xwx ^ T + \ varepsilon the I) ^ {-. 1 XW} \ the Y bar {}
    \} End {Equation
  • Code implementation :
      1  # locally weighted linear regression 
      2  # cross-validation calculation generalization error minimum point 
      . 3  
      . 4  
      . 5  Import numpy
       . 6  from matplotlib Import pyplot AS PLT
       . 7  
      . 8  
      . 9  # to be noise-free fit objective functions 
    10  DEF oriFunc (X):
     . 11      Y numpy.exp = (the -X-) numpy.sin * (10 * X)
     12 is      return Y
     13 is  # to be fitted includes a target function of the noise of 
    14  DEF traFunc (X, Sigma = 0.03 ):
     15      Y = oriFunc (X) + numpy .random.normal (0, Sigma, numpy.array (X) .size)
     16     return Y
     . 17  
    18 is      
    . 19  # locally weighted linear regression of the implement 
    20 is  class LW (Object):
     21 is      
    22 is      DEF  the __init__ (Self, xBound = (0,. 3), Number = 500, tauBound = (from 0.001, 100), Epsilon =. 1. . 3-E ):
     23 is          . Self __xBound = xBound                # sampled boundary 
    24          . Self __number = number                # number of samples 
    25          Self. __tauBound = tauBound            # search of tau boundary 
    26 is          Self. __epsilon = Epsilon             # Of tau accuracy of the search 
    27          
    28          
    29      DEF get_Data (Self):
     30          '' ' 
    31 is          generated in accordance with the objective function to be fitted to the data
     32          ' '' 
    33 is          X-numpy.linspace = (* Self. __XBound , Self. __Number )
     34 is          oriY_ = oriFunc (X-)                    # free of error response 
    35          traY_ traFunc = (X-)                    # comprises the response error 
    36          
    37 [          self.x numpy.vstack = ((X.reshape ((. 1, -1)), numpy.ones (( . 1 , X.shape [0]))))
     38 is          self.oriY_ oriY_.reshape = ((-. 1,. 1 ))
     39         self.traY_ = traY_.reshape((-1, 1))
     40         
     41         return self.X, self.oriY_, self.traY_
     42         
     43         
     44     def lw_fitting(self, tau=None):
     45         if not hasattr(self, "X"):
     46             self.get_data()
     47         if tau is None:
     48             if hasattr(self, "bestTau"):
     49                 tau = self.bestTau
     50             else:
     51                 tau = self.get_tau()
     52         
     53         xList, yList = list(), list()
     54         for val in numpy.linspace(*self.__xBound, self.__number * 5):
     55             x = numpy.array([[val], [1]])
     56             theta = self.__fitting(x, self.X, self.traY_, tau)
     57             y = numpy.matmul(theta.T, x)
     58             xList.append(x[0, 0])
     59             yList.append(y[0, 0])
     60             
     61         resiList = list()                                              # Residual calculation 
    62 is          for IDX in Range (Self. __Number ):
     63 is              X = self.x [:, IDX: + IDX. 1 ]
     64              . Theta = Self __fitting (X, self.x, self.traY_, of tau)
     65              = Y numpy.matmul (theta.T, X)
     66              resiList.append (self.traY_ [IDX, 0] - Y [0, 0])
     67              
    68          return XList, yList, self.x [0,:] ToList. (), resiList
     69          
    70          
    71 is      DEF show (Self):
     72          '' ' 
    73 is          a drawing showing the overall fit case
     74          ' ''
     75         xList, yList, sampleXList, sampleResiList = self.lw_fitting()
     76         y2List = oriFunc(numpy.array(xList))
     77         fig = plt.figure(figsize=(8, 14))
     78         ax1 = plt.subplot(2, 1, 1)
     79         ax2 = plt.subplot(2, 1, 2)
     80         
     81         ax1.scatter(self.X[0, :], self.traY_[:, 0], c="green", alpha=0.7, label="samples with noise")
     82         ax1.plot(xList, y2List, c="red", lw=4, alpha=0.7, label="standard curve")
     83         ax1.plot(xList, yList, c="black", lw=2, linestyle="--", label="fitting curve")
     84         ax1.set(xlabel="$x$", ylabel="$y$")
     85         ax1.legend()
     86         
     87         ax2.scatter(sampleXList, sampleResiList, c="blue", s=10)
     88         ax2.set(xlabel="$x$", ylabel="$\epsilon$", title="residual distribution")
     89         
     90         plt.show()
     91         plt.close()
     92         fig.tight_layout()
     93         fig.savefig("lw.png", dpi=300)
     94         
     95         
     96     def __fitting(self, x, X, Y_, tau, epsilon=1.e-9):
     97         tmpX = X[0:1, :]
     98         tmpW = (-(tmpX - x[0, 0]) ** 2 / tau ** 2 / 2).reshape(-1)
     99         W =numpy.diag (numpy.exp (tmpW))
     100          
    101          ITEM1 = numpy.matmul (numpy.matmul (X-, W is), XT)
     102          ITEM2 = numpy.linalg.inv (ITEM1 * + Epsilon numpy.identity (item1.shape [0]))
     103          Item3 = numpy.matmul (numpy.matmul (X-, W is), Y_)
     104          
    105          Theta = numpy.matmul (ITEM2, Item3)
     106          
    107          return Theta
     108  
    109          
    110      DEF get_tau (Self):
     111          ' ' 
    112          cross validation return most of tau
     113          calculated using the optimum golden section of tau
     114          ' '' 
    115          IF  Not hasattr(self, "X"):
    116             self.get_data()
    117         
    118         lowerBound, upperBound = self.__tauBound
    119         lowerTau = self.__calc_lowerTau(lowerBound, upperBound)
    120         upperTau = self.__calc_upperTau(lowerBound, upperBound)
    121         lowerErr = self.__calc_generalErr(self.X, self.traY_, lowerTau)
    122         upperErr = self.__calc_generalErr(self.X, self.traY_, upperTau)
    123         
    124         while (upperTau - lowerTau) > self.__epsilon:
    125             if lowerErr > upperErr:
    126                 lowerBound = lowerTau
    127                 lowerTau = upperTau
    128                 lowerErr = upperErr
    129                 upperTau = self.__calc_upperTau(lowerBound, upperBound)
    130                 upperErr = self.__calc_generalErr(self.X, self.traY_, upperTau)
    131             else:
    132                 upperBound = upperTau
    133                 upperTau = lowerTau
    134                 upperErr = lowerErr
    135                 lowerTau = self.__calc_lowerTau(lowerBound, upperBound)
    136                 lowerErr = self.__calc_generalErr(self.X, self.traY_, lowerTau)
    137                 
    138         self.bestTau = (upperTau + lowerTau) / 2
    139         return self.bestTau
    140         
    141         
    142     def __calc_generalErr(self, X, Y_, tau):
    143         generalErr = 0
    144         
    145         for idx in range(X.shape[1]):
    146             tmpx = X[:, idx:idx+1]
    147             tmpy_ = Y_[idx:idx+1, :]
    148             tmpX = numpy.hstack((X[:, 0:idx], X[:, idx+1:]))
    149             tmpY_ = numpy.vstack((Y_[0:idx, :], Y_[idx+1:, :]))
    150             
    151             theta = self.__fitting(tmpx, tmpX, tmpY_, tau)
    152             tmpy = numpy.matmul(theta.T, tmpx)
    153             generalErr += (tmpy_[0, 0] - tmpy[0, 0]) ** 2
    154 
    155         return generalErr
    156         
    157         
    158     def __calc_lowerTau(self, lowerBound, upperBound):
    159         delta = upperBound - lowerBound
    160         lowerTau = upperBound - delta * 0.618
    161         return lowerTau
    162         
    163         
    164     def __calc_upperTau(self, lowerBound, upperBound):
    165         delta = upperBound - lowerBound
    166         upperTau = lowerBound + delta * 0.618
    167         return upperTau
    168         
    169         
    170         
    171         
    172 if __name__ == '__main__':
    173     obj =Lw ()
     174      Obj.show ()
    View Code

    Example I used function:
    \ the begin Equation} {\ label eq_4} {
    F (X) = E ^ {-} SiN X (lOx)
    \ Equation End {}

  • The results show :
  • Recommendations :
    1. locally weighted linear regression primarily as a means of smoothing the presence of its low ability to predict the non-covered section of the sample.
    2. In particular, without explicit fitting formula, can be used locally weighted linear regression obtaining an acceptable curve fitting .
    3. after an appropriate weighting scheme is selected to be calculated at the specified point relatively reliable 0th and 1st order function information .

Guess you like

Origin www.cnblogs.com/xxhbdk/p/11614217.html