Detailed explanation of semivariance function

1 Introduction

Tobler’s First Law of Geography states, “Everything is related to everything else, but things near are more related than things far away.

In the case of semivariograms, things that are closer are more predictable and have less variability, whereas < a i=3>Things that are far away are difficult to predict and less relevant.

For example, the terrain at your current location is more likely to be similar to the terrain 1 meter ahead than to the terrain 100 meters away.

The semivariogram plotshow sample values ​​(pollution, elevation, noise, etc.) change with distance.

Next, take soil moisture sample as a case for explanation.

The case contains 73 soil moisture samples from a 10-acre field. In thenorthwestern corner, the sample is morewetter with a higher water content. But in theeastern quadrant they are muchdrier as shown below Show.

I have the following questionsabout the above picture:

  • How predictable are the values ​​for different locations?
  • distancenearnessto know or notratio Distanceconnectiontosimilarity?

This idea can be described in terms ofstatistical dependence or autocorrelation. Additionally, spatial autocorrelation (things that are closer together are more similar than things that are farther away) provides valuable information for predictions.

2 Principle of semivariogram

To understand the spatial dependence, the semivariogram can be used for estimation. Semivariogramtakes 2 sampling positions, and calls the distance between the two pointsh.

On thex axis, itplots distance (h) in units of lag. Plotted on the y-axis between the response variables (moisture content in the soil), and variance, the lag is just the grouping distance. Take 2 sample locations from each group, measure the

Depending on the observer, the semivariogram looks likea bunch of points. For example, a soil moisture diagram looks like this:

But you can do some detective work by selectinga single point. When taking this point on the semivariogram:

You can see them on the map which 2 points represent. This makes sense since they are quite far apart from each other. Therefore, it is in the extreme right position in the semivariogram. It is this point that is emphasized below:

They are also significantly different from the mean value for that particular lag distance. Ifthe semivariance is higher, then itis higher on the y-axis. It can be seen that the semivariance is smaller when the lag distance is closer, and the larger the lag distance is, the larger the semivariance is .

We are studying all distances between 2 samples and their variability. The semivariogram considers all points and their distance from the variance.

That's why there are so many points on the semivariogram. This is a subset of the dataset above to see all the different sets of points we can plot in the semivariogram.

3 Semivariogram calculation

The semivariance function is a function of distanceh and also directionα function. When a variable is distributed in space, the variable is called regionalized variable(regionalized variable), and the semivariance function is a regionalized variable Z ( x i ) Z(x_i) Z(xi) Z ( x i + h ) Z(x_i+h) Z(xi+h) Mathematical expectation of increment squared, That is, the variance (variograms) of the regionalized variable increment. Its calculation formula:

r ( h ) = 1 2 N ( h ) ∑ i = 1 N ( h ) [ Z ( x i ) − Z ( x i + h ) ] 2 r(h)=\frac{1}{2N(h)}\sum_{i=1}^{N(h)}[Z(x_i)-Z(x_i+h)]^2 r(h)=2N(h)1i=1N(h)[Z(xi)Z(xi+h)]2

In expression, r ( h ) r(h) r(h) This is the distance apart < /span> h h h oriented half-way difference diagram估计值 N ( h ) N(h) N(h) YesThe number of distribution points of the distance h, Z ( x i ) Z(x_i) Z(xi) 是样点 x i x_i xi 电影电影, Z ( x i + h ) Z(x_i+h) Z(xi+h) 是样点 x i + h x_i+h xi+The average density of h.

The semivariogram is r ( h ) r(h) r(h) Work distance h h The graph of the function of h, whose value is the value in a specific direction, has the four most important parameters:

  • RANGE: It is the separation distance when the value of the variation function reaches equilibrium , reflecting the scope of influence of regionalizationvariables.
  • Nugget value (NUGGET): refers to the intercept when the variogram curve extends to and the interval distance is zeroDistance, reflecting the possible degree of regionalized variableinternal randomness
  • base level C 0 + C C_0+C C0+C(SILL): refers to the variation function value when reaches equilibrium, reflecting the variable< /span>The magnitude of the change
  • Spatial variation ratio C 0 / ( C 0 + C ) C_0/(C_0+C) C0/(C0+C): reflects the degree of spatial variation of variables , its value is higher, indicating that the degree of spatial heterogeneity caused by the random part is higher is lower, which means that the spatial variation caused by the spatial autocorrelation part is large. From the perspective of structural factors, it represents the degree of spatial correlation of system variables. If the ratio is less than 25%, it indicates that the variables have strong spatial correlation; if the ratio is between 25% and 75%, the variables have moderate spatial correlation; if it is greater than 75%, the variable space The correlation is weak. Variation, it means that the variable has a constant value across the entire scale ratio is close to 1; if the

or later r ( h ) r(h) r(h) 为纵轴, h h h is the horizontal axis, draw r ( h ) r(h) r(h) h h h The increasing change curve is a semivariogram.

It can be seen from the figure:

  • At closely spaced sampling points, the difference in values ​​between points tends to be small. In other words, the semivariance is small.
  • As the distance away from the sampling points increases, there is no longer a relationship between the sampling points. Their variances start to level off and the sample values ​​are not related to each other.
  • When there are two sampling points at the same location, they can be expected to have the same value, so the nugget value should be zero. Sometimes they don't, which adds to the randomness. But before the graph starts to level, these values ​​are spatially autocorrelated.
  • As the distance increases, the semivariance increases. There are fewer pairs of points that are far apart, so the correlation between sample points is lower.
  • As the semivariogram shows, it starts to reach its flat asymptotic level. Try fitting a function to model this behavior.
4 Semivariogram fitting model

Calculate the variogram of all possible distance intervals within the sampling range, draw the function curve, and then establish the variogram theory model .

Theoretical models commonly used in geostatistics to fit actual variation curves include spherical models, exponential models, Gaussian models, linear models, etc.

  • Generally, the spherical model indicates that the studied population has an aggregated distribution , which means that when the distance between sample points reaches the variable range, the spatial dependence of sample points decreases as the distance between sample points increases
  • The exponential model is similar to the spherical model, but its sill value isasymptote
  • Random distribution of r ( h ) r(h) r(h) Gender change
  • Non-horizontal linear model indicates that the population ismoderately aggregated distribution, and its spatial dependence range exceeds Research scale
  • Completely random or uniform data, the curve behaves as a pure nugget variation plot, r ( h ) r(h) r(h) is a horizontal straight line or slightly There is a slope, indicating no spatial correlation

Choosing which model to fit the sample semivariogram is a complex process, which is generally determined based on the shape of the sample variance plot or the purpose of the research.

The spatial distribution and direction of many biotic and abiotic factors in nature are closely related, therefore, corresponding anisotropic models are also produced.

Some regional variables often contain changes at various scales or levels, and their structure is often reflected in the semivariance functionIt is not a model structure, but a composite structure in which multiple model structures are superposed together.

python implementation:

https://scikit-gstat.readthedocs.io/en/latest/userguide/variogram.html#the-variogram

Reference:
https://gisgeography.com/semi-variogram-nugget-range-sill/

Supongo que te gusta

Origin blog.csdn.net/mengjizhiyou/article/details/134147915
Recomendado
Clasificación