Calculate Root Squared Error of xarray dataset

Shalin Barot :

I have xarray dataset monthly_data of just January's with following info:

lat: float64 (192)
lon: float64 (288)
time: object (1200)(monthly data)

Data Variables:
tas: (time, lat, lon)[[[45,78,...],...]...]

I have ground truth value grnd_trth which has true data of January

Coordinates:
lat: float64 (192)
lon: float64 (288)

Data Variables:
tas(lat and lon)

Now I want to calculate root squared error for each month from monthly_data with respect to grnd_trth, I tried using loops and I guess it's working fine, here's my try:

rms = []

for i in range(1200):
  err = 0
  for j in (grnd_trth.tas[0] - monthly_data.tas[i]).values:
    for k in j:
      err += k**2
  rms.append(err**1/2)

I just want to know is there more efficient way or any direct function to do so?

Edit:

Output of monthly_data.tas:

xarray.Datarray 'tas': (time:1200 lat: 192 lon: 288)
array([[[45,46,45,4....],....]...]

Coordinates:
lat:
array([-90. , -89.75,...])

lon:
array([0., 1.25.,.... ])

time:
array([cftime.DatetimeNoLeap(0001-01-15 12:00:00),
       cftime.DatetimeNoLeap(0002-01-15 12:00:00),
       cftime.DatetimeNoLeap(0003-01-15 12:00:00), ...,
       cftime.DatetimeNoLeap(1198-01-15 12:00:00),
       cftime.DatetimeNoLeap(1199-01-15 12:00:00),
       cftime.DatetimeNoLeap(1200-01-15 12:00:00)]

Output of grnd_trth.tas:

xarray.Datarray 'tas': (lat: 192 lon: 288)
array([[45,46,45,4....],....]

Coordinates:
lat:
array([-90. , -89.75,...])

lon:
array([0., 1.25.,.... ])

time:
array([cftime.DatetimeNoLeap(0001-01-15 12:00:00)]

But when I just use .values( ) function it'll only return me tas value array!

AWilliams3142 :

In terms of doing this in a more 'efficient' way, there are two things to point out.

1) You're allowed to do arithmetic operations directly on xarray objects eg.

for time_idx in range(1200):
    # For each time idx, find the root squared error at 
    # each pixel between grnd_truth and monthly_data

    err2 = (grnd_truth.tas - monthly_data.tas[time_idx,...])**2
    err  = err2**(1/2)

2) There's a method call .sum() which sums all the elements in an array, so this means you won't have to do the for k in j: line in order to sum over the pixels. Eg.

rms=[]

for time_idx in range(2000):
    # same two lines as before...

    # sum over every pixel and extract the value from the DataArray
    err_tot = err.sum().values

    # Add to running total
    rms.append(err_tot)

Now, one thing to point out here is that, by simply extracting the values from the DataArray, you lose all of the metadata about the array! So this isn't really best practice, but for now I think this answers your question?

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=350904&siteId=1