Dry goods丨DolphinDB just-in-time compilation (JIT) detailed

DolphinDB is a high-performance distributed time series database with built-in rich computing functions and a powerful multi-paradigm programming language. In order to improve the execution efficiency of DolphinDB scripts, starting from version 1.01, DolphinDB supports just-in-time compilation (JIT).

1 Introduction to JIT

Just-in-time compilation (English: Just-in-time compilation, abbreviation: JIT), also translated just-in-time compilation or real-time compilation, is a form of dynamic compilation that can improve the efficiency of program operation.

Usually there are two ways to run programs: compile execution and interpreted execution. Compilation and execution are all translated into machine code before the program is executed, which is characterized by higher operating efficiency, represented by C/C++. Interpretation execution is that the interpreter interprets and executes the program sentence by sentence, which is more flexible, but the execution efficiency is low, represented by Python.

Just-in-time compilation combines the advantages of the two. Translating the code into machine code at runtime can achieve execution efficiency similar to that of a statically compiled language. PyPy, a third-party implementation of Python, significantly improves the performance of the interpreter through JIT. The vast majority of Java implementations rely on JIT to improve code efficiency.

2 The role of JIT in DolphinDB

DolphinDB's programming language is interpreted and executed. When the program is run, the program is first parsed to generate a syntax tree, and then executed recursively. In the case that vectorization cannot be used, the interpretation cost will be relatively high. This is because the bottom layer of DolphinDB is implemented by C++, and a function call in the script will be converted into multiple virtual function calls in C++. In statements such as for loops, while loops, and if-else, it is very time-consuming to call functions repeatedly, and cannot meet real-time requirements in some scenarios.

The just-in-time compilation function in DolphinDB significantly improves the running speed of for loops, while loops, and if-else statements. It is especially suitable for scenarios where vectorized operations cannot be used but have extremely high speed requirements, such as high-frequency factor calculations, Real-time streaming data processing, etc.

In the following example, we compare the time required for the do-while loop to calculate the sum of 1 to 1,000,000 100 times with and without JIT.

def sum_without_jit(v) {
  s = 0l
  i = 1
  n = size(v)
  do {
    s += v[i]
    i += 1
  } while(i <= n)
  return s
}

@jit
def sum_with_jit(v) {
  s = 0l
  i = 1
  n = size(v)
  do {
    s += v[i]
    i += 1
  } while(i <= n)
  return s
}

vec = 1..1000000

timer(100) sum_without_jit(vec)     // 120552.740 ms
timer(100) sum_with_jit(vec)        //    290.065 ms
timer(100) sum(vec)                 //     48.922 ms

The time consumption of not using JIT is 415 times that of using JIT, and using the built-in sum function takes about 1/7. The built-in function is faster than JIT because there are many instructions to check the NULL value in the code generated by JIT. If the built-in sum function is If the input array has no NULL value, this step will be omitted.

vec[100] = NULL
timer(100) sum(vec)        // 118.063 ms

If the NULL value is added, the speed of the built-in sum is about 2.5 times that of JIT. This is because the built-in sum has also been optimized manually. If more complex calculations are involved in the function, then the speed of JIT will exceed the vectorized operation, which we will mention below.

If the task can use vectorized calculation, JIT may not be used as the case may be. However, in practical applications such as high-frequency factor generation, how to convert loop calculation into vectorized calculation requires certain skills.

In a column on Zhihu, we showed how to use vectorized operations in DolphinDB, where the formula for calculating buying and selling signals is as follows:

direction = (iif(signal>t1, 1h, iif(signal<t10, 0h, 00h)) - iif(signal<t2, 1h, iif(signal>t20, 0h, 00h))).ffill().nullFill(0h)

For those who are new to DolphinDB, they need to understand the iiffunction to write the above statement. It is easier to rewrite the above statement using a for loop:

@jit
def calculate_with_jit(signal, n, t1, t10, t20, t2) {
  cur = 0
  idx = 0
  output = array(INT, n, n)
  for (s in signal) {
    if(s > t1) {           // (t1, inf)
      cur = 1
    } else if(s >= t10) {  // [t10, t1]
      if(cur == -1) cur = 0
    } else if(s > t20) {   // [t20, t10)
      cur = 0
    } else if(s >= t2) {   // [t2, t20]
      if(cur == 1) cur = 0
    } else {               // (-inf, t2)
      cur = -1
    }
    output[idx] = cur
    idx += 1
  }
  return output
}

Remove @jit to get a custom function that does not use JIT calculate_without_jit. Compare the time consuming of the three methods:

n = 10000000
t1= 60
t10 = 50
t20 = 30
t2 = 20
signal = rand(100.0, n)

timer(100) (iif(signal >t1, 1h, iif(signal < t10, 0h, 00h)) - iif(signal <t2, 1h, iif(signal > t20, 0h, 00h))).ffill().nullFill(0h) // 41092.019 ms
timer(100) calculate_with_jit(calculate, signal, size(signal), t1, t10, t20, t2)       //    17075.127 ms
timer(100) calculate_without_jit(signal, size(signal), t1, t10, t20, t2)               //  1404406.413 ms

In this example, the speed of vectorized operations using JIT is 2.4 times faster than that without JIT. The speed of JIT here is faster than vectorization operation, because the built-in function of DolphinDB is called many times in vectorization operation, which produces many intermediate results.

Involving multiple memory allocations and virtual function calls, the code generated by JIT does not have these additional overhead.

Another situation is that vectorization cannot be used in certain calculations. For example, when calculating implied volatility of an option, Newton's method is required, and vectorization cannot be used. In this case, if you need to meet a certain degree of real-time performance, you can choose to use the DolphinDB plug- in or use JIT. The difference between the two is that plug-ins can be used in any scenario, but they need to be written in C++, which is more complicated; JIT writing is relatively easy, but the applicable scenarios are more limited. The running speed of JIT is very close to the speed of using C++ plug-in.

3 How to use JIT in DolphinDB

3.1 How to use

DolphinDB currently only supports JIT for user-defined functions. Just add the logo of @jit in the line before the user-defined function:

@jit
def myFunc(/* arguments */) {
  /* implementation */
}

When the user calls this function, DolphinDB will compile the code of the function into machine code in real time and execute it.

3.2 Supported sentences

DolphinDB currently supports the following statements in JIT:

Assignment statements, for example:

@jit
def func() {
  y = 1
}

Please note that multiple assign is currently not supported, for example:

@jit
def func() {
  a, b = 1, 2
}
func()

Running the above statement will throw an exception.

return statement, for example:

@jit
def func() {
  return 1
}

if-else statement, such as:

@jit
def myAbs(x) {
  if(x > 0) return x
  else return -x
}

do-while statement, for example:

@jit
def mySqrt(x) {
    diff = 0.0000001
    guess = 1.0
    guess = (x / guess + guess) / 2.0
    do {
        guess = (x / guess + guess) / 2.0
    } while(abs(guess * guess - x) >= diff)
    return guess
}

for statement, for example:

@jit
def mySum(vec) {
  s = 0
  for(i in vec) {
    s += i
  }
  return s
}

DolphinDB supports arbitrary nesting of the above statements in JIT.

3.3 Supported operators and functions

DolphinDB currently supports the following operators in JIT: add(+), sub(-), multiply(*), divide(/), and(&&), or(||), bitand(&), bitor( |), bitxor(^), eq(==), neq(!=), ge(>=), gt(>), le(<=), lt(<), neg(-), mod(% ), seq(..), at([]), the implementation of the above operations under all data types is consistent with the non-JIT implementation.

DolphinDB currently supports the following mathematical functions in the JIT: exp, log, sin, asin, cos, acos, tan, atan, abs, ceil, floor, sqrt. When the above mathematical functions appear in JIT,

If the received parameter is scalar, then the corresponding function in glibc or the optimized C implementation function will be called in the final generated machine code; if the received parameter is array, then DolphinDB will be called finally

The mathematical functions provided. The advantage of this is to improve the efficiency of the function by directly calling the code implemented in C, and reduce unnecessary virtual function calls and memory allocation.

DolphinDB currently supports the following built-in functions in the JIT: take, array, size, isValid, rand, cdfNormal.

It should be noted that arraythe first parameter of the function must directly specify the specific data type, and cannot be specified by variable transfer. This is because the type of all variables must be known when JIT is compiled, and arraythe type of the result returned by the function is specified by the first parameter, so the value must be known at compile time.

3.4 Handling of null values

All functions and operators in JIT handle null values in the same way as native functions and operators, that is, each data type uses the minimum value of the type to represent the null value of the type, and users do not need to deal with null values specifically.

3.5 Calls between JIT functions

DolphinDB's JIT function can call another JIT function. E.g:

@jit
def myfunc1(x) {
  return sqrt(x) + exp(x)
}

@jit
def myfunc2(x) {
  return myfunc1(x)
}

myfunc2(1.5)

In the above example, the internal will be compiled first myfunc1, and a native function with the signature double myfunc1(double) will be myfunc2generated. This function is directly called in the generated machine code, instead of executing it after judging myfunc1whether it is a JIT function at runtime , so as to achieve The highest execution efficiency.

Please note that non-JIT user-defined functions cannot be called within JIT functions, because type inference cannot be performed in this way. The type deduction will be mentioned below.

3.6 JIT compilation cost and caching mechanism

DolphinDB's JIT bottom layer relies on LLVM to achieve, and each user-defined function will generate its own module when compiling, which is independent of each other. Compilation mainly includes the following steps:

Initialization of LLVM related variables and environment
Generate LLVM IR based on the syntax tree of the DolphinDB script
Call LLVM to optimize the IR generated in the second step, and then compile it into machine code

The first step of the above steps generally takes less than 5ms, and the time taken for the next two steps is proportional to the complexity of the actual script. Overall, the compilation time is basically less than 50ms.

For a JIT function and a combination of parameter types, DolphinDB will only be compiled once. The system caches the result of JIT function compilation. The system obtains a corresponding string according to the data type of the parameter provided when the user calls a JIT function, and then looks for the compilation result corresponding to this string in a hash table, and calls it directly if it exists; if it does not exist, it starts compilation. And save the compilation result to this hash table, and then execute.

For tasks that need to be executed repeatedly, or tasks whose running time far exceeds the time-consuming compilation, JIT will significantly increase the running speed.

3.7 Limitations

At present, the applicable scenarios of JIT in DolphinDB are still relatively limited:

JIT only supports user-defined functions.
Only scalar and array type parameters are accepted. Other types such as table, dict, pair, string, symbol, etc. are not currently supported.
Subarray is not accepted as a parameter.

4 Type derivation

Before using LLVM to generate IR, you must know the types of all variables in the script. This step is type inference. The type deduction method used by DolphinDB's JIT is partial deduction, such as:

@jit
def foo() {
  x = 1
  y = 1.1
  z = x + y
  return z
}

Use x = 1 to determine that the type of x is int; use y = 1.1 to determine that the type of y is double; use z = x + y and the types of x and y derived above to determine that the type of z is also double; use return z to determine fooThe return type of the function is double.

If the function has parameters, such as:

@jit
def foo(x) {
  return x + 1
}

fooThe return type of the function depends on the type of the input value x.

We mentioned the data types currently supported by JIT. If an unsupported type appears in the function, or the input variable type is not supported, it will cause the variable type deduction of the entire function to fail, and an exception will be thrown at runtime. E.g:

@jit
def foo(x) {
  return x + 1
}

foo(123)             // 正常执行
foo("abc")           // 抛出异常，因为目前不支持STRING
foo(1:2)             // 抛出异常，因为目前不支持pair
foo((1 2, 3 4, 5 6)) // 抛出异常，因为目前不支持tuple

@jit
def foo(x) {
  y = cumprod(x)
  z = y + 1
  return z
}

foo(1..10)             // 抛出异常，因为目前还不支持cumprod函数，不知道该函数返回的类型，导致类型推导失败

Therefore, in order to be able to use JIT functions normally, users should avoid using unsupported types such as tuple or string in functions or parameters, and do not use functions that are not yet supported.

5 Examples

5.1 Calculating implied volatility (implied volatility)

As mentioned above, some calculations cannot be vectorized. The calculation of implied volatility is an example:

@jit
def GBlackScholes(future_price, strike, input_ttm, risk_rate, b_rate, input_vol, is_call) {
  ttm = input_ttm + 0.000000000000001;
  vol = input_vol + 0.000000000000001;

  d1 = (log(future_price/strike) + (b_rate + vol*vol/2) * ttm) / (vol * sqrt(ttm));
  d2 = d1 - vol * sqrt(ttm);

  if (is_call) {
    return future_price * exp((b_rate - risk_rate) * ttm) * cdfNormal(0, 1, d1) - strike * exp(-risk_rate*ttm) * cdfNormal(0, 1, d2);
  } else {
    return strike * exp(-risk_rate*ttm) * cdfNormal(0, 1, -d2) - future_price * exp((b_rate - risk_rate) * ttm) * cdfNormal(0, 1, -d1);
  }
}

@jit
def ImpliedVolatility(future_price, strike, ttm, risk_rate, b_rate, option_price, is_call) {
  high=5.0;
  low = 0.0;

  do {
    if (GBlackScholes(future_price, strike, ttm, risk_rate, b_rate, (high+low)/2, is_call) > option_price) {
      high = (high+low)/2;
    } else {
      low = (high + low) /2;
    }
  } while ((high-low) > 0.00001);

  return (high + low) /2;
}

@jit
def test_jit(future_price, strike, ttm, risk_rate, b_rate, option_price, is_call) {
	n = size(future_price)
	ret = array(DOUBLE, n, n)
	i = 0
	do {
		ret[i] = ImpliedVolatility(future_price[i], strike[i], ttm[i], risk_rate[i], b_rate[i], option_price[i], is_call[i])
		i += 1
	} while(i < n)
	return ret
}

n = 100000
future_price=take(rand(10.0,1)[0], n)
strike_price=take(rand(10.0,1)[0], n)
strike=take(rand(10.0,1)[0], n)
input_ttm=take(rand(10.0,1)[0], n)
risk_rate=take(rand(10.0,1)[0], n)
b_rate=take(rand(10.0,1)[0], n)
vol=take(rand(10.0,1)[0], n)
input_vol=take(rand(10.0,1)[0], n)
multi=take(rand(10.0,1)[0], n)
is_call=take(rand(10.0,1)[0], n)
ttm=take(rand(10.0,1)[0], n)
option_price=take(rand(10.0,1)[0], n)

timer(10) test_jit(future_price, strike, ttm, risk_rate, b_rate, option_price, is_call)          //  2621.73 ms
timer(10) test_non_jit(future_price, strike, ttm, risk_rate, b_rate, option_price, is_call)      //   302714.74 ms

In the above example, the function ImpliedVolatilitywill be called GBlackScholes. The function test_non_jitcan be obtained by test_jitremoving @jit before the definition. The JIT version test_jitruns test_non_jit115 times faster than the non-JIT version .

5.2 Calculate Greeks

Greeks are often used in quantitative finance for risk assessment. The following uses Charm as an example to demonstrate the use of JIT:

@jit
def myMax(a,b){
	if(a>b){
		return a
	}else{
		return b
	}
}

@jit
def NormDist(x) {
  return cdfNormal(0, 1, x);
}

@jit
def ND(x) {
  return (1.0/sqrt(2*pi)) * exp(-(x*x)/2.0)
}

@jit
def CalculateCharm(future_price, strike_price, input_ttm, risk_rate, b_rate, vol, multi, is_call) {
  day_year = 245.0;

  d1 = (log(future_price/strike_price) + (b_rate + (vol*vol)/2.0) * input_ttm) / (myMax(vol,0.00001) * sqrt(input_ttm));
  d2 = d1 - vol * sqrt(input_ttm);

  if (is_call) {
    return -exp((b_rate - risk_rate) * input_ttm) * (ND(d1) * (b_rate/vol/sqrt(input_ttm) - d2/2.0/input_ttm) + (b_rate-risk_rate) * NormDist(d1)) * future_price * multi / day_year;
  } else {
    return -exp((b_rate - risk_rate) * input_ttm) * (ND(d1) * (b_rate/vol/sqrt(input_ttm) - d2/2.0/input_ttm) - (b_rate-risk_rate) * NormDist(-d1)) * future_price * multi / day_year;
  }
}

@jit
def test_jit(future_price, strike_price, input_ttm, risk_rate, b_rate, vol, multi, is_call) {
	n = size(future_price)
	ret = array(DOUBLE, n, n)
	i = 0
	do {
		ret[i] = CalculateCharm(future_price[i], strike_price[i], input_ttm[i], risk_rate[i], b_rate[i], vol[i], multi[i], is_call[i])
		i += 1
	} while(i < n)
	return ret
}


def ND_validate(x) {
  return (1.0/sqrt(2*pi)) * exp(-(x*x)/2.0)
}

def NormDist_validate(x) {
  return cdfNormal(0, 1, x);
}

def CalculateCharm_vectorized(future_price, strike_price, input_ttm, risk_rate, b_rate, vol, multi, is_call) {
	day_year = 245.0;

	d1 = (log(future_price/strike_price) + (b_rate + pow(vol, 2)/2.0) * input_ttm) / (max(vol, 0.00001) * sqrt(input_ttm));
	d2 = d1 - vol * sqrt(input_ttm);
	return iif(is_call,-exp((b_rate - risk_rate) * input_ttm) * (ND_validate(d1) * (b_rate/vol/sqrt(input_ttm) - d2/2.0/input_ttm) + (b_rate-risk_rate) * NormDist_validate(d1)) * future_price * multi / day_year,-exp((b_rate - risk_rate) * input_ttm) * (ND_validate(d1) * (b_rate/vol/sqrt(input_ttm) - d2/2.0/input_ttm) - (b_rate-risk_rate) * NormDist_validate(-d1)) * future_price * multi / day_year)
}

n = 1000000
future_price=rand(10.0,n)
strike_price=rand(10.0,n)
strike=rand(10.0,n)
input_ttm=rand(10.0,n)
risk_rate=rand(10.0,n)
b_rate=rand(10.0,n)
vol=rand(10.0,n)
input_vol=rand(10.0,n)
multi=rand(10.0,n)
is_call=rand(true false,n)
ttm=rand(10.0,n)
option_price=rand(10.0,n)

timer(10) test_jit(future_price, strike_price, input_ttm, risk_rate, b_rate, vol, multi, is_call)                     //   1834.342 ms
timer(10) test_none_jit(future_price, strike_price, input_ttm, risk_rate, b_rate, vol, multi, is_call)                // 224099.805 ms
timer(10) CalculateCharm_vectorized(future_price, strike_price, input_ttm, risk_rate, b_rate, vol, multi, is_call)    //   3117.761 ms

The above is a more complex example, involving more function calls and more complex calculations. The JIT version is about 121 times faster than the non-JIT version and about 0.7 times faster than the vectorized version.

5.3 Calculating stop loss (stoploss)

In this Zhihu column , we show how to use DolphinDB for technical signal backtesting. Below we use JIT to implement the stoploss function:

@jit
def stoploss_JIT(ret, threshold) {
	n = ret.size()
	i = 0
	curRet = 1.0
	curMaxRet = 1.0
	indicator = take(true, n)

	do {
		indicator[i] = false
		curRet *= (1 + ret[i])
		if(curRet > curMaxRet) { curMaxRet = curRet }
		drawDown = 1 - curRet / curMaxRet;
		if(drawDown >= threshold) {
			i = n // break is not supported for now
		}
		i += 1
	} while(i < n)

	return indicator
}

def stoploss_no_JIT(ret, threshold) {
	n = ret.size()
	i = 0
	curRet = 1.0
	curMaxRet = 1.0
	indicator = take(true, n)

	do {
		indicator[i] = false
		curRet *= (1 + ret[i])
		if(curRet > curMaxRet) { curMaxRet = curRet }
		drawDown = 1 - curRet / curMaxRet;
		if(drawDown >= threshold) {
			i = n // break is not supported for now
		}
		i += 1
	} while(i < n)

	return indicator
}

def stoploss_vectorization(ret, threshold){
	cumret = cumprod(1+ret)
 	drawDown = 1 - cumret / cumret.cummax()
	firstCutIndex = at(drawDown >= threshold).first() + 1
	indicator = take(false, ret.size())
	if(isValid(firstCutIndex) and firstCutIndex < ret.size())
		indicator[firstCutIndex:] = true
	return indicator
}
ret = take(0.0008 -0.0008, 1000000)
threshold = 0.10
timer(10) stoploss_JIT(ret, threshold)              //      58.674 ms
timer(10) stoploss_no_JIT(ret, threshold)           //   14622.142 ms
timer(10) stoploss_vectorization(ret, threshold)    //     151.884 ms

The stoploss function actually only needs to find the first day when the drawdown is greater than the threshold, and does not need to calculate all cumprod and cummax, so the JIT version is about 1.5 times faster than the vectorized version, and about 248 times faster than the non-JIT version .

If stoploss is required on the last day of the data, the speed of the JIT version will be the same as that of vectorization, but it is much faster than the non-JIT version.

6. The future

In subsequent versions, we plan to gradually support the following features:

Support break and continue in for, do-while statements.
Support data structures such as dictionary and data types such as string.
Support more mathematical and statistical functions.
Enhanced type inference function, able to identify more data types returned by DolphinDB built-in functions.
Supports declaring data types for input parameters, return values and local variables in custom functions.

7 Summary

DolphinDB has launched the function of just-in-time compilation and execution of custom functions, which significantly improves the running speed of for loops, while loops, and if-else statements. It is especially suitable for scenarios where vectorized operations cannot be used but the running speed is extremely high. For example, high-frequency factor calculation, real-time streaming data processing, etc.