How to quickly build a factor platform based on high-frequency data

In the process of factor investment research and production, it is often necessary to further process stateful and complex indicators based on a large number of factors , such as calculating real-time K-lines, MACD, RSI, etc.

Assuming that 1000 factors are to be calculated, each factor has a different implementation logic, and has specific configurations such as window closing signals, calculation window boundaries, etc. It is bound to be very inefficient to repeatedly build a stream processing framework and calculate the same intermediate variable multiple times . Is there any way to standardize and format the flow calculation of a large number of factors, and realize engineering management?

——DolphinDB has launched a convenient, fast, scalable, and compatible stream-batch integrated factor calculation platform prototype , which provides the functions of calculating minute factors based on snapshot data and further processing minute factors into complex factors . Users can follow our given Scripts and deployment tutorials for quick setup and debugging.

With this set of flow-batch integrated factor calculation platform, business personnel do not need to understand the underlying architecture of the DolphinDB flow calculation framework, they only need to write function expressions according to the business factor calculation logic, and then schedule the calculation interface of the factor calculation platform to complete the factor calculation .

At the same time, developers don’t need to transcribe code anymore. Factor investment research and production only need one system and one script to seamlessly switch, which greatly reduces the cost of development and operation and maintenance, and improves the efficiency of the whole process of factor production.

Architecture and Functionality of Factor Calculation Platform

The architecture of the Level 2 snapshot data flow batch integration factor calculation platform is shown in the figure below:

picture

It mainly includes the following functional modules:

  • Real-time data low-latency access: API real-time data writing interface, real-time market access plug-in, message middleware subscription plug-in;

  • Historical data playback: single-table and multi-table playback are strictly controlled in time order, and the historical data stored in the DolphinDB database is played back into a stream;

  • Stream computing engine: The aggregation calculation of rolling windows for snapshot data uses a time series aggregation engine, which is further processed into a complex factor-dependent responsive state engine;

  • Integrated development environment: DolphinDB GUI and DolphinDB Vscode are used to develop and debug factor expression codes, and perform task scheduling and job execution through API interaction;

  • Low-latency message bus publishing: dock with various message queue middleware, and push real-time calculation results to Kafka, zmq, RabbitMQ, MQTT, etc.

How to use the factor calculation platform?

After deploying the factor calculation platform based on DolphinDB according to this tutorial, the debugging process of the factor development stage based on historical data is as follows:

picture

Factor Development Based on Historical Data

Factor business developers only need to write the function expression of factor calculation in the integrated development environment provided by DolphinDB, and then call the calculation interface of the factor calculation platform to complete the debugging. If the writing factor conforms to the syntax of DolphinDB, it can be executed successfully and the calculation result will be returned. If the writing factor does not conform to the syntax of DolphinDB, an error will be reported and interrupted.

After a certain number of factors have been developed, real-time computing services need to be deployed in the production environment. The deployment process is as follows:

picture

Production environment deployment based on real-time data

Factor business developers only need to call the encapsulated real-time factor calculation service execution function through the client to complete the deployment. After execution, the DolphinDB server will display the entry of the stream computing service, which is a table object, and the data can be accessed through the real-time data access tool provided by DolphinDB. At the same time, the outlet of the stream computing service will be automatically created, which is also a table object to store the calculation results.

In general--

Write a corresponding factor function,

Generate a Json configuration file,

dispatch new Json file,

You're done!

It is worth mentioning that the factor calculation platform launched by DolphinDB this time only involves the factor calculation of minute frequency, but DolphinDB's computing power is not limited to this, and will release snapshot frequency, 1s frequency or even higher frequency factor calculation in succession. Stay tuned for the best practice tutorial on platform construction!

Guess you like

Origin blog.csdn.net/m0_56236921/article/details/132401343