Easily solve data indexing latency issues

What is data index latency


The time difference between data generation and final indexing in Yanhuang Data Platform is the data index delay.


When you find that the generated data cannot be searched in the Yanhuang Data Platform in time, it may be due to data index delay.
 

Under what circumstances does data indexing delay occur?

● The speed of data generation is greater than the speed of data transmission. For example, the speed of the third-party data collector used to send data is slower than the speed of data generation, resulting in data not being sent to Yanhuang Data Platform in time and causing data indexing delays.


● The speed of data generation is greater than the index speed of Yanhuang Data Platform. For example, a large amount of data is suddenly generated in a certain period of time. The import performance of Yanhuang Data Platform is not enough to support it. If the data is not consumed in time, data delay will occur. 


●The Yanhuang data platform service is down. When the service is down and restored, there will be a delay in data indexing.


●Network delay


●The data sent is not generated in real time, such as historical log archives


●Time stamp parsing errors, for example, the timestamp in the data is not parsed correctly

How to tell if there is data indexing delay

After each piece of data is imported into Yanhuang Data Platform, it will have two fields: _time and _ingestion_time.


●_time: The time parsed from the data, in most cases the timestamp generated by the data


●_ingestion_time: The timestamp when the data is indexed in Yanhuang Data Platform


The actual indexing time of the data (_ingestion_time) minus the time of data generation (_time) is the index delay time.

Use the following query to view the data index delay time in a certain data set.

How to reduce data indexing latency


●Exclude the impact of the network and use faster network equipment
●Ensure that _time is extracted correctly
●Performance tuning of the data collector to ensure that the data collector sends data fast enough
●Performance tuning of the Yanhuang data platform To achieve faster indexing speed
●Use a higher configuration host to achieve faster indexing speed

Guess you like

Origin blog.csdn.net/Yhpdata888/article/details/131290136