ElasticSearch introduced txt text or text json

Some time ago to do something, do something documenting retired and sit.


 Business: restoring data from the local to the ES, the local file is large, the amount of data in the decompressed data after about 10 G's.


 Logic: For business needs, a total of three times tried to practice.

  First, the use of Bulk: ES supports batch import local way, recommended text size is about 10-15M, the upper limit of the file should not exceed 200M (uncertain).

  Second, the use logstash: ES official of another product, converting the text data into ES data source.

  Third, the use of the Java: springData-ES way of java. A third way to use the thread pool + + springData queue buffer packing logic of Es, and yet more late


First, the use of bulk (win7 + es6.6.1 + json text)

1. Prepare the correct json data format

es requirements json text format is very strict, reasonable json data format is as follows:

{"index":"demo","id":0}
{"id":null,"dev_id":"1","rcv_time":1557303257,"date":null,"dname":null,"logtype":"1","pri":null,"mod":"pf","sa":null,"sport":null,"ttype":null,"da":null,"dport":null,"code":null,"proto":null,"policy":null,"duration":"0","rcvd":null,"sent":null,"fwlog":null,"dsp_msg":"包过滤日志","failmsg":null,"custom":null,"smac":null,"dmac":null,"type":null,"in_traffic":"52","out_traffic":"52","gen_time":"1557303257","src_ip":"710191296","dest_ip":"896426877","src_port":"51411","dest_port":"443","protocol_id":"1","action_id":"2","filter_policy_id":"0","sat_ip":"0","sat_port":"0","i_ip":"0","i_port":"0","insert_time":"0","p_ip":"0","p_port":"0","rulename_id":"3","min_id":"25955054","svm":null,"dvm":null,"repeat_num":null,"event_type_id":216001001,"event_level_id":1,"org_log":"devid=2 date=\"2019/05/08 16:14:17\" dname=venus logtype=1 pri=5 ver=0.3.0 rule_name=网关产品线 mod=pf sa=192.168.84.42 sport=51411 type=NULL da=125.99.110.53 dport = 443 code = NULL proto = IPPROTO_TCP policy = allow duration = 0 rcvd = 52 sent = 52 fwlog = 0 dsp_msg = \ "packet filtering log \" "," stauts ":" success "," failMsg ": null}
{"index":"demo","id":1}
{"id":null,"dev_id":"1","rcv_time":1557303257,"date":null,"dname":null,"logtype":"1","pri":null,"mod":"pf","sa":null,"sport":null,"ttype":null,"da":null,"dport":null,"code":null,"proto":null,"policy":null,"duration":"0","rcvd":null,"sent":null,"fwlog":null,"dsp_msg":"包过滤日志","failmsg":null,"custom":null,"smac":null,"dmac":null,"type":null,"in_traffic":"52","out_traffic":"52","gen_time":"1557303257","src_ip":"710191296","dest_ip":"896426877","src_port":"51411","dest_port":"443","protocol_id":"1","action_id":"2","filter_policy_id":"0","sat_ip":"0","sat_port":"0","i_ip":"0","i_port":"0","insert_time":"0","p_ip":"0","p_port":"0","rulename_id":"3","min_id":"25955054","svm":null,"dvm":null,"repeat_num":null,"event_type_id":216001001,"event_level_id":1,"org_log":"devid=2 date=\"2019/05/08 16:14:17\" dname=venus logtype=1 pri=5 ver=0.3.0 rule_name=网关产品线 mod=pf sa=192.168.84.42 sport=51411 type=NULL da=125.99.110.53 dport = 443 code = NULL proto = IPPROTO_TCP policy = allow duration = 0 rcvd = 52 sent = 52 fwlog = 0 dsp_msg = \ "packet filtering log \" "," stauts ":" success "," failMsg ": null}
{"index":"demo","id":2}
{"id":null,"dev_id":"1","rcv_time":1557303257,"date":null,"dname":null,"logtype":"1","pri":null,"mod":"pf","sa":null,"sport":null,"ttype":null,"da":null,"dport":null,"code":null,"proto":null,"policy":null,"duration":"0","rcvd":null,"sent":null,"fwlog":null,"dsp_msg":"包过滤日志","failmsg":null,"custom":null,"smac":null,"dmac":null,"type":null,"in_traffic":"52","out_traffic":"52","gen_time":"1557303257","src_ip":"710191296","dest_ip":"896426877","src_port":"51411","dest_port":"443","protocol_id":"1","action_id":"2","filter_policy_id":"0","sat_ip":"0","sat_port":"0","i_ip":"0","i_port":"0","insert_time":"0","p_ip":"0","p_port":"0","rulename_id":"3","min_id":"25955054","svm":null,"dvm":null,"repeat_num":null,"event_type_id":216001001,"event_level_id":1,"org_log":"devid=2 date=\"2019/05/08 16:14:17\" dname=venus logtype=1 pri=5 ver=0.3.0 rule_name=网关产品线 mod=pf sa=192.168.84.42 sport=51411 type=NULL da=125.99.110.53 dport = 443 code = NULL proto = IPPROTO_TCP policy = allow duration = 0 rcvd = 52 sent = 52 fwlog = 0 dsp_msg = \ "packet filtering log \" "," stauts ":" success "," failMsg ": null}

The official requirements of the standard json format is as

2.cmd run (if using curl curl plugin abnormalities can download Baidu)

curl -H "Content-Type:appliaction/json"  -XPOST localhost:9200/index/mapping/_bulk --data-binary @xxx.json

It should be noted: cmd Dodo suddenly rolling up is a success!


 Second, the use logstash 

1. Install logstash (official website to download it)

2. Go logstash in the bin directory, create logstash_def.conf file (provided load at startup logstash startup configuration file)

3. The following documents:

input{
	file{
		path => "D:/log/packet.json" 
		type => "log"
		
		start_position => "beginning"
		codec => json{  
		charset => "UTF-8"     
		}
	}
}

output{
	elasticsearch{
		hosts => "http://127.0.0.1:9200"    
		index => "venus"				
		document_type => "log_packet"		
	}
}

4.cmd into the next logstash bin directory (ES premise already started)

命令:logstash -f logstash_def.conf

Please note: Unsuccessful flip a wrong, or will have been loaded, you can view the status of plug-ins to view the data using the head to increase case

 

Guess you like

Origin www.cnblogs.com/ttzsqwq/p/11077574.html