1.首先查看日志文件的字符集
[logStash@hadoop3 gamelogs]$ file game.txt
game.txt: UTF-8 Unicode (with BOM) text, with CRLF line terminators
2.如果是UTF-8编码,则在LogStash配置文件设置charset为UTF-8
input{
file{
path=>"/home/logStash/gamelogs/*.txt"
start_position=>"beginning"
stat_interval=>10
type=>"gameLogs"
codec=>plain{
charset=>"UTF-8"
}
}
}
output{
kafka{
topic_id => "gameLogTopic"
codec => plain {
format => "%{message}"
charset => "UTF-8"
}
bootstrap_servers=>"192.168.25.100:9092,192.168.25.102:9092"
}
}
3.如果不是UTF-8,则统一设置为GB2312
input{
file{
path=>"/home/logStash/gamelogs/*.txt"
start_position=>"beginning"
stat_interval=>10
type=>"gameLogs"
codec=>plain{
charset=>"GB2312"
}
}
}
output{
kafka{
topic_id => "gameLogTopic"
codec => plain {
format => "%{message}"
charset => "GB2312"
}
bootstrap_servers=>"192.168.25.100:9092,192.168.25.102:9092"
}
}