错误的locale设置导致Impala crash

错误的locale设置导致Impala crash

今天在编译cdh5.16.2版本的Impala并加载数据时,发现三个impalad同时crash。在impalad.ERROR里可以看到错误信息:

terminate called after throwing an instance of 'std::runtime_error'
  what():  locale::facet::_S_create_c_locale name not valid
Wrote minidump to /home/quanlong/workspace/Impala-CDH/logs/data_loading/minidumps/impalad/07f37528-511c-4325-654b5bb6-1ae3dbd5.dmp

既然有生成minidump文件,不妨就看一下crash在哪了。解析出来如下:

Operating system: Linux
                  0.0.0 Linux 4.15.0-99-generic #100~16.04.1-Ubuntu SMP Wed Apr 22 23:56:30 UTC 2020 x86_64
CPU: amd64
     family 6 model 158 stepping 10
     1 CPU

GPU: UNKNOWN

Crash reason:  SIGABRT
Crash address: 0x3e8000026e5
Process uptime: not available

Thread 213 (crashed)
 0  libc-2.23.so + 0x35428
 1  libc-2.23.so + 0x3702a
 2  libstdc++.so.6.0.21 + 0x1381c0
 3  libc-2.23.so + 0x79242
 4  libc-2.23.so + 0x79242
 5  libstdc++.so.6.0.21 + 0x8c880
 6  libstdc++.so.6.0.21 + 0x8f84d
 7  libstdc++.so.6.0.21 + 0xa2b80
 8  libstdc++.so.6.0.21 + 0x8d6b6
 9  libstdc++.so.6.0.21 + 0x8d701
10  libstdc++.so.6.0.21 + 0x8d919
11  libstdc++.so.6.0.21 + 0x138968
12  libstdc++.so.6.0.21 + 0xb65af
13  libstdc++.so.6.0.21 + 0xb0714
14  libstdc++.so.6.0.21 + 0xa126c
15  libstdc++.so.6.0.21 + 0xa24d8
16  impalad!std::_Rb_tree_const_iterator<std::pair<long const, impala::HdfsPartitionDescriptor*> >::operator->() const [stl_tree.h : 278 + 0xf]
17  impalad!boost::filesystem::path::codecvt() + 0x53
18  impalad!impala::HdfsScanNodeBase::Prepare(impala::RuntimeState*) [hdfs-scan-node-base.cc : 209 + 0x5]
19  impalad!std::allocator<impala::ScalarExpr*>::allocator() [allocator.h : 113 + 0x3]
20  impalad!std::vector<impala::ScalarExpr*, std::allocator<impala::ScalarExpr*> >::vector() [stl_vector.h : 257 + 0xc]
21  impalad!std::pair<int const, std::vector<impala::ScalarExpr*, std::allocator<impala::ScalarExpr*> > >::pair<int const&, 0ul>(std::tuple<int const&>&, std::tuple<>&, std::_Index_tuple<0ul>, std::_Index_tuple<>) [tuple : 1102 + 0x18]
22  impalad!std::pair<int const, std::vector<impala::ScalarExpr*, std::allocator<impala::ScalarExpr*> > >::pair<int const&>(std::piecewise_construct_t, std::tuple<int const&>, std::tuple<>) [tuple : 1091 + 0x1b]
23  impalad!std::tuple<int const&>::tuple(std::tuple<int const&>&&) [tuple : 409 + 0x13]
24  impalad!__gnu_cxx::new_allocator<std::pair<int const, std::vector<impala::ScalarExpr*, std::allocator<impala::ScalarExpr*> > > >::construct<std::pair<int const, std::vector<impala::ScalarExpr*> >, const std::piecewise_construct_t&, std::tuple<int const&>, std::tuple<> > [new_allocator.h : 120 + 0x15]
25  impalad!__gnu_cxx::__aligned_buffer<std::pair<int const, std::vector<impala::ScalarExpr*, std::allocator<impala::ScalarExpr*> > > >::_M_ptr() [aligned_buffer.h : 64 + 0xc]
26  impalad!std::tuple_element<0ul, std::pair<int const, std::vector<impala::ScalarExpr*, std::allocator<impala::ScalarExpr*> > > >::type& std::get<0ul, int const, std::vector<impala::ScalarExpr*, std::allocator<impala::ScalarExpr*> > >(std::pair<int const, std::vector<impala::ScalarExpr*, std::allocator<impala::ScalarExpr*> > >&) [utility : 144 + 0xc]
27  impalad!std::__detail::_Equal_helper<int, std::pair<int const, std::vector<impala::ScalarExpr*, std::allocator<impala::ScalarExpr*> > >, std::__detail::_Select1st, std::equal_to<int>, unsigned long, false>::_S_equals(std::equal_to<int> const&, std::__detail::_Select1st const&, int const&, unsigned long, std::__detail::_Hash_node<std::pair<int const, std::vector<impala::ScalarExpr*, std::allocator<impala::ScalarExpr*> > >, false>*) [hashtable_policy.h : 1337 + 0x34]
28  impalad!impala::ScalarExpr** std::__copy_move_a<false, impala::ScalarExpr* const*, impala::ScalarExpr**>(impala::ScalarExpr* const*, impala::ScalarExpr* const*, impala::ScalarExpr**) [stl_algobase.h : 396 + 0x17]
29  impalad!__gnu_cxx::__normal_iterator<impala::ScalarExpr**, std::vector<impala::ScalarExpr*, std::allocator<impala::ScalarExpr*> > > std::__copy_move_a2<false, __gnu_cxx::__normal_iterator<impala::ScalarExpr* const*, std::vector<impala::ScalarExpr*, std::allocator<impala::ScalarExpr*> > >, __gnu_cxx::__normal_iterator<impala::ScalarExpr**, std::vector<impala::ScalarExpr*, std::allocator<impala::ScalarExpr*> > > >(__gnu_cxx::__normal_iterator<impala::ScalarExpr* const*, std::vector<impala::ScalarExpr*, std::allocator<impala::ScalarExpr*> > >, __gnu_cxx::__normal_iterator<impala::ScalarExpr* const*, std::vector<impala::ScalarExpr*, std::allocator<impala::ScalarExpr*> > >, __gnu_cxx::__normal_iterator<impala::ScalarExpr**, std::vector<impala::ScalarExpr*, std::allocator<impala::ScalarExpr*> > >) [stl_algobase.h : 434 + 0x4f]
30  impalad!std::allocator<std::_Rb_tree_node<std::string> >::allocator(std::allocator<std::_Rb_tree_node<std::string> > const&) [allocator.h : 116 + 0x13]
31  impalad!std::_Rb_tree<std::string, std::string, std::_Identity<std::string>, std::less<std::string>, std::allocator<std::string> >::_Rb_tree_impl<std::less<std::string>, true>::_Rb_tree_impl(std::less<std::string> const&, std::allocator<std::_Rb_tree_node<std::string> > const&) [stl_tree.h : 469 + 0xc]
32  impalad!std::_Rb_tree<std::string, std::string, std::_Identity<std::string>, std::less<std::string>, std::allocator<std::string> >::_Rb_tree(std::_Rb_tree<std::string, std::string, std::_Identity<std::string>, std::less<std::string>, std::allocator<std::string> >&&) [stl_tree.h : 703 + 0xc]
33  impalad!std::set<std::string, std::less<std::string>, std::allocator<std::string> >::set(std::set<std::string, std::less<std::string>, std::allocator<std::string> >&&) [stl_set.h : 209 + 0x1e]
34  impalad!std::pair<std::string const, std::set<std::string, std::less<std::string>, std::allocator<std::string> > >::pair<std::string, std::set<std::string, std::less<std::string>, std::allocator<std::string> >, void>(std::pair<std::string, std::set<std::string, std::less<std::string>, std::allocator<std::string> > >&&) [stl_pair.h : 152 + 0x35]
35  impalad!__gnu_cxx::new_allocator<std::_Rb_tree_node<std::pair<const std::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::set<std::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::basic_string<char, std::char_traits<char>, std::allocator<char> > > > > > >::construct<std::pair<const std::basic_string<char>, std::set<std::basic_string<char> > >, std::pair<std::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::set<std::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::basic_string<char, std::char_traits<char>, std::allocator<char> > > > > > [new_allocator.h : 120 + 0xb]
36  impalad!std::allocator_traits<std::allocator<std::_Rb_tree_node<std::pair<const std::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::set<std::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::basic_string<char, std::char_traits<char>, std::allocator<char> > > > > > > >::_S_construct<std::pair<const std::basic_string<char>, std::set<std::basic_string<char> > >, std::pair<std::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::set<std::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::basic_string<char, std::char_traits<char>, std::allocator<char> > > > > > [alloc_traits.h : 253 + 0x22]
37  impalad!__gnu_cxx::new_allocator<std::_Rb_tree_node<std::basic_string<char, std::char_traits<char>, std::allocator<char> > > >::construct<std::basic_string<char>, const std::basic_string<char, std::char_traits<char>, std::allocator<char> >&> [new_allocator.h : 120 + 0xb]
38  impalad!__gnu_cxx::__aligned_buffer<std::pair<std::string const, impala::RuntimeProfile::Counter*> >::_M_ptr() const [aligned_buffer.h : 68 + 0xc]
39  libstdc++.so.6.0.21 + 0xce9e8
40  impalad!bool std::operator< <char, std::char_traits<char>, std::allocator<char> >(std::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) [basic_string.h : 2590 + 0x13]
41  impalad!std::less<std::string>::operator()(std::string const&, std::string const&) const [stl_function.h : 371 + 0x13]
42  impalad!std::_Rb_tree<std::string, std::pair<std::string const, impala::RuntimeProfile::Counter*>, std::_Select1st<std::pair<std::string const, impala::RuntimeProfile::Counter*> >, std::less<std::string>, std::allocator<std::pair<std::string const, impala::RuntimeProfile::Counter*> > >::_M_lower_bound(std::_Rb_tree_node<std::pair<std::string const, impala::RuntimeProfile::Counter*> >*, std::_Rb_tree_node<std::pair<std::string const, impala::RuntimeProfile::Counter*> >*, std::string const&) [stl_tree.h : 1265 + 0x13]
43  linux-gate.so + 0xc45
44  libc-2.23.so + 0x115876
45  libc-2.23.so + 0x115876
46  impalad!impala::MonotonicStopWatch::Start() [stopwatch.h : 110 + 0x5]
47  impalad!impala::ScopedTimer<impala::MonotonicStopWatch>::ScopedTimer(impala::RuntimeProfile::Counter*, bool const*) [runtime-profile-counters.h : 486 + 0xc]
48  impalad!impala::ThreadResourceMgr::ResourcePool::num_optional_threads() const [thread-resource-mgr.h : 134 + 0xe]

这是在读数据时,调用boost的filesystem::path::codecvt()然后crash了。Google一下发现这原来是个known issue:
https://impala.apache.org/docs/build/html/topics/impala_known_issues.html
有一段写道 “Impala should tolerate bad locale settings”

If the LC_* environment variables specify an unsupported locale, Impala does not start.
Apache Issue: IMPALA-532
Workaround: Add LC_ALL=“C” to the environment settings for both the Impala daemon and the Statestore daemon. See Modifying Impala Startup Options for details about modifying these environment settings.
Resolution: Fixing this issue would require an upgrade to Boost 1.47 in the Impala distribution.

意思是说错误的 locale 设置会让impala启动失败。但事实上我这边启动是成功的,只是读数据时失败了。点进去看提到的 IMPALA-532,发现在comments里有提到说虽然Impala依赖的boost版本已经升到1.47以上了(cdh5.16.2的impala用的是boost-1.57.0-p3),但在读数据时还会出这个问题。解决办法就是让impala进程有正确的locale变量,于是改为这样启动:

export LC_ALL=en_US.UTF-8
bin/start-impala-cluster.py

即在启动impala minicluster前设置好LC_ALL,问题就不出现了。生产环境中遇到这个问题,应该需要改impala启动脚本,或者系统全局设置。

待续思考

我这台机器之前是没有这个问题的,最近死机后被我强制重启了两回,然后就出现了这个问题。我另外还有一台没问题的ubuntu,对比两台机器的locale输出:
locale差别
左边是正常机器上的输出,右边是有问题机器的输出。在有问题的机器上其实还附带了一个warning: “locale: Cannot set LC_ALL to default locale: No such file or directory”。
LANGUAGE环境变量似乎不相关,设了之后问题照样重现。可能是系统某个文件被我强制重启损坏了,导致处理空的LC_ALL时读不到缺省的值。现在的解决办法(即手动设非空的LC_ALL)相当于一个workaround吧,可能除了Impala外的应用也会受影响,还是得把系统级别的问题修复了。

问题解决

几周之后我的机器又重启了,在我重新登陆并编译impala时,又出现了同样的问题。观察到如下warning:

perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
        LANGUAGE = (unset),
        LC_ALL = (unset),
        LC_PAPER = "zh_CN.UTF-8",
        LC_ADDRESS = "zh_CN.UTF-8",
        LC_MONETARY = "zh_CN.UTF-8",
        LC_NUMERIC = "zh_CN.UTF-8",
        LC_TELEPHONE = "zh_CN.UTF-8",
        LC_IDENTIFICATION = "zh_CN.UTF-8",
        LC_MEASUREMENT = "zh_CN.UTF-8",
        LC_TIME = "zh_CN.UTF-8",
        LC_NAME = "zh_CN.UTF-8",
        LANG = "en_US.UTF-8"
    are supported and installed on your system.
perl: warning: Falling back to a fallback locale ("en_US.UTF-8").

我意识到这台机器是没有装中文语言的,因此没法识别 “zh_CN.UTF-8”!关于locale设置,读完这篇基本清楚怎么操作了:https://www.linuxbabe.com/linux-server/fix-ssh-locale-environment-variable-error
我这里的情况是从一台装了中文的ubuntu ssh登陆到一台没装中文的ubuntu机器,ssh连接时带上了locale这些环境变量,目标机器没有装中文所以出问题。所以按文章里说的,解决办法有三种:

  1. 在目标机器上装中文
  2. 让目标机器拒绝ssh客户端带上locale设置
  3. 让本地机器的ssh客户端不要带上locale设置

我还有很多目标机器都是没装中文的,因此我选择改本地机器的ssh客户端配置。具体就是 “sudo vim /etc/ssh/ssh_config” 改配置,把里面的 “SendEnv LANG LC_*” 那行注释掉。之后重新ssh登陆,看到的 locale 变量就不一样了,不再有用到 “zh_CN.UTF-8”:

$ locale
LANG=en_US.UTF-8
LANGUAGE=
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=

各种操作恢复正常。

猜你喜欢

转载自blog.csdn.net/huang_quanlong/article/details/106570953