A few tricks teach you how to use lttng and log analysis cpeh

LTTng: (Linux Trace Toolkit Next Generation), it is a system software package used to track the Linux kernel, applications and libraries. LTTng is mainly composed of kernel modules and dynamic link libraries (used to track applications and dynamic link libraries). It is controlled by a session daemon, which accepts commands from the command line interface. The babeltrace project allows translating trace information into user-readable logs and provides a read trace library, namely libbabletrace. Tracepoint is embedded in the ceph code, and lttng is used for tracking.

Configure to enable tracing function

First linux use apt or yum to install lttng

apt way

$ sudo apt-get update
 $ sudo apt-get install lttng-tools lttng-modules-dkms babeltrace

yum way

$ sudo yum install lttng-tools lttng-ust // 查看 trace 结果的工具 # yum install babeltrace

Enter ceph daemon /var/run/ceph/ceph-osd.0.asok config show from the command line to know the module corresponding to traceing, take librbd as an example to expand

"event_tracing": "false"
"osd_function_tracing": "false"
"osd_objectstore_tracing": "false"
"osd_tracing": "false"
"rados_tracing": "false"
"rbd_tracing": "false"

Modify the default ceph.conf configuration and set rbd_tracing to true


~# vim /etc/ceph/ceph.conf

[global]
fsid = xxxxxxxx
public_network = xxxxxx/24
cluster_network = xxxxx/24
mon_initial_members = xxxxxxx, xxxxxx, xxxxxxx
mon_host = xxxxxx,xxxxxx,xxxxxx
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
osd_pool_default_size=3
osd_pool_default_min_size=2
osd_journal_size=100
filestore_xattr_use_omap=true
osd_pool_default_pg_num=128
osd_pool_default_pgp_num=128
osd crush_chooseleaf_type=0
rbd_cache=false
rbd_tracing=true ##------- 设置成true

Note: If the configuration cannot be modified using ceph tell mon.* injectargs "—rbd_tracing false", the configuration must be modified to restart the cluster

lttng display trackable location

lttng tracing must require the process to be running all the time. Needless to say, like mon and osd, if the tracing is librbd, you must ensure that the process of loading librbd.so is running.


 ./rbd_example  ##一直在跑
 lttng list -u

UST events:
-------------

PID: 12039 - Name: ./rbd_example
      lttng_ust_tracelog:TRACE_DEBUG (loglevel: TRACE_DEBUG (14)) (type: tracepoint)
      lttng_ust_tracelog:TRACE_DEBUG_LINE (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint)
      lttng_ust_tracelog:TRACE_DEBUG_FUNCTION (loglevel: TRACE_DEBUG_FUNCTION (12)) (type: tracepoint)
      lttng_ust_tracelog:TRACE_DEBUG_UNIT (loglevel: TRACE_DEBUG_UNIT (11)) (type: tracepoint)
      lttng_ust_tracelog:TRACE_DEBUG_MODULE (loglevel: TRACE_DEBUG_MODULE (10)) (type: tracepoint)
      lttng_ust_tracelog:TRACE_DEBUG_PROCESS (loglevel: TRACE_DEBUG_PROCESS (9)) (type: tracepoint)
      lttng_ust_tracelog:TRACE_DEBUG_PROGRAM (loglevel: TRACE_DEBUG_PROGRAM (8)) (type: tracepoint)
      lttng_ust_tracelog:TRACE_DEBUG_SYSTEM (loglevel: TRACE_DEBUG_SYSTEM (7)) (type: tracepoint)
      lttng_ust_tracelog:TRACE_INFO (loglevel: TRACE_INFO (6)) (type: tracepoint)
      lttng_ust_tracelog:TRACE_NOTICE (loglevel: TRACE_NOTICE (5)) (type: tracepoint)
        .
        .
        .
        .
        .
        .
        .
        .
      librbd:open_image_enter (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint)
      librbd:write_exit (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint)
      librbd:write2_enter (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint)
      librbd:write_enter (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint)
      librbd:read_iterate2_exit (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint)
      librbd:read_iterate2_enter (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint)
      librbd:read_iterate_exit (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint)
      librbd:read_iterate_enter (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint)
      librbd:read_exit (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint)
      librbd:read2_enter (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint)
      librbd:read_enter (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint)

Track and get information

mkdir -p traces ##创建存放
lttng create -o traces librbd ## 创建trace session
lttng enable-event -u 'librbd:*' ## 使能感兴趣的event
lttng add-context -u -t pthread_id ## 加入 线程信息
lttng start ## 开始跟踪
# run RBD workload here
lttng stop ## 停止trace

lttng destroy ##销毁 session

You can check the traces directory to see if there are corresponding records generated

Use babeltrace to read the results

babeltrace traces > result.all

[10:17:31.802322370] (+?.?????????) XXXXXXXXX librbd:aio_complete_enter: { cpu_id = 2 }, { pthread_id = 139658738509568 }, { completion = 0x5635FA045920, rval = 0 }
[10:17:31.802361060] (+0.000038690) XXXXXXXXX librbd:aio_get_return_value_enter: { cpu_id = 2 }, { pthread_id = 139658738509568 }, { completion = 0x5635FA045920 }
[10:17:31.802362582] (+0.000001522) XXXXXXXXX librbd:aio_get_return_value_exit: { cpu_id = 2 }, { pthread_id = 139658738509568 }, { retval = 0 }
[10:17:31.802399704] (+0.000037122) XXXXXXXXX librbd:aio_complete_exit: { cpu_id = 2 }, { pthread_id = 139658738509568 }, { }
[10:17:31.802522131] (+0.000122427) XXXXXXXXX librbd:write_exit: { cpu_id = 2 }, { pthread_id = 139659290902208 }, { retval = 10485760 }
.
.
.
.
.

[10:17:34.397260832] (+0.000000840) XXXXXXXXX librbd:aio_get_return_value_exit: { cpu_id = 2 }, { pthread_id = 139658738509568 }, { retval = 0 }
[10:17:34.397273502] (+0.000012670) XXXXXXXXX librbd:aio_complete_exit: { cpu_id = 2 }, { pthread_id = 139658738509568 }, { }
[10:17:34.397364800] (+0.000091298) XXXXXXXXX librbd:write_exit: { cpu_id = 2 }, { pthread_id = 139659290902208 }, { retval = 10485760 }
[10:17:34.650313545] (+0.252948745) XXXXXXXXX librbd:write_enter: { cpu_id = 2 }, { pthread_id = 139659290902208 }, { imagectx = 0x5635FA04B4B0, name = "librbd_test"
, snap_name = "", read_only = 0, off = 0, buf = 0x5635FA053330, buf_isnull = 0, buf_len = 10485760 }

log function activated

If you want to use it, also modify ceph.conf and add the following client part

~# vim /etc/ceph/ceph.conf

[global]
fsid = xxxxxxxx
public_network = xxxxxx/24
cluster_network = xxxxx/24
mon_initial_members = xxxxxxx, xxxxxx, xxxxxxx
mon_host = xxxxxx,xxxxxx,xxxxxx
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
osd_pool_default_size=3
osd_pool_default_min_size=2
osd_journal_size=100
filestore_xattr_use_omap=true
osd_pool_default_pg_num=128
osd_pool_default_pgp_num=128
osd crush_chooseleaf_type=0
rbd_cache=false
rbd_tracing=true ##------- 设置成true

[client]
debug rbd = 20 ## 需要打印日志部分,以及其等级
debug rados = 20 ## 需要打印日志部分,以及其等级
log file = /var/log/ceph/ceph-client.log ## 日志输出位置

Concluding remarks

Using lttng and log can better analyze the operation mode of ceph.

Guess you like

Origin blog.51cto.com/15024210/2582605