osd心跳机制
SimpleMessenger
osd心跳peers更新时机
handle_pg_create() -> maybe_update_heartbeat_peers()
主要对应pg创建的处理。
handle_osd_map() -> maybe_update_heartbeat_peers()
主要对应osdmap变更的处理,osd承载的pg可能需要重新peering,导致osd状态可能会变为STATE_WAITING_FOR_HEALTHY。
OSD::tick() -> maybe_update_heartbeat_peers()
定时器处理线程: OSD::init() -> tick_timer.init() -> SafeTimerThread::entry() -> SafeTimer::timer_thread() -> (Context)callback::complete() -> C_Tick::finish() -> OSD::tick()
osd peer心跳检测策略
OPTION(osd_heartbeat_interval, OPT_INT, 6) // (seconds) how oftenwe ping peers
这个参数目前在osd中只在OSD::init函数中使用,表示启动后多久开始调用OSD::tick(),后续在OSD::tick()函数中设置的定时器间隔都是1s。
tick_timer.add_event_after(1.0, new C_Tick(this));
OPTION(osd_heartbeat_grace, OPT_INT,20) // (seconds) how longbefore we decide a peer has failed
超过osd_heartbeat_grace秒,osd没有收到peer的心跳,则会向monitor汇报peer down。
扫描二维码关注公众号,回复: 1645725 查看本文章
mon心跳检测策略
OPTION(mon_osd_min_down_reporters,OPT_INT, 1) // number of OSDs who need to report a down OSD for itto count
最少需要有多少个不同的osd汇报同一个osd down。
bool OSDMonitor::check_failure(utime_t now, int target_osd, failure_info_t& fi) { ... ... if (failed_for >= grace && ((int)fi.reporters.size() >= g_conf->mon_osd_min_down_reporters) && (fi.num_reports >= g_conf->mon_osd_min_down_reports)) { pending_inc.new_state[target_osd] = CEPH_OSD_UP; return true; } }
OPTION(mon_osd_min_down_reports, OPT_INT, 3) // numberof times a down OSD must be reported for it to count
针对一个osd, 需要最少接收到多少个 down report。 down reort是来自所有peers汇报的总和。
MOSDFailure *add_report(int who, utime_t failed_since, MOSDFailure *msg) { map<int, failure_reporter_t>::iterator p = reporters.find(who); if (p == reporters.end()) { if (max_failed_since == utime_t()) max_failed_since = failed_since; else if (max_failed_since < failed_since) max_failed_since = failed_since; p = reporters.insert(map<int, failure_reporter_t>::value_type(who, failure_reporter_t(failed_since))).first; } else { p->second.num_reports++; } num_reports++; MOSDFailure *ret = p->second.msg; p->second.msg = msg; return ret; }