MySQL Replication-- replication delay (Seconds_Behind_Master) calculated 01

 I fully understand MySQL source code, the following text is purely guessing, if misguided, is not responsible!

In sql / rpl_slave.cc document, computing code time_diff is:

/*
      The pseudo code to compute Seconds_Behind_Master:
        if (SQL thread is running)
        {
          if (SQL thread processed all the available relay log)
          {
            if (IO thread is running)
              print 0;
            else
              print NULL;
          }
          else
            compute Seconds_Behind_Master;
        }
        else
          print NULL;
    */
    if (mi->rli->slave_running)
    {
      /* Check if SQL thread is at the end of relay log
           Checking should be done using two conditions
           condition1: compare the log positions and
           condition2: compare the file names (to handle rotation case)
      */
      if ((mi->get_master_log_pos() == mi->rli->get_group_master_log_pos()) &&
           (!strcmp(mi->get_master_log_name(), mi->rli->get_group_master_log_name())))
      {
        if (mi->slave_running == MYSQL_SLAVE_RUN_CONNECT)
          protocol->store(0LL);
        else
          protocol->store_null();
      }
      else
      {
        long time_diff= ((long)(time(0) - mi->rli->last_master_timestamp)
                                - mi->clock_diff_with_master);
/* Apparently on some systems time_diff can be <0. Here are possible reasons related to MySQL: - the master is itself a slave of another master whose time is ahead. - somebody used an explicit SET TIMESTAMP on the master. Possible reason related to granularity-to-second of time functions (nothing to do with MySQL), which can explain a value of -1: assume the master's and slave's time are perfectly synchronized, and that at slave's connection time, when the master's timestamp is read, it is at the very end of second 1, and (a very short time later) when the slave's timestamp is read it is at the very beginning of second 2. Then the recorded value for master is 1 and the recorded value for slave is 2. At SHOW SLAVE STATUS time, assume that the difference between timestamp of slave and rli->last_master_timestamp is 0 (i.e. they are in the same second), then we get 0-(2-1)=-1 as a result. This confuses users, so we don't go below 0: hence the max(). last_master_timestamp == 0 (an "impossible" timestamp 1970) is a special marker to say "consider we have caught up". */ protocol->store((longlong)(mi->rli->last_master_timestamp ? max(0L, time_diff) : 0)); } } else { protocol->store_null(); }

1, when the SQL thread stops, returns NULL

2, when the SLAVE normal operation, if the location of the SQL thread execution is the last position of the relay log returns 0, otherwise it returns NULL

3, when the SLAVE normal operation, the delay time = the current copied from the library system time (time (0)) - binlog last time SQL threading (mi-> rli-> last_master_timestamp) - the main difference from the system time (mi-> clock_diff_with_master)

 

The master and slave time difference (mi-> clock_diff_with_master)

In sql / rpl_slave.cc file, the main difference from the computing system time code is as follows:

  /*
    Compare the master and slave's clock. Do not die if master's clock is
    unavailable (very old master not supporting UNIX_TIMESTAMP()?).
  */

  DBUG_EXECUTE_IF("dbug.before_get_UNIX_TIMESTAMP",
                  {
                    const char act[]=
                      "now "
                      "wait_for signal.get_unix_timestamp";
                    DBUG_ASSERT(opt_debug_sync_timeout > 0);
                    DBUG_ASSERT(!debug_sync_set_action(current_thd,
                                                       STRING_WITH_LEN(act)));
                  };);

  master_res= NULL;
  if (!mysql_real_query(mysql, STRING_WITH_LEN("SELECT UNIX_TIMESTAMP()")) &&
      (master_res= mysql_store_result(mysql)) &&
      (master_row= mysql_fetch_row(master_res)))
  {
    mysql_mutex_lock(&mi->data_lock);
    mi->clock_diff_with_master=
      (long) (time((time_t*) 0) - strtoul(master_row[0], 0, 10));
    mysql_mutex_unlock(&mi->data_lock);
  }
  else if (check_io_slave_killed(mi->info_thd, mi, NULL))
    goto slave_killed_err;
  else if (is_network_error(mysql_errno(mysql)))
  {
    mi->report(WARNING_LEVEL, mysql_errno(mysql),
               "Get master clock failed with error: %s", mysql_error(mysql));
    goto network_err;
  }
  else 
  {
    mysql_mutex_lock(&mi->data_lock);
    mi->clock_diff_with_master= 0; /* The "most sensible" value */
    mysql_mutex_unlock(&mi->data_lock);
    sql_print_warning("\"SELECT UNIX_TIMESTAMP()\" failed on master, "
                      "do not trust column Seconds_Behind_Master of SHOW "
                      "SLAVE STATUS. Error: %s (%d)",
                      mysql_error(mysql), mysql_errno(mysql));
  }
  if (master_res)
  {
    mysql_free_result(master_res);
    master_res= NULL;
  }

The master and slave time difference = from the library current time (time ((time_t *) 0)) - main library time (UNIX_TIMESTAMP ()), and the primary library time to the primary database performed SELECT UNIX_TIMESTAMP (), and then take the results of (strtoul (master_row [0], 0, 10))

 

Binlog last time the SQL thread processing (mi-> rli-> last_master_timestamp) 

In sql / rpl_slave.cc file exec_relay_log_event method, calculating last_master_timestamp code is as follows:

/**
  Top-level function for executing the next event in the relay log.
  This is called from the SQL thread.

  This function reads the event from the relay log, executes it, and
  advances the relay log position.  It also handles errors, etc.

  This function may fail to apply the event for the following reasons:

   - The position specfied by the UNTIL condition of the START SLAVE
     command is reached.

   - It was not possible to read the event from the log.

   - The slave is killed.

   - An error occurred when applying the event, and the event has been
     tried slave_trans_retries times.  If the event has been retried
     fewer times, 0 is returned.

   - init_info or init_relay_log_pos failed. (These are called
     if a failure occurs when applying the event.)

   - An error occurred when updating the binlog position.

  @retval 0 The event was applied.

  @retval 1 The event was not applied.
*/
static int exec_relay_log_event(THD* thd, Relay_log_info* rli)
{
  DBUG_ENTER("exec_relay_log_event");

  /*
     We acquire this mutex since we need it for all operations except
     event execution. But we will release it in places where we will
     wait for something for example inside of next_event().
   */
  mysql_mutex_lock(&rli->data_lock);

  /*
    UNTIL_SQL_AFTER_GTIDS requires special handling since we have to check
    whether the until_condition is satisfied *before* the SQL threads goes on
    a wait inside next_event() for the relay log to grow. This is reuired since
    if we have already applied the last event in the waiting set but since he
    check happens only at the start of the next event we may end up waiting
    forever the next event is not available or is delayed.
  */
  if (rli->until_condition == Relay_log_info::UNTIL_SQL_AFTER_GTIDS &&
       rli->is_until_satisfied(thd, NULL))
  {
    rli->abort_slave= 1;
    mysql_mutex_unlock(&rli->data_lock);
    DBUG_RETURN(1);
  }

  Log_event *ev = next_event(rli), **ptr_ev;

  DBUG_ASSERT(rli->info_thd==thd);

  if (sql_slave_killed(thd,rli))
  {
    mysql_mutex_unlock(&rli->data_lock);
    delete ev;
    DBUG_RETURN(1);
  }
  if (ev)
  {
    enum enum_slave_apply_event_and_update_pos_retval exec_res;

    ptr_ev= &ev;
    /*
      Even if we don't execute this event, we keep the master timestamp,
      so that seconds behind master shows correct delta (there are events
      that are not replayed, so we keep falling behind).

      If it is an artificial event, or a relay log event (IO thread generated
      event) or ev->when is set to 0, or a FD from master, or a heartbeat
      event with server_id '0' then  we don't update the last_master_timestamp.
    */
    if (!(rli->is_parallel_exec() ||
          ev->is_artificial_event() || ev->is_relay_log_event() ||
          ev->when.tv_sec == 0 || ev->get_type_code() == FORMAT_DESCRIPTION_EVENT ||
          ev->server_id == 0))
    {
      rli->last_master_timestamp= ev->when.tv_sec + (time_t) ev->exec_time;
      DBUG_ASSERT(rli->last_master_timestamp >= 0);
    }

When.tv_sec which is the current time from the library, when.tv_sec assignment slave_worker_exec_job in sql / rpl_rli_pdb.cc file method:

  ev= static_cast<Log_event*>(job_item->data);
  thd->server_id = ev->server_id;
  thd->set_time();
  thd->lex->current_select= 0;
  if (!ev->when.tv_sec)
    ev->when.tv_sec= my_time(0);
  ev->thd= thd; // todo: assert because up to this point, ev->thd == 0
  ev->worker= worker;

And my_time (0) to return the system time, the code mysys / my_getsystime.cc file

/**
  Return current time.

  @param  flags   If MY_WME is set, write error if time call fails.

  @retval current time.
*/

time_t my_time(myf flags)
{
  time_t t;
  /* The following loop is here beacuse time() may fail on some systems */
  while ((t= time(0)) == (time_t) -1)
  {
    if (flags & MY_WME)
      fprintf(stderr, "%s: Warning: time() call failed\n", my_progname);
  }
  return t;
}

 And ev-> calculated exec_time code sql / log_event.cc as follows:

  /*
  exec_time calculation has changed to use the same method that is used
  to fill out "thd_arg->start_time"
  */

  struct timeval end_time;
  ulonglong micro_end_time= my_micro_time();
  my_micro_time_to_timeval(micro_end_time, &end_time);

  exec_time= end_time.tv_sec - thd_arg->start_time.tv_sec;

exec_time note in the file sql / log_event.h as follows:

  <tr>
    <td>exec_time</td>
    <td>4 byte unsigned integer</td>
    <td>The time from when the query started to when it was logged in the binlog, in seconds.</td>
  </tr>

 

Guess you like

Origin www.cnblogs.com/gaogao67/p/11075648.html