MySQL DDL - gh-ost learning

gh-ost works

1, first create a ghost table, the same structure as the source table 
2, using the alter table command to modify ghost 
3.1, the main analog acquisition binlog library table from the library commands (binlog on-line mode comprises a full mirror and changed before changing after all data), and to parse the statement to execute the ghost table. 
3.2, the range of data acquisition source tables (e.g., according to the master key to obtain the maximum and minimum values), then the data is split into a plurality of batches of copy insertions in ghost in Table 
4, the source table lock, prevent users from modifying the source table data 
5, the source rename table, the ghost table renamed source table 
6, table lock release, cleaning tools create ghost table.

 

gh-ost contrast pt-osc

How to obtain change data: 
GH-dependent BINLOG OST change log on the data acquired during the data copy, the log format is required ROW mode, the operation safer. 
pt-osc rely on triggers to get an update, it requires no trigger on the original table, for more invasive operations. 

Operation thread ends abnormally effects: 
After an abnormal end gh-ost process, retaining only temporary tables on the main library, will not affect the business. 
After pt-osc abnormal end, the main library Uehara table also retains the trigger, such as not timely clean-up, it will affect the operating performance. 

Tool influence business performance: 
pt-OSC using triggers will affect the operating performance, the impact in a highly concurrent obvious. 
gh-ost use BINLOG to get change operation, will not affect the performance of DML operations. 

Effects of traffic load on the tool: 
in low load scenarios, pt-osc elapsed time is shorter (the lower portion gh-ost scene is half the time) 
in the high load scenario, pt-osc can be performed slowly and executed, and gh- osc fast as possible to resolve BINLOG BINLOG generation, causing never complete execution. 
At high load scenarios and innodb_autoinc_lock_mode = 1 (default), pt-osc is likely to trigger a deadlock causes an exception, but gh-osc does not lead to a deadlock.

 

gh-ost operating mode

1, connected to a main library directly modified

Direct primary library 
created on the master library ghost table 
alter modified table structure directly on a new table (ghost table) 
to migrate the original table data into the new table 
pulling parsing binlog event, the application to a new table 
cut-over stage, replacing the original with a new table table

2, is connected indirectly from the library applied to the main library

连接从库
校验完后,在主库创建新表
迁移原表数据到新表
模拟从库的从库,拉取解析增量binlog应用到主库
cut-over阶段,用新表替换掉原表

两者不同的点就在于,通过连接从库来进行变更,对主库的性能影响最小

 

数据一致性问题

由于使用binlog获得的数据总是新于或者等于从源表拷贝的数据:
1、在应用binlog导出的数据时,将UPDATE和DELETE直接应用ghost表,将INSERT修改为REPLACE INTO再应用到ghost表。
2、在copy源表数据到ghost表时,使用INSERT IGNORE来忽略掉ghost表已存在的记录
3、对于在gh-ost工作期间发生的DELETE操作:
  A:如果记录在从源表删除前被复制到ghost表, 则ghost表中记录会在应用binlog导出的DELETE命令时删除。
  B:使用记录在从源表复制到ghost表之前被删除,则记录不会被复制到ghost表,应用binlog导出的DELETE命令也不会报错。

 

跨服务器操作问题

假设有一套主从复制A1-->A2,A1为主库,A2为从库,另有一台服务器B1装有gh-ost,可以在B1上执行对A1上表的修改:
1、对于数据拷贝操作,B1发送查询到A1上先获取最大值和最小值,然后在B1上进行拆分成不同批次,再从B1上发送命令给A1执行小范围数据拷贝
2、对于Binlog解析,先模拟B1到A1的搭建复制,从A1上拉取binlog到B1,在B1上解析成SQL命令,再发送到A1上执行。

对于跨服务器执行gh-ost命令,会导致大量数据在数据库服务器到命令服务器之间传输,需要考虑网络带宽和网络稳定

 

唯一索引问题

如果通过gh-ost来新增唯一索引,由于REPLACE INTO和INSERT IGNORE会受到ghost表上唯一索引的影响,当在唯一索引上存在数据重复时,会导致数据丢失。

 

重命名原理
在pt-osc或者online ddl中,最后的rename操作一般是耗时比较短,但如果表结构变更过程中,有大查询进来,那么在rename操作的时候,会触发MDL锁的等待,如果在高峰期,这就是个严重的问题。所以gh-ost是怎么做的呢?

gh-ost利用了MySQL的一个特性,就是原子性的rename请求,在所有被blocked的请求中,优先级永远是最高的。

gh-ost基于此设计了该方案:一个连接对原表加锁,另启一个连接尝试rename操作,此时会被阻塞住,当释放lock的时候,rename会首先被执行,其他被阻塞的请求会继续应用到新表。

 

gh-ost常用参数

-critical-load --max-load
Comma delimited status-name=threshold, same format as --max-load. When status exceeds threshold, app panics and quits

改表完成之后是否删除老表。        
-ok-to-drop-table
Shall the tool drop the old table at end of operation. DROPping tables can be a long locking operation, which is why I'm not doing it by default. I'm an online tool, yes?     
        
如果执行之前发现old-table,删除还是终止?默认终止。
-initially-drop-old-table
Drop a possibly existing OLD table (remains from a previous run?) before beginning operation. Default is to panic and abort if such table exists           

port选项仅对应的是host吗?其他地方都需要设置全
-port int
MySQL port (preferably a replica, not the master) (default 3306)

因为gh-ost没有提供recursion-method=processlist方法,因此需要通过throttle-control-replicas指定所有的需要检查的slave,并且注意上面的port仅仅对应于host选项,因此需要host:port的方式来写全称
-throttle-control-replicas string
List of replicas on which to check for lag; comma delimited. Example: myhost1.com:3306,myhost2.com,myhost3.com:3307
    	
拷贝每个chunk之后sleep的时间=nice-ratio*copy-chunk-time,默认值是0,表示不sleep
-nice-ratio float
force being 'nice', imply sleep time per chunk time; range: [0.0..100.0]. Example values: 0 is aggressive. 1.5: for every ms spend in a rowcopy chunk, spend 1.5ms sleeping immediately after    
  
默认是不删除socket文件的,这样当第二次运行的时候报错,提示socket文件已经存在 
-initially-drop-socket-file
Should gh-ost forcibly delete an existing socket file. Be careful: this might drop the socket file of a running migration!    	

-exact-rowcount
actually count table rows as opposed to estimate them (results in more accurate progress estimation)

 

gh-ost命令模板

./gh-ost \
--max-load=Threads_running=25 \
--critical-load=Threads_running=64 \
--chunk-size=1000 \
--throttle-control-replicas="test02:3306" \
--max-lag-millis=1500 \
--initially-drop-old-table \
--initially-drop-ghost-table \
--initially-drop-socket-file \
--ok-to-drop-table \
--conf="./my.cnf" \
--host="test02" \
--port=3306 \
--user="admin" \
--password="admin" \
--database="test" \
--table="test" \
--verbose \
--alter="drop index id1" \
--switch-to-rbr \
--allow-master-master \
--cut-over=default \
--default-retries=120 \
--panic-flag-file=/tmp/ghost.panic.flag \
--postpone-cut-over-flag-file=/tmp/ghost.postpone.flag \
--execute

 

参考
https://rj03hou.github.io/mysql/gh-ost/

Guess you like

Origin www.cnblogs.com/gaogao67/p/11210212.html