Foreword
Recently encountered a thorny problem in doing exclusive cloud-output, two years ago, our customers need to upgrade to the latest released version. Because of the longer span, the product code and database schema have changed dramatically. Product code part because the use of version management strategy, with a clear upgrade path, but the database is part of the program because the code is not used, resulting in lack of upgrade path, the entire upgrade process is very difficult. In order to upgrade later can proceed smoothly, it is necessary to develop a uniform set of database coding scheme.
What we needed to do was to change our mindset of how we treated our database. We had to stop treating it like some special artifact or some unique scenario, and we started looking at it through the same perspective that we were treating our web code.
State based VS Migrations based
State based Migrations based and are two common ways to implement the code of a database, are described separately below.
State based
In the state based model, we only need to maintain the state of the target database. Each table, stored procedures, views, triggers will be saved as a separate SQL files that will be a true representation of the state of database objects. The script will be required to upgrade the database automatically generated by the tool, thus greatly reducing maintenance costs.
However, this model does not deal well with the data migration scenario, e.g., a user name column table split into two first name and last name fields. This is because the data table is often context-sensitive, which means that the tool can not assume that the data is reliable to generate upgrade scripts.
Migrations based
In migrations based model, we need to maintain their own database script changes from one version to another version. Compared state based, the model increases the cost and complexity of maintenance, but it allows us to more directly control the migration process, it is possible to migrate data processing scenarios such as context-dependent. And due to changes in the way described by imperative that we can review it earlier. To achieve a migration based representation tools Liquibase , Flyway and so on.
Flyway Profile
This chapter introduces the tools under one representative of migrations based mode Flyway, used in the text version 6.0.8
.
What is the Flyway
Flyway is an open source database migration tool that can easily help us to complete the deployment of new and incremental upgrade of the database. It has the following features:
- It can be embedded in an application, or as a separate tool to perform.
- Tracking the migration has been executed.
- Implementation of the new migration.
- Verify that the database state.
Flyway principle
Flyway works as follows:
- Initially, the database creates a file
flyway_schema_history
meta data table, the table for performing the recording of migration. - Scan user defined migration scripts directory, according to the version number of them from low to high order.
- The migration script in turn applied to the database. Meanwhile, the metadata table will be updated.
- The next incremental upgrade databases, flyway will be based on the implementation of the metadata records in the table, to find this new migration scripts and sequentially executed.
Database coding scheme
Because we not only need to deal with schema changes, but also face some data migration scenarios, so the final choice of migrations based mode and use Flyway help us achieve our database coding.
Migration scripts organizational structure
In order to take into account the public cloud and private cloud, private cloud of taking the old version upgrade scenario, we designed the following directory structure for managing SQL migration scripts.
|--{db1}
|--flyway.conf
|--base_sql
|--V0.000__a.sql
|--V0.001__b.sql
|...
`--V0.025__z.sql
|--upgrade_legacy_private_cloud_sql
|--V2.000__create_TB_t1.sql
|--V2.001__alter_TB_a.sql
`--V2.002__TB_b_update.sql
|--upgrade_sql
|--V2019.11.11.000__alter_TB_b_add_column.sql
|--V2019.11.11.001__TB_c_insert.properties
|--V2019.11.11.001__TB_c_insert.sql
`--V2019.11.13.000__mix.sql
|--{db2}
|--flyway.conf
...
|--common
`--procedure.sql
The following description thereof:
1. Each directory corresponds to a separate database, containing the migration scripts and configuration information for the database.
2. Under the file flyway.conf database directory contains information about the connection, authentication, baseline such as the database.
3. subdirectories base_sql used to store inventory schema, the version number of the format 0.xxx
. Determine the contents of the directory will not allow changes.
4. subdirectory upgrade_legacy_private_cloud_sql used to store private cloud old version to the new version of the migration scripts, the version number of the format 2.xxx
.
5. subdirectory upgrade_sql unified public cloud and private cloud storage subsequent migration scripts. As the public cloud version was not prepared, and private cloud also different versions of numbering, date chosen here as a prefix version migration script format yyyy.mm.dd.index
.
DML 6. may differ in different environments. For this scenario, the need to represented the difference in the migration placeholders in script portion, rendering the actual value placeholder properties file obtained by the same name, for example V{yyyy.mm.dd.index}__xxx.properties
. Properties with the same name in the file stored in the baseline, different environments can be set different values corresponding to the environment will be copied to the file directories upgrade_sql runtime.
7. The file common/procedure.sql
contains the change index, column, key storage process, these stored procedures to achieve a change of idempotent.
Size migration scripts
In general, we recommend that a script contains a class action against a target only. For example, Vxxx_TB_car_add_column.sql
representative of the data table car
adding columns, Vyyy_TB_car_insert.sql
on behalf of the data table to car
insert the data. This design pattern consistent with a single mandate, can greatly reduce the number and complexity of the merge conflicts.
Implementation process
Mode based on the above-described migration SQL scripts, execution flow public cloud and private cloud different scenarios as follows:
surroundings | Scenes | Implementation process |
---|---|---|
Public cloud | The new deployment | 1. Perform migration script base_sql in. 2. Execute the migration script upgrade_sql in. |
Public cloud | The new upgrade | 1. Get the current version x. 2. The migration script in order to perform a version number greater than x upgrade_sql in. |
Public cloud | Existing database upgrade | 1. Manually base_sql existing databases and aligned. 2. The baseline is set to 2000. 3. In order to perform migration script upgrade_sql version number is greater than 2000. |
Private cloud | The new deployment | 1. Perform migration script base_sql in. 2. Execute the migration script upgrade_sql in. |
Private cloud | The new upgrade | 1. Get the current version x. 2. The migration script in order to perform a version number greater than x upgrade_sql in. |
Private cloud | Existing database upgrade | 1. organize private cloud old version to the new version of the migration script and put in the upgrade_legacy_private_cloud_sql directory. 2. The baseline is set to 1. 3. sequentially executed migration script upgrade_legacy_private_cloud_sql and upgrade_sql version number greater than 1. |
Idempotence practice
Ideally, each migration script is run only once per database. But if a particular migration fails, you may need to perform a successful migration steps to restore the database to the desired state. Then idempotent way to write the migration script will be very helpful. Here we summarize the best practices of different DDL and DML idempotency implementation.
SQL Type | Object | Action | SQL script naming convention | Best Practices |
---|---|---|---|---|
DDL | Table | Create Table | V{yyyy.mm.dd.index}__create_TB_{table_name}.sql Example: V2019.11.08.000__create_TB_car.sql |
CREATE TABLE IF NOT EXISTS {table_name}; |
DDL | Table | Drop Table | V{yyyy.mm.dd.index}__drop_TB_{table_name}.sql Example: V2019.11.08.000__drop_TB_car.sql |
DROP TABLE IF EXISTS {table_name}; |
DDL | Column | Add Column | V{yyyy.mm.dd.index}__alter_TB_{table_name}_add_column.sql Example: V2019.11.08.000__alter_TB_car_add_column.sql |
目标 column 不存在,则进行操作。(通过存储过程封装) |
DDL | Column | Drop Column | V{yyyy.mm.dd.index}__alter_TB_{table_name}_drop_column.sql Example: V2019.11.08.000__alter_TB_car_drop_column.sql |
目标 column 存在,则进行操作。(通过存储过程封装) |
DDL | Column | Change Column | V{yyyy.mm.dd.index}__alter_TB_{table_name}_change_column.sql Example: V2019.11.08.000__alter_TB_car_change_column.sql |
目标 column 存在,则进行操作。(通过存储过程封装) |
DDL | Keys and Indexes | Add | V{yyyy.mm.dd.index}__alter_TB_{table_name}_add_index.sql Example: V2019.11.08.000__alter_TB_car_add_index.sql |
目标 key 或 index 不存在,则进行操作。(通过存储过程封装) |
DDL | Keys and Indexes | Drop | V{yyyy.mm.dd.index}__alter_TB_{table_name}_drop_index.sql Example: V2019.11.08.000__alter_TB_car_drop_index.sql |
目标 key 或 index 存在,则进行操作。(通过存储过程封装) |
DML | Row | Insert | V{yyyy.mm.dd.index}__TB_{table_name}_insert.sql Example: V2019.11.08.001__TB_car_insert.sql |
|
DML | Row | Update | V{yyyy.mm.dd.index}__TB_{table_name}_update.sql Example: V2019.11.08.001__TB_car_update.sql |
|
DML | Row | Delete | V{yyyy.mm.dd.index}__TB_{table_name}_delete.sql Example: V2019.11.08.001__TB_car_delete.sql |
天然具备幂等性。 |
DML | Row | Multiple tables Multiple actions |
V{yyyy.mm.dd.index}__mix.sql Example: V2019.11.08.001__mix.sql |
如果针对下列情况有事务性要求,可以将 SQL 放到一个 mix 脚本里。
|
以修改表字段为例,这里将修改过程封装成了如下的存储过程。
DELIMITER $$
CREATE PROCEDURE `SAFE_CHANGE_COLUMN`(IN i_table_name VARCHAR(128),IN i_col_name VARCHAR(128), IN i_col_def VARCHAR(256))
BEGIN
SET @tableName = i_table_name;
SET @colName = i_col_name;
SET @colDef = i_col_def;
SET @colExists = 0;
SELECT 1 INTO @colExists FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = @tableName AND COLUMN_NAME = @colName LIMIT 1;
IF @colExists THEN
SET @query = CONCAT('ALTER TABLE ',@tableName,' CHANGE COLUMN ', @colName,' ',@colDef);
PREPARE stmt FROM @query;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
END IF;
END$$
CICD 流水线
Here we'll database migration scripts and application code stored in the same warehouse code, and share the same CICD process. This mode that best meets DevOps advocated collaboration, testing, rapid feedback, continuous improvement ideas, products can achieve faster, more frequent and more stable delivery.
to sum up
With the code of the database program, database maintenance and upgrade work becomes easy.
- No matter how long span upgrade can calmly, as have a clear upgrade path between any version.
- Database schema changes, data migration behavior becomes auditable.
- Database migration, application code sharing CICD lines, improve product delivery speed release.