Skip to content

Sharding

Yong Zhu edited this page May 7, 2020 · 2 revisions
Master-slave separation

jSqlBox supports master-slave separation, DbContext and related configuration methods are:

setSlaves(DbContext[]) sets a set of slave libraries
setMasterSlaveOption(SqlOption) Sets the master-slave library access policy

For each DbContext instance, you can set a set of slave libraries and set their access policies with the setMasterSlaveOption method. The parameters have the following options:

SqlOption.USE_MASTER read and write are in the current main library
The SqlOption.USE_AUTO write operation is only on the current main library. The read operation is randomly selected from the library, but when the transaction is started (the TinyTx transaction or the Spring transaction in jTransactions), the read operation is also on the main library.
SqlOption.USE_BOTH write operation in the current main library and all slave libraries (note that all slave libraries!), read operations only on the main library
SqlOption.USE_SLAVE write operation in all slave libraries (note that all slave libraries!), read operations are randomly selected from the library

If the slaves (setslaves) are configured, the default option for jSqlBox is the USER_AUTO policy.

Note that the main library and slave library mentioned above are relative to the current DbContext instance, because each DbContext can only be configured with one data source, which corresponds to a database. The library here is equivalent to the DbContext instance.

In addition, several options of SqlOption can also be passed directly to the SQL method of jSqlBox as a parameter, forcing to control which strategy each SQL method runs under, as some examples (source code see MasterSlaveTest.java for unit test), usually option Use static introduction:

//Auto mode, written on the main library
ctx.pUpdate("update TheUser set name=? where id=3", "NewValue");

//Auto mode, randomly read a slave library
TheUser u1 = new TheUser().useContext(ctx).put("id", 3L).load();

//force mode, written on all slave libraries
ctx.pUpdate("update TheUser set name=? where id=3", USE_SLAVE, "NewValue");

/ / Force mode, read the main library
TheUser u1 = new TheUser().useContext(ctx).put("id", 3L).load(USE_MASTER);

/ / Force mode, randomly read a slave library
TheUser u2 = user.loadById(TheUser.class, 3L, USE_SLAVE)
}

Regardless of the policy set by setMasterSlaveOption in the DbContext, the mandatory mode has the highest execution right regardless of whether it is in a transaction.

Sharding

jSqlBox supports sub-database sub-tables, which is usually called Sharding, but it is not intelligent. If a SQL execution cross-database or cross-table phenomenon, it will not automatically perform complex SQL intelligent analysis like some Sharding tools. , SQL spin-off execution, automatic summary of query results, but simply throw a runtime exception. Because the author believes that a SQL cross-table or even cross-database operations, it is not normal, business personnel should try to avoid designing such SQL as much as possible. If the business personnel know that an operation must be operated across libraries or across tables, which is a business requirement and cannot be avoided, then the business personnel must also be able to manually perform SQL splitting, summary operations, and even performance optimization for such services. At present, some Sharding tools, such as Sharding-JDBC, have the functions of automatic SQL splitting and summarization, but there are also many problems, such as excessive complexity, no support for native SQL, support for only one database, and control of transaction control. It takes a heavy price for a problem that is basically not a problem. I personally think it is unnecessary. It is enough to provide some basic Sharding means at the DAO layer. Sharding at the DAO layer or at the database proxy layer has its pros and cons. For a comparison of this, you can also refer to [this article] (https://github.com/QNJR-GROUP/sharding-method). To take a step back, even if the sharding tool of the database proxy layer does a good job, there is no conflict with jSqlBox, which is a tool for doing sharding in the DAO layer, because they are at different levels, one is the driver layer and the other is the DAO layer. Developers can choose their own favorite way.

The configuration methods related to the sub-library table in DbContext are:

setMasters(DbContext[]) sets a set of main libraries
setShardingTools(ShardingTool[]) Sets a set of Sharding tools
setSnowflakeCreator(SnowflakeCreator) sets a distributed primary key generator

In addition, there are three entity annotations @ShardTable, @ShardDatabase, @Snowflake, ShardTB and ShardDB two methods for calling in SQL, as described below:

分分分表 Step 1: Add annotations to entity fields
Public static class TheUser extends ActiveRecord {
        @ShardTable({ "MOD", "8" })
        @Snowflake
        @Id
        Private Long id;
 
        @ShardDatabase({ "RANGE", "10" })
        @Id
        Private Long databaseId;
        ...

Description: @ShardTable({ "MOD", "8" }) indicates that id is a sub-table key, and the sub-table strategy is based on 8 for example. For example, when the id value is 8, the corresponding database table name is theuser_0, when the id is At 9 o'clock, corresponding to the user_1 database table, note that as long as the sub-table is used, all table names must be the naming format of "table name" + underscore + serial number. @ShardDabase({ "RANGE", "10" }) Annotation indicates that databaseId is a sub-library key, and the sub-library strategy is in the range of 10, for example, when the databaseId value is 0 to 9, it is stored in the 0th database. For 10 to 19, it is stored in the second database... and so on. The value of the @Snowflake tag current field is a Snowflake distributed key, which is of type Long and its value will be filled in by jSqlBox, see below.

@ShardTable annotations and @ShardDatabase annotations can be superimposed on the same column or used separately, but must be placed on the primary key column. If single or two are used at the same time, it is usually recommended to put them on a unique primary key column. This avoids the appearance of composite primary keys and helps simplify programming.

分库分表Step 2: Configuring the main library array

Configure the main library array, a library corresponds to a data source, that is, corresponding to a DbContext instance, form an array structure, and then each DbContext instance calls setMasters () method to assign this array to itself:

DbContext[] masters = new DbContext[MASTER_DATABASE_QTY];
For (int i = 0; i < MASTER_DATABASE_QTY; i++) {
    Masters[i] = new DbContext(dataSource[i]);
    Masters[i].setMasters(masters);
    Masters[i].setSnowflakeCreator(new SnowflakeCreator(5, 5, 0, i));
}

TableModel model = TableModelUtils.entity2Model(TheUser.class);
For (int i = 0; i < MASTER_DATABASE_QTY; i++) {
    For (int j = 0; j < TABLE_QTY; j++) {//By the way, the empty table is built.
        model.setTableName("TheUser" + "_" + j);//Change the table name
        For (String ddl : masters[i].toCreateDDL(model))
            Masters[i].iExecute(ddl);//Build a table
    }
}
DbContext.setGlobalDbContext(masters[0]);

The array is ordered, that is, if you need to find the number 5 library, it is equivalent to any of the arrays in the array call the following method: masters[arbitrary].getMasters[5] In the above example, the empty table is built by the way. The last line is to select a main library as the current library. The ActiveRecord operation requires a default library. (see the ActiveRecord section)

Each main library can also be configured with a bunch of slave libraries. Each master library and slave library can create a heap table table_0, table_1..., and some students may ask, if these master and slave libraries hang up, how can they be hanged? do? jSqlBox does not solve this problem. As a lightweight DAO tool, it only provides the most basic platform for building such a structure. How to achieve fault tolerance must be through business methods and secondary development (such as custom fault tolerance). The master-slave, sub-library, and sub-table logic), or the exception is passed to the upper-level architecture such as micro-services to solve.

Step 3: Normally use the pintea method in the program, the sub-database table rules will automatically take effect
For (i:=0;i<100;i++)
 New TheUser().put("databaseId", i).insert();

Run it, TheUser is stored in various databases and sub-tables according to the Sharding rules described in the first step.

The above is the ActiveRecord usage in the pintea system. The following is an example of using the SQL method in the pintea system:

Masters[2].iExecute(TheUser.class, "insert into ", shardTB(10), shardDB(3),
                "(id, name, databaseId) values(?,?,?)", param(10, "u1", 3), USE_BOTH, new PrintSqlHandler());
Assert.assertEquals(1, masters[2].iQueryForLongValue(TheUser.class, "select count(*) from ", shardTB(10),
                shardDB(3), USE_SLAVE, new PrintSqlHandler()));
Assert.assertEquals(1,
                Masters[2].iQueryForLongValue(TheUser.class, "select count(*) from ", shardTB(10), shardDB(3)));
Masters[2].iExecute(TheUser.class, "insert into ", shard(3), " (id, name) values(?,?)", param(10, "u1"),
                USE_BOTH, new PrintSqlHandler());

The ShardTB method can return the table name corresponding to the current entity (that is, a string) according to the given table key value and the part table policy. The ShardDB method can return the library corresponding to the current entity (that is, the DbContext instance) according to the given database key value and the sub-database strategy. When a DbContext instance appears in the Sql method, it will force the execution of the SQL execution right. The purpose of the sub-library.

Starting with version 2.0.4, when @ShardTable and @ShardDatabase are marked on the same single primary key, you can use the Shard (primary key value) method, which returns both the table name and the library instance. The parameters of the above three methods are different according to the sub-database. When calculating, the parameters can receive Collection, array, and multiple parameters, but it should be noted that if multiple table names or multiple libraries are calculated, they will be directly thrown. A runtime exception, because jSqlBox does not support automatic cross-database, cross-table summary SQL results, which requires programmers to manually cross-database cross-table SQL summary, or try to avoid this kind of operation in business design.

Configuration of sub-library table

Now let's go back and talk about the two configuration methods in the DbContext: setShardingTools(ShardingTool[]) Used to set a set of Sharding tools, for @ShardTable({ "MOD", "8" }) and @ShardDatabase({ "RANGE", "10" }) These two annotations are provided by jSqlBox with default An instance of the ShardingTool interface to handle. However, jSqlBox only supports the simplest MOD and RANGE strategies. If you want to add other user-defined sharding strategies, such as @ShardTable({ "Unknow", "8" }), you must call the setShardingTools method. To configure your own ShardingTool, it can accept a ShardingTool array to handle different strategies. Note that the setShardingTools method will clear the strategy that comes with jSqlBox. If you want to keep the two policies that come with it, you must set it like this: ctx.setShardingTools(new ShardingModTool(), new ShardingRangeTool(), new custom strategy 1(), new custom strategy 2()...}

setSnowflakeCreator method to set distributed primary key generator jSqlBox comes with a distributed primary key generator named SnowflakeCreator, which has four build parameters, meaning the data center digits (09), workstation digits (09), data center ID (0511). ), workstation ID (0511), the sum of the first two parameters must be 10, and the sum of the last two parameters must be less than 1024. For more information about the Snowflake algorithm, see [here] (https://www.cnblogs.com/relucent/p/4955340.html), because it marks the unique ID of the workstation, so the SnowflakeCreator of each DbContextcan not be the same, only the following method can be called at runtime to dynamically configure a distribution for each main library at runtime. Primary key generator: Masters[i].setSnowflakeCreator(new SnowflakeCreator(5, 5, 0, i)); The advantage of SnowflakeCreator distributed primary key compared with UUID is that it is a long type value. If it is carefully maintained, the machine id of each application is guaranteed to be different, and the time of each machine is not called back, it can be implemented in the whole distributed environment. The same id will not appear.

The source code corresponding to the above sub-table content is in the two files ShardingModToolTest.java and ShardingRangeToolTest.java.

Transaction in sharding

The table Sharding because all the tables are in one library, there is no transaction trouble. In the case of sub-databases, especially in the case of multi-master databases(not the case of read-write separation), there is a problem of distributed transactions. For distributed transactions, the general principle is to avoid distributed transactions in the business design, such as storing the relevant records of the same user in the same database.

In the demo\jsqlbox-xa-atomikos directory, there is a demonstration of distributed transactions, mainly using Atomikos and Spring support for XA transactions, demonstrating the use of jSqlBox's library combined with distributed XA transactions. The XA transaction is based on two-phase commit and needs to be supported by the database itself. If the concurrency is large, the lock table conflict probability becomes large, which may cause performance problems. Therefore, the XA transaction is more suitable for small projects.

TCC distributed transactions are based on the principle of “first commit-post-compensation”. Both theoretically and practically, the performance exceeds the XA transaction, but the implementation is more complicated, and often requires a dedicated database (such as Alibaba Cloud's GTS depends on a proprietary database). At present, the open source community does not have a mature ready-made solution, and the author is also in the exploration stage.

Clone this wiki locally