-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
MDEV-36733 Clone: Support Local Clone #4248
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
INSTALL PLUGIN clone SONAME 'mysql_clone.so'; CLONE LOCAL DATA DIRECTORY [=] 'clone_dir'; CLONE INSTANCE FROM 'user'@'host':port IDENTIFIED BY 'password' [DATA DIRECTORY [=] 'clone_dir'] [REQUIRE [NO] SSL]; UNINSTALL PLUGIN CLONE;
…tion is in progress. - Added test case in redo_log_resize
dr-m
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still a draft, so I am posting just some superficial comments.
| case DB_ABORT_INCOMPLETE_CLONE: | ||
| return("Incomplete cloned data directory."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we make use of DB_INTERRUPTED instead of introducing a new, rarely used, error code?
storage/innobase/ut/ut0ut.cc
Outdated
| void ut_format_byte_value(uint64_t data_bytes, std::string &data_str) | ||
| { | ||
| int unit_sz= 1024; | ||
| auto exp = static_cast<int>( | ||
| (data_bytes == 0) ? 0 : std::log(data_bytes) / std::log(unit_sz)); | ||
| auto data_value=static_cast<double>(data_bytes) / std::pow(unit_sz, exp); | ||
|
|
||
| char unit[]= " KMGTPE"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| /*************************************************************//** | ||
| Report a failed assertion. */ | ||
|
|
||
| static std::function<void()> assert_callback; | ||
|
|
||
| void ut_set_assert_callback(std::function<void()> &callback) { | ||
| assert_callback = callback; | ||
| } | ||
|
|
||
| ATTRIBUTE_NORETURN | ||
| void | ||
| ut_dbg_assertion_failed( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is "hijacking" the Doxygen comment of ut_dbg_assertion_failed(). Furthermore, we do not seem to need this indirection.
Instead, we could just invoke clone_files_fatal_error() whenever needed. What is the purpose of that function? Why would we want to invoke it on ut_a or ut_error, but not on any fatal signal? Could clone_files_fatal_error be registered as part of the signal handler? What is its purpose?
| Clone_notify notifier(Clone_notify::Type::SPACE_UNDO_TRUNCATE, | ||
| UINT32_MAX, true); | ||
| if (notifier.failed()) return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we reduce the scope of this object?
if (Clone_notify(Clone_notify::Type::SPACE_UNDO_TRUNCATE,
UINT32_MAX, true).failed())
// some comment that explains what is going on
return;There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that Clone_notify is a RAII object and that we probably want to block concurrent clone activity while the undo log truncation is in progress. Some comments and debug assertions would be helpful:
/* Prevent innodb_undo_log_truncate=ON conflict with clone activity. */
Clone_notify notifier(Clone_notify::Type::SPACE_UNDO_TRUNCATE,
UINT32_MAX, true);
if (ut_d(int err=) notifier.get_error())
{
ut_ad(err == ER_CLONE_IN_PROGRESS);
return;
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Come to think of it, do we actually need to block a concurrent clone operation? I understand that in MySQL 5.7 and possibly also later versions, innodb_undo_log_truncate=ON will execute a log checkpoint and create a separate recovery log file. The undo tablespace truncation is just using regular write-ahead logging and some special logic in recv_sys_t::trim(). Is there any reason why this would not work when applying the log on the cloned files? Can we remove this code altogether here?
dr-m
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There seems to be quite a bit of dead code included. I think that we need to remove or at least #if 0 it out.
https://www.theregister.com/2025/08/30/programmers_watch_your_weight/
https://www.nongnu.org/lzip/the_emperors_old_clothes.txt
You include only those features which you know to be needed
| /** @file include/clone0api.h | ||
| Innodb Clone Interface |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It’s usually spelled InnoDB.
| #ifndef CLONE_API_INCLUDE | ||
| #define CLONE_API_INCLUDE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, let us use #pragma once.
| #include "univ.i" | ||
| #ifndef UNIV_HOTBACKUP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UNIV_HOTBACKUP should have been removed in 63574f1 (MDEV-11690).
Do we really have to include univ.i in this code?
| #include "handler.h" | ||
| #include <fil0fil.h> | ||
|
|
||
| using space_id_t = decltype(fil_space_t::id); | ||
| using page_no_t = uint32_t; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really have to include both headers in this code? The fil0fil.h only seems to be needed for this definition of space_id_t. Starting with ca501ff we can simply use uint32_t.
| /** Notification type. Currently used by various DDL commands. */ | ||
| enum class Type { | ||
| /* Space is being created. */ | ||
| SPACE_CREATE, | ||
| /* Space is being dropped. */ | ||
| SPACE_DROP, | ||
| /* Space is being renamed. */ | ||
| SPACE_RENAME, | ||
| /* Space is being discarded or imported. */ | ||
| SPACE_IMPORT, | ||
| /* Space encryption property is altered. */ | ||
| SPACE_ALTER_ENCRYPT, | ||
| /* Space encryption property of general tablespace is altered. */ | ||
| SPACE_ALTER_ENCRYPT_GENERAL, | ||
| /* Space encryption flags of general tablespace are altered. */ | ||
| SPACE_ALTER_ENCRYPT_GENERAL_FLAGS, | ||
| /* In place Alter general notification. */ | ||
| SPACE_ALTER_INPLACE, | ||
| /* Inplace Alter bulk operation. */ | ||
| SPACE_ALTER_INPLACE_BULK, | ||
| /* Special consideration is needed for UNDO as these DDLs | ||
| don't use DDL log and needs special consideration during recovery. */ | ||
| SPACE_UNDO_TRUNCATE, | ||
| /* Redo log resizing */ | ||
| SYSTEM_REDO_RESIZE | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The enum constants are missing Doxygen-style comments (which start with /** instead of /*). Furthermore, only the last 2 types are actually being used at the moment. I would suggest to add #if 0 around any other constants, or removing this parameter for now altogether.
I would actually suggest removing or #if 0’ing this enum class Type altogether for now because the SYSTEM_REDO_RESIZE can be quite different from anything else, not related to any particular tablespace. I see that there is quite a bit of dead code in Clone_notify::Clone_notify(). Anything after the following is currently unreachable:
if (type == Type::SYSTEM_REDO_RESIZE || type == Type::SPACE_IMPORT ||
type == Type::SPACE_UNDO_TRUNCATE)
{
// …
return;
}All parameters of Clone_notify::Clone_notify() as well as the related data members can (and should) be removed for now.
| /** Notification wait type set. */ | ||
| enum class Wait_at { | ||
| /* Clone doesn't need to wait. */ | ||
| NONE, | ||
| /* Clone needs to wait before entering. */ | ||
| ENTER, | ||
| /* Clone needs to wait before state change. */ | ||
| STATE_CHANGE, | ||
| /* Clone needs to abort. */ | ||
| ABORT | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Each value is missing Doxygen comments. Are any other values than ABORT actually assigned for now? If not, the Wait_at as well as m_wait need to be #if 0’d or removed.
| /** Tablespace ID for which notification is sent. */ | ||
| space_id_t m_space_id; | ||
|
|
||
| /** Notification type. */ | ||
| Type m_type; | ||
|
|
||
| /** Wait type set. */ | ||
| Wait_at m_wait; | ||
|
|
||
| /** Blocked clone state if clone is blocked. */ | ||
| uint32_t m_blocked_state; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like all these can be #if 0’d out. m_blocked_state is being never assigned, because Clone_notify is only being constructed with type == Type::SYSTEM_REDO_RESIZE or type == Type::SPACE_UNDO_TRUNCATE, and that case will assign m_wait= Wait_at::ABORT (unless the first component returned by clone_sys->check_active_clone() holds).
| std::tuple<bool, Clone_Handle *> Clone_Sys::check_active_clone() { | ||
| mysql_mutex_assert_owner(&m_clone_sys_mutex); | ||
|
|
||
| bool active_clone = false; | ||
| Clone_Handle *active_handle = nullptr; | ||
|
|
||
| /* Check for active clone operations. */ | ||
| for (int idx = 0; idx < CLONE_ARR_SIZE; idx++) { | ||
| auto clone_hdl = m_clone_arr[idx]; | ||
|
|
||
| if (clone_hdl != nullptr && clone_hdl->is_copy_clone()) { | ||
| active_clone = true; | ||
| active_handle = clone_hdl; | ||
| break; | ||
| } | ||
| } | ||
| return std::make_tuple(active_clone, active_handle); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first component of the tuple is completely redundant, and it will incur runtime overhead. Please let us just return a pointer:
Clone_handle *Clone_sys::check_active_clone()
{
mysql_mutex_assert_owner(&m_clone_sys_mutex);
for (Clone_handle *h: m_clone_arr)
if (h && h->is_copy_clone())
return h;
return nullptr;
}| bool Clone_Sys::check_active_clone(bool print_alert) { | ||
| bool active_clone = false; | ||
| std::tie(active_clone, std::ignore) = check_active_clone(); | ||
|
|
||
| if (active_clone && print_alert) { | ||
| /* purecov: begin inspected */ | ||
| ib::info() << "DDL waiting for CLONE to abort"; | ||
| /* purecov: end */ | ||
| } | ||
| return (active_clone); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like mostly dead code; we’re always invoking it with print_alert=false.
| /** Check if any active clone is running. | ||
| @param[in] print_alert print alert message | ||
| @return true, if concurrent clone in progress */ | ||
| bool check_active_clone(bool print_alert); | ||
|
|
||
| /** Check if any active clone is running. | ||
| @return (true, handle) if concurrent clone in progress */ | ||
| std::tuple<bool, Clone_Handle *> check_active_clone(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that the first function is only being invoked with print_alert=false, we can remove this function altogether, and just invoke the more complex check_active_clone(), which must be revised:
/**
@return handle of active clone
@retval nullptr if no clone is in progress */
Clone_handle *check_active_clone();
dr-m
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to me that part of the code that is currently in storage/innobase/include/chunk* and storage/innobase/chunkcould be better be a part ofplugin/clone. I still don’t have a good overview of everything. I think that storage/maria/ma_clone.cc`, which I did not read yet, should give an idea of what to except from a minimal storage engine implementation.
| /** Directory under data directory for all clone status files. */ | ||
| #define CLONE_FILES_DIR OS_FILE_PREFIX "clone" OS_PATH_SEPARATOR_STR | ||
|
|
||
| /** Clone in progress file name length. */ | ||
| const size_t CLONE_INNODB_FILE_LEN = 64; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These feel like unnecessary obfuscation. The const should also cause run-time overhead; constexpr would be better. It is not clear which file name is being referred to, and whether the name would include a CLONE_FILES_DIR prefix.
The definition of CLONE_FILES_DIR is overly complex. This patch is introducing OS_FILE_PREFIX in a different header file, as well as reintroducing OS_PATH_SEPARATOR and OS_PATH_SEPARATOR_STR, which had been removed for a good reason in cf552f5.
It would be much shorter and more obvious to just write something like
extern const char clone_files_dir[];and define it in clone/clone0api.cc (which has 2 references to the constant) or clone/clone0clone.cc (1 reference):
const char clone_files_dir[]{"#clone/"}
I have a feeling that Clone_Handle should actually belong to the clone plugin, and not to InnoDB. We would want to minimize the overhead when the clone plugin is not being loaded.
| #ifdef UNIV_DEBUG | ||
| /** Clone simulate recovery error file name. */ | ||
| const char CLONE_INNODB_RECOVERY_CRASH_POINT[] = | ||
| CLONE_FILES_DIR OS_FILE_PREFIX "status_crash_point"; | ||
| #endif | ||
|
|
||
| /** Clone in progress file name. */ | ||
| const char CLONE_INNODB_IN_PROGRESS_FILE[] = | ||
| CLONE_FILES_DIR OS_FILE_PREFIX "status_in_progress"; | ||
|
|
||
| /** Clone error file name. */ | ||
| const char CLONE_INNODB_ERROR_FILE[] = | ||
| CLONE_FILES_DIR OS_FILE_PREFIX "status_error"; | ||
|
|
||
| /** Clone fix up file name. Present when clone needs table fix up. */ | ||
| const char CLONE_INNODB_FIXUP_FILE[] = | ||
| CLONE_FILES_DIR OS_FILE_PREFIX "status_fix"; | ||
|
|
||
| /** Clone recovery status. */ | ||
| const char CLONE_INNODB_RECOVERY_FILE[] = | ||
| CLONE_FILES_DIR OS_FILE_PREFIX "status_recovery"; | ||
|
|
||
| /** Clone file name for list of files cloned in place. */ | ||
| const char CLONE_INNODB_NEW_FILES[] = | ||
| CLONE_FILES_DIR OS_FILE_PREFIX "new_files"; | ||
|
|
||
| /** Clone file name for list of files to be replaced. */ | ||
| const char CLONE_INNODB_REPLACED_FILES[] = | ||
| CLONE_FILES_DIR OS_FILE_PREFIX "replace_files"; | ||
|
|
||
| /** Clone file name for list of old files to be removed. */ | ||
| const char CLONE_INNODB_OLD_FILES[] = | ||
| CLONE_FILES_DIR OS_FILE_PREFIX "old_files"; | ||
|
|
||
| /** Clone file name for list of temp files renamed by ddl. */ | ||
| const char CLONE_INNODB_DDL_FILES[] = | ||
| CLONE_FILES_DIR OS_FILE_PREFIX "ddl_files"; | ||
|
|
||
| /** Clone file extension for files to be replaced. */ | ||
| const char CLONE_INNODB_REPLACED_FILE_EXTN[] = "." OS_FILE_PREFIX "clone"; | ||
|
|
||
| /** Clone file extension for saved old files. */ | ||
| const char CLONE_INNODB_SAVED_FILE_EXTN[] = "." OS_FILE_PREFIX "clone_save"; | ||
|
|
||
| /** Clone file extension for temporary renamed file. */ | ||
| const char CLONE_INNODB_DDL_FILE_EXTN[] = "." OS_FILE_PREFIX "clone_ddl"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All these had better be declared extern here and defined somewhere else. Most of these seem to be specific to clone0api.cc. It would be cleaner to declare these as static data members of an object, say, Clone_Sys or Clone_Handle.
Even better, if clone0api.cc exposed some interfaces to clone0apply.cc and clone0snapshot.cc, then only clone0api.cc would have to know about those file names.
For the _EXTN constants, it could be cleaner to try to use suffixes that are already in use by mariadb-backup: .new, .del, .ren. Those suffixes should avoid a risk of exceeding the maximum file name length limit, compared to the .ibd suffix that the being-copied server is using. The limit is only 255 bytes on Linux extfs and its derivatives.
| using Clone_Msec = std::chrono::milliseconds; | ||
| using Clone_Sec = std::chrono::seconds; | ||
| using Clone_Min = std::chrono::minutes; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels like unnecessary obfuscation. Why can’t we just use those longer names in place of these aliases?
| /** Default sleep time while waiting: 100 ms */ | ||
| const Clone_Msec CLONE_DEF_SLEEP{100}; | ||
|
|
||
| /** Default alert interval in multiple of sleep time: 5 seconds */ | ||
| const Clone_Sec CLONE_DEF_ALERT_INTERVAL{5}; | ||
|
|
||
| /** Default timeout in multiple of sleep time: 30 minutes */ | ||
| const Clone_Min CLONE_DEF_TIMEOUT{30}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of these are only used in the wrapper Clone_Sys::wait_default. I think that it would be clearer to just use the literals directly.
| /** Clone system global */ | ||
| extern Clone_Sys *clone_sys; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Singleton objects are best allocated statically, like we do for trx_sys, dict_sys, lock_sys, buf_pool, recv_sys.
The functions clone_init() and clone_free() should be replaced with Clone_Sys::init() and Clone_Sys::free().
| /** If all innodb tablespaces are initialized. */ | ||
| std::atomic<bool> m_space_initialized; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is unclear why this member is needed. Can innodb_clone_begin() be invoked several times, "incrementally"?
| /** Clone system mutex */ | ||
| mysql_mutex_t m_clone_sys_mutex; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which data members or operations does this protect?
| /** Map for current block number for unfinished chunks. Used during | ||
| restart from incomplete clone operation. */ | ||
| using Chunk_Map = std::map<uint32_t, uint32_t>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is an incomplete clone operation and how does the restart happen? Could the comment mention some relevant function names or SQL statements? What is protecting the Chunk_Info::m_incomplete_chunks? Why do we have this alias at all?
| /** Bitmap for completed chunks in current state */ | ||
| class Chnunk_Bitmap { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is a chunk in this context? The class name is mistyped: it should be Chunk_Bitmap.
| operator bool() const { | ||
| auto &val = m_bitmap_ref[m_map_index]; | ||
|
|
||
| if ((val & m_bit_mask) == 0) { | ||
| return (false); | ||
| } | ||
|
|
||
| return (true); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a rather complex way of writing
operator bool() const { return m_bitmap_ref[m_map_index] & m_bit_mask; }The entire Bitmap_Operator_Impl seems to a potential pessimization. I didn’t check the generated code yet. But, I think that we should consider using the existing template<uint> class Bitmap, instead of introducing a reimplementation of something similar.
| /** Clone plugin name */ | ||
| std::string m_plugin_name; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not see any reason for this to be anything else than const char *const. It is being copied from compile-time constant strings. Why don’t we just have a pointer to the compile-time constant?
Is this the name of plugin/clone? There is only one implementation, right? Do we really need this data member at this point of time?
| /** True if clone provisioning in progress. */ | ||
| static std::atomic<int> s_provision_in_progress; | ||
|
|
||
| /** True, if any user data is dropped by clone. */ | ||
| static std::atomic<bool> s_is_data_dropped; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could these be behind the m_plugin_handle interface? That is, can we have as little code in the server core when the plugin has not been loaded?
| #include <atomic> | ||
| #include <string> | ||
| #include <set> | ||
| #include <tuple> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need every single one of these headers here?
- Works correctly for clang compiler
7febedd to
1f64eef
Compare
| usr/lib/mysql/plugin/sql_errlog.so | ||
| usr/lib/mysql/plugin/type_mysql_json.so | ||
| usr/lib/mysql/plugin/wsrep_info.so | ||
| usr/lib/mysql/plugin/mariadb_clone.so |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sigabrt found on debug build whilee testing main-MDEV-36733
main-MDEV-36733, origin/main-MDEV-36733 7cc113f555cecc6ade9cadd3e3ebae3c9129e0aa 2025-09-02T18:48:45+05:30
#0 __pthread_kill_implementation (no_tid=0, signo=6, threadid=139922541356608) at ./nptl/pthread_kill.c:44
#1 __pthread_kill_internal (signo=6, threadid=139922541356608) at ./nptl/pthread_kill.c:78
#2 __GI___pthread_kill (threadid=139922541356608, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3 __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4 __GI_abort () at ./stdlib/abort.c:79
#5 safe_mutex_destroy (mp=<myclone::Client::s_table_mutex>, file=<optimized out>, line=<optimized out>) at /data/Server/MDEV-36733/mysys/thr_mutex.c:597
#6 inline_mysql_mutex_destroy (src_line=712, src_file="/data/Server/MDEV-36733/plugin/clone/src/clone_client.cc", that=<optimized out>)
at /data/Server/MDEV-36733/include/mysql/psi/mysql_thread.h:724
#7 myclone::Client::uninit_pfs () at /data/Server/MDEV-36733/plugin/clone/src/clone_client.cc:712
#8 myclone::Table_pfs::drop_proxy_tables () at /data/Server/MDEV-36733/plugin/clone/src/clone_status.cc:139
#9 myclone::Table_pfs::release_services () at /data/Server/MDEV-36733/plugin/clone/src/clone_status.cc:225
#10 plugin_clone_deinit (plugin_info=<optimized out>) at /data/Server/MDEV-36733/plugin/clone/src/clone_plugin.cc:408
#11 plugin_deinitialize (plugin=, ref_check=ref_check@entry=true) at /data/Server/MDEV-36733/sql/sql_plugin.cc:1279
#12 reap_plugins () at /data/Server/MDEV-36733/sql/sql_plugin.cc:1350
#13 mysql_uninstall_plugin (thd=thd@entry=, name=<optimized out>, dl_arg=<optimized out>) at /data/Server/MDEV-36733/sql/sql_plugin.cc:2494
#14 mysql_execute_command (thd=thd@entry=, is_called_from_prepared_stmt=is_called_from_prepared_stmt@entry=false) at /data/Server/MDEV-36733/sql/sql_parse.cc:5790
#15 mysql_parse (thd=thd@entry=, rawbuf=<optimized out>, length=<optimized out>, parser_state=parser_state@entry=) at /data/Server/MDEV-36733/sql/sql_parse.cc:7914
#16 dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=, packet=packet@entry="UNINSTALL PLUGIN IF EXISTS clone /* E_R Thread6 QNO 2774 CON_ID 23 */ ",
packet_length=packet_length@entry=70, blocking=blocking@entry=true) at /data/Server/MDEV-36733/sql/sql_parse.cc:1896
#17 do_command (thd=thd@entry=, blocking=blocking@entry=true) at /data/Server/MDEV-36733/sql/sql_parse.cc:1419
#18 do_handle_one_connection (connect=<optimized out>, connect@entry=, put_in_cache=put_in_cache@entry=true) at /data/Server/MDEV-36733/sql/sql_connect.cc:1414
#19 handle_one_connection (arg=) at /data/Server/MDEV-36733/sql/sql_connect.cc:1326
#20 start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#21 clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:100
RR trace is present at pluto:-
/data/results/1756833677/TBR-2326
Raised the bug:-
https://jira.mariadb.org/browse/MDEV-37551
| partition.m_index_file, &m_cap, block, copy_buffer) == | ||
| HA_ERR_END_OF_FILE)) | ||
| break; | ||
| error= aria_read_index(partition.m_index_file, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assertion found on main-MDEV-36733 on debug build
main-MDEV-36733, origin/main-MDEV-36733 7cc113f555cecc6ade9cadd3e3ebae3c9129e0aa 2025-09-02T18:48:45+05:30
# 2025-09-02T17:29:54 [2496309] | mariadbd: /data/Server/MDEV-36733/sql/sql_plugin.cc:1396: void intern_plugin_unlock(LEX*, plugin_ref): Assertion `pi->ref_count' failed.
Stacktrace
#0 __pthread_kill_implementation (no_tid=0, signo=6, threadid=140442117543488) at ./nptl/pthread_kill.c:44
#1 __pthread_kill_internal (signo=6, threadid=140442117543488) at ./nptl/pthread_kill.c:78
#2 __GI___pthread_kill (threadid=140442117543488, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3 __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4 __GI_abort () at ./stdlib/abort.c:79
#5 __assert_fail_base (fmt="%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion="pi->ref_count", file="/data/Server/MDEV-36733/sql/sql_plugin.cc", line=1396,
function=<optimized out>) at ./assert/assert.c:92
#6 __GI___assert_fail (assertion="pi->ref_count", file="/data/Server/MDEV-36733/sql/sql_plugin.cc", line=1396, function="void intern_plugin_unlock(LEX*, plugin_ref)")
at ./assert/assert.c:101
#7 intern_plugin_unlock (lex=lex@entry=, plugin=) at /data/Server/MDEV-36733/sql/sql_plugin.cc:1396
#8 plugin_unlock_list (thd=thd@entry=, list=, list@entry=, count=0, count@entry=86) at /data/Server/MDEV-36733/sql/sql_plugin.cc:1438
#9 plugin_foreach_with_mask (thd=<optimized out>, func=func@entry=<show_plugins(THD*, plugin_ref, void*)>, type=type@entry=-1, state_mask=state_mask@entry=4294967294,
arg=<optimized out>) at /data/Server/MDEV-36733/sql/sql_plugin.cc:2555
#10 fill_plugins (thd=<optimized out>, tables=<optimized out>, cond=<optimized out>) at /data/Server/MDEV-36733/sql/sql_show.cc:346
#11 get_schema_tables_result (join=join@entry=, executed_place=executed_place@entry=PROCESSED_BY_JOIN_EXEC) at /data/Server/MDEV-36733/sql/sql_show.cc:9740
#12 JOIN::exec_inner (this=this@entry=) at /data/Server/MDEV-36733/sql/sql_select.cc:5047
#13 JOIN::exec (this=this@entry=) at /data/Server/MDEV-36733/sql/sql_select.cc:4874
#14 mysql_select (thd=thd@entry=, tables=, fields=..., conds=, og_num=0, order=, group=, having=, proc_param=, select_options=2701396736, result=, unit=, select_lex=)
at /data/Server/MDEV-36733/sql/sql_select.cc:5402
#15 handle_select (thd=thd@entry=, lex=lex@entry=, result=result@entry=, setup_tables_done_option=setup_tables_done_option@entry=0) at /data/Server/MDEV-36733/sql/sql_select.cc:634
#16 execute_sqlcom_select (thd=thd@entry=, all_tables=) at /data/Server/MDEV-36733/sql/sql_parse.cc:6198
#17 mysql_execute_command (thd=thd@entry=, is_called_from_prepared_stmt=is_called_from_prepared_stmt@entry=false) at /data/Server/MDEV-36733/sql/sql_parse.cc:3975
#18 mysql_parse (thd=thd@entry=, rawbuf=<optimized out>, length=<optimized out>, parser_state=parser_state@entry=) at /data/Server/MDEV-36733/sql/sql_parse.cc:7914
#19 dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=,
packet=packet@entry="SELECT PLUGIN_NAME, PLUGIN_STATUS FROM INFORMATION_SCHEMA.PLUGINS WHERE PLUGIN_NAME = 'clone' /* E_R Thread10 QNO 469 CON_ID 26 */ ",
packet_length=packet_length@entry=131, blocking=blocking@entry=true) at /data/Server/MDEV-36733/sql/sql_parse.cc:1896
#20 do_command (thd=thd@entry=, blocking=blocking@entry=true) at /data/Server/MDEV-36733/sql/sql_parse.cc:1419
#21 do_handle_one_connection (connect=<optimized out>, connect@entry=, put_in_cache=put_in_cache@entry=true) at /data/Server/MDEV-36733/sql/sql_connect.cc:1414
#22 handle_one_connection (arg=) at /data/Server/MDEV-36733/sql/sql_connect.cc:1326
#23 start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#24 clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:100
RR trace is present on pluto:
/data/results/1756833677/TBR-2327
Raised the bug:-
https://jira.mariadb.org/browse/MDEV-37552
dr-m
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still not a complete review.
It seems to me that we should refactor the log archiving. I am not happy to see a large number of include/arch* and arch/* files. Do we really need a new file format for the archived log? Also, I do not understand what a doublewrite buffer would be useful (even in the case that the archiving covers actual persistent data files and not only the write-ahead log).
| #endif // NEVER | ||
|
|
||
| Progress_pfs::Progress_pfs() : Table_pfs(S_NUM_ROWS) | ||
| { | ||
| // auto table = get_proxy_share(); | ||
| // table->m_table_name = "clone_progress"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that this logic would be needed in MDEV-36742 Clone: Support Remote Clone. I would suggest to remove all #ifdef NEVER and commented-out lines, to make this reviewable. It’s easier to review and understand just the bare minimum code that is needed.
| status_file << state << " " << m_threads[cur_index] << " " | ||
| << m_start_time[cur_index] << " " << m_end_time[cur_index] | ||
| << " " << m_estimate[cur_index] << " " << m_complete[cur_index] | ||
| << " " << m_network[cur_index] << std::endl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this code (which is not enclosed in #ifdef NEVER) really used or reachable? It seems that m_network is irrelevant for this task (local clone).
| | CLONE_SYM INSTANCE_SYM FROM user ':' ulong_num | ||
| IDENTIFIED_SYM BY TEXT_STRING | ||
| opt_datadir_ssl | ||
| { | ||
| Lex->sql_command= SQLCOM_CLONE; | ||
| /* TODO: Reject space characters around ':' */ | ||
| $4->auth= new (thd->mem_root) USER_AUTH(); | ||
| $4->auth->pwtext= $9; | ||
|
|
||
| Lex->m_sql_cmd= new (thd->mem_root) | ||
| Sql_cmd_clone($4, $6, $10); | ||
|
|
||
| if (Lex->m_sql_cmd == nullptr) | ||
| MYSQL_YYABORT; | ||
| } | ||
| ; | ||
|
|
||
| opt_datadir_ssl: | ||
| opt_ssl | ||
| { | ||
| $$= null_clex_str; | ||
| } | ||
| | DATA_SYM DIRECTORY_SYM opt_equal TEXT_STRING_filesystem opt_ssl | ||
| { | ||
| $$= $4; | ||
| } | ||
| ; | ||
|
|
||
| opt_ssl: | ||
| /* empty */ | ||
| { | ||
| Lex->account_options.ssl_type= SSL_TYPE_NOT_SPECIFIED; | ||
| } | ||
| | REQUIRE_SYM SSL_SYM | ||
| { | ||
| Lex->account_options.ssl_type= SSL_TYPE_SPECIFIED; | ||
| } | ||
| | REQUIRE_SYM NO_SYM SSL_SYM | ||
| { | ||
| Lex->account_options.ssl_type= SSL_TYPE_NONE; | ||
| } | ||
| ; | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This (as well as the introduction of the terminal symbol INSTANCE_SYM) seems to be needed for a future task MDEV-36742 (remote clone), not this one.
| while (arch_sys && arch_sys->signal_archiver()) | ||
| { | ||
| std::this_thread::sleep_for(sleep_time); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of sleeping, can we wait for a condition variable, possibly with a timeout? There seems to be a function Arch_Sys::archiver_wait(), which could be extended with a timeout argument.
| mysql_mutex_lock(&m_mutex); | ||
| if (!m_archiver_active) | ||
| { | ||
| try | ||
| { | ||
| std::thread(Arch_Sys::archiver).detach(); | ||
| m_archiver_active= true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be cleaner to set m_archiver_active before creating the thread. Also, m_archiver_active could be Atomic_relaxed<bool>.
Furthermore, I think that it would be better to use the tpool subsystem for scheduling these tasks, like we do for the purge and most other InnoDB tasks.
| bool alive= false; | ||
| mysql_mutex_lock(&m_mutex); | ||
| if (m_archiver_active) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
simpler:
mysql_mutex_lock(&m_mutex);
const bool alive{m_archiver_active};
if (alive)| /* First file in the archive group. */ | ||
| ut_ad(m_file_ctx.get_count() == 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is only one ib_logfile0 in MariaDB, and this is about log archiving. What is an "archive group"? The parameter innodb_log_files_in_group was removed from MariaDB.
Also, if this subsystem is only about log archiving, why don’t we just append to the archived log file in log_t::write_buf(), similar to how we maintain an extra copy re_write_buf for ib_logfile101 during log resizing?
| Arch_Group::write_to_doublewrite_file(from_file, from_buffer, write_size, | ||
| dblwr_offset); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we have to write any doublewrite buffer? mariadb-backup does not deal with any doublewrite buffer either.
| bool Arch_File_Ctx::delete_file(uint file_index, lsn_t begin_lsn) | ||
| { | ||
| bool success; | ||
| char file_name[MAX_ARCH_PAGE_FILE_NAME_LEN]; | ||
|
|
||
| build_name(file_index, begin_lsn, file_name, MAX_ARCH_PAGE_FILE_NAME_LEN); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no comment of the purpose of the begin_lsn (not even in arch0arch.h). Apparently it may form a part of the file name. It is also completely unclear what a "group" is, in the context of begin_lsn. There are no high-level source code comments about this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, Arch_File_Ctx::delete_file() is not being invoked at all while executing the regression tests.
| #ifndef EMBEDDED_LIBRARY | ||
| Clone_notify *notifier= new Clone_notify( | ||
| Clone_notify::Type::SYSTEM_REDO_RESIZE, UINT32_MAX, true); | ||
| if (notifier->failed()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is completely unclear to me why log resizing would interfere with the clone functionality. If we replicated the log in log_t::write_buf(), then we should be completely indifferent to any log resizing.
Log resizing currently interferes with mariadb-backup --backup because once the log resizing completes, the old ib_logfile0 (which will be replaced by the ib_logfile101 that was created for the resized log) will cease to receive updates. So, from the point of view of mariadb-backup --backup which is holding a file handle to the old ib_logfile0, the server will appear to hang (stop writing any log to the old file).
In the clone, as far as I understand, we could just create an append-only archive log file, starting from the latest checkpoint LSN or the current LSN. It should basically be just like the ib_logfile101 that we create for log resizing, only an independent copy, using its own log_sys.clone_lsn instead of log_sys.resize_lsn. It should not matter if the server is concurrently resizing its main log file; we can just keep appending to the clone/ib_logfile0 or whatever we call that file.
| typedef struct st_net_server NET_SERVER; | ||
| typedef struct st_mysql MYSQL; | ||
| typedef struct st_mysql_socket MYSQL_SOCKET; | ||
|
|
||
| // #include "mysql_com_server.h" | ||
|
|
||
| /** Connection parameters including SSL */ | ||
| typedef struct mysql_clone_ssl_context_t { | ||
| /** Clone ssl mode. Same as mysql client --ssl-mode */ | ||
| int m_ssl_mode; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do not seem to need any of this for the local clone functionality. Do we need this file at all? Also, I see that many unrlated .h.pp files for other plugins were changed. That does not seem right to me.
| ../sql/opt_hints.cc ../sql/opt_hints.h | ||
| ../sql/opt_trace_ddl_info.cc ../sql/opt_trace_ddl_info.h | ||
| ${GEN_SOURCES} | ||
| ${GEN_SOURCES} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There seems to be only this white-space-only change to this file. I would suggest to revert it.
| performance-schema-max-stage-classes 160 | ||
| performance-schema-max-statement-classes 227 | ||
| performance-schema-max-stage-classes 175 | ||
| performance-schema-max-statement-classes 229 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need all these classes for the local clone functionality?
| # 2. Test Remote Clone command. Default is local clone. | ||
| # --let remote_clone = 1 | ||
| # | ||
| # 3. Remote Clone command is expected to return error and the error number | ||
| # is different from local clone. | ||
| # --let clone_remote_err = <error number> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are not relevant here. I would suggest to omit the numbering "1.", "2.", "3." as well, to avoid the need of any future merge conflicts or adjustments.
| # 4. Skip clone_valid_donor_list configuration for testing error cases | ||
| # --let skip_donor_config = 1 | ||
| # | ||
| # 5. Test clone automatic tuning of threads | ||
| # --let clone_auto_tune = 1 | ||
| # | ||
| # 6. Test clone command forcing SSL [REQUIRES SSL] | ||
| # --let clone_require_ssl = 1 | ||
| # | ||
| # 7. Test clone command forcing insecure connection [REQUIRES NO SSL] | ||
| # --let clone_require_no_ssl = 1 | ||
| # | ||
| # 8. Test clone command forcing SSL certificate validation | ||
| # --let clone_require_ssl_certificate = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure about 4. and 5., but the SSL related options definitely do not seem to be applicable to the local clone. I would suggest to trim this file to the minimum.
| if ($ddl_encryption) { | ||
| SET GLOBAL DEBUG = '+d,log_redo_with_invalid_master_key'; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no matching DBUG_EXECUTE_IF instrumentation.
| if ($ddl_redo_encrypt) { | ||
| --let $skip_ddl = 1 | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would we need this? The log_t::write_buf(), which in my opinion is the correct place for replicating or "archiving" the log, would just see and copy the encrypted data as is if innodb_encrypt_log=ON.
| --source include/big_test.inc | ||
| --source include/not_parallel.inc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are these for? Are the tests that include this file being run on buildbot at all?
| --source include/count_sessions.inc | ||
|
|
||
| ## Get rid of the binary logs from previous run. | ||
| RESET BINARY LOGS AND GTIDS; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This syntax seems to be specific to MySQL and should throw a syntax error on MariaDB. I did not find any .result file with this output, or any reference to this file ddl_common.inc in any other file.
| if ($remote_clone) { | ||
| SET GLOBAL DEBUG = '+d,remote_release_clone_file_pin'; | ||
| } | ||
|
|
||
| if (!$remote_clone) { | ||
| SET GLOBAL DEBUG = '+d,local_release_clone_file_pin'; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DBUG_EXECUTE_IF for both these instrumentations exist in the source code, but they are not covered by any other than this one (which apparently is not being included in any test).
| ut_ad(latch_have_wr()); | ||
|
|
||
| if (arch_sys) | ||
| arch_sys->log_sys()->wait_archiver(lsn); | ||
|
|
||
| lsn_t old= flushed_to_disk_lsn.load(std::memory_order_relaxed); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There seems to be a race condition here, because Arch_Log_Sys::wait_archiver() may release log_sys.latch. I think that it would be easiest to just not support the clone function with memory-mapped log writes (PMEM). An easy way to do this might be to have log_mmap() return MAP_FAILED when the clone plugin is loaded.
| if (arch_sys) | ||
| arch_sys->log_sys()->wait_archiver(lsn); | ||
|
|
||
| if (resizing != RETAIN_LATCH) | ||
| latch.wr_unlock(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arch_Sys::archiver_wait() may release and reacquire log_sys.latch. This would break the assumption that is being made in log_write_and_flush().
I think that it would be cleaner to introduce data members to log_t related to archiving the log, such as clone_lsn and clone_log (similar to resize_lsn and resize_log), and introduce a log_t::clone_write_buf() as well. In that way, cloning the log would simply rely on the existing group commit mechanism and there would be fewer context switches and no potential race condition scenarios.
| /** Last written LSN; protected by latch */ | ||
| lsn_t write_lsn; | ||
| Atomic_relaxed<lsn_t> write_lsn; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why was this change made? I think that it would be cleanest to have log_sys.latch cover also the cloning of the log, which would be reflected by new data members clone_lsn and clone_log, similar to resize_lsn and resize_log.
- Fixing failed test case
| MYSQL_SYSVAR(delay_after_data_drop), | ||
| #endif | ||
| nullptr}; | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assertion found on debug build
main-MDEV-36733, origin/main-MDEV-36733 7cc113f555cecc6ade9cadd3e3ebae3c9129e0aa 2025-09-02T18:48:45+05:30
# 2025-09-02T20:19:12 [2626295] | mariadbd: /data/Server/MDEV-36733/plugin/clone/src/clone_client.cc:625: int myclone::Client::pfs_begin_state(): Assertion `s_num_clones == 1' failed.
Stack
# 2025-09-02T20:27:48 [2626295] | Thread 3 (Thread 2635011.2671291 (one_connection)):
# 2025-09-02T20:27:48 [2626295] | #0 __pthread_kill_implementation (no_tid=0, signo=6, threadid=140496569935424) at ./nptl/pthread_kill.c:44
# 2025-09-02T20:27:48 [2626295] | #1 __pthread_kill_internal (signo=6, threadid=140496569935424) at ./nptl/pthread_kill.c:78
# 2025-09-02T20:27:48 [2626295] | #2 __GI___pthread_kill (threadid=140496569935424, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
# 2025-09-02T20:27:48 [2626295] | #3 0x000060bc66901476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
# 2025-09-02T20:27:48 [2626295] | #4 0x000060bc668e77f3 in __GI_abort () at ./stdlib/abort.c:79
# 2025-09-02T20:27:48 [2626295] | #5 0x000060bc668e771b in __assert_fail_base (fmt=0x60bc66a9c130 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x1e99687ea0de "s_num_clones == 1", file=0x1e99687ea4a8 "/data/Server/MDEV-36733/plugin/clone/src/clone_client.cc", line=625, function=<optimized out>) at ./assert/assert.c:92
# 2025-09-02T20:27:48 [2626295] | #6 0x000060bc668f8e96 in __GI___assert_fail (assertion=0x1e99687ea0de "s_num_clones == 1", file=0x1e99687ea4a8 "/data/Server/MDEV-36733/plugin/clone/src/clone_client.cc", line=625, function=0x1e99687ea798 "int myclone::Client::pfs_begin_state()") at ./assert/assert.c:101
# 2025-09-02T20:27:48 [2626295] | #7 0x00001e99687b6092 in myclone::Client::pfs_begin_state (this=this@entry=0x7fc7e82481a8) at /data/Server/MDEV-36733/plugin/clone/src/clone_client.cc:625
# 2025-09-02T20:27:48 [2626295] | #8 0x00001e99687d2337 in myclone::Local::clone (this=this@entry=0x7fc7e82481a0) at /data/Server/MDEV-36733/plugin/clone/src/clone_local.cc:64
# 2025-09-02T20:27:48 [2626295] | #9 0x00001e99687b34a6 in plugin_clone_local (thd=0x631360020158, data_dir=<optimized out>) at /data/Server/MDEV-36733/plugin/clone/src/clone_plugin.cc:429
# 2025-09-02T20:27:48 [2626295] | #10 0x000055a258a17842 in Clone_handler::clone_local (this=0x1359200d7010, thd=thd@entry=0x631360020158, data_dir=<optimized out>) at /data/Server/MDEV-36733/sql/clone_handler.cc:124
# 2025-09-02T20:27:48 [2626295] | #11 0x000055a25888128c in Sql_cmd_clone::execute (this=0x6313600356f8, thd=0x631360020158) at /data/Server/MDEV-36733/sql/sql_admin.cc:1800
# 2025-09-02T20:27:48 [2626295] | #12 0x000055a2586e2b61 in mysql_execute_command (thd=thd@entry=0x631360020158, is_called_from_prepared_stmt=is_called_from_prepared_stmt@entry=false) at /data/Server/MDEV-36733/sql/sql_parse.cc:5889
# 2025-09-02T20:27:48 [2626295] | #13 0x000055a2586e3871 in mysql_parse (thd=thd@entry=0x631360020158, rawbuf=<optimized out>, length=<optimized out>, parser_state=parser_state@entry=0x7fc7e8249320) at /data/Server/MDEV-36733/sql/sql_parse.cc:7914
# 2025-09-02T20:27:48 [2626295] | #14 0x000055a2586e59ed in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x631360020158, packet=packet@entry=0x63136002a7b9 "CLONE LOCAL DATA DIRECTORY = '/invalid/nonexistent/path' /* E_R Thread13 QNO 631 CON_ID 27 */ ", packet_length=packet_length@entry=94, blocking=blocking@entry=true) at /data/Server/MDEV-36733/sql/sql_parse.cc:1896
# 2025-09-02T20:27:48 [2626295] | #15 0x000055a2586e7a3e in do_command (thd=thd@entry=0x631360020158, blocking=blocking@entry=true) at /data/Server/MDEV-36733/sql/sql_parse.cc:1419
# 2025-09-02T20:27:48 [2626295] | #16 0x000055a25886ea17 in do_handle_one_connection (connect=<optimized out>, connect@entry=0x55a25c6186a8, put_in_cache=put_in_cache@entry=true) at /data/Server/MDEV-36733/sql/sql_connect.cc:1414
# 2025-09-02T20:27:48 [2626295] | #17 0x000055a25886ec57 in handle_one_connection (arg=0x55a25c6186a8) at /data/Server/MDEV-36733/sql/sql_connect.cc:1326
# 2025-09-02T20:27:48 [2626295] | #18 0x000060bc66953ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
# 2025-09-02T20:27:48 [2626295] | #19 0x000060bc669e4a04 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:100
RR trace is present on pluto:-
/data/results/1756833677/TBR-2328
Bug:-https://jira.mariadb.org/browse/MDEV-37561
|
I compiled both 4fd2564 and the baseline e02f4d7 with
The We can see that 4fd2564 is actually adding much more code to the server (373 KiB) rather than the clone plugin (219 KiB). In an earlier review comment I expressed the impression that some code in Because the local clone implements a small subset of what |
dr-m
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I started to look at code coverage, as well as some things that I noticed while checking reports of code that is not being covered by the tests. This report only covers the storage/innobase/arch subdirectory. I will check storage/innobase/clone later.
| # Monitor clone operations using performance schema's stage and statement events. | ||
| --source include/have_innodb.inc | ||
| --source include/have_debug.inc | ||
| --source include/have_debug_sync.inc | ||
| --source include/count_sessions.inc | ||
| --source include/not_embedded.inc | ||
|
|
||
| # Disable PFS monitoring for threads by default | ||
| connect (con1,localhost,root,,); | ||
| CALL sys.ps_setup_disable_thread(CONNECTION_ID()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--source include/have_perfschema.incis missing here, causing a failure for cmake -DPLUGIN_PERFSCHEMA=NO builds:
CURRENT_TEST: clone.monitor_progress
mysqltest: At line 10: query 'CALL sys.ps_setup_disable_thread(CONNECTION_ID())' failed: ER_SP_DOES_NOT_EXIST (1305): PROCEDURE sys.ps_setup_disable_thread does not exist
| if (++sleep_count == 10 && backoff_count < 3) | ||
| { | ||
| sleep_time*= 10; | ||
| sleep_count= 0; | ||
| ++backoff_count; | ||
| continue; | ||
| } | ||
| if (sleep_count == 30 ) | ||
| ib::warn() << "Archiver still running: Waited 30 seconds."; | ||
|
|
||
| else if (sleep_count >= 600) | ||
| ib::fatal() << "Archiver still running: Waited for 10 minutes."; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These conditions do not hold in the regression test suite, in a cmake -DCMAKE_BUILD_TYPE=Debug -DENABLE_GCOV=ON build.
| if (!os_file_create_directory(ARCH_DIR, false)) | ||
| { | ||
| my_error(ER_CANT_CREATE_FILE, MYF(0), ARCH_DIR, errno); | ||
| return ER_CANT_CREATE_FILE; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The condition is not being covered. It would seem to be relatively simple to cover, for example, by creating a read-only parent directory and asking the directory to be created there.
| catch (...) | ||
| { | ||
| my_error(ER_CANT_CREATE_THREAD, MYF(0), errno); | ||
| m_archiver_active= false; | ||
| err= ER_CANT_CREATE_THREAD; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not reachable. (But, we should use tpool anyway.)
| void Arch_Sys::remove_dir(const char *dir_path, const char *dir_name) | ||
| { | ||
| char path[MAX_ARCH_DIR_NAME_LEN]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is not being invoked when running the regression tests. Why would we need this code at all?
| /** @return stop block position of the group. */ | ||
| Arch_Page_Pos get_stop_pos() const { return (m_stop_pos); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unreachable
| static lsn_t align_lsn(lsn_t lsn, lsn_t ref_lsn) | ||
| { | ||
| lsn_t lsn_diff = (ref_lsn <= lsn) ? (lsn - ref_lsn) : | ||
| (ref_lsn - lsn + OS_FILE_LOG_BLOCK_SIZE); | ||
| lsn_diff = ut_uint64_align_down(lsn_diff, OS_FILE_LOG_BLOCK_SIZE); | ||
| lsn_t r = (ref_lsn <= lsn) ? (ref_lsn + lsn_diff) : (ref_lsn - lsn_diff); | ||
| ut_ad(r <= lsn); | ||
| return r; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would we reintroduce OS_FILE_LOG_BLOCK_SIZE and this complex math? Thankfully, this function appears to be unreacahble (at least in some expansions of this inline function).
| /** Align down LSN value to multiples of OS_FILE_LOG_BLOCK_SIZE with respect | ||
| to groups first LSN value. | ||
| @param[in] lsn LSN value to align | ||
| @return aligned LSN value. */ | ||
| lsn_t align_lsn(lsn_t lsn) const | ||
| { | ||
| ut_ad(m_first_lsn <= lsn); | ||
| return Arch_Group::align_lsn(lsn, m_first_lsn); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we do not want to reintroduce the 512-byte log blocks after 685d958 got rid of them.
| void get_dir_name(char *name_buf, uint buf_len) | ||
| { | ||
| m_file_ctx.build_dir_name(m_begin_lsn, name_buf, buf_len); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unreachable
| /** Get the mutex protecting concurrent start, stop operations required | ||
| for initialising group during recovery. | ||
| @return mutex */ | ||
| mysql_mutex_t *get_mutex() { return (&m_mutex); } | ||
|
|
||
| /** @return operation mutex */ | ||
| mysql_mutex_t *get_oper_mutex() { return (&m_oper_mutex); } | ||
|
|
||
| /** Fetch the system client context. | ||
| @return system client context. */ | ||
| Page_Arch_Client_Ctx *get_sys_client() const { return (m_ctx); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The get_mutex() appears to be unreachable. What is the point of these? We could declare each data member public. If m_ctx really needs the pointer indirection, then it could be declared as Page_Arch_Client_Ctx *const m_ctx; and be public.
dr-m
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked clone0api, clone0apply, clone0clone. Lots of unreachable code. I will abort the code coverage analysis for now.
| /** Rename clone status file. The operation is expected to be atomic | ||
| when the files belong to same directory. | ||
| @param[in] from_file name of current file | ||
| @param[in] to_file name of new file */ | ||
| static void rename_file(const std::string &from_file, | ||
| const std::string &to_file) | ||
| { | ||
| auto ret= std::rename(from_file.c_str(), to_file.c_str()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unused function. This is also duplicating os_file_rename_func().
| static void create_file(std::string &file_name) | ||
| { | ||
| std::ofstream file(file_name.c_str()); | ||
|
|
||
| if (file.is_open()) | ||
| { | ||
| file.close(); | ||
| return; | ||
| } | ||
| ib::error() << "Error creating file : " << file_name.c_str(); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is duplicating os_file_create(), and it is hiding the status from the caller. The error output is unreachable.
| /** Delete clone status file or directory. | ||
| @param[in] file name of file */ | ||
| static void remove_file(const std::string &file) | ||
| { | ||
| os_file_type_t file_type; | ||
| bool exists; | ||
|
|
||
| if (!os_file_status(file.c_str(), &exists, &file_type)) | ||
| { | ||
| ib::error() << "Error checking a file to remove : " << file.c_str(); | ||
| return; | ||
| } | ||
| /* Allow non existent file, as the server could have crashed or returned | ||
| with error before creating the file. This is needed during error cleanup. */ | ||
| if (!exists) | ||
| return; | ||
|
|
||
| /* In C++17 there will be std::filesystem::remove_all and the | ||
| code below will no longer be required. */ | ||
| if (file_type == OS_FILE_TYPE_DIR) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is quite a bit of unnecessary logic here. Why do we invoke stat before unlink? Also, we already have os_file_delete_func(). Also, we do depend on C++17 and std::filesystem elsewhere. If this is essentially rm -fr (removing a directory tree), the function name remove_file is misleading. I think that the code around this would have to be refactored.
Thankfully, the file_type == OS_FILE_TYPE_DIR code path seems to be unreachable. Maybe this entire function could omitted and the caller could invoke os_file_delete() or even better, IF_WIN(DeleteFile(file.c_str()), unlink(file.c_str())) if all we want is "delete if exists" semantics?
| if (clone->replace_datadir()) | ||
| { | ||
| /* Create error file for rollback. */ | ||
| file_name.assign(CLONE_INNODB_ERROR_FILE); | ||
| create_file(file_name); | ||
| return; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The condition is unreachable. But, I think that replacing the target directory is something that would have to be explicitly asked for in the syntax. mariadb-backup --backup would fail if --target-dir already exists.
| file_name.assign(path); | ||
| /* Add path separator if needed. */ | ||
| if (file_name.back() != OS_PATH_SEPARATOR) | ||
| file_name.append(OS_PATH_SEPARATOR_STR); | ||
|
|
||
| file_name.append(CLONE_INNODB_IN_PROGRESS_FILE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The if condition is unreachable. We might just unconditionally do some snprintf() with "%s/%s"; an extra / will not actually hurt. We might also use openat and friends.
| auto err_exit= [&](int in_err) | ||
| { | ||
| char errbuf[MYSYS_STRERROR_SIZE]; | ||
| my_error(in_err, MYF(0), file_name.c_str(), errno, | ||
| my_strerror(errbuf, sizeof(errbuf), errno)); | ||
| return in_err; | ||
| }; | ||
|
|
||
| if (!status) | ||
| return err_exit(ER_CANT_OPEN_FILE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All err_exit are unreachable. A simple goto err_exit should produce less code.
| bool success= false; | ||
| auto handle= os_file_create(innodb_clone_file_key, file_name.c_str(), option, | ||
| file_type, read_only, &success); | ||
|
|
||
| int err= 0; | ||
| if (!success) | ||
| err= (option == OS_FILE_OPEN) ? ER_CANT_OPEN_FILE : ER_CANT_CREATE_FILE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unreachable error path
| auto db_err= init_cbk(handle); | ||
|
|
||
| if (db_err != DB_SUCCESS) | ||
| { | ||
| os_file_close(handle); | ||
| err= ER_ERROR_ON_WRITE; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unreachable condition
| int Clone_Handle::close_file(Clone_Task *task) { | ||
| bool success = true; | ||
|
|
||
| /* Close file, if opened. */ | ||
| if (task->m_current_file_des.m_file != OS_FILE_CLOSED) { | ||
| success = os_file_close(task->m_current_file_des); | ||
| } | ||
|
|
||
| task->m_current_file_des.m_file = OS_FILE_CLOSED; | ||
| task->m_file_cache = true; | ||
|
|
||
| if (!success) { | ||
| my_error(ER_INTERNAL_ERROR, MYF(0), "Innodb error while closing file"); | ||
| return (ER_INTERNAL_ERROR); | ||
| } | ||
|
|
||
| return (0); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need this much code? Why not just unconditionally invoke os_file_close() and let it do any error reporting? The my_error is unreachable here.
| } else if (buf_cbk) { | ||
| unsigned char *data_buf = nullptr; | ||
| uint32_t data_len = 0; | ||
| /* Get data buffer */ | ||
| err = cbk->apply_buffer_cbk(data_buf, data_len); | ||
| if (err == 0) { | ||
| /* Modify and write data buffer to file. */ | ||
| err = modify_and_write(task, offset, data_buf, data_len); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unreachable condition. Can the bool buf_cbk=false parameter be removed?
Description
MDEV-36733 Clone: Implement support for local cloning
This commit introduces the CLONE LOCAL DATA DIRECTORY command, enabling the creation of a
physical copy of a MariaDB instance on the same server.
This new functionality provides a fast and efficient way to duplicate a MariaDB data directory.
Limitation:
How can this PR be tested?
Added --suite=clone
Basing the PR against the correct MariaDB version
mainbranch.PR quality check