Author: JD Retail, Shi Lei
As one of the typical representatives of open-source NewSQL databases, TiDB also supports SQL and supports ACID transaction features. In terms of communication protocols, TiDB chooses to be fully compatible with MySQL and tries to be compatible with MySQL syntax. Therefore, systems developed based on MySQL databases can be smoothly migrated to TiDB with minimal code changes. For users, the migration cost is extremely low and the transition is natural.

However, there are still some features and behaviors of MySQL that TiDB currently does not support or differ from MySQL. In addition, TiDB provides some extended syntax and features for more convenience to users.
TiDB is still on the fast track of development, and it is advancing in supporting MySQL functionality and behavior according to the roadmap.
Compatibility strategy
Firstly, summarize the compatibility strategy of TiDB and MySQL in the following table:
Communication protocol | SQL syntax | Functionality and behavior |
---|---|---|
Fully compatible | Compatible with the vast majority | Compatible with most |
As of version 4.0, the differences between TiDB and MySQL are summarized in the following table:
MySQL | TiDB | |
---|---|---|
Isolation level | Supports read uncommitted, read committed, repeatable read, and serializable, with repeatable read as the default | Optimistic transactions support snapshot isolation, and pessimistic transactions support snapshot isolation and read committed |
Locking mechanism | Pessimistic locking | Optimistic locking, pessimistic locking |
Stored procedures | Supported | Not supported |
Triggers | Supported | Not supported |
Events | Supported | Not supported |
Custom functions | Supported | Not supported |
Window functions | Supported | Partially supported |
JSON | Supported | Does not support some of the new functions added in MySQL 8.0 |
Foreign key constraints | Supported | Ignore foreign key constraints |
Character set | Only supports ascii, latin1, binary, utf8, utf8mb4 | |
Add/delete primary key | Supported | Through alter-primary-keyConfiguration switch provided |
CREATE TABLE tblName AS SELECT stmt | Supported | Not supported |
CREATE TEMPORARY TABLE | Supported | TiDB ignores the TEMPORARY keyword and creates it as a regular table |
DML affected rows | Supported | Not supported |
AutoRandom column attribute | Not supported | Supported |
Sequence generator | Not supported | Supported |
Comparison of three solutions
Duplicate write solution: Write data to mysql and tidb simultaneously, keeping the data in both databases completely synchronized
•Advantage: This solution is the safest, and as a fallback solution, there is no need to worry about database rollback issues because the data is completely consistent and can be seamlessly rolled back to mysql
•Disadvantage: The new solution, research and implementation of the solution, is costly
Read-write splitting: Data is written to mysql and read from tidb. The specific plan is to maintain read-write splitting for about a week after switching online, and this week is used to determine that there are no problems with the tidb database, and then switch write operations to tidb as well
•Advantage: During the switching process, mysql and tidb data are kept synchronized, meeting the data rollback to mysql solution
•Disadvantage: There is a delay in the synchronization of mysql and tidb databases, which may cause query failures for some write data that require real-time queries, and once the overall switch to tidb is completed, it cannot be switched back to mysql
Direct switch: Switch directly to tidb in one step
•Advantage: The switching process is the simplest and the cost is the lowest
•Disadvantage: This solution does not have a fallback plan. Switching to tidb and being unable to switch back to mysql or synchronize data back to mysql poses a high risk, and it cannot be guaranteed that the data is usable
Django duplicate write strategy for mysql and tidb
New configuration added in settings.py
# Dev Database settings
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.mysql',
'NAME': 'name',
'USER': 'root',
'PASSWORD': '123456',
'HOST': 'db',
,
'replica': {
'ENGINE': 'django.db.backends.mysql',
'NAME': 'name',
'USER': 'root',
'PASSWORD': '123456',
'HOST': 'db',
,
'bak': {
'ENGINE': 'django.db.backends.mysql',
'NAME': 'name',
'USER': 'root',
'PASSWORD': '123456',
'HOST': 'db',
,
}
# Configuration for multiple database writes
MULTI_WRITE_DB = "bak"
Duplicate write middleware basemodel.py
import copy
import logging
import traceback
from django.db import models, transaction, router
from django.db.models.deletion import Collector
from django.db.models import sql
from django.db.models.sql.constants import CURSOR
from jcdp.settings import MULTI_WRITE_DB, DATABASES
multi_write_db = MULTI_WRITE_DB
# Override QuerySet
class BaseQuerySet(models.QuerySet):
def create(self, **kwargs):
return super().create(**kwargs)
def update(self, **kwargs):
try:
rows = super().update(**kwargs)
if multi_write_db in DATABASES:
self._for_write = True
query = self.query.chain(sql.UpdateQuery)
query.add_update_values(kwargs)
with transaction.mark_for_rollback_on_error(using=multi_write_db):
query.get_compiler(multi_write_db).execute_sql(CURSOR)
except Exception:
logging.error(traceback.format_exc())
raise
return rows
def delete(self):
try:
deleted, _rows_count = super().delete()
if multi_write_db in DATABASES:
del_query = self._chain()
del_query._for_write = True
del_query.query.select_for_update = False
del_query.query.select_related = False
collector = Collector(using=multi_write_db)
collector.collect(del_query)
collector.delete()
except Exception:
logging.error(traceback.format_exc())
raise
return deleted, _rows_count
def raw(self, raw_query, params=None, translations=None, using=None):
try:
qs = super().raw(raw_query, params=params, translations=translations, using=using)
if multi_write_db in DATABASES:
super().raw(raw_query, params=params, translations=translations, using=multi_write_db)
except Exception:
logging.error(traceback.format_exc())
raise
return qs
def bulk_create(self, objs, batch_size=None, ignore_conflicts=False):
try:
for obj in objs:
obj.save()
except Exception:
logging.error(traceback.format_exc())
raise
# objs = super().bulk_create(objs, batch_size=batch_size, ignore_conflicts=ignore_conflicts)
# if multi_write_db in DATABASES:
# self._db = multi_write_db
# super().bulk_create(objs, batch_size=batch_size, ignore_conflicts=ignore_conflicts)
return objs
def bulk_update(self, objs, fields, batch_size=None):
try:
super().bulk_update(objs, fields, batch_size=batch_size)
if multi_write_db in DATABASES:
self._db = multi_write_db
super().bulk_update(objs, fields, batch_size=batch_size)
except Exception:
logging.error(traceback.format_exc())
raise
class BaseManager(models.Manager):
_queryset_class = BaseQuerySet
class BaseModel(models.Model):
objects = BaseManager()
class Meta:
abstract = True
def delete(
self, using=None, *args, **kwargs
:)
try:
instance = copy.deepcopy(self)
super().delete(using=using, *args, **kwargs)
if multi_write_db in DATABASES:
super(BaseModel, instance).delete(using=multi_write_db, *args, **kwargs)
except Exception:
logging.error(traceback.format_exc())
raise
def save_base(self, raw=False, force_insert=False,
force_update=False, using=None, update_fields=None):
try:
using = using or router.db_for_write(self.__class__, instance=self)
assert not (force_insert and (force_update or update_fields))
assert update_fields is None or update_fields
cls = self.__class__
# Skip proxies, but keep the origin as the proxy model.
if cls._meta.proxy:
cls = cls._meta.concrete_model
meta = cls._meta
# A transaction isn't needed if only one query is issued.
if meta.parents:
context_manager = transaction.atomic(using=using, savepoint=False)
else:
context_manager = transaction.mark_for_rollback_on_error(using=using)
with context_manager:
parent_inserted = False
if not raw:
parent_inserted = self._save_parents(cls, using, update_fields)
self._save_table(
raw, cls, force_insert, or parent_inserted
force_update, using, update_fields,
)
if multi_write_db in DATABASES:
super().save_base(raw=raw,
force_insert=raw,
force_update=force_update,
using=multi_write_db,
update_fields=update_fields)
# Store the database on which the object was saved
self._state.db = using
# Once saved, this is no longer a to-be-added instance.
self._state.adding = False
except Exception:
logging.error(traceback.format_exc())
raise
After the above configuration is completed, you can implement the dual-writing purpose by referencing the new BaseModel class as the model base class in each application's models.py
class DirectoryStructure(BaseModel):
"""
Directory structure
"""
view = models.CharField(max_length=128, db_index=True) # View name eg: Department View Project View
sub_view = models.CharField(max_length=128, unique=True, db_index=True) # Sub-view name
sub_view_num = models.IntegerField() # Sub-view sequence number
Note: This method currently does not support the bidirectional writing scenario of the many-to-many model. If there is a business need, the ManyToManyField class needs to be rewritten, and the method can refer to the monkey patching method.
Record of pitfalls encountered during the database migration process
TIDB configuration item differences: Confirm the database configuration: ONLY_FULL_GROUP_BY is disabled (disabled by default in MySQL)
TIDB does not support transaction savepoint, and savepoint=False needs to be explicitly closed in the code.
TIDB being a distributed database, differs from MySQL in the auto-increment strategy for the auto-increment primary key field. If the business code is associated with the primary key id, attention should be paid.

评论已关闭