基于地理位置的MySQL数据分片。

3

基于地区进行数据分片的常见方法是什么?也称为GDPR强制执行 - 欧盟数据留在欧盟。

如果我要保存用户的电子邮件用户表中 - 我需要以某种方式将美国和欧盟人员的数据分开。例如mysql表:

CREATE TABLE users(
        id INT NOT NULL AUTO_INCREMENT, 
        PRIMARY KEY(id),
        name VARCHAR(30), 
        email VARCHAR(30), 
        otherSensetiveData VARCHAR(30))
  • 在欧洲和美国各有一台服务器,这正常吗?
  • 在这种情况下,自动增量如何工作,连接查询/事务又是如何?

总的来说,我只想知道如何解决这个问题。


MySQL没有内置的分区支持。你想要用什么来代替它? - Akina
1个回答

5
如果您在欧盟有数据驻留要求,那么您需要两个服务器,或者您需要将所有数据存储在欧盟。
如果分片数据(将其分割到多个服务器上),则一般唯一键会有一些复杂性。
至少有四种流行的解决方案可以生成全局唯一的ID值:
  • Use auto-increment, but ensure they don't allocate the same id values by using auto_increment_increment set to the number of shards, and auto_increment_offset set to a distinct value between 0 and the number of shards. For example if you have 2 shards, auto_increment_increment would be set to 2 on both shards, and auto_increment_offset would be set to 0 on the US shard and 1 on the EU shard.

  • Use a compound primary key, one column being auto-increment, and the other column being constrained to a distinct shardid. It's up to you to define the table differently on each shard.

    CREATE TABLE users(
      id INT NOT NULL AUTO_INCREMENT, 
      shardid INT NOT NULL CHECK (shardid = 1),
      PRIMARY KEY(id, shardid)
    );
    
  • Do not use the built-in auto-increment features of MySQL, but instead create a globally unique id generator service, which both the US and the EU app instances call to get the next id. This is something the client app should call, and then pass the value as a query parameter to an INSERT statement. If it's too slow for the remote side to call this service on every INSERT, then the remote app may fetch a batch of id values in advance and store them locally, always keeping a "supply" of id values to use.

  • Use a UUID or globally unique string. This is in part encoded by the server id of the MySQL instance, so it's bound to be unique. You could use a trigger in your MySQL database to fill in the primary key with a UUID.

    CREATE TRIGGER t BEFORE INSERT on users FOR EACH ROW SET id = UUID();
    

分片是一个复杂的主题,你需要选择最适合你的应用程序的解决方案。

我建议你首先与熟悉GDPR的合格法律专业人士交谈,以确认你确实需要数据驻留。在某些情况下,根据像https://www.mcafee.com/blogs/enterprise/data-security/data-residency-a-concept-not-found-in-the-gdpr/这样的文章,你不需要(尽管该文章并非法律建议)。


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接