MySQL导入--使用线程

Question

MySQL导入--使用线程

7

我有一个大型数据库需要复制到从服务器上。尝试通过常规的mysqldump导入它（约15GB）花费了2天时间并且失败了。所以我正在尝试使用mysqldump --tab技巧。

我还想使用--use-threads进行导入，但似乎它并没有同时处理多个表。有没有办法确定它是否在工作？

mysqldump --single-transaction --quick --hex-blob --master-data=1 --tab=/tmp/backup/ apm

在从服务器上：

cat *.sql | mysql apm
mysqlimport --lock-tables --use-threads=4 apm /tmp/backup/*.txt

此外：有没有办法在不编辑配置文件和重新启动服务器的情况下禁用binlog？看起来mysql再次将所有数据复制到binlog中似乎有点愚蠢和缓慢。

- Kevin

5个回答

3

你是否在使用 MySQL 5.1.7 或更高版本？

如果你想测试事物是否按预期进行，为什么不使用一个测试模式和只有数据样本，这样可以使其运行更快呢？

更新关于 --use-threads 是否起作用，我不确定有什么明确的方法来检查。但是，我刚刚运行了一些测试，在 ~130M 数据中没有看到任何真正的差异：

mysqlimport --lock-tables --debug-info --use-threads=2 test /tmp/test/*.txt
Records: 2076063  Deleted: 0  Skipped: 0  Warnings: 0

User time 0.02, System time 0.08
Maximum resident set size 3884, Integral resident set size 0
Non-physical pagefaults 737, Physical pagefaults 0, Swaps 0
Blocks in 0 out 0, Messages in 0 out 0, Signals 0
Voluntary context switches 1340, Involuntary context switches 17

----

mysqlimport --lock-tables --debug-info --use-threads=4 test /tmp/test/*.txt
Records: 2076063  Deleted: 0  Skipped: 0  Warnings: 0

User time 0.03, System time 0.09
Maximum resident set size 3884, Integral resident set size 0
Non-physical pagefaults 738, Physical pagefaults 0, Swaps 0
Blocks in 0 out 0, Messages in 0 out 0, Signals 0
Voluntary context switches 1343, Involuntary context switches 41

----

mysqlimport --lock-tables --debug-info --use-threads=8 test /tmp/test/*.txt
Records: 2076063  Deleted: 0  Skipped: 0  Warnings: 0

User time 0.02, System time 0.09
Maximum resident set size 3884, Integral resident set size 0
Non-physical pagefaults 738, Physical pagefaults 0, Swaps 0
Blocks in 0 out 0, Messages in 0 out 0, Signals 0
Voluntary context switches 1341, Involuntary context switches 30

- cEz

我是。但是，我怎么能通过一个小样本告诉它仍在并行执行呢？ - Kevin

是的，我使用的是5.5.8版本。令人惊讶的是，在线上很少提到--use-threads，没有人备份/恢复大型数据库吗？！似乎任何超过几个G的数据都需要花费数天时间才能完成innodb操作。Percona有一个数据复制工具，但它会复制整个数据库，并需要更多的设置。 - Kevin

抱歉，我将“有没有办法知道它是否在工作？”误读为整个问题的一部分，而不仅仅是关于线程的使用。样例建议是快速检查您的进程是否正在工作，因为您没有表明它已知道如何工作。 - cEz

文档说明：--lock-tables 锁定所有写入表（这将禁用线程）。 - Ommadawn

2

我和 @bill-karwin 一样，也使用 mydumper / myloader 工具。

在 Ubuntu 上：

sudo apt-get install mydumper

对于其他任何情况，请按照以下说明操作：http://centminmod.com/mydumper.html

然后，进行备份：

mydumper -h [host] -u [user] -p [pass] -P [port] -B [database] -c -C --regex '^(?!excluded_table_1|excluded_table_2)' -v 3

然后，进行恢复操作：

myloader -d [dir_created_by_mydumper] -h [host] -u [user] -p [pass] -P [port] -o -C -v 3

注意：

-C 标志压缩 MySQL 协议，仅在外部托管的数据库上使用
-c 使用 gzip 压缩，建议始终保留
-v 3 提供详细输出（以便您了解正在发生的事情），如果关闭此选项，则脚本运行速度将更快。
--regex 可以是任何有效的正则表达式。
创建一个 ~/.my.cnf 文件，您就不需要再输入主机、用户名、密码和端口了。
mydumper --help 和 myloader --help 将为您提供所有选项的完整列表

- Silas Palmer

我使用那个工具时出现了分段错误，它不够稳定。 - John

2

根据源代码，我认为您应该禁用选项--lock-tables，请参见以下内容：

 #ifdef HAVE_LIBPTHREAD
  if (opt_use_threads && !lock_tables)
  {
    pthread_t mainthread;            /* Thread descriptor */
    pthread_attr_t attr;          /* Thread attributes */
    pthread_attr_init(&attr);
    pthread_attr_setdetachstate(&attr,
                                PTHREAD_CREATE_DETACHED);

    VOID(pthread_mutex_init(&counter_mutex, NULL));
    VOID(pthread_cond_init(&count_threshhold, NULL));

    for (counter= 0; *argv != NULL; argv++) /* Loop through tables */
    {
      pthread_mutex_lock(&counter_mutex);
      while (counter == opt_use_threads)
      {
        struct timespec abstime;

        set_timespec(abstime, 3);
        pthread_cond_timedwait(&count_threshhold, &counter_mutex, &abstime);
      }
      /* Before exiting the lock we set ourselves up for the next thread */
      counter++;
      pthread_mutex_unlock(&counter_mutex);
      /* now create the thread */
      if (pthread_create(&mainthread, &attr, worker_thread, 
                         (void *)*argv) != 0)
      {
        pthread_mutex_lock(&counter_mutex);
        counter--;
        pthread_mutex_unlock(&counter_mutex);
        fprintf(stderr,"%s: Could not create thread\n",
                my_progname);
      }
    }

    /*
      We loop until we know that all children have cleaned up.
    */
    pthread_mutex_lock(&counter_mutex);
    while (counter)
    {
      struct timespec abstime;

      set_timespec(abstime, 3);
      pthread_cond_timedwait(&count_threshhold, &counter_mutex, &abstime);
    }
    pthread_mutex_unlock(&counter_mutex);
    VOID(pthread_mutex_destroy(&counter_mutex));
    VOID(pthread_cond_destroy(&count_threshhold));
    pthread_attr_destroy(&attr);
  }

- emlaggr

1

多么有趣，谢谢 :) 实际上，我过去一年左右一直在使用Percona MySQL的热备份脚本。它们可以创建一个完整的“可引导”MySQL数据目录，因此无需导入任何内容。 - Kevin

是的，从mysqlimport --help -> -l, --lock-tables 锁定所有写表（这将禁用线程）。 - tiomno

0

我已经使用了mydumper作为mysqldump/mysqlimport的替代品。

Mydumper使用多个线程创建.sql格式的逻辑备份文件，并且myloader可以使用多个线程导入它们。

唯一的缺点是你必须从源代码构建它自己，因为似乎没有可用的二进制包。不过在CentOS上构建非常简单。

https://launchpad.net/mydumper

- Bill Karwin

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- rubricize.me · Accepted Answer

事实上，似乎只有在指定导入多个文件（到多个表）时，线程才会启动——它对于单个大文件没有帮助。要确定是否实际上使用了线程，请查看SHOW PROCESSLIST输出。在这里，您可以看到它实际上正在工作，尽管如上所述，是在不同的表上。

mysql> show processlist;
+-------+--------+------------------+------+---------+------+-----------+------------------------------------------------------------------------------------------------------+
| Id    | User   | Host             | db   | Command | Time | State     | Info                                                                                                 |
+-------+--------+------------------+------+---------+------+-----------+------------------------------------------------------------------------------------------------------+
|  4097 | root   | 127.0.0.1:33372  | test | Query   |    0 | executing | show processlist                                                                                     | 
|  6145 | root   | 10.2.13.44:44182 | test | Query   |    3 | executing | LOAD DATA   INFILE 'qpcrecpls2.tsv' INTO TABLE `qpcrecpls2` IGNORE 0 LINES                           | 
|  7169 | root   | 10.2.13.44:44186 | test | Query   |    3 | executing | LOAD DATA   INFILE 'qpcrecpls1.tsv' INTO TABLE `qpcrecpls1` IGNORE 0 LINES                           | 
|  8193 | root   | 10.2.13.44:44184 | test | Query   |    3 | executing | LOAD DATA   INFILE 'qpcrecpls3.tsv' INTO TABLE `qpcrecpls3` IGNORE 0 LINES                           | 
|  9217 | root   | 10.2.13.44:44188 | test | Query   |    3 | executing | LOAD DATA   INFILE 'qpcrecpls4.tsv' INTO TABLE `qpcrecpls4` IGNORE 0 LINES                           | 
| 10241 | root   | 10.2.13.44:44183 | test | Query   |    3 | executing | LOAD DATA   INFILE 'qpcrecpls5.tsv' INTO TABLE `qpcrecpls5` IGNORE 0 LINES                           | 
| 11265 | root   | 10.2.13.44:44185 | test | Query   |    3 | executing | LOAD DATA   INFILE 'qpcrecpls.tsv' INTO TABLE `qpcrecpls` IGNORE 0 LINES                             | 
| 12289 | root   | 10.2.13.44:44189 | test | Query   |    3 | executing | LOAD DATA   INFILE 'qpcrecpls6.tsv' INTO TABLE `qpcrecpls6` IGNORE 0 LINES                           | 
| 13313 | root   | 10.2.13.44:44190 | test | Query   |    3 | executing | LOAD DATA   INFILE 'qpcrecpls7.tsv' INTO TABLE `qpcrecpls7` IGNORE 0 LINES                           | 
+-------+--------+------------------+------+---------+------+-----------+------------------------------------------------------------------------------------------------------+

--verbose（详细）输出也很有启发性。