技术干货 | 如何将大表在线改造为分区表并释放空间
<h1 id="技术干货--如何将大表在线改造为分区表并释放空间">技术干货 | 如何将大表在线改造为分区表并释放空间</h1><h2 id="出现麻烦">出现麻烦</h2>
<p>你是否遇到过如此令人麻烦的问题,比如:由于前期规划不当,后期库表数据量猛增;或者由于“年久失修”而造成的表数据积累不断。等等诸如此类 <strong>大表</strong> 问题。</p>
<p>面对这些大表,<strong>是删也不能删,清理又困难</strong>。</p>
<p>即使你不嫌麻烦,一点一点的通过 delete 清理了数据,但是还会有令人头疼的 <strong>表碎片</strong> 问题。</p>
<p>面对 表ibd 文件的只增不减,也许想到了可以通过这些方法解决:</p>
<ul>
<li><code>ALTER TABLE xxx engine=innodb;</code></li>
<li><code>OPTIMIZE TABLE xxx;</code></li>
</ul>
<p>这两种方法虽然在清理表数据后可以释放空间,但是会造成锁表问题。</p>
<h2 id="如何优雅在线的清理数据并释放空间呢">如何优雅在线的清理数据并释放空间呢?</h2>
<p>清理数据方法有很多,比如 [删除分区] 等,其实最简单有效的方法就是 <strong>删除分区</strong> 。</p>
<p>有人会问:当初数据库中没有设计为分区表,谁也没想到这个表现在数据量这么大,这该怎么办?</p>
<p>再也不用抱怨你接手了一个烂摊子,也不用频繁接收表文件过大造成的文件系统告警了......</p>
<h4 id="我们在线把大表改造成分区表不就好了">我们"在线"把大表改造成分区表不就好了!</h4>
<p>这将优雅的实现:</p>
<ul>
<li>在线 ——> 不影响业务</li>
<li>分区 ——> 便于管理</li>
<li>空间 ——> 删除分区直接释放</li>
</ul>
<h2 id="闪亮登场">闪亮登场</h2>
<p>实现我们的大目标,那就需要借助专业的小工具。他就是</p>
<h4 id="percona-toolkit-之-pt-online-schema-change">percona-toolkit 之 pt-online-schema-change</h4>
<h3 id="概要">概要</h3>
<p>在线修改表结构,特点是修改过程中不会造成读写阻塞。</p>
<h3 id="原理">原理</h3>
<p>工作原理是创建要更改的表的空副本,根据需要对其进行修改,然后将原始表中的行复制到新表中。复制完成后,它会移走原始表并用新表替换。默认情况下,它还会删除原始表。</p>
<h3 id="用法">用法</h3>
<pre><code class="language-bash">pt-online-schema-change DSN
</code></pre>
<h2 id="选项">选项</h2>
<blockquote>
<p>详见:https://greatsql.cn/docs/8.0.32-27/6-oper-guide/10-5-pt-development.html#pt-online-schema-change</p>
</blockquote>
<h2 id="最佳实践">最佳实践</h2>
<h3 id="1-测试表">1. 测试表</h3>
<pre><code class="language-sql">CREATE TABLE `t1` (
`id` int NOT NULL AUTO_INCREMENT,
`data_value` varchar(255) DEFAULT NULL,
`expired_date` int unsigned NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=1;
</code></pre>
<h3 id="2-测试数据">2. 测试数据</h3>
<p>表中存在数据 10000000</p>
<pre><code class="language-sql">greatsql> SELECT COUNT(*) FROM t1;
+----------+
| COUNT(*) |
+----------+
| 10000000 |
+----------+
1 row in set (3.03 sec)
</code></pre>
<h3 id="3-查看表文件大小">3. 查看表文件大小</h3>
<p>t1.ibd 文件大小有 468M</p>
<pre><code class="language-sql">-rw-r----- 1 mysql mysql 468M May5 06:07 t1.ibd
</code></pre>
<h2 id="4-修改主键">4. 修改主键</h2>
<p>如果使用分区表,分区键需要包含在主键中,使用 pt-online-schema-change 可以 <strong>online</strong> 修改主键。</p>
<p><strong>首先,使用</strong> <strong><code>--dry-run</code></strong> <strong>进行执行前测试</strong></p>
<pre><code class="language-sql">pt-online-schema-change h=192.168.105.101,P=3306,u=root,p=root,D=test,t=t1 --alter "DROP PRIMARY KEY,ADD PRIMARY KEY (id, expired_date)" --recursion-method=none --no-check-replication-filters --no-check-unique-key-change --alter-foreign-keys-method auto --print --dry-run
</code></pre>
<p>执行结果:</p>
<pre><code class="language-sql">Operation, tries, wait:
......
2025-05-05T06:59:52 Dropping new table...
DROP TABLE IF EXISTS `test`.`_t1_new`;
2025-05-05T06:59:52 Dropped new table OK.
Dry run complete.`test`.`t1` was not altered.
</code></pre>
<p><strong>然后,使用</strong> <strong><code>--execute</code></strong> <strong>执行</strong></p>
<pre><code class="language-sql">pt-online-schema-change h=192.168.105.101,P=3306,u=root,p=root,D=test,t=t1 --alter "DROP PRIMARY KEY,ADD PRIMARY KEY (id, expired_date)" --no-check-alter --recursion-method=none --no-check-replication-filters --no-check-unique-key-change --alter-foreign-keys-method auto --print --execute
</code></pre>
<p>执行结果:</p>
<pre><code class="language-sql">Operation, tries, wait:
......
Copying `test`.`t1`:49% 00:30 remain
Copying `test`.`t1`:96% 00:02 remain
2025-05-05T07:06:55 Copied rows OK.
2025-05-05T07:06:55 Analyzing new table...
2025-05-05T07:06:55 Swapping tables...
RENAME TABLE `test`.`t1` TO `test`.`_t1_old`, `test`.`_t1_new` TO `test`.`t1`
2025-05-05T07:06:55 Swapped original and new tables OK.
2025-05-05T07:06:55 Dropping old table...
DROP TABLE IF EXISTS `test`.`_t1_old`
2025-05-05T07:06:55 Dropped old table `test`.`_t1_old` OK.
2025-05-05T07:06:55 Dropping triggers...
DROP TRIGGER IF EXISTS `test`.`pt_osc_test_t1_del`
DROP TRIGGER IF EXISTS `test`.`pt_osc_test_t1_upd`
DROP TRIGGER IF EXISTS `test`.`pt_osc_test_t1_ins`
2025-05-05T07:06:55 Dropped triggers OK.
Successfully altered `test`.`t1`.
</code></pre>
<p><strong>查看表结构</strong></p>
<pre><code class="language-sql">greatsql> SHOW CREATE TABLE t1\G;
*************************** 1. row ***************************
Table: t1
Create Table: CREATE TABLE `t1` (
`id` int NOT NULL AUTO_INCREMENT,
`data_value` varchar(255) COLLATE utf8mb4_bin DEFAULT NULL,
`expired_date` int unsigned NOT NULL,
PRIMARY KEY (`id`,`expired_date`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin
1 row in set (0.00 sec)
</code></pre>
<p>此时可以看到,表 t1 的主键已经修改为 <code>(id,expired_date)</code> 了。</p>
<h2 id="5-调整为分区表">5. 调整为分区表</h2>
<p>使用 pt-online-schema-change 可以 <strong>online</strong> 调整为分区表。</p>
<p><strong>首先,使用</strong> <strong><code>--dry-run</code></strong> <strong>进行执行前测试</strong></p>
<pre><code class="language-sql">pt-online-schema-change h=192.168.105.101,P=3306,u=root,p=root,D=test,t=t1 \
--alter "
PARTITION BY RANGE (expired_date) (
PARTITION p202501 VALUES LESS THAN (1738339200),-- 2025-02-01 00:00:00
PARTITION p202502 VALUES LESS THAN (1740758400),-- 2025-03-01 00:00:00
PARTITION p202503 VALUES LESS THAN (1743436800),-- 2025-04-01 00:00:00
PARTITION p202504 VALUES LESS THAN (1746028800),-- 2025-05-01 00:00:00
PARTITION p202505 VALUES LESS THAN (1748707200),-- 2025-06-01 00:00:00
PARTITION p_max VALUES LESS THAN MAXVALUE
)" \
--recursion-method=none --no-check-replication-filters --no-check-unique-key-change --alter-foreign-keys-method auto --print --dry-run
</code></pre>
<p>执行结果:</p>
<pre><code class="language-sql">Operation, tries, wait:
......
2025-05-05T07:18:17 Dropping new table...
DROP TABLE IF EXISTS `test`.`_t1_new`;
2025-05-05T07:18:17 Dropped new table OK.
Dry run complete.`test`.`t1` was not altered.
</code></pre>
<p><strong>然后,使用</strong> <strong><code>--execute</code></strong> <strong>执行</strong></p>
<pre><code class="language-sql">pt-online-schema-change h=192.168.105.101,P=3306,u=root,p=root,D=test,t=t1 \
--alter "
PARTITION BY RANGE (expired_date) (
PARTITION p202501 VALUES LESS THAN (1738339200),-- 2025-02-01 00:00:00
PARTITION p202502 VALUES LESS THAN (1740758400),-- 2025-03-01 00:00:00
PARTITION p202503 VALUES LESS THAN (1743436800),-- 2025-04-01 00:00:00
PARTITION p202504 VALUES LESS THAN (1746028800),-- 2025-05-01 00:00:00
PARTITION p202505 VALUES LESS THAN (1748707200),-- 2025-06-01 00:00:00
PARTITION p_max VALUES LESS THAN MAXVALUE
)" \
--recursion-method=none --no-check-replication-filters --no-check-unique-key-change --alter-foreign-keys-method auto --print --execute
</code></pre>
<p>执行结果:</p>
<pre><code class="language-sql">Operation, tries, wait:
......
Operation, tries, wait:
......
Copying `test`.`t1`:24% 01:30 remain
Copying `test`.`t1`:59% 00:40 remain
Copying `test`.`t1`:97% 00:02 remain
2025-05-05T07:22:02 Copied rows OK.
2025-05-05T07:22:02 Analyzing new table...
2025-05-05T07:22:02 Swapping tables...
RENAME TABLE `test`.`t1` TO `test`.`_t1_old`, `test`.`_t1_new` TO `test`.`t1`
2025-05-05T07:22:02 Swapped original and new tables OK.
2025-05-05T07:22:02 Dropping old table...
DROP TABLE IF EXISTS `test`.`_t1_old`
2025-05-05T07:22:02 Dropped old table `test`.`_t1_old` OK.
2025-05-05T07:22:02 Dropping triggers...
DROP TRIGGER IF EXISTS `test`.`pt_osc_test_t1_del`
DROP TRIGGER IF EXISTS `test`.`pt_osc_test_t1_upd`
DROP TRIGGER IF EXISTS `test`.`pt_osc_test_t1_ins`
2025-05-05T07:22:02 Dropped triggers OK.
Successfully altered `test`.`t1`.
</code></pre>
<p><strong>查看表结构</strong></p>
<pre><code class="language-sql">greatsql> SHOW CREATE TABLE t1\G
*************************** 1. row ***************************
Table: t1
Create Table: CREATE TABLE `t1` (
`id` int NOT NULL AUTO_INCREMENT,
`data_value` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin DEFAULT NULL,
`expired_date` int unsigned NOT NULL,
PRIMARY KEY (`id`,`expired_date`)
) ENGINE=InnoDB AUTO_INCREMENT=10000001 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin
/*!50100 PARTITION BY RANGE (`expired_date`)
(PARTITION p202501 VALUES LESS THAN (1738339200) ENGINE = InnoDB,
PARTITION p202502 VALUES LESS THAN (1740758400) ENGINE = InnoDB,
PARTITION p202503 VALUES LESS THAN (1743436800) ENGINE = InnoDB,
PARTITION p202504 VALUES LESS THAN (1746028800) ENGINE = InnoDB,
PARTITION p202505 VALUES LESS THAN (1748707200) ENGINE = InnoDB,
PARTITION p_max VALUES LESS THAN MAXVALUE ENGINE = InnoDB) */
1 row in set (0.00 sec)
</code></pre>
<p>此时可以看到,表 t1 已经调整为分区表了。</p>
<h2 id="6-测试验证">6. 测试验证</h2>
<ol>
<li>查询表分区及数据情况</li>
</ol>
<pre><code class="language-sql">-- 查询数据
greatsql> SELECT COUNT(*) FROM t1;
+----------+
| COUNT(*) |
+----------+
| 10000000 |
+----------+
1 row in set (3.39 sec)
-- 查询分区
greatsql> SELECT table_schema,table_name,partition_name,partition_expression,partition_description FROM information_schema.partitions WHERE table_name = 't1' order by partition_name asc;
+--------------+------------+----------------+----------------------+-----------------------+
| TABLE_SCHEMA | TABLE_NAME | PARTITION_NAME | PARTITION_EXPRESSION | PARTITION_DESCRIPTION |
+--------------+------------+----------------+----------------------+-----------------------+
| test | t1 | p202501 | `expired_date` | 1738339200 |
| test | t1 | p202502 | `expired_date` | 1740758400 |
| test | t1 | p202503 | `expired_date` | 1743436800 |
| test | t1 | p202504 | `expired_date` | 1746028800 |
| test | t1 | p202505 | `expired_date` | 1748707200 |
| test | t1 | p_max | `expired_date` | MAXVALUE |
+--------------+------------+----------------+----------------------+-----------------------+
6 rows in set (0.01 sec)
</code></pre>
<ol start="2">
<li>查询表文件情况</li>
</ol>
<pre><code class="language-sql">total 509M
-rw-r----- 1 mysql mysql 116M May5 07:22 t1#p#p202501.ibd
-rw-r----- 1 mysql mysql 104M May5 07:22 t1#p#p202502.ibd
-rw-r----- 1 mysql mysql 116M May5 07:22 t1#p#p202503.ibd
-rw-r----- 1 mysql mysql 112M May5 07:22 t1#p#p202504.ibd
-rw-r----- 1 mysql mysql60M May5 07:22 t1#p#p202505.ibd
-rw-r----- 1 mysql mysql 112K May5 07:22 t1#p#p_max.ibd
</code></pre>
<p>可以看到,t1.ibd 文件已经不存在了,每个 ibd 文件即为 分区文件。</p>
<ol start="3">
<li>删除数据,释放空间<br>
如果需要删除一些数据,直接删除分区即可</li>
</ol>
<pre><code class="language-sql">-- 删除 202501 分区数据
greatsql> ALTER TABLE t1 DROP PARTITION p202501;
Query OK, 0 rows affected (0.07 sec)
Records: 0Duplicates: 0Warnings: 0
greatsql> SELECT COUNT(*) FROM t1;
+----------+
| COUNT(*) |
+----------+
|7727745 |
+----------+
1 row in set (2.53 sec)
</code></pre>
<p>数据删除完成,看看文件系统所占空间是否释放?</p>
<pre><code class="language-sql">total 393M
-rw-r----- 1 mysql mysql 104M May5 07:42 t1#p#p202502.ibd
-rw-r----- 1 mysql mysql 116M May5 07:22 t1#p#p202503.ibd
-rw-r----- 1 mysql mysql 112M May5 07:22 t1#p#p202504.ibd
-rw-r----- 1 mysql mysql60M May5 07:22 t1#p#p202505.ibd
-rw-r----- 1 mysql mysql 112K May5 07:22 t1#p#p_max.ibd
</code></pre>
<p>可以看到,分区 p202501 所使用的 ibd 文件也不存在了,文件系统空间得到了释放。</p>
<hr>
<p>Enjoy GreatSQL 😃</p>
<h2 id="关于-greatsql">关于 GreatSQL</h2>
<p>GreatSQL是适用于金融级应用的国内自主开源数据库,具备高性能、高可靠、高易用性、高安全等多个核心特性,可以作为MySQL或Percona Server的可选替换,用于线上生产环境,且完全免费并兼容MySQL或Percona Server。</p>
<p>相关链接: GreatSQL社区 Gitee GitHub Bilibili</p>
<h2 id="greatsql社区">GreatSQL社区:</h2>
<blockquote>
<p>社区博客有奖征稿详情:https://greatsql.cn/thread-100-1-1.html</p>
</blockquote>
<p><img src="https://img2024.cnblogs.com/other/2630741/202509/2630741-20250905095426835-2000879909.png"></p>
<h2 id="技术交流群">技术交流群:</h2>
<blockquote>
<p>微信:扫码添加<code>GreatSQL社区助手</code>微信好友,发送验证信息<code>加群</code>。</p>
</blockquote>
<p><img src="https://img2024.cnblogs.com/other/2630741/202509/2630741-20250905095427086-1822557387.png"></p><br><br>
来源:https://www.cnblogs.com/greatsql/p/19074890
頁:
[1]