如何加快从大型数据库表中删除的速度?[英] How do I speed up deletes from a large database table?

本文是小编为大家收集整理的关于如何加快从大型数据库表中删除的速度?的处理/解决方法,可以参考本文帮助大家快速定位并解决问题,中文翻译不准确的可切换到English标签页查看源文。

问题描述

这是我要解决的问题:我最近完成了一个数据层的重新设计,使我可以在多个碎片上加载数据库.为了保持碎片平衡,我需要能够将数据从一个碎片迁移到另一个碎片,这涉及从碎片A复制到碎片B,然后从Shard A中删除记录.但是我有几张非常大的表格,并有许多外键指向它们,因此从表中删除单个记录可能需要一秒钟以上.

在某些情况下,我需要从表中删除数百万张记录,而实用的时间太长了.

禁用外国钥匙不是一个选择.删除大批行也不是一个选择,因为这是一个生产应用程序,大量删除锁定太多资源,导致故障.我正在使用SQL Server,并且我知道分区表,但是对分区的限制(以及企业版的许可费)是如此不现实,以至于不可能.

当我开始解决这个问题时,我认为困难的部分是编写算法,该算法算出了如何从叶片删除行直至数据模型的顶部,以便沿着没有外国密钥限制会违反.方法.但是解决这个问题对我没有好处,因为要删除需要消失的记录需要几周的时间.

我已经以一种将数据标记为几乎已删除的方式,因此就应用程序而言,数据消失了,但是我仍在处理大型数据文件,大型备份和较慢的查询,因为桌子的透明尺寸.

有什么想法吗?我已经在这里阅读了较旧的帖子,没有任何帮助.

推荐答案

请参阅:在SQL Server上优化DELETE

这篇MS支持文章可能很感兴趣:如何解决如何解决由SQL中锁定升级引起的阻止问题服务器:

将大批量操作分解为几个较小的操作.为了 例如,假设您运行以下内容 查询要删除数百个 审计的千唱片 桌子,然后您发现它 导致锁定升级 其他用户:

DELETE FROM LogMessages WHERE LogDate < '2/1/2002'    

通过删除这些记录 一次,你可以 大大减少 每笔交易积累的锁 并防止锁定升级.为了 示例:

SET ROWCOUNT 500
delete_more:
     DELETE FROM LogMessages WHERE LogDate < '2/1/2002'
IF @@ROWCOUNT > 0 GOTO delete_more
SET ROWCOUNT 0

通过使查询效率与 可能.大扫描或大型 书签查找的数量可能 增加锁定的机会 升级;此外,它增加了 僵局的机会,通常 不利影响并发和 性能.

其他推荐答案

delete_more:
     DELETE TOP(500) FROM LogMessages WHERE LogDate < '2/1/2002'
IF @@ROWCOUNT > 0 GOTO delete_more

您可以使用Mitch建议的SET ROWCOUNT实现相同的结果MSDN DELETE不支持SQL Server的未来版本中的其他操作:

使用Set RowCount不会影响删除,插入和更新 SQL Server将来版本中的语句.避免使用Set RowCount 在新开发工作中的删除,插入和更新语句中, 并计划修改当前使用它的应用程序.对于类似 行为,使用顶部语法.有关更多信息,请参阅顶部 (Transact-SQL).

其他推荐答案

您可以创建新文件,复制除"已删除"行以外的所有文件,然后在表上交换名称.最后,放下旧桌子.如果您要删除很大一部分记录,那么这实际上可能会更快.

本文地址:https://www.itbaoku.cn/post/597638.html

问题描述

Here's the problem I am trying to solve: I have recently completed a data layer re-design that allows me to load-balance my database across multiple shards. In order to keep shards balanced, I need to be able to migrate data from one shard to another, which involves copying from shard A to shard B, and then deleting the records from shard A. But I have several tables that are very big, and have many foreign keys pointed to them, so deleting a single record from the table can take more than one second.

In some cases I need to delete millions of records from the tables, and it just takes too long to be practical.

Disabling foreign keys is not an option. Deleting large batches of rows is also not an option because this is a production application and large deletes lock too many resources, causing failures. I'm using Sql Server, and I know about partitioned tables, but the restrictions on partitioning (and the license fees for enterprise edition) are so unrealistic that they are not possible.

When I began working on this problem I thought the hard part would be writing the algorithm that figures out how to delete rows from the leaf level up to the top of the data model, so that no foreign key constraints get violated along the way. But solving that problem did me no good since it takes weeks to delete records that need to disappear overnight.

I already built in a way to mark data as virtually deleted, so as far as the application is concerned, the data is gone, but I'm still dealing with large data files, large backups, and slower queries because of the sheer size of the tables.

Any ideas? I have already read older related posts here and found nothing that would help.

推荐答案

Please see: Optimizing Delete on SQL Server

This MS support article might be of interest: How to resolve blocking problems that are caused by lock escalation in SQL Server:

Break up large batch operations into several smaller operations. For example, suppose you ran the following query to remove several hundred thousand old records from an audit table, and then you found that it caused a lock escalation that blocked other users:

DELETE FROM LogMessages WHERE LogDate < '2/1/2002'    

By removing these records a few hundred at a time, you can dramatically reduce the number of locks that accumulate per transaction and prevent lock escalation. For example:

SET ROWCOUNT 500
delete_more:
     DELETE FROM LogMessages WHERE LogDate < '2/1/2002'
IF @@ROWCOUNT > 0 GOTO delete_more
SET ROWCOUNT 0

Reduce the query's lock footprint by making the query as efficient as possible. Large scans or large numbers of Bookmark Lookups may increase the chance of lock escalation; additionally, it increases the chance of deadlocks, and generally adversely affects concurrency and performance.

其他推荐答案

delete_more:
     DELETE TOP(500) FROM LogMessages WHERE LogDate < '2/1/2002'
IF @@ROWCOUNT > 0 GOTO delete_more

You could achieve the same result using SET ROWCOUNT as suggested by Mitch but according to MSDN it won't be supported for DELETE and some other operations in future versions of SQL Server:

Using SET ROWCOUNT will not affect DELETE, INSERT, and UPDATE statements in a future release of SQL Server. Avoid using SET ROWCOUNT with DELETE, INSERT, and UPDATE statements in new development work, and plan to modify applications that currently use it. For a similar behavior, use the TOP syntax. For more information, see TOP (Transact-SQL).

其他推荐答案

You could create new files, copy all but the "deleted" rows, then swap the names on the tables. Finally, drop the old tables. If you're deleting a large percentage of the records, then this may actually be faster.