处理大型数据库[英] Handling large databases

本文是小编为大家收集整理的关于处理大型数据库的处理/解决方法,可以参考本文帮助大家快速定位并解决问题,中文翻译不准确的可切换到English标签页查看源文。

问题描述

我已经在Web项目(ASP.NET)工作了大约六个月.最终产品即将上线.该项目使用SQL Server作为数据库.我们已经通过大量数据进行了性能测试,结果表明,当数据变得太大时,性能会降低,例如200万行(超时问题,延迟的回复等).起初,我们使用了完全纳脏的数据库,但是现在由于性能问题(减少连接),我们将其部分归一化.首先,这是正确的决定吗? 另外,当数据大小变得很大时,可能的解决方案是什么,如否.客户的未来增加?

我想进一步添加:

  • 200万行是实体表,解决关系的表有更大的行.
  • 数据 + NO时的性能降低.用户的增加.
  • 在识别大量使用的查询后进行了译出.
  • 我们还使用了大量的XML列和Xquery.这可以是原因吗?
  • 有点删除主题,我项目中的一些人说动态SQL查询比存储的过程方法快.他们进行了某种性能测试以证明自己的观点.我认为恰恰相反.一些大量使用的查询是动态的,其中大多数其他查询都封装在存储过程中.

推荐答案

在事物方案中,几百万行不是一个特别的大数据库.

假设我们谈论的是一个OLTP数据库,而不是先确定瓶颈的根本原因的分整规范化是一个非常非常非常糟糕的主意.

您需要做的第一件事是在代表性的时间段内配置查询工作负载,以确定大多数工作正在完成的位置(例如,如果您使用的是SQL Server,请使用SQL Profiler).查看查询执行的逻辑读数的数量乘以执行的次数.一旦确定了十大表现最差的查询,就需要详细检查查询执行计划.

我将在这里肢体出去(因为通常是这样),但是如果您的问题不是

,我会感到惊讶
  1. 缺少昂贵查询的"右"涵盖索引
  2. 配置不佳或指定磁盘子系统

this 所以答案描述了如何配置以找到工作负载中表现最差的查询.

其他推荐答案

俗话说"正常直到疼痛,不可分割直至起作用".

我喜欢这个!这通常是不再接受的东西.我可以想象,回到DBASEIII时,您一次无法打开超过4个表(除非更改某些autoexec.bat参数并重新启动计算机,啊!...),有一定的兴趣在非定义中.

,但是如今,我看到这种解决方案类似于等待海啸浇水的园丁.请使用可用的浇水罐(SQL Profiler).

并且不要忘记,每次您将部分数据库不利时,随着代码中错误的风险增加,您的进一步调整的能力会降低,从而使整个系统越来越不可持续.

其他推荐答案

200万行通常不是一个很大的数据库,具体取决于您存储的信息.通常,当性能降低时,您应该验证索引策略. SQL Server数据库引擎调谐顾问可能会有所帮助.

本文地址:https://www.itbaoku.cn/post/597451.html

问题描述

I have been working in a web project(asp.net) for around six months. The final product is about to go live. The project uses SQL Server as the database. We have done performance testing with some large volumes of data, results show that performance degrades when data becomes too large, say 2 million rows (timeout issues, delayed reponses, etc). At first we were using fully normailized database, but now we made it partially normalized due to performance issues (to reduce joins). First of all, is it the right decision? Plus what are the possible solutions when data size becomes very large, as the no. of clients increase in future?

I would like to add further:

  • 2 million rows are entity tables, tables resolving the relations have much larger rows.
  • Performance degrades when data + no. of users increases.
  • Denormalization was done after identifying the heavily used queries.
  • We are also using some heavy amount of xml columns and xquery. Can this be the cause?
  • A bit off the topic, some folks in my project say that dynamic sql query is faster than a stored procedure approach. They have done some kind of performance testing to prove their point. I think the opposite is true. Some of the heavily used queries are dynamicaly created where as most of other queries are encapsulated in stored procedures.

推荐答案

In the scheme of things, a few million rows is not a particulary large Database.

Assuming we are talking about an OLTP database, denormalising without first identifying the root cause of your bottlenecks is a very, very bad idea.

The first thing you need to do is profile your query workload over a representative time period to identify where most of the work is being done (for instance, using SQL Profiler, if you are using SQL Server). Look at the number of logical reads a query performs multiplied by the number of times executed. Once you have identified the top ten worst performing queries, you need to examine the query execution plans in detail.

I'm going to go out on a limb here (because it is usually the case), but I would be surprised if your problem is not either

  1. Absence of the 'right' covering indexes for the costly queries
  2. Poorly configured or under specified disk subsystem

This SO answer describes how to profile to find the worst performing queries in a workload.

其他推荐答案

As the old saying goes "normalize till it hurts, denormalise till it works".

I love this one! This is typically the kind of thing that must not be accepted anymore. I can imagine that, back at DBASEIII times, where you could not open more than 4 tables at a time (unless changing some of your AUTOEXEC.BAT parameters AND rebooting your computer, ahah! ...), there was some interest in denormalisation.

But nowadays I see this solution similar to a gardener waiting for a tsunami to water his lawn. Please use the available watering can (SQL profiler).

And don't forget that each time you denormalize part of your database, your capacity to further adapt it decreases, as risks of bugs in code increases, making the whole system less and less sustainable.

其他推荐答案

2 million rows is normally not a Very Large Database, depending on what kind of information you store. Usualy when performance degrades you should verify your indexing strategy. The SQL Server Database Engine Tuning Advisor may be of help there.