MySQL JOIN的滥用?能有多糟?[英] MySQL JOIN Abuse? How bad can it get?

本文是小编为大家收集整理的关于MySQL JOIN的滥用?能有多糟?的处理/解决方法,可以参考本文帮助大家快速定位并解决问题,中文翻译不准确的可切换到English标签页查看源文。

问题描述

我在每个选择上使用许多加入语句阅读了很多有关关系数据库的信息.但是,我一直在想,从长远来看,滥用此方法时是否存在任何性能问题.

例如,可以说我们有一个users表.我通常会添加"最常用的"数据,而不是进行任何额外的连接.例如,当我说"大多数使用"数据时,将是用户名,显示图片和位置.

在网站上显示任何用户互动时,将始终需要此数据,例如:在每个comments上加入articles>.而不是在users&users_profiles表上加入以获取"位置"和"显示",而只需在users table上使用信息.

这是我的方法,但是我确实知道有很多出色且经验丰富的程序员可以给我一个关于此事的建议.

我的问题是:

我应该尝试对加入保守吗?还是我应该更多地使用它们?为什么?

从长远来看,使用很多加入时是否有任何性能问题?

注意:我必须澄清,我根本不会避免加入.我只在需要时使用它们.在此示例上是评论/文章作者,仅在用户配置文件页面上显示的额外配置信息...等等.

推荐答案

我对数据建模的建议是:

  • 您应该在1:1上偏爱可选的(无效)列一般而言.在某些情况下,1:1是有道理的,通常围绕亚型旋转.当涉及到无效的列时,人们往往比奇怪地加入更加刺耳;
  • 除非 真正合理(在下面的更多内容);
  • 偏爱加入聚合.这可能会有所不同,因此需要进行测试.请参阅 oracle oracle加入以此为例;
  • 连接比N+1选择要好. n+1选择是从数据库表中选择订单,然后发出单独的查询以获取该顺序的所有订单项;
  • 连接的可伸缩性通常是 仅在选择质量时的问题.如果您选择一行,然后将其加入几件事,这是一个问题(但有时是);
  • 外国键应始终为索引,除非您要处理一个微不足道的表;

更多信息在 AppDevelopers构成的数据库开发错误..

.

现在,关于模型的直接性,让我给你一个例子.假设您正在设计用于用户身份验证和授权的系统.过度工程的解决方案可能看起来像这样:

  • 别名(id,用户名,user_id);
  • 用户(id,...);
  • 电子邮件(id,user_id,电子邮件地址);
  • 登录(id,user_id,...)
  • 登录角色(id,login_id,cole_id);
  • 角色(id,name);
  • 角色特权(id,cool_id,privilege_id);
  • 特权(ID,名称).

因此,您需要6个加入才能从输入的用户名中获得实际特权.确保可能有实际的要求,但是由于某些开发人员的手工缠绕,即使每个用户只有一个别名,用户登录的用户是1 :1等.一个简单的解决方案是:

  • 用户(ID,用户名,电子邮件地址,用户类型)

,就是这样.也许如果您需要一个复杂的角色系统,但是您不可能不这样做,并且如果您这样做很容易插入(用户类型将外键变成用户类型或角色表),或者通常很简单地映射旧的.

这是关于复杂性的事情:易于添加,难以删除.通常,这是对意外复杂性的持续守夜,这已经足够糟糕了,而不会通过增加不必要的复杂性而使情况变得更糟.

其他推荐答案

一些聪明的人曾经说过:

正常直到疼痛,直到起作用为止!

这一切都取决于连接的类型和联接条件,但是它们没有错. table1.pk = table2.fk非常有效.

其他推荐答案

如果数据为1 <-> 1,并且您不会有很多空字段,则不会过度归一化.您仍然可以在"选择语句"中指定所需的字段("大多数使用的数据").

本文地址:https://www.itbaoku.cn/post/597481.html

问题描述

I've been reading a lot about Relational Databases using many JOIN statements on every SELECT. However, I've been wondering if there's any performance problem on the long run when abusing this method.

For example, lets say we have a users table. I would usually add the "most used" data, instead of doing any extra JOINs. When I say the "most used" data, for instance, would be the username, display picture and location.

This data would always be needed when displaying any user interaction on the website, example: on every comments table JOIN for articles. Instead of doing a JOIN on the users & users_profiles tables to get the 'location' and 'display', just use the information on users table.

That's my approach, however I do know that there are a lot of excellent and experienced programmers that can give me a word of advice about this matter.

My questions are:

Should I try to be conservative with the JOINs? or should I use them more? Why?

Are there any performance problems on the long run when using JOIN a lot?

Note: I must clarify, that I'm not trying to avoid JOINS at all. I use them only when needed. On this example would be comment/article authors, extra profile information that only displays on user profiles pages... etc.

推荐答案

My advice on data modeling is:

  • You should favour optional (nullable) columns over 1:1 joins generally speaking. There are still instances where 1:1 makes sense, usually revolving around subtyping. People tend to be more squeamish when it comes to nullable columns than they do about joins oddly;
  • Don't make a model too indirect unless really justified (more on this below);
  • Favour joins over aggregation. This can vary so it needs to be tested. See Oracle vs MySQL vs SQL Server: Aggregation vs Joins for an example of this;
  • Joins are better than N+1 selects. An N+1 select is, for example, selecting an order from a database table and then issuing a separate query to get all the line items for that order;
  • The scalability of joins is usually only an issue when you're doing mass selects. If you select a single row and then join that to a few things rarely is this a problem (but sometimes it is);
  • Foreign keys should always be indexed unless you're dealing with a trivially small table;

More in Database Development Mistakes Made by AppDevelopers.

Now as for directness of a model, let me give you an example. Let's say you're designing a system for authentication and authorization of users. An overengineered solution might look something like this:

  • Alias (id, username, user_id);
  • User (id, ...);
  • Email (id, user_id, email address);
  • Login (id, user_id, ...)
  • Login Roles (id, login_id, role_id);
  • Role (id, name);
  • Role Privilege (id, role_id, privilege_id);
  • Privilege (id, name).

So you need 6 joins to get from the username entered to the actual privileges. Sure there might be an actual requirement for this but more often than not this kind of system is put in because of the hand-wringing by some developer thinking they might someday need it even though every user only has one alias, user to login is 1:1 and so on. A simpler solution is:

  • User (id, username, email address, user type)

and, well, that's it. Perhaps if you need a complex role system but it's also quite possible that you don't and if you do it's reasonably easy to slot in (user type becomes a foreign key into a user types or roles table) or it's generally straightforward to map the old to the new.

This is thing about complexity: it's easy to add and hard to remove. Usually it's a constant vigil against unintended complexity, which is bad enough without going and making it worse by adding unnecessary complexity.

其他推荐答案

Some bright person once said:

Normalize until it hurts, denormalize until it works!

It all depends on the type of joins, and the join conditions, but there are nothing wrong with them. Joins ON table1.PK = table2.FK are very efficient.

其他推荐答案

If the data is 1 <-> 1, and you will not have many null fields, dont over normalize. You can still specify the fields required ("most used data") in the select statements.