如同Facebook的新闻源数据库设计[英] News feed database design as in Facebook

本文是小编为大家收集整理的关于如同Facebook的新闻源数据库设计的处理/解决方法,可以参考本文帮助大家快速定位并解决问题,中文翻译不准确的可切换到English标签页查看源文。

问题描述

如何制作新闻提要"友好"数据库设计,以便将所有物品(查询)放入新闻供稿并不是非常昂贵的?我能想到的唯一方法将涉及几乎每张桌子(代表组,笔记,朋友等)并获得日期等等,这似乎是每个用户运行的昂贵查询,并且'd很难缓存这样的东西,每个人都不同.

推荐答案

首先,考虑执行性能原型以检查您的预感是否太贵了.您可能会过早地优化不是问题的事物.

如果这是一个真正的问题,请考虑纯粹是为了保存事件供稿数据的表格,必须与其他表平行更新.

例如.当您创建笔记记录时,还会在事件表中创建事件记录,其中涉及日期,描述和用户.

考虑基于用户ID(或用户ID和日期)的事件表的索引表.还要考虑在不再需要时清除旧数据.

这不是归一化模式,但是如果获得事件供稿是经常操作,则可能会更快.

其他推荐答案

没有模式就很难回答这个问题,但我的直觉是,涉及10个或更正确索引表的工会没有:
诸如WordPress或PHPBB之类的典型灯泡应用程序每页浏览量超过10个查询,没有问题.所以不用担心.

其他推荐答案

union =昂贵,因为完整的结果集需要进行独特的操作. Union All =便宜,因为它实际上是多个查询,每个查询共同添加.

这取决于数据量或课程.

效率的主要驱动力将是联合在一起的个别查询,但是没有理由为什么从10张表中选择最新的10个记录应占用一秒钟的一小部分.

本文地址:https://www.itbaoku.cn/post/597729.html

问题描述

How would make a news feed "friendly" database design, so that it wouldn't be extremely expensive to get all of the items (query) to put in the news feed? The only way I can think of would involve UNIONing nearly every table (representing groups, notes, friends, etc) and getting the dates and such, that just seems like it'd be a really expensive query to run for each user, and it'd be pretty hard to cache something like that with everyone's being different.

推荐答案

Firstly, consider doing a performance prototype to check your hunch that the union would be too expensive. You may be prematurely optimisizing something that is not an issue.

If it is a real issue, consider a table designed purely to hold the event feed data, that must be updated in parallel with the other tables.

E.g. when you create a Note record, also create an event record in the Event table with the date, description, and user involved.

Consider an indexing the Event table based on UserId (or UserId and Date). Also consider clearing old data when it is no longer required.

This isn't a normalised schema, but it may be faster if getting an event feed is a frequent operation.

其他推荐答案

It's hard to answer this question without a schema, but my hunch is that a UNION involving 10 or more properly indexed tables is nothing:
A typical LAMP application like wordpress or PHPBB runs more than 10 queries per pageview without problems. So don't worry.

其他推荐答案

UNION = expensive, because the complete result set is subject to a DISTINCT operation. UNION ALL = cheaper, because it is effectively multiple queries for which the results of each are appended together.

It depends on the data volume, or course.

The main driver of efficiency would be the individual queries that are unioned together, but there's no reason why selecting the most recent (say) 10 records from each of 10 tables should take more than a small fraction of a second.