谷歌的Bigtable与关系型数据库的对比[英] Google's Bigtable vs. A Relational Database

本文是小编为大家收集整理的关于谷歌的Bigtable与关系型数据库的对比的处理/解决方法,可以参考本文帮助大家快速定位并解决问题,中文翻译不准确的可切换到English标签页查看源文。

问题描述

重复

我对Google的Begtable了解不多,但想知道Google像MySQL这样的Boogtable和关系数据库之间的区别是什么.两者的局限性是什么?

推荐答案

Bigtable是Google的发明,可以处理公司定期交易的大量信息.笨拙的数据集可以增长到巨大的尺寸(许多之前),并在大量服务器上分布了存储空间.使用Bortable的系统包括Google的Web索引和Google Earth等项目.

根据 Google Whitepaper 关于该主题:

一个笨拙的是稀疏,分布式,持久的多维分类地图.地图由行键,列键和时间戳索引;地图中的每个值都是一个未解释的字节数组.

bigtable versus的内部力学,例如,mySQL是如此不同,以至于使比较变得困难,并且预期的目标也不会重叠.但是,您可以认为有点像单表数据库.想象一下,例如,如果您尝试使用MySQL数据库来实现Google的整个Web搜索系统,您将遇到的困难 - BigTable围绕解决这些问题而建立.

可以使用名为GQL(" Gee-kwal")的语言来查询诸如Appengine之类的服务,该数据集基于SQL的子集. GQL中明显缺少任何形式的JOIN命令.由于宏伟的数据库的分布性质,在两个表之间执行连接效率非常低.相反,程序员必须在其应用程序中实现此类逻辑,或设计其应用程序以使其不需要.

其他推荐答案

Google的Bigtable和其他类似项目(例如: couchdb hbase )是定向的数据库系统,因此数据主要是不划定(即,重复和分组).

主要优点是: - 加入操作的成本降低了 - 由于数据独立性,数据的复制/分布的成本较小(即,如果要在两个节点上分发数据,则可能不会存在一个在另一个节点中具有一个实体和其他相关实体的问题,因为相似数据分组)

对于需要实现最佳量表的应用程序,这种系统已指示(即,您为系统添加更多节点,并且性能会成比例地增加).在像MySQL或Oracle这样的RDBM中,如果您加入两个不在相同节点的表,则开始添加更多节点时,联接成本更高.当您处理大量的时候,这将变得很重要.

rdbms'很不错,因为存储模型的丰富性(表,连接,fks).分布式数据库很不错,因为易于规模.

本文地址:https://www.itbaoku.cn/post/597423.html

问题描述

Duplicates

I don't know much about Google's Bigtable but am wondering what the difference between Google's Bigtable and relational databases like MySQL is. What are the limitations of both?

推荐答案

Bigtable is Google's invention to deal with the massive amounts of information that the company regularly deals in. A Bigtable dataset can grow to immense size (many petabytes) with storage distributed across a large number of servers. The systems using Bigtable include projects like Google's web index and Google Earth.

According to Google whitepaper on the subject:

A Bigtable is a sparse, distributed, persistent multidimensional sorted map. The map is indexed by a row key, column key, and a timestamp; each value in the map is an uninterpreted array of bytes.

The internal mechanics of Bigtable versus, say, MySQL are so dissimilar as to make comparison difficult, and the intended goals don't overlap much either. But you can think of Bigtable a bit like a single-table database. Imagine, for example, the difficulties you would run into if you tried to implement Google's entire web search system with a MySQL database -- Bigtable was built around solving those problems.

Bigtable datasets can be queried from services like AppEngine using a language called GQL ("gee-kwal") which is a based on a subset of SQL. Conspicuously missing from GQL is any sort of JOIN command. Because of the distributed nature of a Bigtable database, performing a join between two tables would be terribly inefficient. Instead, the programmer has to implement such logic in his application, or design his application so as to not need it.

其他推荐答案

Google's BigTable and other similar projects (ex: CouchDB, HBase) are database systems that are oriented so that data is mostly denormalized (ie, duplicated and grouped).

The main advantages are: - Join operations are less costly because of the denormalization - Replication/distribution of data is less costly because of data independence (ie, if you want to distribute data across two nodes, you probably won't have the problem of having an entity in one node and other related entity in another node because similar data is grouped)

This kind of systems are indicated for applications that need to achieve optimal scale (ie, you add more nodes to the system and performance increases proportionally). In an RDBMS like MySQL or Oracle, when you start adding more nodes if you join two tables that are not in the same node, the join cost is higher. This becomes important when you are dealing with high volumes.

RDBMS' are nice because of the richness of the storage model (tables, joins, fks). Distributed databases are nice because of the ease of scale.