什么是列式数据库?[英] What is a columnar database?

本文是小编为大家收集整理的关于什么是列式数据库?的处理/解决方法,可以参考本文帮助大家快速定位并解决问题,中文翻译不准确的可切换到English标签页查看源文。

问题描述

我已经与仓库合作了一段时间.

我对柱状数据库和他们为数据检索提供的速度所吸引.

我有多部分问题:

  • 柱状数据库如何工作?
  • 它们与关系数据库有何不同?

推荐答案

柱状数据库如何工作?
柱状数据库是概念而不是特定的架构/实现.换句话说,关于这些数据库的工作方式没有一个特殊的描述.的确,几个是建立在传统的,面向行的DBM上的,只是将信息存储在一个(或通常是两个)列的表中(并添加必要的层以简单的方式访问柱状数据).

>

它们与关系数据库有何不同? 它们通常与传统的(面向行的)数据库不同,有关...

  • 性能...
  • 存储要求...
  • 易于修改模式...

...在dbmses的特定用例中 .
特别是,当典型使用是在有限数量的列上计算汇总值时,它们在提到的区域中提供了优势,而不是尝试检索给定实体的所有/大多数列.

我可以安装柱状数据库的试用版吗? (我在Windows 7上) 是的,有商业,免费,也有开源的柱状数据库实现.请参阅 wikipedia文章 for Starter. br> 请注意,其中一些实现被引入以解决特定的需求(例如,占地面积很小,数据的高度可压缩分布或备用矩阵仿真等),而不是提供面向列的通用列DBMS -se.

注意:关于几个柱状DBMSES的"单一目的取向"的评论不是对这些实现的批评,而是另一种迹象表明,这种方法是从更自然的"自然"中流浪的方法(当然更广泛地使用了)存储记录实体的方法.结果,当以排为导向的方法不满意时,使用此方法,因此并且倾向于
a)针对特定目的的目标 b)比"通用","尝试和测试",表格的方法相比,获得的资源/兴趣少.

.

暂时, entity-attity-attribute-value >(EAV)数据模型,可能是您可能需要考虑的替代存储策略.尽管与"纯"柱状DB模型不同,但EAV具有柱状DBS的几个特征.

其他推荐答案

柱状数据库如何工作? 列店的定义概念是表的值通过列连续存储.因此,CJ日期供应商和零件数据库的经典供应商表:

SNO  STATUS CITY    SNAME
---  ------ ----    -----
S1       20 London  Smith
S2       10 Paris   Jones
S3       30 Paris   Blake
S4       20 London  Clark
S5       30 Athens  Adams

将存储在磁盘或内存中:

S1S2S3S4S5;2010302030;LondonParisParisLondonAthens;SmithJonesBlakeClarkAdams 

这与传统的划艇形成对比,该划艇将更多地存储数据:

S120LondonSmith;S210ParisJones;S330ParisBlake;S420LondonClark;S530AthensAdams

从这个简单的概念中,无论是好是坏,都在专栏店和仓库之间流动所有基本差异.例如,列存储将在执行总计和平均值之类的聚合方面表现出色,但是插入一行可能很昂贵,而逆向行列对于划分店来说是正确的.这应该从上图.

它们与关系数据库有何不同? 关系数据库是一个逻辑概念.柱状数据库或柱状店是一个物理概念.因此,这两个术语都无法以任何有意义的方式比较.面向列的DMBS可能是关系的,就像面向行的DBMS可能或多或少地遵守关系原理一样.

其他推荐答案

我会说了解有关列的数据库的最佳候选人是检查HBase( apache hbase ).您可以结帐代码并进一步探索以了解实现.

本文地址:https://www.itbaoku.cn/post/597403.html

问题描述

I have been working with warehousing for a while now.

I am intrigued by Columnar Databases and the speed that they have to offer for data retrievals.

I have multi-part question:

  • How do Columnar Databases work?
  • How do they differ from relational databases?

推荐答案

How do Columnar Databases work?
Columnar database is a concept rather a particular architecture/implementation. In other words, there isn't one particular description on how these databases work; indeed, several are build upon traditional, row-oriented, DBMS, simply storing the info in tables with one (or rather often two) columns (and adding the necessary layer to access the columnar data in an easy fashion).

How do they differ from relational databases? They generally differ from traditional (row-oriented) databases with regards to ...

  • performance...
  • storage requirements ...
  • ease of modification of the schema ...

...in specific use cases of DBMSes.
In particular they offer advantages in the areas mentioned when the typical use is to compute aggregate values on a limited number of columns, as opposed to try and retrieve all/most columns for a given entity.

Is there a trial version of a columnar database I can install to play around? (I am on Windows 7) Yes, there are commercial, free and also open-source implementation of columnar databases. See the list at the end of the Wikipedia article for starter.
Beware that several of these implementations were introduced to address a particular need (say very small footprint, highly compressible distribution of data, or spare matrix emulation etc.) rather than provide a general purpose column-oriented DBMS per-se.

Note: The remark about the "single purpose orientation" of several columnar DBMSes is not a critique of these implementations, but rather an additional indication that such an approach for DBMSes strays from the more "natural" (and certainly more broadly used) approach to storing record entities. As a result, this approach is used when the row-oriented approach isn't satisfactory, and therefore and tends to
a) be targeted for a particular purpose b) receive less resources/interest than work on "General Purpose", "Tried and Tested", tabular approach.

Tentatively, the Entity-Attribute-Value (EAV) data model, may be an alternative storage strategy which you may want to consider. Although distinct from the "pure" Columnar DB model, EAV shares several of the characteristics of Columnar DBs.

其他推荐答案

How do columnar databases work? The defining concept of a column-store is that the values of a table are stored contiguously by column. Thus the classic supplier table from CJ Date's supplier and parts database:

SNO  STATUS CITY    SNAME
---  ------ ----    -----
S1       20 London  Smith
S2       10 Paris   Jones
S3       30 Paris   Blake
S4       20 London  Clark
S5       30 Athens  Adams

would be stored on disk or in memory something like:

S1S2S3S4S5;2010302030;LondonParisParisLondonAthens;SmithJonesBlakeClarkAdams 

This is in contrast to a traditional rowstore which would store the data more like this:

S120LondonSmith;S210ParisJones;S330ParisBlake;S420LondonClark;S530AthensAdams

From this simple concept flows all of the fundamental differences in performance, for better or worse, between a column-store and a row-store. For example, a column store will excel at doing aggregations like totals and averages, but inserting a single row can be expensive, while the inverse holds true for row-stores. This should be apparent from the above diagram.

How do they differ from relational databases? A relation database is a logical concept. A columnar database, or column-store, is a physical concept. Thus the two terms are not comparable in any meaningful way. Column- oriented DMBSs may be relational or not, just as row-oriented DBMS's may adhere more or less to relational principles.

其他推荐答案

I would say the best candidate to understand about column oriented databases is to check HBase (Apache Hbase) . You an checkout the code and explore further to find out about the implementation .