如何在数据库中存储具有动态属性数量的数据[英] How to store data with dynamic number of attributes in a database

本文是小编为大家收集整理的关于如何在数据库中存储具有动态属性数量的数据的处理/解决方法,可以参考本文帮助大家快速定位并解决问题,中文翻译不准确的可切换到English标签页查看源文。

问题描述

我有许多不同的对象,这些对象具有不同的属性.到目前为止,我将数据保存在XML文件中,这些数据很容易允许不断变化的属性.但是我正在尝试将其移至数据库.

您首选存储此数据是什么?

到目前为止,我已经确定了一些策略:

  • 在对象的表中有一个名为"属性"的单个字段,并将数据序列化或json'存储在其中.
  • 将数据存储在两个表(对象,属性)中,并使用第三个来保存关系,从而使其成为真正的n:m关系.非常干净的解决方案,但获取整个物体及其所有属性可能非常昂贵
  • 识别所有对象具有共同的属性,并将其为对象表创建字段.将其余属性作为序列化数据存储在另一个字段中.这比第一个策略具有优势,使搜索更容易.

有什么想法?

推荐答案

如果您 ever 计划搜索特定属性,则将它们序列化为单个列是一个坏主意,因为您必须使用每行函数来获取信息出去 - 这很少缩放得很好.

我会选择您的第二选择.在属性表,其自己表中的对象和一个称为对象属性的众多关系表中都有一个属性列表.

例如:

objects:
    object_id    integer
    object_name  varchar(20)
    primary key  (object_id)
attributes:
    attr_id      integer
    attr_name    varchar(20)
    primary key  (attr_id)
object_attributes:
    object_id    integer  references (objects.object_id)
    attr_id      integer  references (attributes.attr_id)
    oa_value     varchar(20)
    primary key (object_id,attr_id)

您对性能的关注是,但就我的经验而言,拆分列总是比组合多个列更为昂贵的.如果事实证明存在性能问题,则出于绩效原因打破3NF是完全可以接受的.

在这种情况下,我将以相同的方式存储它,但也有一个带有原始序列化数据的列.只要您使用插入/更新触发器将柱状数据和组合数据保持同步,则不会有任何问题.但是,直到实际的问题浮出水面,您都不应该担心.

通过使用这些触发器,您可以最大程度地减少数据更改时 所需的工作.通过尝试提取子列信息,您可以在上进行不必要的工作.选择.

其他推荐答案

您的2D解决方案上的一个变体仅为两个表(假设所有属性均为单一类型):

t1:|对象数据列| object_id |

t2:| object ID | attribute_name |属性值| (前2列上的唯一索引)

与第三溶液结合使用时,这更有效,例如所有普通字段都进入T1.

sstuffing>不建议将1个属性纳入相同的斑点 - 您无法通过属性过滤,您无法有效地更新它们

其他推荐答案

让我对DVK的说法给出一些具体性.

假设值是相同的表格(祝您好运,我觉得您需要它):

dynamic_attribute_table
------------------------
id         NUMBER
key        VARCHAR
value      SOMETYPE?

示例(汽车):

|id|    key   |   value   |
---------------------------
| 1|'Make'    |'Ford'     |
| 1|'Model'   |'Edge'     |
| 1|'Color'   |'Blue'     |
| 2|'Make'    |'Chevrolet'|
| 2|'Model'   |'Malibu'   |
| 2|'MaxSpeed'|'110mph'   |

因此,
实体1 = {('make','ford'),('模型','edge'),('color','blue')}
而且,
实体2 = {('make','chevrolet'),('模型','malibu'),('maxspeed','110mph')}.

本文地址:https://www.itbaoku.cn/post/597440.html

问题描述

I have a number of different objects with a varying number of attributes. Until now I have saved the data in XML files which easily allow for an ever changing number of attributes. But I am trying to move it to a database.

What would be your preferred way to store this data?

A few strategies I have identified so far:

  • Having one single field named "attributes" in the object's table and store the data serialized or json'ed in there.
  • Storing the data in two tables (objects, attributes) and using a third to save the relations, making it a true n:m relation. Very clean solution, but possibly very expensive to fetch an entire object and all its attributes
  • Identifying attributes all objects have in common and creating fields for these to the object's table. Store the remaining attributes as serialized data in another field. This has an advantage over the first strategy, making searches easier.

Any ideas?

推荐答案

If you ever plan on searching for specific attributes, it's a bad idea to serialize them into a single column, since you'll have to use per-row functions to get the information out - this rarely scales well.

I would opt for your second choice. Have a list of attributes in an attribute table, the objects in their own table, and a many-to-many relationship table called object attributes.

For example:

objects:
    object_id    integer
    object_name  varchar(20)
    primary key  (object_id)
attributes:
    attr_id      integer
    attr_name    varchar(20)
    primary key  (attr_id)
object_attributes:
    object_id    integer  references (objects.object_id)
    attr_id      integer  references (attributes.attr_id)
    oa_value     varchar(20)
    primary key (object_id,attr_id)

Your concern about performance is noted but, in my experience, it's always more costly to split a column than to combine multiple columns. If it turns out that there are performance problems, it's perfectly acceptable to break 3NF for performance reasons.

In that case I would store it the same way but also have a column with the raw serialized data. Provided you use insert/update triggers to keep the columnar and combined data in sync, you won't have any problems. But you shouldn't worry about that until an actual problem surfaces.

By using those triggers, you minimize the work required to only when the data changes. By trying to extract sub-column information, you do unnecessary work on every select.

其他推荐答案

A variation on your 2d solution is just two tables (assuming all attributes are of a single type):

T1: |Object data columns|Object_id|

T2: |Object id|attribute_name|attribute value| (unique index on first 2 columns)

This is even more efficient when combined with 3rd solution, e.g. all of the common fields go into T1.

Sstuffing >1 attribute into the same blob is no recommended - you can not filter by attributes, you can not efficiently update them

其他推荐答案

Let me give some concreteness to what DVK was saying.

Assuming values are of same type the table would look like (good luck, I feel you're going to need it):

dynamic_attribute_table
------------------------
id         NUMBER
key        VARCHAR
value      SOMETYPE?

example (cars):

|id|    key   |   value   |
---------------------------
| 1|'Make'    |'Ford'     |
| 1|'Model'   |'Edge'     |
| 1|'Color'   |'Blue'     |
| 2|'Make'    |'Chevrolet'|
| 2|'Model'   |'Malibu'   |
| 2|'MaxSpeed'|'110mph'   |

Thus,
entity 1 = { ('Make', 'Ford'), ('Model', 'Edge'), ('Color', 'Blue') }
and,
entity 2 = { ('Make', 'Chevrolet'), ('Model', 'Malibu'), ('MaxSpeed', '110mph') }.