在mysql db中,应该使用什么列类型来存储序列化的数据?[英] What column type should be used to store serialized data in a mysql db?

本文是小编为大家收集整理的关于在mysql db中,应该使用什么列类型来存储序列化的数据?的处理/解决方法,可以参考本文帮助大家快速定位并解决问题,中文翻译不准确的可切换到English标签页查看源文。

问题描述

应使用哪种列类型将序列化数据存储在MySQL DB中? 我知道您可以使用Varbinary,Blob,文本.什么是最好的,为什么?

编辑: 我知道存储串行数据不是"好".不过,在这种情况下,我需要这样做.如果您有答案,请相信我,并专注于问题.谢谢!

推荐答案

要回答:文本在许多DBM中都弃用了,因此更好地使用具有高极限的blob或varchar(并且使用Blob,您将不会遇到任何编码问题,这是主要的带有varchar和文字的麻烦).

也如mysql论坛上的此线程困难,很难, - 驱动器比软件便宜,因此您最好先设计软件并使其正常工作,只有这样,如果空间成为问题,您可能需要优化该方面.因此,请勿过早地试图过度地将列的大小过度适应,最好将尺寸设置为更大(此外,这将避免安全问题).

关于各种评论: SQL狂热者太多了.尽管我非常喜欢SQL和关系模型,但他们也有陷阱.

将序列化数据存储到数据库AS-IS(例如存储JSON或XML格式数据)具有一些优点:

  • 您可以为您的数据具有更灵活的格式:即时添加和删除字段,更改即时的字段规范,等等...
  • 与对象模型的阻抗不匹配:与获取数据相比,您可以像程序中一样存储并获取数据,然后必须处理并在程序对象的结构和关系数据库的结构之间进行处理并将其转换为.

还有更多其他优点,因此请不要狂热主义:关系数据库是一个很好的工具,但让我们不要为我们获得的其他工具提供任何东西.更多工具,越好.

至于具体的使用示例,我倾向于在数据库中添加一个JSON字段,以存储记录的额外参数,其中JSON数据的列(属性)永远不会单独选择,但仅在正确的记录已经选择. In this case, I can still discriminate my records with the relational columns, and when the right record is selected, I can just use the extra parameters for whatever purpose I want.

因此,我的建议是保留世界上最好的(速度,序列化和结构灵活性),只需使用一些标准的关系列作为区分行的独特键,然后使用blob/var columch序列化数据将插入.通常,唯一键只需要两个/三列,因此这不会是主要的开销.

此外,您可能对现在具有JSON Datatype的PostgreSQL感兴趣, PostsQl Project 直接处理JSON字段作为关系列.

其他推荐答案

您打算存储多少?查看字符串类型the MySQL Docs 和他们的 size .这里的关键是您不在乎索引此列,但是您也永远不希望它溢出并截断,从那以后,您的json是不可读的.

  • tinyText l <2^8
  • 文本 l <2^16
  • 中文字 l <2^24
  • longText l <2^32

其中l是字符的长度

只是平原文本应该足够了,但是如果您存储更多的话,请更大.但是,在这种情况下,您可能不想将其存储在DB中.

其他推荐答案

@twisted梨提及的长度限制是很好的原因.

还认为TEXT及其幼虫具有与之相关的 charset ,而BLOB数据类型则没有.如果您只是存储原始的数据,则可以使用BLOB而不是TEXT.

请注意,您仍然可以将文本数据存储在BLOB中,您只是无法对其进行任何SQL操作,以考虑CharSet;只是字节到SQL.但这在您的情况下可能不是问题,因为它是SQL未知的结构的序列化数据.您需要做的就是存储字节和获取字节.字节的解释取决于您的应用.

我还使用LONGBLOB或LONGTEXT使用某些客户端库(例如PHP)遇到了麻烦任何给定的行,直到获取为止.这导致PHP试图分配4GB缓冲区时爆炸起来.我不知道您正在使用什么客户,或者它是否具有相同的行为.

解决方法:使用MEDIUMBLOB或仅BLOB,只要这些类型足以存储您的序列化数据.


关于人们告诉您不要这样做的问题,我不会告诉您(尽管我是SQL倡导者).的确,您不能使用SQL表达式在串行数据中的各个元素上执行操作,但这不是您的目的.通过将数据放入数据库中,您所能获得的收益包括:

  • 将序列化数据与其他更多关系数据关联.
  • 能够根据交易范围,提交,回滚存储和获取序列化数据.
  • 将您所有的关系和非关系数据存储在一个地方,以便更轻松地复制到奴隶,备份和还原等等.

本文地址:https://www.itbaoku.cn/post/597468.html

问题描述

What column type should be used to store serialized data in a mysql db? I know you can use varbinary, blob, text. What's considered the best and why?

Edit: I understand it is not "good" to store serialized data. I need to do it in this one case though. Please just trust me on this and focus on the question if you have an answer. Thanks!

推荐答案

To answer: text is deprecated in a lot of DBMS it seems, so better use either a blob or a varchar with a high limit (and with blob you won't get any encoding issues, which is a major hassle with varchar and text).

Also as pointed in this thread at the MySQL forums, hard-drives are cheaper than software, so you'd better first design your software and make it work, and only then if space becomes an issue, you may want to optimize that aspect. So don't try to overoptimize the size of your column too early on, better set the size larger at first (plus this will avoid security issues).

About the various comments: Too much SQL fanaticism here. Despite the fact that I am greatly fond of SQL and relational models, they also have their pitfalls.

Storing serialized data into the database as-is (such as storing JSON or XML formatted data) has a few advantages:

  • You can have a more flexible format for your data: adding and removing fields on the fly, changing the specification of the fields on the fly, etc...
  • Less impedance mismatch with the object model: you store and you fetch the data just as it is in your program, compared to fetching the data and then having to process and convert it between your program objects' structures and your relational database's structures.

And there are a lot more other advantages, so please no fanboyism: relational databases are a great tool, but let's not dish the other tools we can get. More tools, the better.

As for a concrete example of use, I tend to add a JSON field in my database to store extra parameters of a record where the columns (properties) of the JSON data will never be SELECT'd individually, but only used when the right record is already selected. In this case, I can still discriminate my records with the relational columns, and when the right record is selected, I can just use the extra parameters for whatever purpose I want.

So my advice to retain the best of both world (speed, serializability and structural flexibility), just use a few standard relational columns to serve as unique keys to discriminate between your rows, and then use a blob/varchar column where your serialized data will be inserted. Usually, only two/three columns are required for a unique key, thus this won't be a major overhead.

Also, you may be interested by PostgreSQL which now has a JSON datatype, and the PostSQL project to directly process JSON fields just as relational columns.

其他推荐答案

How much do you plan to store? Check out the specs for the string types at the MySQL docs and their sizes. The key here is that you don't care about indexing this column, but you also never want it to overflow and get truncated, since then you JSON is unreadable.

  • TINYTEXT L < 2^8
  • TEXT L < 2^16
  • MEDIUMTEXT L < 2^24
  • LONGTEXT L < 2^32

Where L is the length in character

Just plain text should be enough, but go bigger if you are storing more. Though, in that case, you might not want to be storing it in the db.

其他推荐答案

The length limits that @Twisted Pear mentions are good reasons.

Also consider that TEXT and its ilk have a charset associated with them, whereas BLOB data types do not. If you're just storing raw bytes of data, you might as well use BLOB instead of TEXT.

Note that you can still store textual data in a BLOB, you just can't do any SQL operations on it that take charset into account; it's just bytes to SQL. But that's probably not an issue in your case, since it's serialized data with structure unknown to SQL anyway. All you need to do is store bytes and fetch bytes. The interpretation of the bytes is up to your app.

I have also had troubles using LONGBLOB or LONGTEXT using certain client libraries (e.g. PHP) because the client tries to allocate a buffer as large as the largest possible data type, not knowing how large the content will be on any given row until it's fetched. This caused PHP to burst into flames as it tried to allocate a 4GB buffer. I don't know what client you're using, or whether it suffers from the same behavior.

The workaround: use MEDIUMBLOB or just BLOB, as long as those types are sufficient to store your serialized data.


On the issue of people telling you not to do this, I'm not going to tell you that (in spite of the fact that I'm an SQL advocate). It's true you can't use SQL expressions to perform operations on individual elements within the serialized data, but that's not your purpose. What you do gain by putting that data into the database includes:

  • Associate serialized data with other more relational data.
  • Ability to store and fetch serialized data according to transaction scope, COMMIT, ROLLBACK.
  • Store all your relational and non-relational data in one place, to make it easier to replicate to slaves, back up and restore, etc.