自制的与Java的序列化[英] Homemade vs. Java Serialization





因此,我尝试了基本的Java序列化,但是从我进行的基本测试中,这是一个非常昂贵的操作(构建bytearrayoutputstream,objectOutputstream,or of of of of of of of of of the delesialization).






Elliot Rusty Harold写下了不错的参数他的XOM库中的对象.相同的原则适用于您.内置的Java序列化是Java特异性的,脆弱的和缓慢的,因此最好避免.












I have a certain POJO which needs to be persisted on a database, current design specifies its field as a single string column, and adding additional fields to the table is not an option.

Meaning, the objects need to be serialized in some way. So just for the basic implementation I went and designed my own serialized form of the object which meant concatenating all it's fields into one nice string, separated by a delimiter I chose. But this is rather ugly, and can cause problems, say if one of the fields contains my delimiter.

So I tried basic Java serialization, but from a basic test I conducted, this somehow becomes a very costly operation (building a ByteArrayOutputStream, an ObjectOutputStream, and so on, same for the deserialization).

So what are my options? What is the preferred way for serializing objects to go on a database?

Edit: this is going to be a very common operation in my project, so overhead must be kept to a minimum, and performance is crucial. Also, third-party solutions are nice, but irrelevant (and usually generate overhead which I am trying to avoid)


Elliot Rusty Harold wrote up a nice argument against using Java Object serialization for the objects in his XOM library. The same principles apply to you. The built-in Java serialization is Java-specific, fragile, and slow, and so is best avoided.

You have roughly the right idea in using a String-based format. The problem, as you state, is that you're running into formatting/syntax problems with delimiters. The solution is to use a format that is already built to handle this. If this is a standardized format, then you can also potentially use other libraries/languages to manipulate it. Also, a string-based format means that you have a hope of understanding it just by eyeballing the data; binary formats remove that option.

XML and JSON are two great options here; they're standardized, text-based, flexible, readable, and have lots of library support. They'll also perform surprisingly well (sometimes even faster than Java serialization).


You might try Protocol Buffers, it is a open-source project from Google, it is said to be fast (generates shorter serialized form than XML, and works faster). It also handles addition of new field gently (inserts default values).


You need to consider versioning in your solution. Data incompatibility is a problem you will experience with any solution that involves the use of a binary serialization of the Object. How do you load an older row of data into a newer version of the object?

So, the solutions above which involve serializing to a name/value pairs is the approach you probably want to use.

One solution is to include a version number as one of field values. As new fields are added, modified or removed then the version can be modified.

When deserializing the data, you can have different deserialization handlers for each version which can be used to convert data from one version to another.