问题描述
我已经编写了一些非常好的、时髦的库,用于 LinqToSql.(有一天,当我有时间考虑它时,我可能会将它开源...... :))
无论如何,我不确定这是否与我的库有关,但我发现当我在一个事务中有大量更改的对象,然后调用 DataContext.GetChangeSet() 时,事情开始变得 <强>真的很慢www.当我闯入代码时,我发现我的程序正在旋转它的轮子,在更改集中的对象之间进行大量的 Equals() 比较.我不能保证这是真的,但我怀疑如果更改集中有 n 个对象,那么对 GetChangeSet() 的调用会导致每个对象都与每个其他对象进行比较以获得等效性,即充其量 (n^2-n)/2 调用 Equals()...
是的,当然我可以单独提交每个对象,但这有点违背了事务的目的.在我正在编写的程序中,我可以有一个包含 100,000 个单独项目的批处理作业,所有这些都需要一起提交.那里有大约 50 亿次比较.
所以问题是:(1)我对情况的评估是否正确?您是否在纯粹的教科书 LinqToSql 中获得了这种行为,或者这是我的图书馆正在做的事情?并且 (2) 是否有一个标准/合理的解决方法,以便我可以创建我的批处理,而不会因更改集中的每个额外对象而使程序在几何上变慢?
推荐答案
最后我决定重写批次,以便每个单独的项目独立保存,都在一个大事务中.换句话说,而不是:
var b = new Batch { ... }; while (addNewItems) { ... var i = new BatchItem { ... }; b.BatchItems.Add(i); } b.Insert(); // that's a function in my library that calls SubmitChanges()
.. 你必须这样做:
context.BeginTransaction(); // another one of my library functions try { var b = new Batch { ... }; b.Insert(); // save the batch record immediately while (addNewItems) { ... var i = new BatchItem { ... }; b.BatchItems.Add(i); i.Insert(); // send the SQL on each iteration } context.CommitTransaction(); // and only commit the transaction when everything is done. } catch { context.RollbackTransaction(); throw; }
你可以看到为什么第一个代码块更干净,更自然使用,很遗憾我被迫使用第二个结构......
问题描述
I've written some really nice, funky libraries for use in LinqToSql. (Some day when I have time to think about it I might make it open source... :) )
Anyway, I'm not sure if this is related to my libraries or not, but I've discovered that when I have a large number of changed objects in one transaction, and then call DataContext.GetChangeSet(), things start getting reaalllly slooowwwww. When I break into the code, I find that my program is spinning its wheels doing an awful lot of Equals() comparisons between the objects in the change set. I can't guarantee this is true, but I suspect that if there are n objects in the change set, then the call to GetChangeSet() is causing every object to be compared to every other object for equivalence, i.e. at best (n^2-n)/2 calls to Equals()...
Yes, of course I could commit each object separately, but that kinda defeats the purpose of transactions. And in the program I'm writing, I could have a batch job containing 100,000 separate items, that all need to be committed together. Around 5 billion comparisons there.
So the question is: (1) is my assessment of the situation correct? Do you get this behavior in pure, textbook LinqToSql, or is this something my libraries are doing? And (2) is there a standard/reasonable workaround so that I can create my batch without making the program geometrically slower with every extra object in the change set?
推荐答案
In the end I decided to rewrite the batches so that each individual item is saved independently, all within one big transaction. In other words, instead of:
var b = new Batch { ... }; while (addNewItems) { ... var i = new BatchItem { ... }; b.BatchItems.Add(i); } b.Insert(); // that's a function in my library that calls SubmitChanges()
.. you have to do something like this:
context.BeginTransaction(); // another one of my library functions try { var b = new Batch { ... }; b.Insert(); // save the batch record immediately while (addNewItems) { ... var i = new BatchItem { ... }; b.BatchItems.Add(i); i.Insert(); // send the SQL on each iteration } context.CommitTransaction(); // and only commit the transaction when everything is done. } catch { context.RollbackTransaction(); throw; }
You can see why the first code block is just cleaner and more natural to use, and it's a pity I got forced into using the second structure...