问题描述
我有一个列表,该列表不断增加.我正在添加批处理取决于列表大小.我忘了将Do ExecuteBatch的限制放在指定的大小中.
程序正在工作数小时.我不想停下来,修复并重新开始.
我的问题,什么决定添加批量的大小?批次一次executeBatch()的最大容量是什么?我可以使用addBatch几次,没有executeBatch()?
推荐答案
PGJDBC在批处理中有一些限制:
-
所有请求值以及所有结果必须累积在内存中.这包括大斑点/杂物结果.因此,免费内存是批处理大小的主要限制因素.
-
直到pgjdbc 9.4(尚未发布), batches返回生成的密钥始终为每个条目进行往返,因此它们并不比单个语句执行更好.
-
即使在9.4中,返回生成密钥的批次也仅在生成的值有限的情况下提供好处. A single text, bytea or unconstrained varchar field in the requested result will force the driver to do a round trip每次执行.
批处理的好处是减少网络圆旅行.因此,如果您的DB是应用程序服务器本地的,那么要点要小得多.随着批处理大小的增加,回报率降低,因为网络等待中所花费的总时间很快下降了,因此通常不强调尝试使批量尽可能大.
如果您是批量加载数据,请认真考虑通过通过PgConnection接口获得的PGJDBC的CopyManager使用COPY API.它使您可以将类似于CSV的数据流到服务器中,以便使用极少的客户端/服务器往返行程快速加载.不幸的是,它的文献不足 - 它根本没有出现在主要的pgjdbc文档中,仅在API文档中.
其他推荐答案
afaik旁边的内存问题没有限制. 关于您的问题:该语句仅在执行批处理时发送到DB,因此在执行批处理之前,内存将继续增长,直到您获得JavaheapSpace或将批次发送到DB.
.其他推荐答案
根据JDBC实现,可能有最大数量的参数标记.
例如,postgresql驱动程序表示参数的数量 2个字节整数,在Java中最多是32768.
问题描述
I have a list and that list increasing continuously. I am doing add batch depend on the list size. I forgot to put limit for do executeBatch in specified size.
Program is working for hours. I dont want to stop, fix and start again for now.
My questions, what decides size of the adding batch? What is the max capacity of the batch to do executeBatch() in a one time? How many time I can use addBatch without do executeBatch()?
推荐答案
PgJDBC has some limitations regarding batches:
All request values, and all results, must be accumulated in memory. This includes large blob/clob results. So free memory is the main limiting factor for batch size.
Until PgJDBC 9.4 (not yet released), batches that return generated keys always do a round trip for every entry, so they're no better than individual statement executions.
Even in 9.4, batches that return generated keys only offer a benefit if the generated values are size limited. A single text, bytea or unconstrained varchar field in the requested result will force the driver to do a round trip for every execution.
The benefit of batching is a reduction in network round trips. So there's much less point if your DB is local to your app server. There's a diminishing return with increasing batch size, because the total time taken in network waits falls off quickly, so it's often not work stressing about trying to make batches as big as possible.
If you're bulk-loading data, seriously consider using the COPY API instead, via PgJDBC's CopyManager, obtained via the PgConnection interface. It lets you stream CSV-like data to the server for rapid bulk-loading with very few client/server round trips. Unfortunately, it's remarkably under-documented - it doesn't appear in the main PgJDBC docs at all, only in the API docs.
其他推荐答案
AFAIK there is no limit beside the memory issue. regarding your question: the statement is sent to the DB only on execute batch so until you'll execute the batch the memory will continue to grow until you will get JavaHeapSpace or the batch will be sent to the DB.
其他推荐答案
There may be a maximum number of parameter markers depending on the JDBC implementation.
For instance the PostgreSQL driver represents the number of parameters as a 2-byte integer, which in Java is at most 32768.