连接池策略:好,坏还是丑陋?[英] Connection Pool Strategy: Good, Bad or Ugly?

本文是小编为大家收集整理的关于连接池策略:好,坏还是丑陋?的处理/解决方法,可以参考本文帮助大家快速定位并解决问题,中文翻译不准确的可切换到English标签页查看源文。

问题描述

我负责开发和维护一组围绕类似数据的Web应用程序.我当时决定的架构是每个应用程序都有自己的数据库和Web-root应用程序.每个应用程序都将连接池维护到其自己的数据库和共享数据的中央数据库(登录等)

.

一位同事一直认为该策略不会扩展,因为拥有如此多的不同连接池将无法扩展,我们应该重构数据库,以便所有不同的应用程序都使用单个中央数据库和任何修改.这可能是系统所独有的,需要从该数据库中反射,然后使用由Tomcat供电的单个池.他认为,整个网络中来回有很多"元数据"以维护连接池.

我的理解是,通过适当的调整,只能在不同的池中使用尽可能多的连接(低音量应用程序获得更少的连接,高音量应用程序的增加等等)池的数量与 Connections 或更正式的数量相比,维护3个连接池所需的间接费用差异可以忽略不计,而与30个连接的1个池相比,这并不重要.

.

最初将系统分解为单个应用数据库设计的推理是,应用程序之间可能会有差异,并且每个系统都可以根据需要对架构进行修改.同样,它消除了系统数据渗透到其他应用程序的可能性.

不幸的是,公司没有强大的领导才能做出艰难的决定.尽管我的同事只有模糊性就支持他的担忧,但我想确保我了解多个小型数据库/连接的后果与一个大型数据库/连接池.

.

推荐答案

您的原始设计基于声音原理.如果有帮助您的情况,则此策略被称为水平分区或sharding或sharding .它提供:

1)更大的可伸缩性 - 因为如果需要,每个碎片都可以在单独的硬件上生活.

2)更大的可用性 - 因为单个碎片的故障不会影响其他碎片

3)较高的性能 - 因为搜索的表的行较少,因此较小的索引可产生更快的搜索.

您的同事的建议将您转移到单点设置.

至于您对3个尺寸10 vs 1连接池30的连接池的问题,解决该辩论的最佳方法是使用基准测试.每种方式配置您的应用程序,然后使用AB(Apache Benchmark)进行一些应力测试,并查看哪种方法的性能更好.我怀疑不会有显着差异,但要进行基准来证明这一点.

其他推荐答案

如果您有一个数据库和两个连接池,则每个连接5个连接,则与数据库有10个连接.如果您有5个连接池,每个连接有2个连接,则可以与数据库有10个连接.最后,您与数据库有10个连接.数据库不知道您的池存在,没有意识.

池和数据库之间交换的任何元数据都将在每个连接上发生.当连接启动时,连接被拆除等时等等.因此,如果您有10个连接,则此流量将发生10次(至少,假设它们都在泳池的寿命中保持健康).无论您有1个池还是10个池.

都会发生这种情况.

至于"每个应用程序1 dB",如果您不与每个数据库的数据库的单独实例交谈,那么基本上都没关系.

如果您有一个托管5个数据库的DB服务器,并且与每个数据库具有连接(例如2个连接),则与托管单个数据库的同一DB相比,这将消耗更多的开销和内存.但是,该开销充其量是边缘的,对具有GB大小的数据缓冲区的现代机器完全微不足道.除了某个点之外,所有数据库都关心的是映射和复制数据页面从磁盘到RAM,然后再次返回.

如果您在DB的重复中有一张大冗余桌子,那可能会浪费.

最后,当我使用"数据库"一词时,我的意思是服务器用于合并表的逻辑实体.例如,Oracle确实喜欢每个服务器拥有一个"数据库",并将其分解为" schemas". Postgres有几个DBS,每个DB都可以使用模式.但是无论如何,所有现代服务器都具有可以使用的数据逻辑界限.我只是在这里使用"数据库"一词.

因此,只要您击中所有应用程序的DB服务器的一个实例,Connection Pools等人在大图中并不重要,因为服务器将共享所有内存和资源必要时遍布客户.

其他推荐答案

好的问题.我不知道哪种方法更好,但是您是否考虑过以一种可以从一种策略切换到另一个策略的方式来设计代码?也许可以使用一些轻巧的数据库代理对象从高级代码掩盖此决策.以防万一.

本文地址:https://www.itbaoku.cn/post/597532.html

问题描述

I'm in charge of developing and maintaining a group of Web Applications that are centered around similar data. The architecture I decided on at the time was that each application would have their own database and web-root application. Each application maintains a connection pool to its own database and a central database for shared data (logins, etc.)

A co-worker has been positing that this strategy will not scale because having so many different connection pools will not be scalable and that we should refactor the database so that all of the different applications use a single central database and that any modifications that may be unique to a system will need to be reflected from that one database and then use a single pool powered by Tomcat. He has posited that there is a lot of "meta data" that goes back and forth across the network to maintain a connection pool.

My understanding is that with proper tuning to use only as many connections as necessary across the different pools (low volume apps getting less connections, high volume apps getting more, etc.) that the number of pools doesn't matter compared to the number of connections or more formally that the difference in overhead required to maintain 3 pools of 10 connections is negligible compared to 1 pool of 30 connections.

The reasoning behind initially breaking the systems into a one-app-one-database design was that there are likely going to be differences between the apps and that each system could make modifications on the schema as needed. Similarly, it eliminated the possibility of system data bleeding through to other apps.

Unfortunately there is not strong leadership in the company to make a hard decision. Although my co-worker is backing up his worries only with vagueness, I want to make sure I understand the ramifications of multiple small databases/connections versus one large database/connection pool.

推荐答案

Your original design is based on sound principles. If it helps your case, this strategy is known as horizontal partitioning or sharding. It provides:

1) Greater scalability - because each shard can live on separate hardware if need be.

2) Greater availability - because the failure of a single shard doesn't impact the other shards

3) Greater performance - because the tables being searched have fewer rows and therefore smaller indexes which yields faster searches.

Your colleague's suggestion moves you to a single point of failure setup.

As for your question about 3 connection pools of size 10 vs 1 connection pool of size 30, the best way to settle that debate is with a benchmark. Configure your app each way, then do some stress testing with ab (Apache Benchmark) and see which way performs better. I suspect there won't be a significant difference but do the benchmark to prove it.

其他推荐答案

If you have a single database, and two connection pools, with 5 connections each, you have 10 connections to the database. If you have 5 connection pools with 2 connections each, you have 10 connections to the database. In the end, you have 10 connections to the database. The database has no idea that your pool exists, no awareness.

Any meta data exchanged between the pool and the DB is going to happen on each connection. When the connection is started, when the connection is torn down, etc. So, if you have 10 connections, this traffic will happen 10 times (at a minimum, assuming they all stay healthy for the life of the pool). This will happen whether you have 1 pool or 10 pools.

As for "1 DB per app", if you're not talking to an separate instance of the database for each DB, then it basically doesn't matter.

If you have a DB server hosting 5 databases, and you have connections to each database (say, 2 connection per), this will consume more overhead and memory than the same DB hosting a single database. But that overhead is marginal at best, and utterly insignificant on modern machines with GB sized data buffers. Beyond a certain point, all the database cares about is mapping and copying pages of data from disk to RAM and back again.

If you had a large redundant table in duplicated across of the DBs, then that could be potentially wasteful.

Finally, when I use the word "database", I mean the logical entity the server uses to coalesce tables. For example, Oracle really likes to have one "database" per server, broken up in to "schemas". Postgres has several DBs, each of which can have schemas. But in any case, all of the modern servers have logical boundaries of data that they can use. I'm just using the word "database" here.

So, as long as you're hitting a single instance of the DB server for all of your apps, the connection pools et al don't really matter in the big picture as the server will share all of the memory and resources across the clients as necessary.

其他推荐答案

Excellent question. I don't know which way is better, but have you considered designing the code in such a way that you can switch from one strategy to the other with the least amount of pain possible? Maybe some lightweight database proxy objects could be used to mask this design decision from higher-level code. Just in case.