分布式系统中的故障转移有哪些算法?[英] What algorithms there are for failover in a distributed system?

本文是小编为大家收集整理的关于分布式系统中的故障转移有哪些算法?的处理/解决方法,可以参考本文帮助大家快速定位并解决问题,中文翻译不准确的可切换到English标签页查看源文。

问题描述

我正计划使用共享 - noreferrer">共享 - nothing-nothing Architture Multiversion并发控制.冗余将通过 asynchronous replication 失败,只要系统中的数据保持一致).对于每个数据库条目,一个节点具有主副本(只有节点具有对其的写入访问),除此之外,一个或多个节点还具有该条目的次要副本以进行可伸缩性和冗余目的(仅读取辅助副本) .当更新条目的主副本时,它将被时间戳并异步发送到带有辅助副本的节点,以便最终将获得该条目的最新版本.具有主副本的节点可以随时更改 - 如果另一个节点需要编写该条目,它将请求主副本的当前所有者以赋予该节点的所有权,并在接收该节点的所有权之后可以写入条目(所有交易和写入都是本地的).

最近,我一直在考虑群集中的节点下降时该怎么做,该策略用于故障转移.这是一些问题.我希望您至少知道其中一些的可用替代方案.

  • 在分布式系统中进行故障转移有什么算法?
  • 分布式系统中有什么算法?
  • 群集中的节点应该如何确定节点下降?
  • 节点应该如何确定哪些数据库条目在失败时在失败的节点上具有主副本,以便其他节点可以恢复这些条目?
  • 如何确定哪个节点具有某些条目的最新次要副本?
  • 如何确定应将哪个节点的辅助副本晋升为新的主副本?
  • 如何处理它,如果要倒下的节点,突然回来好像什么都没发生?
  • 如何避免跨越脑的场景,该场景暂时分为两者,双方都认为另一侧已经死了?

推荐答案

* What algorithms there are for doing failover in a distributed system?

可能不是算法,而是系统.您需要围绕提出的问题设计建筑.

* What algorithms there are for consensus in a distributed system?

您可能想实现Paxos.简单的paxos并不难以正确.如果您想制作子弹,请阅读Google的" Paxos Made Live"纸.如果您希望使其高表现,请查看多人paxos.

* How should the nodes in the cluster determine that a node is down?

取决于.心跳实际上是做到这一点的好方法.问题在于您有误报,但这是不可避免的,在同一LAN的集群中,具有可管理的负载,它们是准确的.关于Paxos的好处是,误报会自动处理.但是,如果您实际上需要出于其他目的而需要故障信息,那么您需要确保可以检测到节点失败是可以的,但是实际上它只是在负载下,并花时间回应心跳.

* How should the nodes determine that what database entries had their master copy on the failed node at the time of failure, so that other nodes may recover those entries?
* How to decide that which node(s) has the latest secondary copy of some entry?
* How to decide that which node's secondary copy should be promoted to be the new master copy?

我认为您可能会从阅读Google文件系统论文中受益.在GFS中,有一个专用的主节点,可以跟踪哪些节点具有哪些块.该方案可能对您有用,但关键是要继续访问此主人的最低限度.

如果您不将这些信息存储在专用节点上,则必须在任何地方存储它.尝试使用主持人的ID标记数据.

* How to handle it, if the node which was though to be down, suddenly comes back as if nothing happened?

请参见上文,但基本观点是您必须小心,因为主人可能不再认为是的节点.我认为您无法解决的一件事是:更新如何到达主 - 即客户如何知道将更新发送到哪个节点?

* How to avoid split-brain scenarios, where the network is temporarily split into two, and both sides think that the other side has died?

paxos在此处通过防止完美拆分的进度来工作.否则,像以前一样,您必须非常小心.

通常,解决了知道哪个节点将哪个数据项作为主的问题,您将在修复体系结构方面有很长的路要走.请注意,您不能仅仅让接收到更新的节点是主 - 如果同时发生两个更新,该怎么办?也不要依靠同步的全局时钟 - 那样疯狂.您可能想避免在每个写入方面的共识,如果您可以帮助它,因此也许具有缓慢的主禁用协议和快速写入路径.

,如果您想知道更多详细信息,请随时向我发送邮件.我的博客 http://the-paper-trail.org 处理了很多这些东西.

欢呼,

亨利

其他推荐答案

您正在问一个绝对巨大的问题,您想知道的很多东西仍在积极研究中.

一些想法:

  • 分布式系统很困难,因为没有万无一失的系统可以处理故障.在异步系统中,无法确保节点降低或是否存在网络延迟.这听起来很琐碎,但实际上并不是.
  • 可以通过 paxos家族的算法,其版本使用的版本, Google的Boogtable和其他地方.

您需要深入研究分布式系统教科书(或几本).我喜欢 tannenbaum p>

其他推荐答案

一个很棒的博客,讨论了很多关于分布式系统和分布式算法(包括实施Paxos)的博客,IS http://the-paper-trail.org/

本文地址:https://www.itbaoku.cn/post/597405.html

问题描述

I'm planning on making a distributed database system using a shared-nothing architecture and multiversion concurrency control. Redundancy will be achieved through asynchronous replication (it's allowed to lose some recent changes in case of a failure, as long as the data in the system remains consistent). For each database entry, one node has the master copy (only that node has write access to it), in addition to which one or more nodes have secondary copies of the entry for scalability and redundancy purposes (the secondary copies are read-only). When the master copy of an entry is updated, it is timestamped and sent asynchronously to nodes with secondary copies so that finally they will get the latest version of the entry. The node that has the master copy can change at any time - if another node needs to write that entry, it will request the current owner of the master copy to give that node the ownership of that entry's master copy, and after receiving ownership that node can write the entry (all transactions and writes are local).

Lately I've been thinking about what to do when a node in the cluster goes down, that what strategy to use for failover. Here are some questions. I hope that you would know available alternatives to at least some of them.

  • What algorithms there are for doing failover in a distributed system?
  • What algorithms there are for consensus in a distributed system?
  • How should the nodes in the cluster determine that a node is down?
  • How should the nodes determine that what database entries had their master copy on the failed node at the time of failure, so that other nodes may recover those entries?
  • How to decide that which node(s) has the latest secondary copy of some entry?
  • How to decide that which node's secondary copy should be promoted to be the new master copy?
  • How to handle it, if the node which was though to be down, suddenly comes back as if nothing happened?
  • How to avoid split-brain scenarios, where the network is temporarily split into two, and both sides think that the other side has died?

推荐答案

* What algorithms there are for doing failover in a distributed system?

Possibly not algorithms, so much as systems. You need to design your architecture around the questions you've asked.

* What algorithms there are for consensus in a distributed system?

You probably want to implement Paxos. Simple Paxos is not too hard to get right. If you're are trying to make it bullet proof, read Google's 'Paxos Made Live' paper. If you're hoping to make it high-performance, look at Multi-Paxos.

* How should the nodes in the cluster determine that a node is down?

Depends. Heartbeats are actually a pretty good way to do this. The problem is that you have false positives, but that's kind of unavoidable, and in a cluster on the same LAN with manageable load they're accurate. The good thing about Paxos is that false positives are dealt with automatically. However, if you actually need failure information for some other purpose then you need to make sure it's ok that you detect a node as failed, but it actually is just under load and taking time to respond to a heartbeat.

* How should the nodes determine that what database entries had their master copy on the failed node at the time of failure, so that other nodes may recover those entries?
* How to decide that which node(s) has the latest secondary copy of some entry?
* How to decide that which node's secondary copy should be promoted to be the new master copy?

I think you might really benefit from reading the Google FileSystem paper. In GFS there's a dedicated master node which keeps track of which nodes have which blocks. This scheme might work for you, but the key is to keep accesses to this master minimal.

If you don't store this information on a dedicated node, you're going to have to store it everywhere. Try tagging the data with the master holder's id.

* How to handle it, if the node which was though to be down, suddenly comes back as if nothing happened?

See above, but the basic point is that you have to be careful because a node that is no longer the master might think that it is. One thing that I don't think you've solved: how does an update get to the master - i.e. how does a client know which node to send the update to?

* How to avoid split-brain scenarios, where the network is temporarily split into two, and both sides think that the other side has died?

Paxos works here by preventing progress in the case of a perfect split. Otherwise, as before, you have to be very careful.

In general, solve the question of knowing which node gets which data item as the master, and you'll be a long way towards fixing your architecture. Note that you can't just have the node receiving the update be the master - what if two updates happen concurrently? Don't rely on a synchronised global clock either - that way madness lies. You probably want to avoid running consensus on every write if you can help it, so instead perhaps have a slow master-failover protocol and a fast write path.

Feel free to shoot me a mail off line if you want to know more details. My blog http://the-paper-trail.org deals with a lot of this stuff.

cheers,

Henry

其他推荐答案

You are asking an absolutely massive question, and a lot of what you want to know is still in active research.

Some thoughts:

  • Distributed systems are difficult, because there are no foolproof systems to deal with failures; in an asynchronous system, there is no way to be sure that a node is down or whether there is network delay. This may sound trivial, but it really isn't.
  • Achieving consensus can be done by the Paxos family of algorithms, versions of which are used in Google's bigtable, and in other places.

You'll want to delve into a distributed systems textbook (or several). I like Tannenbaum's Distributed Systems: Principles and Paradigms

其他推荐答案

A great blog that talks a lot about distributed systems and distributed algorithms -- including implementing Paxos -- is http://the-paper-trail.org/