=================================================================================================================================== VMS VAXcluster Primitives Connection Manager =================================================================================================================================== VAXcluster Goals o Lock integrity - Strong notion of membership - Avoid ""PARTITION"" o Automatic configuration - Avoid detailed global configuration information o Avoid single points of failure =================================================================================================================================== Connection Manager o Deals only with VAX nodes o Supports distributed services - Distributed lock manager - $ MOUNT /CLUSTER - OPCOM - $BRKTHRU =================================================================================================================================== Principles o VAXcluster member must have connections to all other members - Connection Manager seeks this state o Two VAXclusters may not intersect - Shut-down nodes when this happens =================================================================================================================================== Connection Manager Structure o Two Levels o Lower Level - Node-to-node protocol - Built upon SCS - Acknowledged messages o Upper Level - Deals with cluster as a whole - Coordinates state transitions =================================================================================================================================== Lower Level o Form connection to visible nodes o Repair broken connections o Message guarantees - Sequential delivery - No lost messages o Only one failure type - Time-out of repair attempt - ""Last Gasp"" from remote node =================================================================================================================================== Upper Level o Prevent partition o Orchestrates cluster state changes - Formation of cluster - Addition of node - Reconfiguration of cluster - Add / remove quorum disk - Manually adjust quorum =================================================================================================================================== Avoiding Partition o Partition exists if cluster is 2 independently running pieces o Solution - Votes and quorum - Quorum required is > half of total votes - No activity without quorum =================================================================================================================================== Quorum Details o Each node has 0 or more votes o Each node suggests quorum o Cluster has sum of nodes votes o Cluster has max of suggested quorum o Block activity if votes < quorum o Quorum ratchets up o Quorum always greater than 1/2 votes =================================================================================================================================== Quorum Example o Three nodes -- A, B, C - Each has 1 vote - Each suggests quorum of 2 o Cluster has - Votes: 3 - Quorum: 2 - Activity allowed =================================================================================================================================== Quorum Example Continued o If one node is lost - Votes remaining: 2 - Quorum: 2 - Activity is allowed =================================================================================================================================== Quorum Example Continued o If second node is lost - Votes remaining: 1 - Quorum: 2 - Activity is blocked =================================================================================================================================== Extended Quorum o Needed for two node case o Designate disk to serve as ""node"" o Disk assigned votes by each node o Disk votes count only if - All nodes designate same disk - All nodes can read / write disk o File: [000000]QUORUM.DAT =================================================================================================================================== Cluster State Changes o Form cluster o Join cluster o Reconfigure cluster o Add / remove quorum disk o Manually adjust quorum =================================================================================================================================== State Change Mechanism o Node requests ""coordinator"" role o Proposes state change o Every participating node must agree o State change executed o Two-phase commit protocol assures atomicity =================================================================================================================================== Booting o Early decision regarding cluster o Booting waits until - new cluster formed - existing cluster joined o Leave cluster only by shutting down =================================================================================================================================== Cluster Reconfiguration o Resolves connectivity defect o New cluster - Subset of old cluster - Preserves maximum votes o Orchestrate lock manager rebuild =================================================================================================================================== Performance o Detection of failed node - Operator shutdown: immediate - Software bugcheck: usually immediate - Other: up to 90 seconds + RECNXINTERVAL o Reconfiguration - 8 seconds minimum - 10 seconds typical ===================================================================================================================================