Clusters |
|||||||||||||||||||||||||||
|
If your site serves many domains, you may want to install several independent CommuniGate Pro Servers and distribute the load by distributing domains between the servers. In this case you do not need to employ the special Cluster Support features. However if you have one or several domains with 100,000 or more accounts in each, and you cannot guarantee that clients will always connect to the proper server, when you need dynamic load balancing and very high availability, you should implement a CommuniGate Pro Cluster on your site.
Many vendors use the term Cluster for simple fail-over or hot stand-by configurations. The CommuniGate Pro software can be used in fail-over, as well as in Distributed Domains configurations, however these configurations are not referred to as Cluster configurations.
A CommuniGate Pro Cluster is a set of server computers that handle the site mail load together. Each Cluster Server hosts a set of regular, non-shared domains (the CommuniGate Pro Main Domain is always a non-shared one), and it also serves (together with other Cluster Servers) a set of Shared Domains.
To use CommuniGate Pro servers in a Cluster, you need a special CommuniGate Pro Cluster License.
Please read the Scalability section first to learn how to estimate your mail server load, and how to get most out of each CommuniGate Pro Server running your multi-server (Cluster) site.
There are two main types of Cluster configurations: Static and Dynamic.
Each Account in a Shared Domain served with a Static Cluster is created (hosted) on a certain Server,
and only that Server can access the account data directly. When a Static Cluster Server needs to perform any
operation with an account hosted on a different Server, it establishes a TCP/IP connection with
the account Host Server and accesses account data via that Host Server. This architecture allows you
to use local (i.e. non-shared) storage devices for account data.
Note: some vendors have "Mail Multiplexor"-type products. Those products usually implement
a subset of Static Cluster frontend functionality.
Accounts in Shared Domains served with a Dynamic Cluster are stored on a shared storage, so each Cluster Server (except for frontend Servers, see below) can access the account data directly. At any given moment, one of the Cluster Servers acts as a Cluster Controller synchronizing access to Accounts in Shared Domains. When a Dynamic Cluster Server needs to perform any operation with an account currently opened on a different Server, it establishes a TCP/IP connection with that "current host" Server and accesses account data via that Server. This architecture provides the highest availability (all accounts can be accessed as long as at least one Server is running), and does not require file-locking operations on the storage device.
Clusters of both types are usually equipped with frontend Servers. Frontend Servers cannot access account data directly - they always open connection to other (backend) Servers to perform any operation with account data.
Frontend servers accept TCP/IP connections from client computers (usually - from the Internet). In a pure Frontend-Backend configuration no accounts are created on any Frontend Server, but nothing prohibits you from serving some domains (with accounts and mailing lists) directly on the Frontend servers.
When a client establishes a connection with one of the Frontend Servers and sends the authentication information (the account name), the Frontend server detects on which Backend server the address account actually resides, and establishes a connection with that Backend Server.
The Frontend Servers:
If the Frontend Servers are direcly exposed to the Internet, and the security of a Frontend Server operating system is compromised, so someone gets unauthorized access to that Server OS, the security of the site is not totally compromised. Frontend Servers do not keep any Account information (mailboxes, passwords) on their disks. The "cracker" would then have to go through the firewall and break the security of the Backend Server OS in order to get access to any account information. Since the network between Frontend and Backend Servers can be disabled for all types of communications except the CommuniGate Pro inter-server communications, breaking the Backend Server OS is virtualy impossible.
Both Static and Dynamic Clusters can work without dedicated Frontend Servers This is called a symmetric configuration, where each Cluster Server implements both Frontend and Backend functions.
In the example below, the domain1.dom and domain2.dom domain Accounts are distributed between three
Static Cluster Servers, and each Server accepts incoming connections for these domains.
If the Server SV1 receives a connection for the account kate@domain1.dom located on
the Server SV2, the Server SV1 starts to operate as a Frontend Server, connecting to the
Server SV2 as the Backend Server hosting the addressed Account.
At the same time, an external connection established with the server SV2 can request access to
the ada@domain1.dom account located on the Server SV1. The Server SV2 acting as
a Frontend Server will open a connection to the Server SV1 and will use it as the Backend Server hosting
the addressed account.
In a symmetric configuration, the number of inter-server connections can be equal to the number of external (user) access-type (POP, IMAP, HTTP) connections. For a symmetric Static Cluster, the average number of inter-server connections is M*(N-1)/N, where M is the number of external (user) connections, and the N is the number of Servers in the Static Cluster. For a symmetric Dynamic Cluster, the average number of inter-Server connections is M*(N-1)/N * A/T, where T is the total number of Accounts in Shared Domains, and A is the average number of Accounts opened on each Server. For large ISP-type and portal-type sites, the A/T ratio is small (usually - not more than 1:100).
In a pure Frontend-Backend configuration, the number of inter-server connections is usually the same as the number of external (user) connections: for each external connection, a Frontend Server opens a connection to a Backend Server. A small number of inter-server connections can be opened between Backend Servers, too.
Acceess to all Shared Domain Accounts is provided without interruption as long as at least one Frontend Server is running.
If a Frontend server fails, no account becomes unavailable and no mail is lost. While POP and IMAP sessions conducted via the failed Frontend server are interrupted, all WebUser Interface session remain active, and WebUser Interface clients can continue to work via remaining Frontend Servers. POP and IMAP users can immediately re-establish their connections via remaining Frontend Servers.
First, install CommuniGate Pro Software on all Servers that will take part in your Cluster. Specify the Main Domain Name for all Cluster Servers. Those names should differ in the first domain name element only:
Use the WebAdmin Interface to open the Settings->General->Cluster page on each Backend Server, and enter all Frontend and Backend Server IP addresses. Backend CommuniGate Pro Servers will accept Cluster connections from the specified IP addresses only. If the Frontend Servers use dedicated Network Interface Cards (NICs) to communicate with Backend Servers, specify the IP addresses the Frontend Servers have on that internal network:
If your Backend Servers use non-standard port numbers for mail services, change the Backend Server Ports table on the Settings->General->Cluster page:
For example, if your Backend Servers accept WebUser Interface connections not on the port number 8100, but on the standard HTTP port 80, set 80 in the HTTP User field and click the Update button.
The Cluster and SMTP port, as well as their Cache values should be specified for the Dynamic Cluster Servers only.
If an address is routed to a domain listed in this table, the CommuniGate Pro Server uses its Clustering mechanism to connect to the Backend server at the specified address and performs the requested operations on that Backend server.
The logical setup of the Backend and Frontend Servers is the same - you simply do not create Shared Domain Accounts on any Frontend Server, but create them on your Backend Servers.
Computers in a Static Cluster can use different operating systems.
A complete Frontend-Backend Static Cluster configuration uses Load Balancers and several separate networks:
In a simplified configuration, you can connect Frontend Servers directly to the Internet, and balance the load using the DNS round-robin mechanism. In this case, it is highly recommended to install a firewall between Frontend and Backend Servers.
To add a Server to a Static Cluster:
After a new Frontend Server is configured and added to the Static Cluster, reconfigure the Load Balancer or the round-robin DNS server to direct incoming requests to the new Server, too.
After a new Backend Server is configured and added to the Static Cluster, you can start creating Accounts in its Shared Domains.
If you decide to shut down a Static Cluster Backend Server, all Accounts hosted on that Server become unavailable. Incoming messages to unavailable Accounts will be collected in the Frontend Server queues, and they will be delivered as soon as the Backend Server is added back or these Accounts become available on a different Backend Server (see below).
To restore access to the Accounts hosted on the failed Server, its Account Storage should be connected to any other Backend server. You can either:
After a sibling Backend server gets physical access to Account Storage of the failed server, you should modify the Directory so all Servers will contact the new "home" for Accounts in that Storage. This can be done by an LDAP utility that modifies all records in the Domains Subtree that contain the name of the failed Server as the hostServer attribute value. The utility should set the attribute value to the name of the new Host Server, and should add the oldHostServer attribute with the name of the original Host Server. This additional attribute will allow you to restore the hostServer attribute value after the original Host Server is restored and the Account Storage is reconnected to it. If the CommuniGate Pro is used as the site Directory Server, 100,000 Directory records can be modified within 1-2 minutes.
If it is necessary to provide 100% site uptime and 24x7 access to all Accounts even when some of the Backend Servers fail, the Dynamic Cluster should be deployed.
The main difference between Static and Dynamic Clusters is the account hosting. While each account in a Static Cluster has its Host Server, and only that Server can access the Account data directly, all Backend Servers in a Dynamic Cluster can access the Account data directly. The most common method to implement a Dynamic Cluster shared Account Storage is employing dedicated File Servers.
Many legacy mail servers can employ file servers for account storage. Since those servers are usually implemented as multi-process systems (under Unix), they use the same synchronisation methods in both single-server and multi-server environments: file locks implemented on the Operating System/File System level.
This method has the following problems:
In the attempt to decrease the negative effect of file-locking, some legacy mail servers support the MailDir mailbox format only (one file per message), and they rely on the "atomic" nature of file directory operations (rather than on file-level locks). This approach theoretically can solve some of the outlined problems (in real-life implementations it hardly solves any), but it results in wasting most of the file server storage: many high-end file servers use 64Kbyte blocks for files, while an average mail message size is about 4Kb, and storing each message in a separate file results in wasting more than 90% of the file server disk space, and overloads file server internal file tables. Also, performance of File Servers severely declines when an application uses many smaller files instead of few larger files.
While simple clustering based on Operating System/File System multi-access capabilities works fine for Web servers (where the data is not modified too often), it does not work well for Mail servers where the data modification traffic is almost the same as the data retrieval traffic.
Simple Clustering does not provide any additional value (like Single Service Image), so administering a 10-Server cluster is more difficult than administering 10 independent Servers.
The CommuniGate Pro software supports the External INBOX feature, so a file-based clustering can be implemented with the CommuniGate Pro, too. But because of the problems outlined above, it is highly recommended to avoid this type of solutions and use the real CommuniGate Pro Dynamic Cluster instead.
This architecture provides the maximum uptime: if a Backend Server fails, all Accounts can be accessed via other Backend Servers - without any manual operator intervention, and without any downtime. The site continues to operate and provide access to all it Accounts as long as at least one Backend Server is running.
One of the Backend Servers in a Dynamic Cluster acts as the Cluster Controller. It synchronizes all other Servers in the Cluster and executes operations such as creating Shared Domains, creating and removing accounts in the shared domains, etc. The Cluster Controller also provides the Single Service Image functionality: not only a site user, but also a site administrator can connect to any Server in the Dynamic Cluster and perform any Account operation (even if the Account is currently opened on a different Server), as well as any Domain-level operations (like Domain Settings modification), and all modifications will be automatically propagated to all Cluster Servers.
Note: most of the Domain-level update operations, such as updating Domain Settings, Default Account Settings, WebUser Interface Settings, and Domain-Level Alerts may take up to 30 seconds to propagate to all Servers in the Cluster. Account-Level modifications come into effect on all Servers immediately.
The Cluster Contoller collects the load level information from the Backend Servers. When a Frontend Server receives a session request for an Account not currently opened on any Backend Server, the Controller directs the Frontend Server to the least loaded Backend Server. This second-level load balancing for Backend Server is based on actual load levels and it supplements the basic first-level Frontend load balancing (DNS round-robin or traffic-based).
While the Dynamic Cluster can maintain a Directory with Account records, the Dynamic Cluster functionality does not rely on the Directory. If the Directory is used, it should be implemented as a Shared Directory.
A complete Frontend-Backend Dynamic Cluster configuration uses Load Balancers and several separate networks:
Since all Backend Servers in a Dynamic Cluster have direct access to Account data, they should run the operating systems using the same EOL (end-of-line) conventions. This means that all Backend Servers should either run the same or different flavors of the Unix OS, or they all should run the same or different flavors of the MS Windows OS. Frontend Servers do not have direct access to the Account data, so you can use any OS for your Frontend Servers (for example, a site can use the Solaris OS for Backend Servers and Microsoft Windows 2000 for Frontend Servers).
Note: if creating symbolic links is problematic (as on MS Windows platforms), you should specify the location of the "mounted" file directory as the --SharedBase Command Line Option:
Use the WebAdmin Interface of this first Backend Server to verify that the Cluster Controller is running. Open the Domains page to check that:
Use the Create Shared Domain button to create additional Shared Domains to be served with the Dynamic Cluster.
When the Cluster Controller is running, the site can start serving clients (if you do not use Frontend Servers). If your configuration employs Frontend servers, at least one Frontend Server should be started (see below).
Additional Backend Server can be added to the Cluster at any moment. They should be pre-configured in the exactly the same was as the first Backend Server was configured.
To add a Backend Server to your Dynamic Cluster, start it with the --ClusterMember address Command Line option (it can be added to the CommuniGatePro startup script). The address parameter should specify the IP address of the current Cluster Controller Server.
Use the WebAdmin interface to verify that the Backend Server is running. Use the Domains page to check that all Shared Domains are visible and that you can administer Accounts in the Shared Domains.
When the Cluster Controller and at least one Backend Server are running, they both can serve all accounts in the Shared Domains. If you do not use Frontend Servers, load-balancing should be implemented using a regular load-balancer switch, DNS round-robin, or similar technique that distributes incoming requests between all Backend Servers.
Install and Configure the CommuniGate Pro software on a Frontend Server computer. Since Frontend Servers do not access Account data directly, there is no need to make the SharedDomains file directory available ("mounted" or "mapped") to any Frontend Server.
To add a Frontend Server to your Dynamic Cluster, start it with the --ClusterFrontend address Command Line option (it can be added to the CommuniGatePro startup script). The address parameter should specify the IP address of the current Cluster Controller Server.
Use the WebAdmin interface to verify that the Frontend Server is running. Use the Domains page to check that all Shared Domains are visible.
When Frontend Servers try to open one of the Shared Domain accounts, the Controller directs them to one of the running Backend Servers, distributing the load between all available Backend Servers.
If you use Frontend Servers, only Frontend Servers should have dedicated IP Addresses for Shared Domains. Inter-server communications always use full account names (accountname@domainname), so there is no need to dedicate IP Addresses to Shared Domains on Backend Servers.
If you use the DNS round-robin mechanisms to distribute the site load, you need to assign N IP addresses to each Shared Domain that needs dedicated IP addresses, where N is the number of your Frontend Servers. Configure the DNS Server to return these addresses in the round-robin manner:
In this example, the Cluster is serving two Shared Domains: domain1.dom and domain2.dom, and the Cluster has three Frontend Servers. Three IP addresses are assigned to the each domain name in the DNS server tables, and the DNS server returns all three addresses when a client is requesting A-records for one of these domain names. Each time the DNS server "rotates" the order of the IP addresses in its responses, implementing the DNS "round-robin" load balancing (client applications usually use the first address in the DNS server response, and use other addresses only if an attempt to establish a TCP/IP connection with the first address fails).
When configuring these Shared Domains in your CommuniGate Pro Servers, you assign all three IP addresses to the each Domain.
If you use a Load Balancer to distribute the site load, you need to assign only 1 IP addresses to each Shared Domain, where N is the number of your Frontend Servers. You assign a unique IP address (in your internal LAN address range) for each Shared Domain on each Frontend Server:
In this example, the Cluster is serving two Shared Domains: domain1.dom and domain2.dom, and the Cluster has three Frontend Servers. One IP Addresses assigned to each Shared Domain in the DNS server tables, and those addresses are external (Internet) addresses of your Load Balancer. You should instruct the Load Balancer to distribute connections received on each of its external IP addresses to three internal IP addresses - the addresses assigned to your Frontend Servers.
When configuring these Shared Domains in your CommuniGate Pro Servers, you assign these three internal IP addresses to each Domain.
DNS MX-records for Shared Domains can point to their A-records.
To protect the site from these "cracks":
The outgoing mail traffic generated with regular (POP/IMAP) clients is submitted to the site using the A-records of the site Domains. As a result, the submitted messages go to the Frontend Servers and the messages are distributed from there.
Messages generated with WebUser clients and messages generated automatically (using the Automated Rules) are generated on the Backend Servers. Since usually the Backend servers are behind the firewall and since you usually do not want the Backend Servers to spend their resources maintaining SMTP queues, it is recommended to use the forwarding feature of the CommuniGate Pro SMTP module. Select the Forward to option and specify the domain name that resolves into the IP addresses of all (or some) Frontend Servers. In this case all mail generated on the Backend Server will be quickly sent to the Frontend Servers and it will be distributed from there.
Frontend Servers can be grouped into subsets for traffic segmentation. Each subset can have its own load balancer(s), and a switch that connects this Frontend Subset with every Backend Dynamic Cluster.
If you plan to deploy many (50 and more) Frontend Servers, the Directory Server itself can become the main site bottleneck. To remove this bottleneck and to provide redundancy on the Directory level, you can deploy several Directory Servers (each Serving serving one or serveral Frontend subsets). Backend Dynamic Clusters can be configured to update only one "Master" Directory Server, and other Directory Servers can use replication mechanisms to synchronize with the Master Directory Server, or the Backend Clusters can be configured to modify all Directory Servers at once.