Slide 1 : LVS : Linux Virtual Server Presented by : Abhishek Chib
Slide 2 : Cluster A computer cluster is a group of tightly coupled computers that work together closely so that in many respects they can be viewed as though they are a single computer.
The components of a cluster are commonly, but not always, connected to each other through fast local area networks.
Clusters are usually deployed to improve performance and/or availability over that provided by a single computer, while typically being much more cost-effective than single computers of comparable speed or availability.
Slide 3 : Types Of Cluster High-availability (HA) clusters
HA cluster implementations attempt to manage the redundancy inherent in a cluster to eliminate single points of failure.
Example: Linux-ha, Many other commercial products are available for HA
Load-balancing clusters
Load-balancing clusters operate by having all workload come through one or more load-balancing front ends, which then distribute it to a collection of back end servers.
Such a cluster of computers is sometimes referred to as a server farm
Slide 4 : Example: Commercial Products
Platform LSF HPC, Sun Grid Engine, Moab Cluster Suite and Maui Cluster Scheduler
Open Source Products
The Linux Virtual Server project provides one commonly used free software package for the Linux OS High-performance computing (HPC) clusters High-performance computing (HPC) clusters are implemented primarily to provide increased performance by splitting a computational task across many different nodes in the cluster, and are most commonly used in scientific computing.
Slide 5 : Grid computing Grid Computing or grid clusters are a technology closely related to cluster computing. The key differences (by definitions which distinguish the two at all) between grids and traditional clusters are that grids connect collections of computers which do not fully trust each other, or which are geographically dispersed. Examples:
Folding@home project, Used by Researchers to find cure for diseases like Cancer.
SETI@home project , largest distributed cluster in existence,
It uses approximately three million home computers all over the world to analyze data from the Arecibo Observatory radio telescope, searching for evidence of extraterrestrial intelligence
Slide 6 : Introduction to Linux Virtual Server
Slide 7 : Agenda 1) Explains the Linux Virtual Server technology used by Red Hat Enterprise Linux to create a load-balancing cluster
2) Explains how to configure a Red Hat Enterprise Linux LVS cluster
3) Guides you through the Piranha Configuration Tool, a graphical interface used for configuring and monitoring an LVS cluster
Slide 8 : Compute clustering (such as Beowulf) uses multiple machines to provide greater computing power for computationally intensive tasks. This type of clustering is not addressed by Red Hat Enterprise Linux.
High-availability (HA) clustering uses multiple machines to add an extra level of reliability for a service or group of services.
Load-balance clustering uses specialized routing techniques to dispatch traffic to a pool of servers. Technology Overview
Slide 9 : What is LVS and how it works and why we should use LVS ?
Linux Virtual Server (LVS) appears as a single server to the user.
Two layered architecture
For higher throughput. The cost of increasing throughput by adding realservers in an LVS increases linearly, whereas the cost of increased throughput by buying a larger single machine increases faster than linearly
For redundancy. Individual machines can be switched out of the LVS, upgraded and brought back on line without interuption of service to the clients
For adaptability. If the throughput is expected to change gradually (as a business builds up), or quickly (for an event), the number of servers can be increased (and then decreased) transparently to the clients
Slide 10 : A Basic LVS Configuration
Slide 11 : Data Replication and Data Sharing Between Real Servers No built-in component in LVS clustering to share the same data between the real servers
Solution
Synchronize the data across the real server pool
Add a third layer to the topology for shared data access
(Mainly for busy FTP and HTTP Server)?
Slide 12 : A Three Tiered LVS Configuration
Slide 13 : LVS Scheduling Overview Round-Robin Scheduling
Weighted Round-Robin Scheduling
Least-Connection
Weighted Least-Connections (default)?
Locality-Based Least-Connection Scheduling
Slide 14 : Routing Methods
Slide 15 : NAT-ROUTING
Slide 16 : Direct Routing
Slide 17 : Direct Routing and the ARP Limitation 1) The issue with ARP requests in a direct routing LVS setup is that because a client request to an IP address must be associated with a MAC address for the request to be handled, the virtual IP address of the LVS system must also be associated to a MAC as wel 2) However, since both the LVS router and the real servers all have the same VIP, the ARP request will be broadcasted to all the machines associated with the VIP Problem This can cause several problems, such as the VIP being associated directly to one of the real servers and processing requests directly, bypassing the LVS router completely and defeating the purpose of the LVS setup Solution LVS Router With Powerful CPU so it can respond faster to ARP request ( prob when under load)
Slide 18 : To solve this issue, the incoming requests should only associate the VIP to the LVS router, which will properly process the requests and send them to the real server pool. Use arptables_jf Persistence and Firewall Marks Persistence
Persistence acts like a timer When a client connects to a service, LVS remembers the last connection for a specified period of time. By default persistence timeout value is 300 seconds. Firewall Marks Firewall marks are an easy and efficient way to a group ports used for a protocol or group of related protocols.
Example: HTTP and HTTPS for e-commerce site.
Slide 19 : LVS Cluster — A Block Diagram
Slide 20 : Components of an LVS Cluster pulse
This is the controlling process which starts all other daemons related to LVS router.
Controlled by /etc/rc.d/init.d/pulse script and it reads configuration file /etc/sysconfig/ha/lvs.cf. lvs
The lvs daemon runs on the active LVS router once called by pulse. It reads the configuration file /etc/sysconfig/ha/lvs.cf, calls the ipvsadm utility to build and maintain the IPVS routing table, and assigns a nanny process for each configured LVS service. If nanny reports a real server is down, lvs instructs the ipvsadm utility to remove the real server from the IPVS routing table.
Slide 21 : ipvsadm
This service updates the IPVS routing table in the kernel. The lvs daemon sets up and administers an LVS cluster by calling ipvsadm to add, change, or delete entries in the IPVS routing table.
nanny
The nanny monitoring daemon runs on the active LVS router. Through this daemon, the active router determines the health of each real server and, optionally, monitors its workload. A separate process runs for each service defined on each real server. send_arp
This program sends out ARP broadcasts when the floating IP address changes from one node to another during failover.
Slide 22 : Initial LVS Configuration Configuring Services on the LVS Routers Following Services should be running on LVS routers
piranha-gui (Only on Primary Router)?
pulse
sshd
use chkconfig –level daemon on
Configure Password & Start Piranha Configuration Tool
/usr/sbin/piranha-passwd
service piranha-gui start (use chkconfig also)?
Slide 23 : The piranha-gui service depends on httpd process.If you are restarting httpd so start piranha-gui service also.
The Piranha Configuration Tool runs on port 3636 by default, but its configurable.
Launch Piranha Configuration Tool via any Web browser
http://localhost:3636
Limiting Access To the Piranha Configuration Tool Edit /etc/sysconfig/ha/web/secure/.htaccess file as per your need
Slide 24 : Turning on Packet Forwarding
cat /proc/sys/net/ipv4/ip_forward
0
echo 1 >> /proc/sys/net/ipv4/ip_forward
vi /etc/sysctl.conf
net.ipv4.ip_forward=1
sysctl -p|grep ip_forward
or
sysctl net.ipv4.ip_forward
Slide 25 : Configure Services on the Real Servers
HTTPD
FTP
DNS
Slide 26 : Setting Up a Red Hat Enterprise Linux LVS NAT Cluster
Slide 27 : If your eth0 is connected to the Real Internet and eth1 is connected to your local area network then you can turn on masquerading with the following commands:
iptables -t nat -P POSTROUTING DROP
iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE
service iptables save
Create a server cluster and choose algorithm module
ipvsadm -A -t :80 -s wlc Replace VIP with your virual ipaddress ipvsadm -a -t :80 -r :80 -m
ipvsadm -a -t :80 -r :80 -m -w 2
OR You can use The Piranha Configuration Tool
Slide 28 : FTP In an LVS Cluster
File Transport Protocol (FTP) is an old and complex multi-port protocol that presents a distinct set of challenges to a clustered environment. To understand the nature of these challenges, you must first understand some key things about how FTP works. FTP is a TCP based service exclusively. There is no UDP component to FTP. FTP is an unusual service in that it utilizes two ports, a 'data' port and a 'command' port (also known as the control port). Traditionally these are port 21 for the command port and port 20 for the data port. The confusion begins however, when we find that depending on the mode, the data port is not always on port 20. In active mode FTP the client connects from a random unprivileged port (N > 1023) to the FTP server's command port, port 21. Then, the client starts listening to port N+1 and sends the FTP command PORT N+1 to the FTP server. The server will then connect back to the client's specified data port from its local data port, which is port 20. Active FTP
Slide 29 : In step 1, the client's command port contacts the server's command port and sends the command PORT 1027. The server then sends an ACK back to the client's command port in step 2. In step 3 the server initiates a connection on its local data port to the data port the client specified earlier. Finally, the client sends an ACK back as shown in step 4.
Slide 30 : Passive FTP In step 1, the client contacts the server on the command port and issues the PASV command. The server then replies in step 2 with PORT 2024, telling the client which port it is listening to for the data connection. In step 3 the client then initiates the data connection from its data port to the specified server data port. Finally, the server sends back an ACK in step 4 to the client's data port.
Slide 31 : Active & Passive FTP
Slide 32 : The two important things to note about all of this in regards to clustering is:
1. The client determines the type of connection, not the server. This means, to effectively cluster FTP, you must configure the LVS routers to handle both active and passive connections.
2.The FTP client/server relationship can potentially open a large number of ports that the Piranha Configuration Tool and IPVS do not know about.
How This Affects LVS Routing
IPVS packet forwarding only allows connections in and out of the cluster based on it recognizing its port number or its firewall mark. If a client from outside the cluster attempts to open a port IPVS is not configured to handle, it drops the connection. Similarly, if the real server attempts to open a connection back out to the Internet on a port IPVS does not know about, it drops the connection. This means all connections from FTP clients on the Internet must have the same firewall mark assigned to them and all connections from the FTP server must be properly forwarded to the Internet using network packet filtering rules.
Slide 33 : Iptables rule for FTP
Rules for Active Connections
/sbin/iptables -t nat -A POSTROUTING -p tcp -s 192.168.1.0/24 --sport 20 -j MASQUERADE
Rules for Passive Connections
/sbin/iptables -t mangle -A PREROUTING -p tcp -d 172.16.1.5/32 --dport 21 -j MARK --set-mark 21
/sbin/iptables -t mangle -A PREROUTING -p tcp -d 172.16.1.5/32 --dport 10000:20000 -j MARK --set-mark 21
You must also use the VIRTUAL SERVER subsection of Piranha Configuration Tool to configure a virtual server for port 21 with a value of 21 in the Firewall Mark Replace with your Lan
NAT interfaces network Replace with floating IP for the FTP VS defines in VS section
Slide 34 : Load IPVS FTP module
lsmod|grep ip_vs (Check the installed ip_vs module)?
modprobe ip_vs_ftp (Load ipvs ftp module for netfilter )?
Changes in Real FTP Servers
/etc/vsftpd.conf:
pasv_min_port=10000
pasv_max_port=20000
pasv_address=172.16.1.5
Slide 35 : Direct Routing and IPTables GATEWAY Internet Client LVS Router Real Server1 Real
Server2 DIP 172.16.1.1
VIP 172.16.1.5 172.16.1.15 172.16.1.16 172.16.1.254 10.10.10.2 Gw 172.16.1.254 Gw 172.16.1.254 iptables -t nat -A PREROUTING -p tcp -d 172.16.1.5 –dport 80 -j REDIRECT