UIT – 3 Subject: CCICE, Bhawanipatna By Soumya Sourabha Patnaik, B.Tech (CSE) UIT – 3: etwork Layer Subject: CCICE, Bhawanipatna otes by Soumya Sourabha Patnaik, B.Tech (CSE) 2 etwork Layer • It is responsible for host to host delivery and for routing the packets trough the routers or switches. • For delivering data from host to host we need certain routing mechanism. • Whether the network layer provides a datagram service (in which case different packets between a given host-destination pair may take different routes) the network layer must nonetheless determine the path for a packet. This is the job of the network layer routing protocol. • The network protocol in the Internet is called the Internet Protocol, or more commonly, the IP Protocol. IP (Internet Protocol) • A host (also called an end system) has one link into the network. When IP in the host wants to send a datagram, it passes the datagram to its link. The boundary between the host and the link is called the interface. • A router is fundamentally different from a host in that it has two or more links that connect to it. When a router forwards a datagram, it forwards the datagram over one of its links. • The boundary between the router and any one of its links is also called an interface. Thus, a router has multiple interfaces, one for each of its links. Because every interface (for a host or router) is capable of sending and receiving IP data grams, IP requires each interface to have an IP address. • Each IP address is 32 bits long (equivalently, four bytes) long. IP addresses are typically written in so-called "dot-decimal notation", where by each byte of the address is written in its decimal form and is separated by a period. For example, a typical IP address would be 193.32.216.9. The 193 is the decimal equivalent for the first 8 bits of the address; the 32 is the decimal equivalent for the second 8 bits ofthe address, etc. Thus, the address 193.32.216.9 in binary notation is: 11000001 00100000 11011000 00001001 ( A space as been added between the bytes for visual purposes.) Because each IP address is 32 bits long, there are 232 possible IP addresses. UIT – 3: etwork Layer Subject: CCICE, Bhawanipatna otes by Soumya Sourabha Patnaik, B.Tech (CSE) 3 • In the above figure there is one router which interconnects three LANs. • each of these LANs is called an IP network or more simply a "network". There are several things to observe from this diagram. First, the router has threes interfaces, labeled 1, 2 and 3. Each of the router interfaces has its own IP address, which are provided in above Figure. • In other words, each address has two parts: the first part (the first three bytes in this example) that specifies the network; and the second part (the last byte in this example) that addresses a specific host on the network. UIT – 3: etwork Layer Subject: CCICE, Bhawanipatna otes by Soumya Sourabha Patnaik, B.Tech (CSE) 4 • Address Resolution Protocol (ARP) Address Resolution Protocol (ARP) is a telecommunications protocol used for resolution of network layer addresses into link layer addresses, a critical function in multipleaccces networks. Packet structure The Address Resolution Protocol uses a simple message format that contains one address resolution request or response. The size of the ARP message depends on the upper layer and lower layer address sizes, which are given by the type of networking protocol .The message header specifies these types, as well as the size of addresses of each. The message header is completed with the operation code for request (1) and reply (2). The payload of the packet consists of four addresses, the hardware and protocol address of the sender and receiver hosts. Hardware type (HTYPE) Internet Protocol (IPv4) over Ethernet ARP packet bit offset 0 – 7 8 – 15 0 Hardware type (HTYPE) 16 Protocol type (PTYPE) 32 Hardware address length (HLEN) Protocol address length (PLEN) 48 Operation (OPER) 64 Sender hardware address (SHA) (first 16 bits) 80 (next 16 bits) 96 (last 16 bits) 112 Sender protocol address (SPA) (first 16 bits) 128 (last 16 bits) 144 Target hardware address (THA) (first 16 bits) 160 (next 16 bits) 176 (last 16 bits) 192 Target protocol address (TPA) (first 16 bits) 208 (last 16 bits) UIT – 3: etwork Layer Subject: CCICE, Bhawanipatna otes by Soumya Sourabha Patnaik, B.Tech (CSE) 5 This field specifies the network protocol type. Example: Ethernet is 1. Protocol type (PTYPE) This field specifies the internetwork protocol for which the ARP request is intended. For IPv4, this has the value 0x0800. The permitted PTYPE values share a numbering space with those for Ether Type. Hardware length (HLE) This field stores the length (in octets) of a hardware address. Example: Ethernet addresses size is 6. Protocol length (PLE) This field stores the length (in octets) of addresses used in the upper layer protocol. Example: IPv4 address size is 4. Operation This bit specifies the operation that the sender is performing: 1 for request, 2 for reply. Sender hardware address (SHA) This bit stores the media address of the sender. Sender protocol address (SPA) This bit stores the internetwork address of the sender. Target hardware address (THA) This bit stores the media address of the intended receiver. This field is ignored in requests. Target protocol address (TPA) This bit stores the internetwork address of the intended receiver. ARP protocol parameter values have been standardized and are maintained by the Internet Assigned Numbers Authority (IANA). UIT – 3: etwork Layer Subject: CCICE, Bhawanipatna otes by Soumya Sourabha Patnaik, B.Tech (CSE) 6 IPV4 Packet structure An IP packet consists of a header section and a data section Header The IPv4 packet header consists of 14 fields, of which 13 are required. The 14th field is optional and named as: options. The most significant bit(MSB) is numbered 0, so the version field is actually found in the four most significant bits of the first byte, for example. bit offset 0–3 4–7 8–13 14-15 16–18 19–31 0 Version Header Length Differentiated Services Code Point Explicit Congestion Notification Total Length 32 Identification Flags Fragment Offset 64 Time to Live Protocol Header Checksum 96 Source IP Address 128 Destination IP Address 160 Options ( if Header Length > 5 ) 160 or 192+ Data Version: The first header field in an IP packet is the four-bit version field. For IPv4, this has a value of 4 (hence the name IPv4). Internet Header Length (IHL): The second field (4 bits) is the Internet Header Length (IHL) telling the number of 32-bit words in the header. The minimum value for this field is 5, which is a length of 5×32 = 160 bits = 20 bytes. Being a 4-bit value, the maximum length is 15 words (15×32 bits) or 480 bits = 60 bytes. Differentiated Services Code Point (DSCP): Originally defined as the Type of Service field, this field is now defined by RFC 2474 for Differentiated services (DiffServ). New technologies are emerging that requires real-time data UIT – 3: etwork Layer Subject: CCICE, Bhawanipatna otes by Soumya Sourabha Patnaik, B.Tech (CSE) 7 streaming and therefore make use of the DSCP field. An example is Voice over IP (VoIP) that is used for interactive data voice exchange. Explicit Congestion otification (EC): Defined in RFC 3168 and allows end-to-end notification of network congestion without dropping packets. ECN is an optional feature that is only used when both endpoints support it and are willing to use it. It is only effective when supported by the underlying network. Total Length: This 16-bit field defines the entire datagram size, including header and data, in bytes. The minimum-length datagram is 20 bytes (20-byte header + 0 bytes data) and the maximum is 65,535 bytes — the maximum value of a 16-bit word. The minimum size datagram that any host is required to be able to handle is 576 bytes, but most modern hosts handle much larger packets. Sometimes sub networks impose further restrictions on the size, in which case data grams must be fragmented. Fragmentation is handled in either the host or packet switch in IPv4. Identification: This field is an identification field and is primarily used for uniquely identifying fragments of an original IP datagram. Some experimental work has suggested using the ID field for other purposes, such as for adding packet-tracing information to data grams in order to help trace back data grams with spoofed source addresses. Flags: A three-bit field follows and is used to control or identify fragments. They are (in order, from high order to low order): • bit 0: Reserved; must be zero. • bit 1: Don't Fragment (DF) • bit 2: More Fragments (MF) If the DF flag is set and fragmentation is required to route the packet then the packet is dropped. This can be used when sending packets to a host that does not have sufficient resources to handle fragmentation. It can also be used for Path MTU Discovery, either automatically by the host IP software, or manually using diagnostic tools such as ping or trace route. For un-fragmented packets, the MF flag is cleared. For fragmented packets, all fragments except the last have the MF flag set. The last fragment has a non-zero Fragment Offset field, differentiating it from an un-fragmented packet. Fragment Offset: UIT – 3: etwork Layer Subject: CCICE, Bhawanipatna otes by Soumya Sourabha Patnaik, B.Tech (CSE) 8 The fragment offset field, measured in units of eight-byte blocks, is 13 bits long and specifies the offset of a particular fragment relative to the beginning of the original un-fragmented IP datagram. The first fragment has an offset of zero. This allows a maximum offset of (213 – 1) × 8 = 65,528 bytes which would exceed the maximum IP packet length of 65,535 bytes with the header length included (65,528 + 20 = 65,548 bytes). Time to Live (TTL) An eight-bit time to live field helps prevent datagrams from persisting (e.g. going in circles) on an internet. This field limits a datagram's lifetime. It is specified in seconds, but time intervals less than 1 second are rounded up to 1. In latencies typical in practice, it has come to be a hop count field. Each router that a datagram crosses decrements the TTL field by one. When the TTL field hits zero, the packet is no longer forwarded by a packet switch and is discarded. Typically, an ICMP Time Exceeded message is sent back to the sender to inform it that the packet has been discarded. The reception of these ICMP messages is at the heart of how traceroute works. Protocol This field defines the protocol used in the data portion of the IP datagram. The Internet Assigned Numbers Authority maintains a list of IP protocol numbers which was originally defined in RFC 790. Header Checksum The 16-bit checksum field is used for error-checking of the header. At each hop, the checksum of the header must be compared to the value of this field. If a header checksum is found to be mismatched, then the packet is discarded. Errors in the data field must be handled by the encapsulated protocol and both UDP and TCP have checksum fields. As the TTL field is decremented on each hop, a new checksum must be computed each time. The method used to compute the checksum is defined by RFC 1071: The checksum field is the 16-bit one's complement of the one's complement sum of all 16-bit words in the header. For purposes of computing the checksum, the value of the checksum field is zero. For example, use Hex 4500003044224000800600008c7c19acae241e2b (20 bytes IP header): 4500 + 0030 + 4422 + 4000 + 8006 + 0000 + 8c7c + 19ac + ae24 + 1e2b = 2BBCF 2 + BBCF = BBD1 = 1011101111010001, the 1'S of sum = 0100010000101110 = 442E To validate a header's checksum the same algorithm may be used -the checksum of a header which contains a correct checksum field is a word containing all zeros (value 0): 2BBCF + 442E = 2FFFD. 2 + FFFD = FFFF. the 1'S of FFFF = 0. Source address UIT – 3: etwork Layer Subject: CCICE, Bhawanipatna otes by Soumya Sourabha Patnaik, B.Tech (CSE) 9 An IPv4 address indicating the sender of the packet. Note that this address may be changed in transit by a network address translation device. Destination address An IPv4 address indicating the receiver of the packet. As with the Source address, this may be changed in transit by a network address translation device. Options Additional header fields may follow the destination address field, but these are not often used. Note that the value in the IHL field must include enough extra 32-bit words to hold all the options (plus any padding needed to ensure that the header contains an integral number of 32-bit words). The list of options may be terminated with an EOL (End of Options List, 0x00) option; this is only necessary if the end of the options would not otherwise coincide with the end of the header. The possible options that can be put in the header are as follows: UIT – 3: etwork Layer Subject: CCICE, Bhawanipatna otes by Soumya Sourabha Patnaik, B.Tech (CSE) 10 IPV6 Internet Protocol version 6 (IPv6) is a version of the Internet Protocol (IP). It is designed to succeed the Internet Protocol version 4 (IPv4). The Internet operates by transferring data between hosts in small packets that are independently routed across networks as specified by an international communications protocol known as the Internet Protocol. Comparison to IPv4 and IPv6 specifies a new packet format, designed to minimize packet header processing by routers. Because the headers of IPv4 packets and IPv6 packets are significantly different, the two protocols are not interoperable. However, in most respects, IPv6 is a conservative extension of IPv4. Most transport and application-layer protocols need little or no change to operate over IPv6. Packet Structure of IPV6 The IPv6 packet is composed of two parts: UIT – 3: etwork Layer Subject: CCICE, Bhawanipatna otes by Soumya Sourabha Patnaik, B.Tech (CSE) 11 The packet header The payload. The header consists of a fixed portion with minimal functionality required for all packets and may contain optional extension to implement special features. The fixed header occupies the first 40 octets (320 bits) of the IPv6 packet. It contains the source and destination addresses, traffic classification options, a hop counter, and a pointer for extension headers if any. The ext Header field, present in each extension as well, points to the next element in the chain of extensions. The last field points to the upper-layer protocol that is carried in the packet's payload. Extension headers carry options that are used for special treatment of a packet in the network, e.g., for routing, fragmentation, and for security using the IP sec framework. The payload can have a size of up to 64KB without special options or larger with a jumbo payload option in a Hop-By-Hop Options extension header. Unlike in IPv4, fragmentation is handled only in the end points of a communication session; routers never fragment a packet, and hosts are expected to use Path MTU Discovery to select a packet size that can traverse the entire communications path. Address Format IPv6 addresses have two logical parts: a 64-bit network prefix and a 64-bit host address part. (The host address is often automatically generated from the interface MAC address. An IPv6 address is represented by 8 groups of 16-bit hexadecimal values separated by colons (:) shown as follows: 2001:0db8:85a3:0000:0000:8a2e:0370:7334 The hexadecimal digits are case-insensitive. The 128-bit IPv6 address can be abbreviated with the following rules: • Rule one: Leading zeroes within a 16-bit value may be omitted. For example, the address fe80:0000:0000:0000:0202:b3ff:fe1e:8329 may be written as fe80:0:0:0:202:b3ff:fe1e:8329 • Rule two: One group of consecutive zeroes within an address may be replaced by a double colon. For example, fe80:0:0:0:202:b3ff:fe1e:8329 becomes fe80::202:b3ff:fe1e:8329 UIT – 3: etwork Layer Subject: CCICE, Bhawanipatna otes by Soumya Sourabha Patnaik, B.Tech (CSE) 12 Tunneling In order to reach the IPv6 Internet, an isolated host or network must use the existing IPv4 infrastructure to carry IPv6 packets. This is done using a technique known as tunneling which consists of encapsulating IPv6 packets within IPv4, in effect using IPv4 as a link layer for IPv6. The direct encapsulation of IPv6 data grams within IPv4 packets is indicated by IP protocol number 41. IPv6 can also be encapsulated within UDP packets e.g. in order to cross a router or NAT device that blocks protocol 41 traffic. ICMP The Internet Control Message Protocol (ICMP) is one of the core protocols of the Internet Protocol Suite. It is chiefly used by the operating systems of networked computers to send error messages indicating, for example, that a requested service is not available or that a host or router could not be reached. ICMP can also be used to relay query messages. ICMP for Internet Protocol version 4 (IPv4) is also known as ICMPv4. IPv6 has a similar protocol, ICMPv6. ICMP messages are typically generated in response to errors in IP datagrams or for diagnostic or routing purposes. ICMP errors are always reported to the original source IP address of the originating datagram. Each ICMP message is encapsulated directly within a single IP datagram, and thus ICMP is unreliable. UIT – 3: etwork Layer Subject: CCICE, Bhawanipatna otes by Soumya Sourabha Patnaik, B.Tech (CSE) 13 ICMP segment structure Header The ICMP header starts after the IPv4 header. All ICMP packets will have an 8-byte header and variable-sized data section. The first 4 bytes of the header will be consistent. The first byte is for the ICMP type. The second byte is for the ICMP code. The third and fourth bytes are a checksum of the entire ICMP message. The contents of the remaining 4 bytes of the header will vary based on the ICMP type and code. ICMP error messages contain a data section that includes the entire IP header plus the first 8 bytes of data from the IP datagram that caused the error message. The ICMP datagram is then encapsulated in a new IP datagram. • Type – ICMP type as specified below. • Code – Subtype to the given type. • Checksum – Error checking data. Calculated from the ICMP header+data, with value 0 for this field. The checksum algorithm is specified in RFC 1071. • Rest of Header – Four byte field. Will vary based on the ICMP type and code. Padding data Padding data follows the ICMP header (in octets): • Windows "ping.exe" adds, by default, 32 bytes of padding • The Linux "ping" utility adds, by default, 56 bytes of padding List of permitted control messages (incomplete list) Type Code Description 0 – Echo Reply 0 Echo reply (used to ping) 1 and 2 Reserved UIT – 3: etwork Layer Subject: CCICE, Bhawanipatna otes by Soumya Sourabha Patnaik, B.Tech (CSE) 14 3 – Destination Unreachable 0 Destination network unreachable 1 Destination host unreachable 2 Destination protocol unreachable 3 Destination port unreachable 4 Fragmentation required, and DF flag set 5 Source route failed 6 Destination network unknown 7 Destination host unknown 8 Source host isolated 9 Network administratively prohibited 10 Host administratively prohibited 11 Network unreachable for TOS 12 Host unreachable for TOS 13 Communication administratively prohibited 4 – Source Quench 0 Source quench (congestion control) 5 – Redirect Message 0 Redirect Datagram for the Network 1 Redirect Datagram for the Host 2 Redirect Datagram for the TOS & network 3 Redirect Datagram for the TOS & host 6 Alternate Host Address 7 Reserved UIT – 3: etwork Layer Subject: CCICE, Bhawanipatna otes by Soumya Sourabha Patnaik, B.Tech (CSE) 15 8 – Echo Request 0 Echo request (used to ping) 9 – Router Advertisement 0 Router Advertisement 10 – Router Solicitation 0 Router discovery/selection/solicitation 11 – Time Exceeded 0 TTL expired in transit 1 Fragment reassembly time exceeded 12 – Parameter Problem: Bad IP header 0 Pointer indicates the error 1 Missing a required option 2 Bad length 13 – Timestamp 0 Timestamp 14 – Timestamp Reply 0 Timestamp reply 15 – Information Request 0 Information Request 16 – Information Reply 0 Information Reply 17 – Address Mask Request 0 Address Mask Request 18 – Address Mask Reply 0 Address Mask Reply 19 Reserved for security 20 through 29 Reserved for robustness experiment 30 – Traceroute 0 Information Request 31 Datagram Conversion Error 32 Mobile Host Redirect 33 Where-Are-You (originally meant for IPv6) 34 Here-I-Am (originally meant for IPv6) UIT – 3: etwork Layer Subject: CCICE, Bhawanipatna otes by Soumya Sourabha Patnaik, B.Tech (CSE) 16 35 Mobile Registration Request 36 Mobile Registration Reply 37 Domain Name Request 38 Domain Name Reply 39 SKIP Algorithm Discovery Protocol, Simple Key-Management for Internet Protocol 40 Photuris, Security failures 41 ICMP for experimental mobility protocols such as Seamoby [RFC4065] 42 through 255 Reserved UIT – 3: etwork Layer Subject: CCICE, Bhawanipatna otes by Soumya Sourabha Patnaik, B.Tech (CSE) 17 Transport Layer UDP (User Data Gram Protocol) With UDP, computer applications can send messages, in this case referred to as datagrams, to other hosts on an Internet Protocol (IP) network without requiring prior communications to set up special transmission channels or data paths. The protocol was designed by David P. Reed in 1980 and formally defined in RFC 768. UDP uses a simple transmission model without implicit handshaking dialogues for providing reliability, ordering, or data integrity. Thus, UDP provides an unreliable service and datagrams may arrive out of order, appear duplicated, or go missing without notice. UDP assumes that error checking and correction is either not necessary or performed in the application, avoiding the overhead of such processing at the network interface level. Time-sensitive applications often use UDP because dropping packets is preferable to waiting for delayed packets, which may not be an option in a real-time system. Service ports UDP applications use datagram sockets to establish host-to-host communications. An application binds a socket to its endpoint of data transmission, which is a combination of an IP address and a service port. A port is a software structure that is identified by the port number, a 16 bit integer value, allowing for port numbers between 0 and 65535. Port 0 is reserved, but is a permissible source port value if the sending process does not expect messages in response. Packet structure UDP is a minimal message-oriented Transport Layer protocol that is documented in IETF RFC 768. UDP provides no guarantees to the upper layer protocol for message delivery and the UDP protocol layer retains no state of UDP messages once sent. For this reason, UDP is sometimes referred to as Unreliable Datagram Protocol. UDP provides application multiplexing (via port numbers) and integrity verification (via checksum) of the header and payload. If transmission reliability is desired, it must be implemented in the user's application. UIT – 3: etwork Layer Subject: CCICE, Bhawanipatna otes by Soumya Sourabha Patnaik, B.Tech (CSE) 18 offset (bits) 0 – 15 16 – 31 0 Source Port Number Destination Port Number 32 Length Checksum 64+ Data The UDP header consists of 4 fields, each of which is 2 bytes (16 bits).[1] The use of two of those is optional in IPv4 (pink background in table). In IPv6 only the source port is optional (see below). Source port number This field identifies the sender's port when meaningful and should be assumed to be the port to reply to if needed. If not used, then it should be zero. If the source host is the client, the port number is likely to be an ephemeral port number. If the source host is the server, the port number is likely to be a well-known port number. Destination port number This field identifies the receiver's port and is required. Similar to source port number, if the client is the destination host then the port number will likely be an ephemeral port number and if the destination host is the server then the port number will likely be a well-known port number. Length A field that specifies the length in bytes of the entire datagram: header and data. The minimum length is 8 bytes since that's the length of the header. The field size sets a theoretical limit of 65,535 bytes (8 byte header + 65,527 bytes of data) for a UDP datagram. The practical limit for the data length which is imposed by the underlying IPv4 protocol is 65,507 bytes (65,535 − 8 byte UDP header − 20 byte IP header). Checksum The checksum field is used for error-checking of the header and data. If no checksum is generated by the transmitter, the field uses the value all-zeros.[5] This field is not optional for IPv6. UIT – 3: etwork Layer Subject: CCICE, Bhawanipatna otes by Soumya Sourabha Patnaik, B.Tech (CSE) 19 APPLICATIOS OF UDP Numerous key Internet applications use UDP, including: the Domain Name System (DNS), where queries must be fast and only consist of a single request followed by a single reply packet, the Simple Network Management Protocol (SNMP), the Routing Information Protocol (RIP)[1] and the Dynamic Host Configuration Protocol (DHCP). Voice and video traffic is generally transmitted using UDP. Real-time video and audio streaming protocols are designed to handle occasional lost packets, so only slight degradation in quality occurs, rather than large delays if lost packets were retransmitted. Because both TCP and UDP run over the same network, many businesses are finding that a recent increase in UDP traffic from these real-time applications is hindering the performance of applications using TCP, such as point of sale, accounting, and database systems. When TCP detects packet loss, it will throttle back its data rate usage. Since both real-time and business applications are important to businesses, developing quality of service solutions is seen as crucial by some. Comparison of UDP and TCP Transmission Control Protocol is a connection-oriented protocol, which means that it requires handshaking to set up end-to-end communications. Once a connection is set up user data may be sent bi-directionally over the connection. • Reliable – TCP manages message acknowledgment, retransmission and timeout. Multiple attempts to deliver the message are made. If it gets lost along the way, the server will rereqques the lost part. In TCP, there's either no missing data, or, in case of multiple timeouts, the connection is dropped. • Ordered – if two messages are sent over a connection in sequence, the first message will reach the receiving application first. When data segments arrive in the wrong order, TCP buffers the out-of-order data until all data can be properly re-ordered and delivered to the application. • Heavyweight – TCP requires three packets to set up a socket connection, before any user data can be sent. TCP handles reliability and congestion control. • Streaming – Data is read as a byte stream, no distinguishing indications are transmitted to signal message (segment) boundaries. UDP is a simpler message-based connectionless protocol. Connectionless protocols do not set up a dedicated end-to-end connection. Communication is achieved by transmitting information in one direction from source to destination without verifying the readiness or state of the receiver. However, one primary benefit of UDP over TCP is the application to voice over internet protocol (VoIP) where any handshaking would hinder clear voice communication. It is assumed in VoIP UDP that the end users provide any necessary real time confirmation that the message has been received. UIT – 3: etwork Layer Subject: CCICE, Bhawanipatna otes by Soumya Sourabha Patnaik, B.Tech (CSE) 20 • Unreliable – When a message is sent, it cannot be known if it will reach its destination; it could get lost along the way. There is no concept of acknowledgment, retransmission or timeout. • ot ordered – If two messages are sent to the same recipient, the order in which they arrive cannot be predicted. • Lightweight – There is no ordering of messages, no tracking connections, etc. It is a small transport layer designed on top of IP. • Datagrams – Packets are sent individually and are checked for integrity only if they arrive. Packets have definite boundaries which are honored upon receipt, meaning a read operation at the receiver socket will yield an entire message as it was originally sent. • o congestion control – UDP itself does not avoid congestion, and it's possible for high bandwidth applications to trigger congestion collapse, unless they implement congestion control measures at the application level. UIT – 3: etwork Layer Subject: CCICE, Bhawanipatna otes by Soumya Sourabha Patnaik, B.Tech (CSE) 21 TCP ETWORK COGESTIOTransmission Control Protocol (TCP) uses a network congestion avoidance algorithm that includes various aspects of an additive increase/multiplicative decrease (AIMD) scheme, with other schemes such as slow-start in order to achieve congestion avoidance. The TCP congestion avoidance algorithm is the primary basis for congestion control in the Internet. aming history Two such variations are those offered by TCP Tahoe and Reno. The two algorithms were retrospectively named after the 4.3BSD operating system in which each first appeared (which were themselves named after Lake Tahoe and the city of Reno, Nevada). The “Tahoe” algorithm first appeared in 4.3BSD-Tahoe (which was made to support the CCI Power 6/32 “Tahoe” minicomputer), and was made available to non-AT&T licensees as part of the “4.3BSD Networking Release 1”; this ensured its wide distribution and implementation. Improvements, described below, were made in 4.3BSD-Reno and subsequently released to the public as “Networking Release 2” and later 4.4BSD-Lite. The “TCP Foo” names for the algorithms appear to have originated in a 1996 paper by Kevin Fall and Sally Floyd.[6] TCP Tahoe and Reno To avoid congestion collapse, TCP uses a multi-faceted congestion control strategy. For each connection, TCP maintains a congestion window, limiting the total number of unacknowledged packets that may be in transit end-to-end. This is somewhat analogous to TCP's sliding window used for flow control. TCP uses a mechanism called slow start[7] to increase the congestion window after a connection is initialized and after a timeout. It starts with a window of two times the maximum segment size (MSS). Although the initial rate is low, the rate of increase is very rapid: for every packet acknowledged, the congestion window increases by 1 MSS so that the congestion window effectively doubles for every round trip time (RTT). When the congestion window exceeds a threshold ssthresh the algorithm enters a new state, called congestion avoidance. In some implementations (e.g., Linux), the initial ssthresh is large, and so the first slow start usually ends after a loss. However, ssthresh is updated at the end of each slow start, and will often affect subsequent slow starts triggered by timeouts. Congestion avoidance: As long as non-duplicate ACKs are received, the congestion window is additively increased by one MSS every round trip time. When a packet is lost, the likelihood of duplicate ACKs being received is very high (it's possible though unlikely that the stream just underwent extreme packet reordering, which would also prompt duplicate ACKs). The behavior of Tahoe and Reno differ in how they detect and react to packet loss: UIT – 3: etwork Layer Subject: CCICE, Bhawanipatna otes by Soumya Sourabha Patnaik, B.Tech (CSE) 22 • Tahoe: Triple duplicate ACKS are treated the same as a timeout. Tahoe will perform "fast retransmit", reduce congestion window to 1 MSS, and reset to slow-start state.[8] • Reno: If three duplicate ACKs are received (i.e., four ACKs acknowledging the same packet, which are not piggybacked on data, and do not change the receiver's advertised window), Reno will halve the congestion window, perform a fast retransmit, and enter a phase called Fast Recovery. If an ACK times out, slow start is used as it is with Tahoe.[8] Fast Recovery. (Reno Only) In this state, TCP retransmits the missing packet that was signaled by three duplicate ACKs, and waits for an acknowledgment of the entire transmit window before returning to congestion avoidance. If there is no acknowledgment, TCP Reno experiences a timeout and enters the slow-start state. Both algorithms reduce congestion window to 1 MSS on a timeout event. TCP Vegas Until the mid 1990s, all of TCP's set timeouts and measured round-trip delays were based upon only the last transmitted packet in the transmit buffer. University of Arizona researchers Larry Peterson and Lawrence Brakmo introduced TCP Vegas, in which timeouts were set and roundtrri delays were measured for every packet in the transmit buffer. In addition, TCP Vegas uses additive increases in the congestion window. This variant was not widely deployed outside Peterson's laboratory. However, TCP Vegas was deployed as default congestion control method for DD-WRT firmwares v24 SP2. TCP ew Reno TCP New Reno improves retransmission during the fast recovery phase of TCP Reno. During fast recovery, for every duplicate ACK that is returned to TCP New Reno, a new unsent packet from the end of the congestion window is sent, to keep the transmit window full. For every ACK that makes partial progress in the sequence space, the sender assumes that the ACK points to a new hole, and the next packet beyond the ACKed sequence number is sent. Because the timeout timer is reset whenever there is progress in the transmit buffer, this allows New Reno to fill large holes, or multiple holes, in the sequence space -much like TCP SACK. Because New Reno can send new packets at the end of the congestion window during fast recovery, high throughput is maintained during the hole-filling process, even when there are multiple holes, of multiple packets each. When TCP enters fast recovery it records the highest outstanding unacknowledged packet sequence number. When this sequence number is acknowledged, TCP returns to the congestion avoidance state. A problem occurs with New Reno when there are no packet losses but instead, packets are reordered by more than 3 packet sequence numbers. When this happens, New Reno mistakenly UIT – 3: etwork Layer Subject: CCICE, Bhawanipatna otes by Soumya Sourabha Patnaik, B.Tech (CSE) 23 enters fast recovery, but when the reordered packet is delivered, ACK sequence-number progress occurs and from there until the end of fast recovery, every bit of sequence-number progress produces a duplicate and needless retransmission that is immediately ACKed. New Reno performs as well as SACK at low packet error rates, and substantially outperforms Reno at high error rates. TCP Hybla TCP Hybla aims to eliminate penalization of TCP connections that incorporate a high-latency terrestrial or satellite radio link, due to their longer round trip times. It stems from an analytical evaluation of the congestion window dynamics, which suggests the necessary modifications to remove the performance dependence on RTT. TCP BIC Binary Increase Congestion control is an implementation of TCP with an optimized congestion control algorithm for high speed networks with high latency (called LF, long fat networks, in RFC 1072). BIC is used by default in Linux kernels 2.6.8 through 2.6.18. TCP CUBIC CUBIC is a less aggressive and more systematic derivative of BIC, in which the window is a cubic function of time since the last congestion event, with the inflection point set to the window prior to the event. CUBIC is used by default in Linux kernels since version 2.6.19. Compound TCP Compound TCP is a Microsoft implementation of TCP which maintains two different congestion windows simultaneously, with the goal of achieving good performance on LFNs while not impairing fairness. It has been widely deployed with Microsoft Windows Vista and Windows Server 2008 and has been ported to older Microsoft Windows versions as well as Linux.