TechFest Home Page TechFest Networking Page TechFest LAN Page TechFest Feedback

TechFest Ethernet Technical Summary

Copyright © 1999 All Rights Reserved. Do not duplicate or redistribute in any form.

Previous Chapter
Back to Table of Contents

3.0 Ethernet Media Access Control

This section describes the two media access control protocols defined for Ethernet: "half-duplex", and "full-duplex".

3.1 Half-Duplex Ethernet (CSMA/CD Access Protocol)

Half-Duplex Ethernet is the traditional form of Ethernet that uses the CSMA/CD (Carrier Sense Multiple Access/Collision Detect) protocol. With CSMA/CD two or more stations share a common transmission medium. To transmit a frame, a station must wait for an idle period on the medium when no other station is transmitting. It then transmits the frame by broadcasting it over the medium such that it is "heard" by all the other stations on the network. If another device tries to send data at the same time, a "collision" is said to occur. The transmitting station then intentionally transmits a "jam sequence" to ensure all stations are notified the frame transmission failed due to a collision. The station then remains silent for a random period of time before attempting to transmit again. This process is repeated until the frame is eventually transmitted successfully.

The basic rules for transmitting a frame are as follows:

  1. The network is monitored for a "carrier", or presence of a transmitting station. This process is known as "carrier sense".
  2. If an active carrier is detected, then transmission is deferred. The station continues to monitor the network until the carrier ceases.
  3. If an active carrier is not detected, and the period of no carrier is equal to or greater than the interframe gap, then the station immediately begins transmission of the frame.
  4. While the transmitting station is sending the frame, it monitors the medium for a collision.
  5. If a collision is detected, the transmitting station stops sending the frame data and sends a 32-bit "jam sequence". If the collision is detected very early in the frame transmission, the transmitting station will complete sending of the frame preamble before starting transmission of the jam sequence. The sequence jam is transmitted to ensure that the length of the collision is sufficient to be noticed by the other transmitting stations.
  6. After sending the jam sequence the transmitting station waits a random period of time chosen using a random number generator before starting the transmission process over from step 1 above. This process is called "backoff". The probability of a repeated collision is reduced by having the colliding stations wait a random period of time before retransmitting.
  7. If repeated collisions occur, then transmission is repeated, but the random delay is increased with each attempt. This further reduces the probability of another collision.
  8. This process repeats until a station transmits a frame without collision. Once a station successfully transmits a frame, it clears the collision counter it uses to increase the backoff time after each repeated collision.

3.1.1 Slot Time

The "slot time" is a key parameter for half-duplex Ethernet network operation. It is defined as 512 bit times for Ethernet networks operating at 10 and 100 Mb/s, and 4096 bit times for Gigabit Ethernet. In order for each transmitter to reliably detect collisions, the minimum transmission time for a complete frame must be at least one slot time, and the time required for collisions to propagate to all stations on the network must be less than one slot time. Thus, a station cannot finish transmission of a frame before detecting that a collision has occurred.

The signals transmitted by Ethernet stations encounter delays as they travel through the network. These delays consist of the time required for signals to travel across the cable segments, and the logic delays encountered when the signals pass through electronic components in Network Interface Cards (NICs) and repeating hubs. The longer the cable segments and the more hubs in the network, the longer it takes for a signal to propagate from one end of the network to the other. The time it takes a signal to travel between the two stations that are furthest apart in the network is known as the maximum "propagation delay" of the network.

For a station to detect that the frame it is transmitting has encountered a collision, its signal must propagate across the network to another station that detects the collision. This station must transmit a jam sequence to indicate a collision has been detected. The jam sequence must then propagate back across the network before being detected by the transmitting station. The sum of a network's maximum "round trip propagation delay" and the time required to transmit a jam sequence are the components that define the length of the Ethernet slot time.

Slot time is an important parameter for the following reasons:

For Gigabit Ethernet, the slot time had to be increased from 512 to 4096 bit times. Due to the higher data rate of Gigabit Ethernet, signals propagate only a very small distance within 512 bit times. At gigabit speeds a 512 bit slot time would support a maximum network size of about 20 meters. A network that small is clearly impractical, so the concept of a carrier extension was introduced to increase to slot time to 4096 bits. By increasing the size of the slot time and limiting the number of repeaters in a network to only one, a network size of 200 meters can be supported by Gigabit Ethernet.

3.1.2 Backoff

Backoff is the process by which a transmitting station determines how long to wait following a collision before attempting to retransmit the frame. If all stations waited the same length of time before retransmission, then another collision would inevitably occur. This is avoided by having each station generate a random number which determines the length of time it must wait before testing the carrier. This time period is known as the station's "backoff delay".

The backoff algorithm implemented in Ethernet is officially known as "truncated binary exponential backoff". Following a collision, each station generates a random number that falls within a specified range of values. It then waits that number of slot times before attempting retransmission. The range of values increases exponentially after each failed retransmission. For the first attempt the range is 0 to 1; for the second attempt, 0 to 3; for the third, 0 to 7 and so on. If repeated collisions occur, the range continues to expand until after 10 attempts when it reaches 0 to 1023. After that the range of values stays fixed from 0 to 1023. If a station is unsuccessful in transmitting after 16 attempts, the MAC function reports an "excessive collision error". The frame being transmitted is then dropped, requiring that application software detect its loss and initiate a retransmission.

Binary exponential backoff results in minimum delays before retransmission when traffic on the LAN is light. When traffic is high, repeated collisions cause the range of numbers to increase, thus lessening then chance of further collisions. In a network where the traffic is extremely high, repeated collisions will begin to cause excessive collision errors to be generated. Excessive collision errors are an indication that the traffic load has increased to the point that it can no longer be efficiently handled on a single Ethernet network.

3.1.3 Capture Effect

When the network is operating under a heavy load, the binary exponential backoff algorithm can exhibit an unfairness problem known as the "capture effect". The problem stems from the handling of the collision counters. Each station updates its collision counter independently and only after a transmission attempt. Only the winner zeroes its collision counter after a successful packet transmission. This approach benefits a single busy station permitting it to "capture" the network for an extended period of time.

A simple example of the capture effect consists of two stations that have a lot of data to send and can send data as fast as allowed. They both collide on their first transmission attempt and choose a backoff of 0 or 1. Station A chooses 0, and station B chooses 1. Station A gets to transmit while station B waits for one slot time. After Station A completes its transmission and the interframe gap passes, both stations are ready to transmit again and another collision occurs. This is station A's first collision for this frame, so it chooses a backoff of 0 or 1. However, this is station B's second collision for this frame, so it chooses a backoff between 0 and 3. Thus station A has a higher probability transmitting while station B waits again. If they happen to pick the same number and collide again, then the odds for station B get even worse.

The same scenario can repeat over and over again possibly ending only when station A's queue is finally empty or when station B finally reaches 16 attempts. After 16 attempts station B will reset its collision counter allowing it to compete more aggressively again. But it also discards the frame it was attempting to transmit, requiring that it be queued for transmission again by software.

In 1994 a new backoff algorithm called "binary logarithmic arbitration method" (BLAM) was proposed to alleviate the capture effect problem. An IEEE 802.3w working group was formed to add BLAM as an optional feature of the Ethernet standard. Although simulation results proved that BLAM offered a definite improvement over the binary exponential backoff algorithm, the work to incorporate it into the Ethernet standard was never completed due to a shift in focus to full-duplex Ethernet and a lack of interest in updating the half-duplex hardware.

3.2 Full-Duplex Ethernet

The release of the IEEE 802.3x standard defined a second mode of operation for Ethernet, called "full-duplex", that bypasses the CSMA/CD protocol. The CSMA/CD protocol is "half-duplex". This implies that a station may either transmit data, or receive data, but never both at the same time. Full-duplex mode allows two stations to simultaneously exchange data over a point to point link that provides independent transmit and receive paths. Since each station can simultaneously transmit and receive data, the aggregate throughput of the link is effectively doubled. A 10 Mb/s station operating in full-duplex mode provides a maximum bandwidth of 20 Mb/s. A full-duplex 100 Mb/s station provides 200 Mb/s of bandwidth.

Full-duplex operation is restricted to links meeting the following criteria:

Full-duplex operation offers several major advantages:

3.2.1 PAUSE Frames

The addition of full-duplex mode to the Ethernet standard included an optional flow control operation known as "PAUSE" frames. PAUSE frames permit one end station to temporarily stop all traffic from the other end station (except MAC Control frames).

For example, assume a full-duplex link that connects two devices called "Station A" and "Station B". Suppose Station A transmits frames at a rate that causes Station B to enter into a state of congestion (i.e. no buffer space remaining to receive additional frames). Station B may transmit a PAUSE frame to Station A requesting that Station A stop transmitting frames for a specified period of time. Upon receiving the PAUSE frame, Station A will suspend further frame transmission until the specified time period has elapsed. This will allow Station B time to recover from the congestion state. At the end of the specified time period, Station A will resume normal transmission of frames.

Note that the PAUSE frame protocol is bi-directional. Station A may send frames to pause Station B, and Station B may send frames to pause Station A. A PAUSE frame is the one type of frame that a station is allowed to send even if it is currently in the paused state. Support for PAUSE frames is optional among devices that implement the full-duplex protocol (the use of PAUSE frames is not supported in a half-duplex environment). It is valid for a device to support only half of the protocol; i.e. it may transmit PAUSE frames without having the capability to decode them on the receive side, and vice-versa. Devices use the Auto-Negotiation protocol to discover the PAUSE frame capabilities of the device at the other end of the link. This permits interoperability between devices that do or do not support one or both halves of the protocol.

The format of a PAUSE frame is illustrated below. It conforms to the standard Ethernet frame format but includes a unique type field and other parameters as follows:

Preamble (7-bytes) Start Frame Delimiter (1-byte) Dest. MAC Address (6-bytes)
= (01-80-C2-
or unique DA
Source MAC Address (6-bytes) Length/Type (2-bytes)
= 802.3 MAC Control
MAC Control
MAC Control
= (00-00 to
= all zeros
Frame Check Sequence (4-bytes)

3.2.2 Link Aggregation

"Link Aggregation", or "Trunking", is another Ethernet feature that applies only to the full-duplex mode of operation. It provides for increased link availability and bandwidth between two Ethernet stations by allowing multiple "physical" links to be combined to operate as a single "logical" link. The Link Aggregation specification was recently developed by the IEEE 802.3ad Working Group. It is in its final stages of approval and is expected to be formally released as an addition to the Ethernet standard in early 2000.

Prior to Link Aggregation it was difficult, if not impossible, to have multiple links between two Ethernet stations. The "spanning tree" algorithm used in Ethernet bridging disables parallel paths to prevent "loops" in the network. An end station could have multiple Ethernet links only if the links were attached to different networks, or to different VLANs within a network.

Link Aggregation resolves this limitation by allowing multiple parallel links between any two Ethernet stations; The links may be between two switches, between a switch and a server, or between a switch and an end user station. The following advantages are provided:

Link Aggregation operates by adding a new layer of function between the Ethernet MACs and the higher layer protocols above. Each of the underlying Ethernet ports in an aggregated group transmit and receive frames with their own unique MAC address. As frames pass through the Link Aggregation layer, addresses are manipulated so the aggregated ports appear as a single link with one MAC address. This makes the Link Aggregation function completely transparent to all higher layer protocols and functions including the spanning tree algorithm, VLANs, SNMP and routers.

As the Link Aggregation layer distributes frames among the multiple links within the group, it must ensure the frames arrive at the other end in the correct order. To do this, the Link Aggregation algorithm creates sessions, called "conversations", that consist of Ethernet frames with identical sources and destinations. All frames from a conversation are restricted to a single link within the aggregation group. By grouping traffic in this fashion, frames are guaranteed to arrive at the destination in the proper sequence.

The use of Link Aggregation is restricted as follows:

Next Chapter
Back to Table of Contents

Copyright © 1999 All Rights Reserved.