Measuring network throughput

Throughput of a network can be measured using various tools available on different platforms. This page explains the theory behind what these tools set out to measure and the issues regarding these measurements.
Reasons for measuring throughput in networks.
People are often concerned about measuring the maximum data throughput in bits per second of a communications link or network access. A typical method of performing a measurement is to transfer a 'large' file from one system to another system and measure the time required to complete the transfer or copy of the file. The throughput is then calculated by dividing the file size by the time to get the throughput in megabits, kilobits, or bits per second.
Unfortunately, the results of such an exercise will often result in the goodput, which is less than the maximum theoretical data throughput, leading to people believing that their communications link is not operating correctly.
In fact, there are many overheads accounted for in throughput in addition to transmission overheads, including latency, TCP Receive Window size and system limitations, which means the calculated goodput does not reflect the maximum achievable throughput.

Theory: Short summary

The Maximum bandwidth can be calculated as follows:
where RWIN is the TCP Receive Window and RTT is the round-trip time for the path.
The Max TCP Window size in the absence of TCP window scale option is 65,535 bytes. Example: Max Bandwidth = bytes / 0.220 s = × 8 = 2.383 Mbit/s. Over a single TCP connection between those endpoints, the tested bandwidth will be restricted to even if the contracted bandwidth is greater.

Bandwidth test software

Bandwidth test software is used to determine the maximum bandwidth of a network or internet connection. It is typically undertaken by attempting to download or upload the maximum amount of data in a certain period of time, or a certain amount of data in the minimum amount of time. For this reason, Bandwidth tests can delay internet transmissions through the internet connection as they are undertaken, and can cause inflated data charges.

Nomenclature

The throughput of communications links is measured in bits per second, kilobits per second, megabits per second and gigabits per second. In this application, kilo, mega and giga are the standard SI prefixes indicating multiplication by 1000,, and .
File sizes are typically measured in bytes — kilobytes, megabytes, and gigabytes being usual, where a byte is eight bits. In modern textbooks one kilobyte is defined as byte, one megabyte as byte, etc., in accordance with the 1998 International Electrotechnical Commission standard. However, the convention adopted by Windows systems is to define 1 kilobyte is as 1024 bytes, which is equal to 1 kibibyte. Similarly, a file size of 1 megabyte is 1024 × 1024 byte, equal to 1 mebibyte, and 1 gigabyte is 1024 × 1024 × 1024 byte = 1 gibibyte.

Confusing and inconsistent use of suffixes

It is usual for people to abbreviate commonly used expressions. For file sizes, it is usual for someone to say that they have a 64 k file, or a 100 meg file. When talking about circuit bit rates, people will interchangeably use the terms throughput, bandwidth and speed, and refer to a circuit as being a 64 k circuit, or a 2 meg circuit — meaning or . However, a 64 k circuit will not transmit a 64 k file in one second. This may not be obvious to those unfamiliar with telecommunications and computing, so misunderstandings sometimes arise. In actuality, a 64 kilobyte file is 64 × 1024 × 8 bits in size and the 64 k circuit will transmit bits at a rate of 64 ×, so the amount of time taken to transmit a 64 kilobyte file over the 64 k circuit will be at least / seconds, which works out to be 8.192 seconds.

Compression

Some equipment can improve matters by compressing the data as it is sent. This is a feature of most analog modems and of several popular operating systems. If the 64 k file can be shrunk by compression, the time taken to transmit can be reduced. This can be done invisibly to the user, so a highly compressible file may be transmitted considerably faster than expected. As this invisible compression cannot easily be disabled, it therefore follows that when measuring throughput by using files and timing the time to transmit, one should use files that cannot be compressed. Typically, this is done using a file of random data, which becomes harder to compress the closer to truly random it is.
Assuming your data cannot be compressed, the 8.192 seconds to transmit a 64-kilobyte file over a 64- communications link is a theoretical minimum time that will not be achieved in practice. This is due to the effect of overheads, which are used to format the data in an agreed manner so that both ends of a connection have a consistent view of the data.
There are at least two issues that aren't immediately obvious for transmitting compressed files:

The throughput of the network itself isn't improved by compression. From the end-to-end perspective compression does improve throughput. That's because information content for the same amount of transmission is increased through compression of files.
Compressing files at the server and client takes more processor resources at both the ends. The server has to use its processor to compress the files, if they aren't already done. The client has to decompress the files upon receipt. This can be considered an expense for the benefit of increased end to end throughput

Overheads and data formats

A common communications link used by many people is the asynchronous start-stop, or just asynchronous, serial link. If you have an external modem attached to your home or office computer, the chances are that the connection is over an asynchronous serial connection. Its advantage is that it is simple — it can be implemented using only three wires: Send, Receive and Signal Ground. In an RS-232 interface, an idle connection has a continuous negative voltage applied. A zero bit is represented as a positive voltage difference with respect to the Signal Ground and a one bit is a negative voltage with respect to signal ground, thus indistinguishable from the idle state. This means you need to know when a one bit starts to distinguish it from idle. This is done by agreeing in advance how fast data will be transmitted over a link, then using a start bit to signal the start of a byte — this start bit will be a zero bit. Stop bits are one bits i.e., negative voltage.
Actually, more things will have been agreed in advance — the speed of bit transmission, the number of bits per character, the parity and the number of stop bits. So a designation of 9600-8-E-2 would be 9600 bits per second, with eight bits per character, even parity and two stop bits.
A common set-up of an asynchronous serial connection would be 9600-8-N-1 - a total of 10 bits transmitted to send one 8-bit character. This is an overhead of 20%, so a asynchronous serial link will not transmit data at 9600/8 bytes per second but actually, in this case, 9600/10 bytes per second, which is considerably slower than expected.
It can get worse. If parity is specified and we use 2 stop bits, the overhead for carrying one 8-bit character is 4 bits - or 50%. In this case a connection will carry 9600/12 byte/s. Asynchronous serial interfaces commonly will support bit transmission speeds of up to. If it is set up to have no parity and one stop bit, this means the byte transmission rate is.
The advantage of the asynchronous serial connection is its simplicity. One disadvantage is its low efficiency in carrying data. This can be overcome by using a synchronous interface. In this type of interface, a clock signal is added on a separate wire, and the bits are transmitted in synchrony with the clock — the interface no longer has to look for the start and stop bits of each individual character — however, it is necessary to have a mechanism to ensure the sending and receiving clocks are kept in synchrony, so data is divided up into frames of multiple characters separated by known delimiters. There are three common coding schemes for framed communications — HDLC, PPP, and Ethernet

HDLC

When using HDLC, rather than each byte having a start, optional parity, and one or two stop bits, the bytes are gathered together into a frame. The start and end of the frame are signalled by the 'flag', and error detection is carried out by the frame check sequence. If the frame has a maximum-sized address of 32 bits, a maximum-sized control part of 16 bits and a maximum-sized frame check sequence of 16 bits, the overhead per frame could be as high as 64 bits. If each frame carried but a single byte, the data throughput efficiency would be extremely low. However, the bytes are normally gathered together, so that even with a maximal overhead of 64 bits, frames carrying more than 24 bytes are more efficient than asynchronous serial connections. As frames can vary in size because they can have different numbers of bytes being carried as data, this means the overhead of an HDLC connection is not fixed.

PPP

The point-to-point protocol is defined by the Internet Request For Comment documents RFC 1570, RFC 1661 and RFC 1662. With respect to the framing of packets, PPP is quite similar to HDLC, but supports both bit-oriented as well as byte-oriented methods of delimiting frames while maintaining data transparency.

Ethernet

Ethernet is a "local area network" technology, which is also framed. The way the frame is electrically defined on a connection between two systems is different from the typically wide-area networking technology that uses HDLC or PPP implemented, but these details are not important for throughput calculations. Ethernet is a shared medium, so that it is not guaranteed that only the two systems that are transferring a file between themselves will have exclusive access to the connection. If several systems are attempting to communicate simultaneously, the throughput between any pair can be substantially lower than the nominal bandwidth available.

Other low-level protocols

Dedicated point-to-point links are not the only option for many connections between systems. Frame Relay, ATM, and MPLS based services can also be used. When calculating or estimating data throughputs, the details of the frame/cell/packet format and the technology's detailed implementation need to be understood.

Frame relay

Frame Relay uses a modified HDLC format to define the frame format that carries data.

Asynchronous Transfer Mode]

Asynchronous Transfer Mode uses a radically different method of carrying data. Rather than using variable-length frames or packets, data is carried in fixed size cells. Each cell is 53 bytes long, with the first 5 bytes defined as the header, and the following 48 bytes as payload. Data networking commonly requires packets of data that are larger than 48 bytes, so there is a defined adaptation process that specifies how larger packets of data should be divided up in a standard manner to be carried by the smaller cells. This process varies according to the data carried, so in ATM nomenclature, there are different ATM Adaptation Layers. The process defined for most data is named ATM Adaptation Layer No. 5 or AAL5.
Understanding throughput on ATM links requires a knowledge of which ATM adaptation layer has been used for the data being carried.

MPLS

Multiprotocol Label Switching adds a standard tag or header known as a 'label' to existing packets of data. In certain situations it is possible to use MPLS in a 'stacked' manner, so that labels are added to packets that have already been labelled. Connections between MPLS systems can also be 'native', with no underlying transport protocol, or MPLS labelled packets can be carried inside frame relay or HDLC packets as payloads. Correct throughput calculations need to take such configurations into account. For example, a data packet could have two MPLS labels attached via 'label-stacking', then be placed as payload inside an HDLC frame. This generates more overhead that has to be taken into account that a single MPLS label attached to a packet which is then sent 'natively', with no underlying protocol to a receiving system.

Higher-level protocols

Few systems transfer files and data by simply copying the contents of the file into the 'Data' field of HDLC or PPP frames — another protocol layer is used to format the data inside the 'Data' field of the HDLC or PPP frame. The most commonly used such protocol is Internet Protocol, defined by RFC 791. This imposes its own overheads.
Again, few systems simply copy the contents of files into IP packets, but use yet another protocol that manages the connection between two systems — TCP, defined by RFC 1812. This adds its own overhead.
Finally, a final protocol layer manages the actual data transfer process. A commonly used protocol for this is the File Transfer Protocol.