Border Gateway Protocol


Border Gateway Protocol is a standardized exterior gateway protocol designed to exchange routing and reachability information among autonomous systems on the Internet. BGP is classified as a path-vector routing protocol, and it makes routing decisions based on paths, network policies, or rule-sets configured by a network administrator.
BGP used for routing within an autonomous system is called Interior Border Gateway Protocol. In contrast, the Internet application of the protocol is called Exterior Border Gateway Protocol.

History

One half year after the initial publication, the protocol definition was changed in 1990 with the publication of. In October 1991, BGP version 3 was defined in, obsoleting the two earlier versions. In 1994, the current version, was published as. Its definition was replaced in March 1995 by. In January 2006, was published, which currently is the latest definition of BGP4.
RFC 4271 corrected errors, clarified ambiguities and updated the specification with common industry practices. The major enhancement of BGP4 was the support for Classless Inter-Domain Routing and use of route aggregation to decrease the size of routing tables.
In its native form, the BGP4 protocol can only work with IPv4 addresses. Since the publication of in 1998, routing information about a wide range of "address families" can be carried. The 'Multiprotocol Extensions' have been updated in 2000 with and finally in 2007 with. With these extensions, the protocol is also referred to as Multiprotocol BGP.

Operation

BGP neighbors, called peers, are established by manual configuration among routers to create a TCP session on port 179. A BGP speaker sends 19-byte keep-alive messages every 30 seconds to maintain the connection. Among routing protocols, BGP is unique in using TCP as its transport protocol.
When BGP runs between two peers in the same autonomous system, it is referred to as Internal BGP. When it runs between different autonomous systems, it is called External BGP. Routers on the boundary of one AS exchanging information with another AS are called border or edge routers or simply eBGP peers and are typically connected directly, while iBGP peers can be interconnected through other intermediate routers. Other deployment topologies are also possible, such as running eBGP peering inside a VPN tunnel, allowing two remote sites to exchange routing information in a secure and isolated manner.
The main difference between iBGP and eBGP peering is in the way routes that were received from one peer are typically propagated by default to other peers:
  • New routes learned from an eBGP peer are re-advertised to all iBGP and eBGP peers.
  • New routes learned from an iBGP peer are re-advertised to all eBGP peers only.
These route-propagation rules effectively require that all iBGP peers inside an AS are interconnected in a full mesh with iBGP sessions.
How routes are propagated can be controlled in detail via the route-maps mechanism. This mechanism consists of a set of rules. Each rule describes, for routes matching some given criteria, what action should be taken. The action could be to drop the route, or it could be to modify some attributes of the route before inserting it in the routing table.

Extensions negotiation

During the peering handshake, when OPEN messages are exchanged, BGP speakers can negotiate optional capabilities of the session, including multiprotocol extensions and various recovery modes. If the multiprotocol extensions to BGP are negotiated at the time of creation, the BGP speaker can prefix the Network Layer Reachability Information it advertises with an address family prefix. These families include the IPv4, IPv6, IPv4/IPv6 Virtual Private Networks and multicast BGP. Increasingly, BGP is used as a generalized signaling protocol to carry information about routes that may not be part of the global Internet, such as VPNs.
In order to make decisions in its operations with peers, a BGP peer uses a simple finite-state machine that consists of six states: Idle; Connect; Active; OpenSent; OpenConfirm; and Established. For each peer-to-peer session, a BGP implementation maintains a state variable that tracks which of these six states the session is in. The BGP defines the messages that each peer should exchange in order to change the session from one state to another.
The first state is the Idle state. In the Idle state, BGP initializes all resources, refuses all inbound BGP connection attempts and initiates a TCP connection to the peer. The second state is Connect. In the Connect state, the router waits for the TCP connection to complete and transitions to the OpenSent state if successful. If unsuccessful, it starts the ConnectRetry timer and transitions to the Active state upon expiration. In the Active state, the router resets the ConnectRetry timer to zero and returns to the Connect state. In the OpenSent state, the router sends an Open message and waits for one in return in order to transition to the OpenConfirm state. Keepalive messages are exchanged and, upon successful receipt, the router is placed into the Established state. In the Established state, the router can send and receive: Keepalive; Update; and Notification messages to and from its peer.
  • Idle State:
  • * Refuse all incoming BGP connections.
  • * Start the initialization of event triggers.
  • * Initiates a TCP connection with its configured BGP peer.
  • * Listens for a TCP connection from its peer.
  • * Changes its state to Connect.
  • * If an error occurs at any state of the FSM process, the BGP session is terminated immediately and returned to the Idle state. Some of the reasons why a router does not progress from the Idle state are:
  • ** TCP port 179 is not open.
  • ** A random TCP port over 1023 is not open.
  • ** Peer address configured incorrectly on either router.
  • ** AS number configured incorrectly on either router.
  • Connect State:
  • * Waits for successful TCP negotiation with peer.
  • * BGP does not spend much time in this state if the TCP session has been successfully established.
  • * Sends Open message to peer and changes state to OpenSent.
  • * If an error occurs, BGP moves to the Active state. Some reasons for the error are:
  • ** TCP port 179 is not open.
  • ** A random TCP port over 1023 is not open.
  • ** Peer address configured incorrectly on either router.
  • ** AS number configured incorrectly on either router.
  • Active State:
  • * If the router was unable to establish a successful TCP session, then it ends up in the Active state.
  • * BGP FSM tries to restart another TCP session with the peer and, if successful, then it sends an Open message to the peer.
  • * If it is unsuccessful again, the FSM is reset to the Idle state.
  • * Repeated failures may result in a router cycling between the Idle and Active states. Some of the reasons for this include:
  • ** TCP port 179 is not open.
  • ** A random TCP port over 1023 is not open.
  • ** BGP configuration error.
  • ** Network congestion.
  • ** Flapping network interface.
  • OpenSent State:
  • * BGP FSM listens for an Open message from its peer.
  • * Once the message has been received, the router checks the validity of the Open message.
  • * If there is an error it is because one of the fields in the Open message does not match between the peers, e.g., BGP version mismatch, the peering router expects a different My AS, etc. The router then sends a Notification message to the peer indicating why the error occurred.
  • * If there is no error, a Keepalive message is sent, various timers are set and the state is changed to OpenConfirm.
  • OpenConfirm State:
  • * The peer is listening for a Keepalive message from its peer.
  • * If a Keepalive message is received and no timer has expired before reception of the Keepalive, BGP transitions to the Established state.
  • * If a timer expires before a Keepalive message is received, or if an error condition occurs, the router transitions back to the Idle state.
  • Established State:
  • * In this state, the peers send Update messages to exchange information about each route being advertised to the BGP peer.
  • * If there is any error in the Update message then a Notification message is sent to the peer, and BGP transitions back to the Idle state.

    Router connectivity and learning routes

In the simplest arrangement, all routers within a single AS and participating in BGP routing must be configured in a full mesh: each router must be configured as a peer to every other router. This causes scaling problems, since the number of required connections grows quadratically with the number of routers involved. To alleviate the problem, BGP implements two options: route reflectors and BGP confederations. The following discussion of basic update processing assumes a full iBGP mesh.
A given BGP router may accept network-layer reachability information updates from multiple neighbors and advertise NLRI to the same, or a different set, of neighbors. The BGP process maintains several routing information bases:
  • RIB: routers main routing information base table.
  • Loc-RIB: local routing information base BGP maintains its own master routing table separate from the main routing table of the router.
  • Adj-RIB-In: For each neighbor, the BGP process maintains a conceptual adjacent routing information base, incoming, containing the NLRI received from the neighbor.
  • Adj-RIB-Out: For each neighbor, the BGP process maintains a conceptual adjacent routing information base, outgoing , containing the NLRI sent to the neighbor.
The physical storage and structure of these conceptual tables are decided by the implementer of the BGP code. Their structure is not visible to other BGP routers, although they usually can be interrogated with management commands on the local router. It is quite common, for example, to store the Adj-RIB-In, Adj-RIB-Out and the Loc-RIB together in the same data structure, with additional information attached to the RIB entries. The additional information tells the BGP process such things as whether individual entries belong in the Adj-RIBs for specific neighbors, whether the peer-neighbor route selection process made received policies eligible for the Loc-RIB, and whether Loc-RIB entries are eligible to be submitted to the local router's routing table management process.
BGP submits the routes that it considers best to the main routing table process. Depending on the implementation of that process, the BGP route is not necessarily selected. For example, a directly connected prefix, learned from the router's own hardware, is usually most preferred. As long as that directly connected route's interface is active, the BGP route to the destination will not be put into the routing table. Once the interface goes down, and there are no more preferred routes, the Loc-RIB route would be installed in the main routing table.
BGP carries the information with which rules inside BGP-speaking routers can make policy decisions. Some of the information carried that is explicitly intended to be used in policy decisions are:
  • [|Communities]
  • [|multi-exit discriminators].
  • autonomous systems