An Introduction to IGRP Charles L. Hedrick RUTGERS The State University of New Jersey Center for Computers and Information Services Laboratory for Computer Science Research 25 October 1989 Acknowledgements This paper really should show Len Bosack of cisco Systems as co-author, and possibly should also list an unidentified lawyer at Townsend and Townsend. Most of the ideas behind IGRP are Len's. However since he has't even seen this paper, it would seem misleading to list him as an author. This paper is based on the patent application for IGRP, and on an examination of the code implementing IGRP in the cisco gateways. Unlike previous drafts, this paper follows the code more closely than the patent application. The exceptions are a couple of features which are not yet implemented, but which are planned to be. The background and evaluative comments are my own. However when it came to giving equations and detailed algorithmic descriptions, it seemed to make more sense simply to lift them from the patent application, to the extent that they were consistent with the way the code actually works. I have tried to translate them from legalese into wording more appropriate for technical discussions, but inevitably there are going to be similarities in both organization and wording between this paper and the patent application. Unfortunately patent applications do not follow the usual conventions of scholarly attribution. But it is quite clear that most of the ideas in it came from Len Bosack, with some wording contributed by Townsend and Townsend. But of course I am responsible for the final form, and should bear the blame for any errors that may appear. 1 This paper is an introduction to IGRP, cisco System's Inter-Gateway Routing Protocol. It has two purposes. One is to form an introduction to the IGRP technology, for those who are interested in using, evaluating, and possibly implementing it. The other is to give wider exposure to some interesting ideas that are embodied in IGRP. I believe even those who are not interested in using IGRP per se may find some of these ideas useful. This draft is intended for discussion in the open distance-vector group within the IETF. Thus some attempt has been made to give protocol details. It is not clear whether there are enough details to do a complete implementation or not. However that is the intention. This document has been checked carefully against the source code for cisco's Gateway Server, release 8.0(3). Where possible, the description is consistent with that code. However no attempt has been made to describe all of the internal data structures or algorithms, since this is a protocol description, not an implementation design. In particular, here are some areas in which this description can be expected to differ from cisco's implementation: - cisco does not yet implement multiple types of service, nor tests that depend upon the hop count. However they do maintain and propagate the necessary information. The description here indicates how I believe that information will be used in future releases. - cisco has a number of administrative controls, allowing filters and modifications of various sorts on routing information. Since these apply to all routing protocols implemented by cisco, I have chosen to view these controls as features of the implementation, and not part of the protocol. They are not reflected in any of the data exchanged between gateways. Thus there is nothing to prevent other implementors from choosing different sorts of controls. In previous releases of the cisco software, "administrative distance" appeared to be part of IGRP. However in 8.0, administrative distance is not propagated by IGRP, but is purely local to the gateway. I do not consider it be part of the protocol, and do not describe it here. - cisco supplies several ways to define default routes. Again, I have chosen to regard this as a feature of the implementation rather than the protocol. However the concept of "exterior routes" is reflected in the protocol, and will be described here. The technology described here is the subject of a patent application by cisco. This paper is being circulated with cisco's permission. However that permission should not be construed as giving any sort of license. The legal implications of implementing the ideas described in this paper are between you, your lawyer, and cisco. 1. Goals for IGRP IGRP is a protocol that allows a number of gateways to coordinate their routing. Its goals are - stable routing even in very large or complex networks. No routing loops should occur, even as transients. - fast response to changes in network topology - low overhead. That is, IGRP itself should not use more bandwidth than 2 what is actually needed for its task. - splitting traffic among several parallel routes when they are of roughly equal desirability. - taking into account error rates and level of traffic on different paths - the ability to handle multiple "types of service" with a single set of information. The current implemention of IGRP handles routing for TCP/IP. However the basic design is intended to be able to handle a variety of protocols. During the last few years, routing has suddenly become a more difficult problem than it used to be. A few years ago, protocols such as RIP were sufficient to handle most real networks. However growth in the Internet, and decentralization of control of its structure, have now resulted in a system of networks that is nearly beyond our capabilities to manage. Similar situations are occuring in large corporate networks as well. IGRP is one tool intended to help attack this problem. No one tool is going to solve all routing problems. Conventionally the routing problem is broken into several pieces. Protocols such as IGRP are called "internal gateway protocols" (IGP's). They are intended for use within a single set of networks, either under a single management or closely coordinated managements. Such sets of networks are connected by "external gateway protocols" (EGP's). An IGP is designed to keep track of a good deal of detail about network topology. Priority in designing an IGP is placed on producing optimal routes and responding quickly to changes. An EGP is intended to protect one system of networks against errors or intentional misrepresentation by other systems. Priority in designing an EGP is on stability and administrative controls. Often it is sufficient for an EGP to produce a reasonable route, rather than the optimal route. In fact, there are features in cisco's implementation that allow IGRP to be used as an EGP in some circumstances. However the emphasis in its design is on use as an IGP. IGRP has some similarities to older protocols such as Xerox's Routing Information Protocol, Berkeley's RIP, and Dave Mills' Hello. It differs from these protocols primilarly in being designed for larger and more complex networks. Section 4 gives a more detailed comparison with RIP, which is the most widely used of the older generation of protocols. Like these older protocols, IGRP is a distance vector protocol. In such a protocol, gateways exchange routing information only with adjacent gateways. This routing information contains a summary of information about the rest of the network. It can be shown mathematically that all of the gateways taken together are solving an optimization problem by what amounts to a distributed algorithm. Each gateway only needs to solve part of the problem, and it only has to receive a portion of the total data. The major alterative is a class of algorithms referred to as SPF (shortest-path first). These are based on a flooding technique, where every gateway is kept up to date about the status of every interface on every other gateway. Each gateway independently solves the optimization problem from its point of view using data for the entire network. There are advantages to each approach. In some circumstances SPF may be able to respond to changes more quickly. In order to prevent routing loops, IGRP has to ignore new data for a few minutes after certain kinds of changes. Because SPF has information directly from each gateway, it is able to avoid these routing loops. Thus it can act on new information immediately. However SPF has to deal with substantially more data than IGRP, both in internal data structures and in messages between gateways. 3 Thus SPF implementations can be expected to have higher overhead than IGRP implementations, other things being equal. 2. The Routing Problem IGRP is intended for use in gateways connecting several networks. We assume that the networks use packet-based technology. In effect the gateways act as packet switches. When a system connected to one network wants to send a packet to a system on a different network, it addresses the packet to a gateway. If the destination is on one of the networks connected to the gateway, the gateway will forward the packet to the destination. If the destination is more distant, the gateway will forward the packet to another gateway that is closer to the destination. Gateways use routing tables to help them decide what to do with packets. Here is a simple example routing table. (Addresses used in the examples are IP addresses taken from Rutgers University. Note that the basic routing problem is similar for other protocols as well, but this description will assume that IGRP is being used for routing IP.) network gateway interface 128.6.4 none ethernet 0 128.6.5 none ethernet 1 128.6.21 128.6.4.1 ethernet 0 128.121 128.6.5.4 ethernet 1 10 128.6.5.4 ethernet 1 (Actual IGRP routing tables have additional information for each gateway, as we will see.) This gateway is connected to two Ethernets, called 0 and 1. They have been given IP network numbers (actually subnet numbers) 128.6.4 and 128.6.5. Thus packets addressed for these specific networks can be sent directly to the destination, simply by using the appropriate Ethernet interface. There are two nearby gateways, 128.6.4.1 and 128.6.5.4. Packets for networks other than 128.6.4 and 128.6.5 will be forwarded to one or the other of those gateways. The routing table indicates which gateway should be used for which network. For example, packets addressed to a host on network 10 should be forwarded to gateway 128.6.5.4. One hopes that this gateway is closer to network 10, i.e. that the best path to network 10 goes through this gateway. The primary purpose of IGRP is allow the gateways to build and maintain routing tables like this. 3. Summary of IGRP As mentioned above, IGRP is a protocol that allows gateways to build up their routing table by exchanging information with other gateways. A gateway starts out with entries for all of the networks that are directly connected to it. It gets information about other networks by exchanging routing updates with adjacent gateways. In the simplest case, the gateway will find one path that represents the best way to get to each network. A path is characterized by the next gateway to which packets should be sent, the network interface that should be used, and metric information. Metric information is a set of numbers that characterize how good the path is. This allows the gateway to compare paths that it has heard from various gateways and decide which one to use. There are often cases where it makes sense to split traffic between two or more paths. IGRP will do this whenever two or more paths are equally good. The user can also configure it to split traffic when paths are almost equally good. In this 4 case more traffic will be sent along the path with the better metric. The intent is that traffic can be split between a 9600 bps line and a 19200 bps line, and the 19200 line will get roughly twice as much traffic as the 9600 bps line. The metric used by IGRP includes - the topological delay time - the bandwidth of the narrowest bandwidth segment of the path - the channel occupancy of the path - the reliability of the path. Topological delay time is the amount of time it would take to get to the destination along that path, assuming an unloaded network. Of course there is additional delay when the network is loaded. However load is accounted for by using the channel occupany figure, not by attempting to measure actual delays. The path bandwidth is simply the bandwidth in bits per second of the slowest link in the path. Channel occupancy indicates how much of that bandwidth is currently in use. It is measured, and will change with load. Reliability indicates the current error rate. It is the fraction of packets that arrive at the destination undamaged. It is measured. Although they are not used as part of the metric, two addition pieces of information are passed with it: hop count and MTU. The hop count is simply the number of gateways that a packet will have to go through to get to the destination. MTU is the maximum packet size that can be sent along the entire path without fragmentation. (That is, it is the minimum of the MTU's of all the networks involved in the path.) Based on the metric information, a single "composite metric" is calculated for the path. The composite metric combines the effect of the various metric components into a single number representing the "goodness" of that path. It is the composite metric that is actually used to decide on the best path. Periodically each gateway broadcasts its entire routing table (with some censoring because of the split horizon rule) to all adjacent gateways. When a gateway gets this broadcast from another gateway, it compares the table with its existing table. Any new destinations and paths are added to the gateway's routing table. Paths in the broadcast are compared with existing paths. If a new path is better, it may replace the existing one. Information in the broadcast is also used to update channel occupancy and other information about existing paths. This general procedure is similar to that used by all distance vector protocols. It is referred to in the mathematical literature as the Bellman-Ford algorithm. For a detailed development of the basic procedure, see RFC 1058, which describes RIP, an older distance vector protocol. In IGRP, the general Bellman-Ford algorithm is modified in three critical aspects. First, instead of a simple metric, a vector of metrics is used to characterize paths. Second, instead of picking a single path with the smallest metric, traffic is split among several paths, whose metrics fall into a specified range. Third, several features are introduced to provide stability in situations where the topology is changing. The best path is selected based on a composite metric [(K1 / Be) + (K2 * Dc)] r Where: 5 K1, K2 = constants; Be = unloaded path bandwidth x (1 - channel occupancy) Dc = topological delay; and r = reliability. The path having the smallest composite metric will be the best path. Where there are multiple paths to the same destination, the gateway can route the packets over more than one path. This is done in accordance with the composite metric for each data path. For instance, if one path has a composite metric of 1 and another path has a composite metric of 3, three times as many packets will be sent over the data path having the composite metric of 1. However, only paths whose composite metrics are with a certain range of the smallest composite metric will be used. K1 and K2 indicate the weight to be assigned to bandwidth and delay. These will depend upon the "type of service". For example, interactive traffic would normally place a higher weight on delay, and file transfer on bandwidth. There are two advantages to using a vector of metric information. The first is that it provides the ability to support multiple types of service from the same set of data. The second advantage is improved accuracy. When a single metric is used, it is normally treated as if it were a delay. Each link in the path is added to the total metric. If there is a link with a low bandwidth, it is normally represented by a large delay. However bandwidth limitations don't really cumulate the way delays do. By treating bandwidth as a separate component, it can be handled correctly. Similarly, load can be handled by a separate channel occupancy number. IGRP provides a system for interconnecting computer networks which can stably handle a general graph topology including loops. The system maintains full path metric information, i.e., it knows the path parameters to all other networks to which any gateway is connected. Traffic can be distributed over parallel paths and multiple path parameters can be simultaneously computed over the entire network. 4. Comparison with RIP This section compares IGRP with RIP. This comparison is useful because RIP is used widely for purposes similar to IGRP. However doing this is not entirely fair. RIP was not intended to meet all of the same goals as IGRP. RIP was intended for use in small networks with reasonably uniform technology. In such applications it is generally adequate. The most basic difference between IGRP and RIP is the structure of their metrics. Unfortunately this is not a change that can simply be retrofitted to RIP. It requires the new algorithms and data structures present in IGRP. RIP uses a simple "hop count" metric to describe the network. Unlike IGRP, where every path is described by a delay, bandwidth, etc., in RIP it is described by a number from 1 to 15. Normally this number is used to represent how many gateways the path goes through before getting to the destination. This means that no distinction is made between a slow serial line and an Ethernet. In some implementations of RIP, it is possible for the system administrator to specify that a given hop should be counted more than once. Slow networks can be represented by a large hop count. But since the maximum is 15, this can't be done very much. E.g. if an Ethernet is represented by 1 6 and a 56Kb line by 3, there can be at most 5 56Kb lines in a path, or the maximum of 15 is exceeded. In order to represent the full range of available network speeds, and allow for a large network, studies done by cisco suggest that a 24-bit metric is needed. If the maximum metric is too small, the system administrator is presented with an unpleasant choice: either he can't distinguish between fast and slow routes, or he can't fit his whole network into the limit. In fact a number of national networks are now large enough that RIP can't handle them even if every hop is counted only once. RIP simply can't be used for such networks. The obvious response would be to modify RIP to allow a larger metric. Unfortunately, this won't work. Like all distance vector protocols, RIP has the problem of "counting to infinity". This is described in more detail in RFC 1058. When topology changes, spurious routes will be introduced. The metrics associated with these spurious routes slowly increase until they reach 15, at which point the routes are removed. 15 is a small enough maximum that this process will converge fairly quickly, assuming that triggered updates are used. If RIP were modified to allow a 24-bit metric, loops would persist long enough for the metric to be counted up to 2**24. This is not tolerable. IGRP has features designed to prevent spurious routes from being introduced. These are discussed below in section 5.2. It is not practical to handle complex networks without introducing such features or changing to a protocol such as SPF. IGRP does a bit more than simply increase the range of allowable metrics. It restructures the metric to describe delay, bandwidth, reliability, and load. It is possible to represent such considerations in a single metric such as RIP's. However the approach taken by IGRP is potentially more accurate. For example, with a single metric, several successive fast links will appear to be equivalent to a single slow one. This may be the case for interactive traffic, where delay is the primary concern. However for bulk data transfer, the primary concern is bandwidth, and adding metrics together is not the right approach there. IGRP handles delay and bandwidth separately, cumulating delays, but taking the minimum of the bandwidths. It is not easy to see how to incorporate the effects of reliability and load into a single-component metric. In my opinion, one of the big advantages of IGRP is ease of configuration. It can directly represent quantities that have physical meaning. This means that it can be set up automatically, based on interface type, line speed, etc. With a single-component metric, the metric is more likely to have to be "cooked" to incorporate effects of several different things. Other innovations are more a matter of algorithms and data structures than of the routing protocol. For example, IGRP specifies algorithms and data structures that support splitting traffic among several routes. It is certainly possible to design an implementation of RIP that does this. However once routing is being reimplemented, there is no reason to stick with RIP. So far I have described "generic IGRP", a technology which could support routing for any network protocol. However in this section it is worth mentioning a bit more about the specific TCP/IP implementation. That is the implementation that is going to be compared with RIP. RIP update messages simply contain snapshots of the routing table. That is, they have a number of destinations and metric values, and little else. The IP implementation of IGRP has additional structure. First, the update message is identified by an "autonomous system number." This terminology comes out of the Arpanet tradition, and has specific meaning there. However for most networks what it means is that you can run several different routing systems on the same network. This is useful for places where networks from several organizations converge. Each organization can maintain its own routing. Because each update is labelled, gateways can be configured to pay attention only to the right one. 7 Certain gateways are configured to receive updates from several autonomous systems. They pass information between the systems in a controlled manner. Note that this is not a complete solution to problems of routing security. Any gateway can be configured to listen to updates from any autonomous system. However it is still a very useful tool in implementing routing policies where is a reasonable degree of trust between the network administrators. The second structural feature about IGRP update messages affects the way default routes are handled by IGRP. Most routing protocols have a concept of default route. It is often not practical for routing updates to list every network in the world. Typically a set of gateways need detailed routing information for networks within their organization. All traffic for destinations outside their organization can be sent to one of a few boundary gateways. Those boundary gateways may have more complete information. The route to the best boundary gateway is a "default route". It's a default in the sense that it is used to get to any destination that is not listed specifically in the internal routing updates. RIP, and some other routing protocols, circulate information about the default route as if it were a real network. IGRP takes a different approach. Rather than a single fake entry for the default route, IGRP allows real networks to be flagged as candidates for being a default. This is implemented by placing information about those networks in a special exterior section of the update message. However it might as well be thought of as turning on a bit associated with those networks. Periodically IGRP scans all the candidate default routes and chooses the one with the lowest metric as the actual default route. Potentially this approach to defaults is somewhat more flexible than the approach taken by most RIP implementations. Most typically RIP gateways can be set to generate a default route with a certain specified metric. The intention is that this would be done at boundary gateways. However there are situations where this isn't really good enough. Suppose that Rutgers University is directly connected to two regional networks, say NYSERnet at 56kbps and JvNCnet at T1. Normally the link to JvNCnet would be the best default. Using RIP we would configure our gateway to JvNCnet to advertise a default route with a lower metric than our gateway to NYSERnet. However suppose something happens within JvNCnet so that its access to the national networks is either cut off or becomes much worse (e.g. it might end up passing traffic through a backup link to some other regional). In that case it might be better to use NYSERnet. The typical RIP implementations would not adapt to that change, because the boundary gateway is generating a default with a fixed metric. With IGRP we would configure our gateways so that the NSFnet national backbone was flagged as a candidate default route. In that case, we would choose the default route at each gateway based on metric information that takes into account the entire path back to the national backbone. So the default would change to take into account events within our regional networks, assuming that they pass full IGRP metric information to us. 5. Detailed Description 5.1 Overall Description When a gateway is first turned on, its routing table is initialized. This may be done by an operator from a console terminal, or by reading information from configuration files. A description of each network connected to the gateway is provided, including the topological delay along the link (i.e., how long it takes a single bit to transverse the link) and the bandwidth of the link. 8 network 1 network 2 network 3 network 4 128.6.5 128.6.4 128.6.21 128.121 ===== ======================= ========== ================ | | | | | | | | __|____|__ _____|____ __|____|__ __|____|____ ___|________ 128.6.4.2 128.6.4.3 128.6.4.1 128.6.21.1 128.121.50.2 128.6.5.1 128.6.21.2 128.121.50.1 __________ __________ __________ ____________ ____________ gateway R computer A gateway S gateway T computer B Fig 1. A simple example network For instance, in Fig. 1, gateway S would be told that it is connected to networks 2 and 3 via the corresponding interfaces. Thus, initially, gateway 2 only knows that it can reach any destination computer in networks 2 and 3. All the gateways are programmed to periodically transmit to their neighboring gateways the information that they have been initialized with, as well as information gathered from other gateways. Thus, gateway S would receive updates from gateways R and T and learn that it can reach computers in network 1 through gateway R and computers in network 4 through gateway T. Since gateway S sends its entire routing table, in the next cycle gateway T will learn that it can get to network 1 through gateway S. It is easy to see that information about every network in the system will eventually reach every gateway in the system, providing only that the network is fully connected. ________ network 1 | gw A --nw2-- gw C | / | | / | nw3 nw4 nw5 | / | | / | gw B gw D ____|____________|____ network 6 Fig 2. Example of alternate paths Each gateway computes a composite metric to determine the desirability of the data paths to destination computers. For instance, in Fig 2, for a destination in network 6, gateway A would compute metric functions for two paths, via gateways B and C. Note that paths are defined simply by the next hop. There are actually three possible routes from A to network 6: - direct to B - to C and then to B - to C and then to D However gateway A need not choose between the two routes involving C. The routing table in A has a single entry representing the path to C. Its metric represents the best way of getting from C to the final destination. If A sends a packet to C, it is up to C to decide whether to use B or D. The composite metric function computed for each data path is as follows: [(K1 / Be) + (K2 * Dc)] r Eq. 1 Where: 9 r = fractional reliability (% of transmissions that are sucessfully received at the next hop.) Dc = composite delay; Be = effective bandwidth: unloaded bandwidth x (1 - channel occupancy) K1, K2 = constants. In principle the composite delay, Dc, could be determined as follows: Dc = Ds + Dcir + Dt Eq. 2 Where: Ds = switching delay; Dcir = circuit delay (propagation delay of 1 bit); and Dt = transmission delay (no-load delay for a 1500 bit message). However in practice a standard delay figure is used for each type of network technology. E.g there will be a standard delay figure for Ethernet, and for serial lines at any particular bit rate. Here is an example of how gateway A's routing table might look in the case of Fig 2. (Note that individual components of the metric vector are not shown, for simplicity.) interface next gateway metric network 1 nw 1 none directly connected network 2 nw 2 none directly connected network 3 nw 3 none directly connected network 4 nw 2 C 1270 nw 3 B 1180 network 5 nw 2 C 1270 nw 3 B 2130 network 6 nw 2 C 2040 nw 3 B 1180 Fig 3. An example routing table The basic process of building up a routing table by exchanging information with neighbors is described by the Bellman-Ford algorithm. The algorithm has been used in earlier protocols such as RIP (RFC 1058). In order to deal with more complex networks, IGRP adds three features to the basic Bellman-Ford algorithm: 1. Instead of a simple metric, a vector of metrics is used to characterize paths. A single composite metric can be computed from this vector according to equation 1. Use of a vector allows the gateway to accomodate different types of service, by using several different coefficients in Eq. 1. It also allows a more accurate representation of the characteristics of the network than a single metric. 2. Instead of picking a single path with the smallest metric, traffic is split among several paths with metrics falling into a specified range. This allows several routes to be used in parallel, providing a greater effective bandwidth than any single route. A variance V is specified by 10 the network administrator. All paths with minimal composite metric M are kept. In addition, all paths whose metric is less than V x M are kept. Traffic is distributed among multiple paths in inverse proportion to the composite metrics. 3. Several features are introduced to provide stability in situations where the topology is changing. These features are intended to prevent routing loops and "counting to infinity," which have characterized previous attempts to use Ford-type algorithms for this type of application. The primary stability features are "holddowns", "triggered updates", "split horizon," and "poisoning". These will be discussed in more detail below. Traffic splitting (point 2) raises a rather subtle danger. The variance V is designed to allow gateways to use parallel paths of different speeds. For example, there might be a 9600 bps line running in parallel with a 19200 bps line, for redundancy. If the variance V is 1, only the best path will be used. So the 9600 bps line will not be used if the 19200 bps line has a reasonable reliability. By raising the variance, we can allow traffic to be split between the best route and other routes that are nearly as good. With a large enough variance, traffic will be split between the two lines. The danger is that with a large enough variance, paths become allowed that aren't just slower, but are actually "in the wrong direction". Thus there should be an additional rule to prevent traffic from being sent "upstream": No traffic is sent along paths whose remote composite metric (the composite metric calculated at the next hop) is greater than the composite metric calculated at the gateway. In general system administrators are encouraged not to set the variance above 1 except in specific situations where parallel paths need to be used. In this case, the variance is carefully set to provide the "right" results. IGRP is intended to handle multiple "types of service," and multiple protocols. Type of service is a specification in a data packet that modifies the way paths are to be evaluted. For example, the TCP/IP protocol allows the packet to specify the relative importance of high bandwidth, low delay, or high reliability. Generally, interactive applications will specify low delay, whereas bulk transfer applications will specify high bandwidth. These requirements determine the relative values of K1 and K2 that are appropriate for use in Eq. 1. Each combination of specifications in the packet that is to be supported is referred to as a "type of service". For each type of service, a set of parameters K1 and K2 must be chosen. A routing table is kept for each type of service. This is done because paths are selected and ordered according to the composite metric defined by Eq. 1. This is different for each type of service. Information from all of these routing tables is combined to produce the routing update messages exchanged by the gateways, as described in Fig. 7. 5.2 Stability Features This section describes holddowns, triggered updates, split horizon, and poisoning. These features are designed to prevent gateways from picking up erroneous routes. As described in RFC 1058, this can happen when a route becomes unusable, due to failure of a gateway or a network. In principle, the adjacent gateways detect failures. They then send routing updates that show the old route as unusable. However it is possible for updates not to reach some parts of the network at all, or to be delayed in reaching certain gateways. A gateway that still believes the old route is good can continue spreading that information, thus reentering the failed route into the system. Eventually this information will propagate through the network and come back to the gateway that reinjected it. The result is a circular route. In fact there is some redundancy among11the countermeasures. In principle, holddowns and triggered updates should be sufficient to prevent erroneous routes in the first place. However in practice, communications failures of various kinds can cause them to be be insufficient. Split horizon and route poisoning are intended to prevent routing loops in any case. Normally, new routing tables are sent to neighboring gateways on a regular basis (every 90 seconds by default, although this can be adjusted by the system administrator). A triggered update is a new routing table that is sent immediately, in response to some change. The most important change is removal of a route. This can happen because a timeout has expired (probably a neighboring gateway or line has gone down), or because an update message from the next gateway in the path shows that the path is no longer usable. When a gateway G detects that a route is no longer usable, it triggers an update immediately. This update will show that route as unusable. Consider what happens when this update reaches the neighboring gateways. If the neighbor's route pointed back to G, the neighbor must remove the route. This causes the neighbor to trigger an update, etc. Thus a failure will trigger a wave of update messages. This wave will propagate throughout that portion of the network in which routes went through the failed gateway or network. Triggered updates would be sufficent if we could guarantee that the wave of updates reached every appropriate gateway immediately. However there are two problems. First, packets containing the update message can be dropped or corrupted by some link in the network. Second, the triggered updates don't happen instantaneously. It is possible that a gateway that has not yet gotten the triggered update will issue a regular update at just the wrong time, causing the bad route to be reinserted in a neighbor that had already gotten the triggered update. Holddowns are designed to get around these problems. The holddown rule says that when a route is removed, no new route will be accepted for the same destination for some period of time. This gives the triggered updates time to get to all other gateways, so that we can be sure any new routes we get aren't just some gateway reinserting the old one. The holddown period must be long enough to allow for the wave of triggered updates to go throughout the network. In addition, it should include a couple of regular broadcast cycles, to handle dropped packets. Consider what happens if one of the triggered updates is dropped or corrupted. The gateway that issued that update will issue another update at the next regular update. This will restart the wave of triggered updates at neighbors that missed the initial wave. The combination of triggered updates and holddowns should be sufficient to get rid of expired routes and prevent them from being reinserted. However some additional precautions are worth doing anyway. They allow for very lossy networks, and networks that have become partitioned. The additional precautions called for by IGRP are split horizon and route poisoning. Split horizon arises from the observation that it never makes sense to send a route back in the direction from which it came. Consider the following situation: network 1 network 2 -------------X-----------------X gateway A gateway B Gateway A will tell B that it has a route to network 1. When B sends updates to A, there is never any reason for it to mention network 1. Since A is closer to 1, there is no reason for it to consider going via B. The split horizon rule says a separate update message should be generated for each neighbor (actually each neighboring network). The update for a given neighbor should omit routes that point to that neighbor. This rule prevents loops between adjacent gateways. E.g. suppose A's interface to network 1 fails. Without the split horizon rule, B would be telling A that it can get to 1. Since it no longer has a real route, A might pick up that route. In this case, A and B would both 12 have routes to 1. But A would point to B and B would point to A. Of course triggered updates and holddowns should prevent this from happening. But since there's no reason to send information back to the place it came from, split horizon is worth doing anyway. In addition to its role in preventing loops, split horizon keeps down the size of update messages. Split horizon should prevent loops between adjacent gateways. Route poisoning is intended to break larger loops. The rule is that when an update shows the metric for an existing route to have increased sufficiently, there is a loop. The route should be removed and put into holddown. There are actually two versions of the rule. The cleanest is based on the hop count. If the hop count for a path increases, it would seem to indicate that something further on in the path has failed, so the route now must go through more gateways. However if this were really the case, a triggered update would have been issued and the route would be in holddown. So the actual sequence would be that the old route would be removed, holddown would occur, and after the holddown the new, longer path would be established. In fact if the hop count for an existing route increases, it indicates that this route is part of a loop. The actual cisco implementation has just begun maintaining a hop count. Since the hop count cannot yet be relied on, an alternative test is needed in the interim. So currently a route is removed if the composite metric increases more than a factor of 1.1. It is not safe for just any increase in composite metric to trigger removal of the route, since small metric changes can occur due to changes in channel occupancy or reliability. So the factor of 1.1 is just a heuristic. The exact value isn't critical. We expect this rule only to be needed to break very large loops, since small ones will be prevented by triggered updates and holddowns. 5.3 Details of the Update Process The set of processes described in Fig. 4-8 are intended to handle a single network protocol, e.g. TCP/IP, DECnet, or the ISO/OSI protocol. However protocol details will be given only for TCP/IP. A single gateway may process data that follows more than one protocol. Because each protocol has different addressing structures and packet formats, the computer code used to implement Figs. 4-8 will generally be different for each protocol. The process described in Fig. 4 will vary the most, as described in the detailed notes for Fig. 4. The processes described in Fig. 5-8 will have the same general structure. The primary difference from protocol to protocol will be the format of the routing update packet, which must be designed to be compatible with a specific protocol. Note that the definition of a destination may vary from protocol to protocol. The method described here can be used for routing to individual hosts, to networks, or for more complex hierachical address schemes. Which type of routing is used will depend upon the addressing structure of the protocol. The current TCP/IP implementation supports only routing to IP networks. Thus "destination" really means IP network or subnet number. Subnet information is only kept for connected networks. Figures 4-7 show pseudo-code for various pieces of the routing process used by the gateways. At the start of the program, acceptable protocols and parameters describing each interface are entered. The gateway will only handle certain protocols which are listed. Any communication from a system using a protocol not on the list will be ignored. The data inputs are 13 - the networks to which the gateway is connected - the unloaded bandwidth of each networks - topological delay of each network - reliability of each network - channel occupancy of each network - MTU of each network The metric function for each data path is then computed according to Eq. 1. Note that the first three items are reasonably permanent. They are a function of the underlying network technology, and do not depend upon load. They could be set from a configuration file or by direct operator input. Note that IGRP does not use measured delay. Both theory and experience suggest that it is very difficult for protocols that use measured delay to maintain stable routing. There are two measured parameters: reliability and channel occupancy. Reliability is based on error rates reported by the network interface hardware or firmware. In addition these inputs, the routing algorithm requires a value for the variance V. This would normally be specified by a configuration file or operator input. Once initial information is entered, operations in the gateway are triggered by events - either the arrival of a data packet at one of the network interfaces, or expiration of a timer. The processes described in Figs. 4-7 are triggered as follows: When a packet arrives, it is processed according to Fig 4. This results in the packet being sent out another interface, discarded, or accepted for further processing. When a packet is accepted by the gateway for further processing, it is analyzed in a protocol-specific fashion not described in this specification. If the packet is a routing update, it is processed according to Fig. 5. Figure 6 shows events triggered by a timer. The timer is set to generate an interrupt once per second. When the interrupt occurs, the process shown in Fig. 6 is executed. Figure 7 shows a routing update subroutine. Calls to this subroutine are shown in Figs 5 and 6. In addition, Fig. 8 shows details of metric computations, referred to in Fig. 5 and 7. There are 4 critical time constants that control route propagation and expiration. These time constants may be set by the system administrator. However there are default values. These time constants are - broadcast time - updates are broadcast by all gateways on all connected interfaces this often. The default is once every 90 sec. - invalid time - if no update has been received for a given path within this amount of time, it is considered to have timed out. It should be several times the broadcast time, in order to allow for the possibility that packets containing an update could be dropped by the network. The default 14 is 3 times the broadcast time. - hold time - when a destination has become unreachable (or the metric has increased enough to cause poisoning), the destination goes into "holddown". During this state, no new path will be accepted for the same destination for this amount of time. The hold time indicates how long this state should last. It should be several times the broadcast time. The default value is 3 times the broadcast time plus 10 sec. - flush time - if no update has been received for a given destination within this amount of time, the entry for it is removed from the routing table. Note the difference between invalid time and flush time: After the invalid time a path is timed out and removed. If there are no remaining paths to a destination, the destination is now unreachable. However the database entry for the destination remains. It has to remain in order to enforce the holddown. After the flush time the database entry is removed from the table. It should be somewhat longer than the invalid time plus the holddown time. The default is 7 times the broadcast time. These figures presuppose the following major data structures. A separate set of these data structures is kept for each protocol supported by the gateway. Within each protocol, a separate set of data structures is kept for each type of service to be supported. For each destination known to the system, there is a (possibly null) list of paths to the destination, a holddown expiration time, and a last update time. The last update time indicates the last time any path for this destination was included in an update from another gateway. Note that there are also update times kept for each path. Holddowns are discussed in section 5.2. When the last path to a destination is removed, the destination is put into holddown. The holddown expiration time indicates the time at which the holddown expires. The fact that it is non-zero indicates that the destination is in holddown. In order to save calculation time, it is also a good idea to keep a "best metric" for each destination. This is simply the minimum of the composite metrics for all the paths to the destination. For each path to a destination, there is the address of the next hop in the path, the interface to be used, a vector of metrics characterizing the path, including topological delay, bandwidth, reliability, and channel occupancy. Other information is also associated with each path, including hop count, MTU, source of information, the remote composite metric, and a composite metric calculated from these numbers according to equation 1. There is also a last update time. The source of information indicates where the most recent update for that path came from. In practice this is the same as the address of the next hop. The last update time is simply the time at which the most recent update arrived for this path. It is used to expire timed-out paths. Note that an IGRP update message has three portions: interior, system (meaning "this autonomous system" but not interior), and exterior. The interior section is for routes to subnets. Not all subnet information is included. Only subnets of one network are included. This is the network associated with the address to which the update is being sent. Normally updates are broadcast on each interface, so this is simply the network on which the broadcast is being sent. (Other cases arise for responses to an IGRP request and point to point IGRP.) Major networks (i.e. non-subnets) are put into the system portion of the update message unless they are specifically flagged as exterior. 15 A network will be flagged as exterior if it was learned from another gateway and the information arrived in the exterior portion of the update message. cisco's implementation also allows the system administrator to declare specific networks as exterior. Exterior routes are also referred to as "candidate default". They are routes that go to or through gateways that are considered to be appropriate as defaults, to be used when there is no explicit route to a destination. For example at Rutgers we configure the gateway that connects Rutgers to our regional network so that it flags the route to the NSFnet backbone as exterior. cisco's implementation chooses a default route by picking that exterior route with the smallest metric. The following sections are intended to clarify certain portions of Fig. 4-8. 5.4 Packet Routing Fig 4 describes overall processing of input packets. This is used simply to clarify terminology. Obviously this is not a complete description of what an IP gateway does. This process uses the list of supported protocols and the information about the interfaces entered when the gateway is initialized. Details of the packet processing depend upon the protocol used by the packet. This is determined in Step A. Step A is the only portion of Fig. 4 which is shared by all protocols. Once the protocol type is known, the implementation of Fig. 4 appropriate to the protocol type is used. Details of the packet contents are described by the specifications of the protocol. The specifications of a protocol include a procedure for determining the destination of a packet, a procedure for comparing the destination with the gateway's own addresses to determine whether the gateway itself is the destination, a procedure for determining whether a packet is a broadcast, and a procedure for determining whether the destination is part of a specified network. These procedures are used in steps B and C of Fig. 4. The test in step D requires a search of the destinations listed in the routing table. The test is satisfied if there is an entry in the routing table for the destination, and that destination has associated with it at least one usable path. Note that the destination and path data used in this and the next step are maintained separately for each type of service supported. Thus this step begins by determining the type of service specified by the packet, and selecting the corresponding set of data structures to use for this and the next step. A path is usable for the purposes of steps D and E if its remote composite metric is less than its composite metric. A path whose remote composite metric is greater than its composite metric is a path whose next hop is "farther away" from the destination, as measured by the metric. This is referred to as an "upstream path." Normally one would expect that the use of metrics would prevent upstream paths from being chosen. It is easy to see that an upstream path can never be the best one. However if a large variance is allowed, paths other than the best one can be used. Some of those could be upstream. Step E computes the path to use. Paths whose remote composite metric is not less than their composite metrics are not considered. If more than one path is acceptable, such paths are used in a weighted form of round-robin alternation. The frequency with which a path is used is inversely proportional to its composite metric. 16 5.5 Reception Of Routing Updates Fig 5 describes the processing of a routing update received from a neighboring gateway. Such updates consist of a list of entries, each of which gives information for a single destination. More than one entry for the same destination can occur in a single routing update, to accomodate multiple types of service. Each of these entries is processed individually, as described in Fig. 5. If an entry is in the exterior section of the update, the exterior flag will be set for the destination if it is added as a result of this process. The entire process described in figure 5 must be repeated once for each type of service supported by the gateway, using the set of destination / path information associated with that type of service. This is shown in the outermost loop in Fig. 5. The entire routing update must be processed once for each type of service. (Note that the current implementation of IGRP does not support multiple types of service. So the outermost loop is not actually implemented.) In step A, basic acceptability tests are done on the path. This should include reasonableness tests for the destination. Impossible ("Martian") network numbers should be rejected. (See RFC 1009 and RFC 1122 for details.) Updates are also rejected if the destination they refer to is in holddown, i.e. the holddown expiration time is non-zero and later than the current time. In steps B the routing table is searched to see whether this entry describes a path that is already known. A path in the routing table is defined by the destination with which it is associated, the next hop listed as part of the path, the output interface to be used for the path, and the information source (the address from which the update came -- in practice normally the same as the next hop). The entry from the update packet describes a path whose destination is listed in the entry, whose output interface is the interface that the update came in, and whose next hop and information source are the address of the gateway that sent the update (the "source" S). In steps H and T, the update process described in Fig. 7 is scheduled. This process will actually run after the entire process described in Fig. 5 has been finished. That is, the update process described in Fig. 7 will only happen once, even if it is triggered several times during the processing described in Fig. 5. Furthermore, precautions must be taken to keep updates from being issued too often, if the network is changing rapidly. Steps K is done if the destination described by the current entry in the update packet already exists in the routing table. K compares the new composite metric computed from data in the update packet with the best composite metric for the destination. Note that the best composite metric is not recomputed at this time. So if the path being considered is already in the routing table, this test may compare new and old metrics for the same path. Step L is performed for the paths that are worse than the existing best composite metric. This includes both new paths that are worse than existing ones and existing paths whose composite metric has increased. Step L tests whether the new path is acceptable. Note that this test implements both the test for whether a new path is good enough to keep, and route poisoning. In order to be acceptable, the delay value must not be the special value that indicates an unreachable destination (for the current IP implementation, all ones in a 24 bit field), and the composite metric (calculated as specified in Fig. 8) must be acceptable. To determine whether the composite metric is acceptable, compare it with the composite metrics of all other paths to the destination. Let M be the minimum of these. The new path is acceptable if it 17 is < V x M, where V is the variance set when the gateway was initialized. If V = 1, then a metric any worse than the existing one is not acceptable. There is one exception to this: If the path already exists and is the only path to the destination, the path will be retained if the metric has not increased by more than 10%. Test O compares the hop count calculated from the data in the update with the hop count recorded in routing table as part of the path P. If the newly calculated one is greater than the one in routing table by more than 2, this test succeeds. (This test is not done in the current implementation of IGRP. The current implementation does not yet use the hop count. When this test is implemented, it will no longer be necessary to remove existing paths when their metric has increased by more than 10%.) Step V is done when the new information for a path indicates that the composite metric will be decreased. The composite metrics of all paths to destination D are compared. In this comparison, the new composite metric for P is used, rather than the one appearing in routing table. The minimum composite metric M is calculated. Then all paths to D are examined again. If the composite metric for any path > M x V, that path is removed. V is the variance, entered when the gateway was initialized. 5.6 Periodic Processing The process described in Figure 6 is triggered once a second. It examines various timers in the routing table, to see if any has expired. These timers are described above. In step U, the process described in Fig. 7 is activated. Steps R and S are necessary because the composite metrics stored in the routing table depend upon the channel occupancy, which changes over time, based on measurements. Periodically the channel occupancy is recalculated, using a moving average of measured traffic through the interface. If the newly-calculated value differs from the existing one, all composite metrics involving that interface must be adjusted. Every path shown in the routing table is examined. Any path whose next hop uses interface "I" has its composite metric recalculated. This is done in accordance with eq. 1, using as the channel occupancy the maximum of the value stored in routing table as part of the path's metric, and the newly calculated channel occupancy of the interface. 5.7 Generating Update Messages Figure 7 describes how the gateway generates update messages to be sent to other gateways. A separate message is generated for each network interface attached to the gateway. That message is then sent to all other gateways that are reachable through the interface (step J). Generally this is done by sending the message as a broadcast. However, if the network technology or protocol does not allow broadcasts, it may be necessary to send the message individually to each gateway. In general, the message is built up by adding an entry for each destination in the routing table, in step G. Note that the destination / path data associated with each type of service must be used. In the worst case, a new entry is 18 added to the update for each destination for each type of service. However, before adding an entry to the update message in step G, the entries already added are scanned. If the new entry is already present in the update message, it is not added again. A new entry duplicates an existing one when the destinations and next hop gateways are the same. For the sake of simplicity, the pseudo-code omits one thing: IGRP update messages have three parts: interior, system, and exterior. Thus there are actually three loops over destinations. The first includes only subnets of the network to which the update is being sent. The second includes all major networks (i.e. non-subnets) that are not flagged as exterior. The third includes all major networks that are flagged as exterior. Step E implements the split horizon test. In the normal case, this test fails for routes whose best path goes out the same interface that the update is being sent out. However if the update is being sent to a specific destination (e.g. in response to an IGRP request from another gateway, or as part of "point to point IGRP"), split horizon fails only if the best path originally came from that destination (its "information source" is the same as the destination) and its output interface is the same as the one the request came in from. 5.8 Computing Metric Information Fig. 8 describes how the metric information is processed from update messages received by the gateway, and how it is generated for update messages being sent by the gateway. Note that the entry is based on one particular path to the destination. If there is more than one path to the destination, a path whose composite metric is minimum is chosen. If more than one path has the minimum composite metric, an arbitrary tie-breaking rule is used. (For most protocols, this will be based on the address of the next hop gateway.). 19 Fig 4. Processing incoming packets Data packet arrives using interface I A Determine protocol used by packet If protocol is not supported then discard packet B If destination address matches any of gateway's addresses or the broadcast address then process packet in protocol-specific way C If destination is on a directly-connected network then send packet direct to the destination, using the encapsulation appropriate to the protocol and link type D If there are no paths to the destination in the routing table, or all paths are upstream then send protocol-specific error message and discard the packet E Choose the next path to use. If there are more than one, alternate round-robin with frequency proportional to inverse of composite metric. Get next hop from path chosen in previous step. Send packet to next hop, using encapsulation appropriate to protocol and data link type. 20 Fig 5. Processing incoming routing updates Routing update arrives from source S For each type of service supported by gateway Use routing data associated with this type of service For each destination D shown in update A If D is unacceptable or in holddown then ignore this entry and continue loop with next destination D B Compute metrics for path P to D via S (see Fig 8) If destination D is not already in the routing table then Begin Add path P to the routing table, setting last update times for P and D to current time. H Trigger an update Set composite metric for D and P to new composite metric computed in step B. End 21 Else begin (dest. D is already in routing table) K Compare the new composite metric for P with best existing metric for D. New > old: L If D is shown as unreachable in the update, or the new composite metric is greater then the exist metric for D by more than the variance O or P has a new hop count > old hop count + 2 then Begin Remove P from routing table if present If P was the last route to D then Set holddown time for D to current time + holddown time T and Trigger an update End else Begin Compute new best composite metric for D Put the new metric information into the entry for P in the routing table Add path P to the routing table if it was not present. Set last update times for P and D to current time. End New <= old: V Set composite metric for D and P to new composite metric computed in step B. If any other paths to D are now outside the variance, remove them. Put the new metric information into the entry for P in the routing table Set last update times for P and D to current time. End End of for End of for 22 Fig 6. Periodic processing Process is activated by regular clock, e.g. once per second For each path P in the routing table (except directly connected interfaces) If current time < P's last update time + invalid time then continue with the next path P Remove P from routing table If P was the last route to D then Set metric for D to inaccessible Start holddown timer for D and Trigger an update else Recompute the best metric for D End of for For each destination D in the routing table If D's metric is inaccessible then Begin Clear all paths to D If current time >= D's last update time + flush time then Remove entry for D End End of for For each network interface I attached to the gateway R Recompute channel occupancy and error rate S If channel occupancy or error rate has changed, then recompute metrics End of for At intevals of broadcast time U Trigger update 23 Fig 7. Generate update Process is caused by "trigger update" For each network interface I attached to the gateway Create empty update message For each type of service S supported Use path/destination data for S For each destination D E If any paths to D have a next hop reached through I then continue with the next destination If any paths to D with minimal composite metric are already in the update message then continue with the next destination G Create an entry for D in the update message, using metric information from a path with minimal composite metric (see Fig. 8) End of for End of for J If there are any entries in the update message then send it out interface I End of for 24 Fig 8. Details of metric computations This section describes the procedure for computing metrics and hop counts from an arriving routing update. The input to this function is the entry for a specific destination in a routing update packet. The output is a vector of metrics which can be used to compute the composite metric, and a hop count. If this path is added to the routing table, the entire vector of metrics is entered in the table. The interface parameters used in the following definitions are those set when the gateway was initialized, for the interface on which the routing update arrived, except that the channel occupancy and reliability are based on a moving average of measured traffic through the interface. delay = delay from packet + interface topological delay bandwidth = min (bandwidth from packet, interface bandwidth) reliability = min (reliability from packet, interface reliability) channel occupancy = max (channel occupancy from packet, interface channel occupancy) Note that the original channel occupancy from the packet must be saved, since it will be needed to recompute the effective channel occupancy whenever the interface channel occupancy changes. The following are not part of the metric vector, but are also kept in the routing table as characteristics of the path: hop count = hop count from packet MTU = min (MTU from packet, interface MTU) remote composite metric - calculated from Eq. 1 using the metric values from the packet. That is, the metric components are those from the packet, and are not updated as shown above. Obviously this must be calculated before the adjustments shown above are done. composite metric- calculated from Eq. 1 using the metric values calculated as described in this section 25 This section describes the procedure for computing metrics and hop count for routing updates to be sent: This function determines the metric information and hop count to be put into an outgoing update packet. It is based on a specific path to a destination, if there are any usable paths. If there are no paths, or the paths are all upstream, the destination is called "inaccesible". If destination is inaccessible, this is indicated by using a specific value in the delay field. This value is chosen to be larger than the largest valid delay. For the IP implementation this is all ones in a 24-bit field. If destination is directly reachable through one of the interfaces, use the delay, bandwidth, reliability, and channel occupancy of the interface. Set hop count to 0. Otherwise, use the vector of metrics associated with the path in the routing table. Add one to the hop count from the path in the routing table. 6. Details of the IP implementation This section gives a brief description of the packet formats used by cisco's IGRP. IGRP is sent using IP datagrams with IP protocol 9 (IGP). The packet begins with a header. It starts immediately after the IP header. unsigned version: 4; /* protocol version number */ unsigned opcode: 4; /* opcode */ uchar edition; /* edition number */ ushort asystem; /* autonomous system number */ ushort ninterior; /* number of subnets in local net */ ushort nsystem; /* number of networks in AS */ ushort nexterior; /* number of networks outside AS */ ushort checksum; /* checksum of IGRP header and data */ For update messages, routing information follows immediately after the header. The version number is currently 1. Packets having other version numbers are ignored. The opcode is either 1 update 2 request This indicates the type of message. The format of the two message types will be given below. Edition is a serial number which is incremented whenever there is a change in the routing table. (This is done in those conditions in which the pseudo-code above says to trigger a routing update.) The edition number allows gateways to avoid processing updates containing information that they have already seen. (This is not currently implemented. That is, the edition number is generated correctly, but it is ignored on input. Because it is possible for packets to be dropped, it is not clear that the edition number is sufficient to avoid duplicate processing. It would be necessary to make sure that all of the 26 packets associated with the edition had been processed.) Asystem is the autonomous system number. In cisco's implementation a gateway can participate in more than one autonomous system. Each such system runs its own IGRP protocol. Conceptually, there are completely separate routing tables for each autonomous system. Routes that arrive via IGRP from one autonomous system are sent only in updates for that AS. This field allows the gateway to select which set of routing tables to use for processing this message. If the gateway receives an IGRP message for an AS that it is not configured for, it is ignored. In fact cisco's implementation allows information to be "leaked" from one AS to another. However I regard that as an administrative tool, and not part of the protocol. Ninterior, nsystem, and nexterior indicate the number of entries in each of the three sections of update messages. These sections have been described above. There is no other demarcation between the sections. The first ninterior entries are taken to be interior, the next nsystem entries as being system, and the final nexterior as exterior. Checksum is an IP checksum, computed using the same checksum algorithm as a UDP checksum. The checksum is computed on the IGRP header and any routing information that follows it. The checksum field is set to zero when computing the checksum. The checksum does not include the IP header, nor is there any virtual header as in UDP and TCP. 6.1 Requests An IGRP request asks the recipient to send its routing table. The request message has only a header. Only the version, opcode, and asystem fields are used. All other fields are zero. The recipient is expected to send a normal IGRP update message to the requester. 6.2 Updates An IGRP update message contains a header, followed immediately by routing entries. As many routing entries are included as will fit into a 1500-byte datagram (including IP header). With current structure declarations, this allows up to 104 entries. If more entries are needed, several update messages are sent. Since update messages are simply processed entry by entry, there is no advantage to using a single fragmented message rather than several independent ones. Here is the structure of a routing entry: uchar number[3]; /* 3 significant octets of IP address */ uchar delay[3]; /* delay, in tens of microseconds */ uchar bandwidth[3]; /* bandwidth, in units of 1 Kbit/sec */ uchar mtu[2]; /* MTU, in octets */ uchar reliability; /* percent packets successfully tx/rx */ uchar load; /* percent of channel occupied */ uchar hopcount; /* hop count */ The fields defined uchar[2] and uchar[3] are simply 16 and 24 bit binary integers, in normal IP network order. 27 Number defines the destination being described. It is an IP address. To save space, only the first 3 bytes of the IP address are given, except in the interior section. In the interior section, the last 3 bytes are given. For system and exterior routes, no subnets are possible, so the low-order byte is always zero. Interior routes are always subnets of a known network, so the first byte of that network number is supplied. Delay is in units of 10 microseconds. This gives a range of 10 microseconds to 168 seconds, which seems sufficient. A delay of all ones indicates that the network is unreachable. Bandwidth is inverse bandwidth in bits per sec scaled by a factor of 10e10. The range is from a 1200 bps line to 10 Gbps. MTU is in bytes. Reliability is given as a fraction of 255. That is, 255 is 100%. Load is given as a fraction of 255. Hopcount is a simple count. Because of the somewhat wierd units used for bandwidth and delay, some examples seem in order. These are the default values used for several common media. Delay Bandwidth Satellite 200,000 (2 sec) 20 (500 Mbit) Ethernet 100 (1 ms) 1,000 1.544 Mbit 2000 (20 ms) 6,476 64 Kbit 2000 156,250 56 Kbit 2000 178,571 10 Kbit 2000 1,000,000 1 Kbit 2000 10,000,000 6.3 Metric computations Here is a description of the way the composite metric is actually computed in cisco version 8.0(3). metric = [K1*bandwidth + (K2*bandwidth)/(256 - load) + K3*delay] * [(reliability + K4)/K5] If K5 == 0, then there is no reliability term. The default version of IGRP has K1 == K3 == 1, K2 == K3 == K5 == 0 28