Hot Potato vs Cold Potato Routing

Hot Potato and Cold Potato are two practices of exchanging traffic between BGP Peers. The difference in these two methods is in the approaches to how to carry traffic across the network.

Hot Potato vs Cold Potato discussions are only relevant in the scenarios where multiple traffic exchange (peering) points exist between two networks.

In our example, we will use the following diagrams depicting two networks spanning across North America and Europe.

We are interested in the traffic flow that is originated by Customer-NA connected to ISP-BLUE and is destined to Customer-EU connected to ISP-GREEN.

BGP Hot Potato vs Cold Potato Routing
BGP Network Interconnects

Continue reading “Hot Potato vs Cold Potato Routing”

Juniper High Availability Customer Site using AS-Prepend

Configuring Dual-CE BGP High Availability Site. This article provides Juniper Configuration Example that uses BGP AS-Prepend to identify primary and secondary paths.
Continue reading “Juniper High Availability Customer Site using AS-Prepend”

IPv4 Bogons

Introduction

BGP Best Practice Recommendation documented in RFC 7454 and discussed in “BGP Best Practices or Dissecting RFC 7454” article mandates the use of inbound prefix-list filtering to discard bogus route-advertisements to and from BGP peers. It is strongly recommended that you implement aforementioned filtering if you accept the full or partial BGP view from your peers.

You do not need to maintain inbound bogus route filtering if the only route you are planning to accept from your service providers is the default 0.0.0.0/0 prefix. In this scenario, you should configure an explicit prefix-filter permitting 0.0.0.0/0 route and rejecting all other advertisements.

Bogons, Martians, Bogus Advertisements

Over the years, networking professions have used various terms to refer to the same thing. These “bad” advertisements might be referred to as Bogons, Martian Lists, Bogus Advertisements, etc.

The current list is comprised of IP Blocks that are used for some kind of special use, such as RFC1918 space, Loopback block, etc. Sometime ago this list also included valid IPv4 prefixes that have not been allocated by The Internet Assigned Numbers Authority (IANA). IPv4 Space Exhaustion put stop to this. For the majority of ISPs and Enterprises, it is no longer feasible to include remaining unallocated blocks to the Bogons least, as this IPv4 space is small and constantly changing. The situation is very different when it comes to IPv6 space, and it will be discussed in IPv6 Martians article.

Importance of Bogons

The main reason for filtering-out Bogon advertisements is the Internet security. Bad things might begin to happen if you allow Bogon blocks to be accepted into your BGP domain. Let’s consider a few scenarios where hackers were able to advertise RFC1918 block to your network.

  • Firewall filters might consider RFC1918 blocks “trusted” space and permit dataflows that otherwise would be rejected
  • Spammers might send out email messages from servers in RFC1918 space, making it nearly impossible to track them back
  • Similar to Spam, DDoS Attacks from RFC1918 space are impossible to track
  • Your network might attract large volume of bogus traffic destined to RFC1918 space, such as portscans, vulnerability scans, etc

Current IPv4 Bogons List

Source: http://www.radb.net/query/?keywords=fltr-martian

  • 0.0.0/8
  • 10.0.0.0/8
  • 100.64.0.0/10
  • 127.0.0.0/8
  • 169.254.0.0/16
  • 172.16.0.0/12
  • 192.0.0.0/24
  • 192.0.2.0/24
  • 192.168.0.0/16
  • 198.18.0.0/15
  • 198.51.100.0/24
  • 203.0.113.0/24
  • 224.0.0.0/3

Cisco Configuration

ip prefix-list martians seq 10 deny 0.0.0.0/8 le 32

ip prefix-list martians seq 20 deny 10.0.0.0/8 le 32

ip prefix-list martians seq 30 deny 100.64.0.0/10 le 32

ip prefix-list martians seq 40 deny 127.0.0.0/8 le 32

ip prefix-list martians seq 50 deny 169.254.0.0/16 le 32

ip prefix-list martians seq 60 deny 172.16.0.0/12 le 32

ip prefix-list martians seq 70 deny 192.0.0.0/24 le 32

ip prefix-list martians seq 80 deny 192.0.2.0/24 le 32

ip prefix-list martians seq 90 deny 192.168.0.0/16 le 32

ip prefix-list martians seq 100 deny 198.18.0.0/15 le 32

ip prefix-list martians seq 110 deny 198.51.100.0/24 le 32

ip prefix-list martians seq 120 deny 203.0.113.0/24 le 32

ip prefix-list martians seq 130 deny 224.0.0.0/3 le 32

ip prefix-list martians seq 9999 permit 0.0.0.0/0 le 32




router bgp 111100

  ...

  neighbor 120.0.4.17 prefix-list martians in

 

Juniper Configuration

Set Format:

set policy-options policy-statement martians-ipv4  from route-filter 0.0.0.0/8 orlonger reject

set policy-options policy-statement martians-ipv4  from route-filter 10.0.0.0/8 orlonger reject

set policy-options policy-statement martians-ipv4  from route-filter 100.64.0.0/10 orlonger reject

set policy-options policy-statement martians-ipv4  from route-filter 127.0.0.0/8 orlonger reject

set policy-options policy-statement martians-ipv4  from route-filter 169.254.0.0/16 orlonger reject

set policy-options policy-statement martians-ipv4  from route-filter 172.16.0.0/12 orlonger reject

set policy-options policy-statement martians-ipv4  from route-filter 192.0.0.0/24 orlonger reject

set policy-options policy-statement martians-ipv4  from route-filter 192.0.2.0/24 orlonger reject

set policy-options policy-statement martians-ipv4  from route-filter 192.168.0.0/16 orlonger reject

set policy-options policy-statement martians-ipv4  from route-filter 198.18.0.0/15 orlonger reject

set policy-options policy-statement martians-ipv4  from route-filter 198.51.100.0/24 orlonger reject

set policy-options policy-statement martians-ipv4  from route-filter 203.0.113.0/24 orlonger reject

set policy-options policy-statement martians-ipv4  from route-filter 224.0.0.0/3 orlonger reject

set policy-options policy-statement martians-ipv4  then accept




set protocols bgp group ebgp import martians-ipv4

Curly Braces format:

policy-statement martians-ipv4 {
 from {
  route-filter 0.0.0.0/8 orlonger reject;
  route-filter 10.0.0.0/8 orlonger reject;
  route-filter 100.64.0.0/10 orlonger reject;
  route-filter 127.0.0.0/8 orlonger reject;
  route-filter 169.254.0.0/16 orlonger reject;
  route-filter 172.16.0.0/12 orlonger reject;
  route-filter 192.0.0.0/24 orlonger reject;
  route-filter 192.0.2.0/24 orlonger reject;
  route-filter 192.168.0.0/16 orlonger reject;
  route-filter 198.18.0.0/15 orlonger reject;
  route-filter 198.51.100.0/24 orlonger reject;
  route-filter 203.0.113.0/24 orlonger reject;
  route-filter 224.0.0.0/3 orlonger reject;
  }
 then accept;
}

protocols {
 bgp {
   group ebgp {
    import martians-ipv4;
    ...
   } 
 }
}

 

BGP Best Practices or Dissecting RFC 7454

In this article, we will focus on the RFC 7547. This RFC covers BGP Operations and Security best current practices and needs to be understood and implemented by any organization running BGP in production.

Introduction

RFC 7547 recommendations can be split into the following categories:

  • BGP Session Protection
  • Prefix Filtering Recommendations
  • AS-Path Filtering Recommendations
  • Next-Hop Filtering
  • Optional BGP Community Scrubbing
  • Traffic Filtering Recommendations

In this article, we will use Roman Numerals (I, II, etc) to identify BGP protection mechanisms, Arabic Numerals (1,2, etc) to identify Traffic Filtering, Uppercase Letters (A, B, etc) to identify Prefix Filtering, and Lowercase Letter (a,b, etc) to identify AS-Path filtering and Greek  Letters (α,   β)  to identify BGP scrubbing.

Figure below shows depicts peering routers connected to upstream, private, IXP and downstream peers.

RFC7454 Peering Router
RFC7454 Peering Router

As most of the modern routers do, our sample router has a dedicated forwarding engine responsible for forwarding packets and a dedicated routing engine responsible for participating in routing protocols, building Routing Information Base (RIB) and Forwarding Information Base (FIB) tables. While actual vendors’ implementations will vary between routers’ models, best practices discussed in this article are generic enough to be applicable to the majority of vendors.

BGP Protection

Group of BGP Protection mechanisms is responsible for maintaining stability of BGP sessions, as well as providing anti-spoofing and bogus route-injection protection mechanisms. We will also add “maximum-prefix” protection mechanism to this category, as it helps to protect against operators’ mistakes.

RFC7454 BGP Protection
RFC7454 BGP Protection

I. GTSM (TTL Security)

GTSM – Generalized TTL Security Mechanisms, also known as TTL security, defined in RFC 5082. GTSM (TTL Security) is a mechanism that checks TTL value of incoming IP Packets in order to make sure they have not been spoofed. Directly connected BGP peers will set IP TTL value to 255, making it impossible to deliver spoofed IP with TTL=255 packets via non-directly connected interfaces. As per section 5.2 of RFC 7454 GTSM should be implemented.

Configuration Examples:

II. TCP-AO (TCP Authentication Option)

TCP-AO – TCP Authentication Option is a stronger protection mechanism than traditionally used MD5, it is described in RFC 5925. At some point, it is expected to replace MD5 for session protection. It has not been widely adopted due to the lack of implementation from equipment vendors.

Section 5.1 of RFC 7454 recommends, although does not require, leveraging either MD5 or TCP-AO for session protection.

No configuration examples due to lack of vendors’ implementation. 

III. MD5

MD5 – Protection of the TCP session header, described in RFC 2385. MD5 is a TCP session protection mechanism that has been available for many years and is supported by the vast majority of equipment manufacturers. It has become the de-facto standard for BGP session protection. Although it has been made obsolete by TCP-AO protection, it is still used for the majority of BGP peering sessions.

Configuration Examples (Simple Key and Key-Chains):

IV. Max-Prefix

Maximum-Prefix Limit is one of the commonly used safety mechanisms that will bring down BGP session if the number of routes advertised by the peer exceeds pre-configured limit. Section 8 of RFC 7454 provides the following recommendations:

  • From public and private peers, it is recommended to have the limit set to either a lower than the number of routes on the Internet, or to a specific number for each peer based on the advertised number of routes plus some headroom. From the author’s experience, setting the number to below the number of routes on the Internet is too risky and should be avoided. There have been situations where public and private peers would make an error and leak the entire BGP table to their peering partners, causing major network instability. Author prefers setting session reset limit to 2x the number of routes normally advertised by the peer and session warning limit to 1.5x number of routes. Your NOC should monitor logs for warning threshold violations and adjust limits accordingly.
  • From upstream, the number of routes should be set higher than the number of routes on the Internet, but not higher that the capabilities of your routers. For example, if FIB tables of your devices can support up to 1 Million IPv4 routes, you can set the limit to be 950,000 routes. While resetting BGP sessions with your upstream providers is never a good thing, damage caused by reset is much lower than that caused by FIB exhaustion. For more information, please refer to our article on BGP Table Size analysis (http://www.bgphelp.com/2017/01/01/bgpsize/).

MD5, TCP-AO and GTSM have to be configured on both sides of the BGP session. Max-Prefix can be configured on one side only.

Prefix Filtering

Prefix-filtering policies are responsible for discarding bogus route-advertisements to and from BGP peers. Examples of these bogus advertisements are prefixes from RFC1918 address space, to specific routes (>24), unallocated prefixes.

RFC7454 Prefix Filtering
RFC7454 Prefix Filtering

Route-filtering should be implemented on each BGP session maintained by the service provider:

  • A. Private/Public/Transit Inbound Prefix Filtering
  • B. Private/Public/Transit Outbound Prefix Filtering
  • C. Downstream Inbound Prefix Filtering
  • D. Downstream Outbound Prefix Filtering

A. Inbound Prefix Filtering from Private/Public/Transit Peers

RFC 7475 provides similar recommendations for route filtering from Upstream Providers (section 6.2.3) and route-filtering from private and public peers (section 6.2.1). Because of this, there is very little difference in filtering policies, allowing us to combine them in one recommendation.

As per Section 6.2.1.1.1 of RFC 7475, the following prefixes should not be accepted from peers

  • Special-Purpose Prefixes (RFC 7475 Section 6.1.1)
  • Unallocated Prefixes (RFC 7475 Section 6.1.2)
  • Prefixes that are too specific (RFC 7475 Section 6.1.3)
  • Prefixes belonging to the local AS (RFC 7475 Section 6.1.4)
  • IXP LAN Prefixes (RFC 7475 Section 6.1.5), other than authorized ASes (RFC 7475 Section 6.1.5)
  • The Default Route (RFC 7475 Section 6.1.6)

Section 6.2.1.1.2 of RFC 7475 also provides recommendations for “Strict” inbound filtering option, which we consider to be too risky and will not cover in this document.

B. Outbound Prefix Filtering towards Private/Public/Transit Peers

As per Section 6.2.1.2 of RFC 7475, the following prefixes should not be accepted from peers

  • Special-Purpose Prefixes (RFC 7475 Section 6.1.1)
  • Prefixes that are too specific (RFC 7475 Section 6.1.3)
  • IXP LAN Prefixes (RFC 7475 Section 6.1.5)
  • The Default Route (RFC 7475 Section 6.1.6)

You also need to make sure that only authorized prefixes (those advertised by your AS and downstream customers) are being sent.

C. Inbound Prefix Filtering from Customers

General recommendations provided in Section 6.2.2.1 of RFC 7475 state that “only customer prefixes SHOULD be accepted, all others SHOULD be discarded.” The list of allowed prefixes should be manually built by the network provisioner after validating that customer prefixes are indeed allocated to the client by IP address management authorities.

In some cases, if customer advertises too many prefixes or has BGP clients of their own, customer-specific filters can be replaced with generic filters previously described in “Inbound Filtering from Private/Public/Transit Peers” section of the paper.

D. Outbound Prefix Filtering towards Customers

Depending on the customer preferences, they might want to receive

  • The default route only
  • Full Internet routing table
  • Subset of the Full Internet table (e.g. only the routes received via public and private peers, but not the transit routes)
  • The default route in addition to the Full or Partial Internet view

Generic recommendation described in Section 6.2.2.2 of RFC 7454 states that the following prefixes should not be sent to the customer:

  • Special-Purpose Prefixes (RFC 7475 Section 6.1.1)
  • Prefixes that are too specific (RFC 7475 Section 6.1.3)
  • The Default Route (RFC 7475 Section 6.1.6), for those customers not willing to receive it

AS-Path Filtering

Section 9 of RFC 7454 provides a number of AS-Path Filtering recommendations that should be implemented on upstream/private/public peering sessions and customer sessions.

RFC7454 AS Path Filtering
RFC7454 AS Path Filtering

Similar to how we analyzed Prefix Filtering recommendations in the previous chapter, we will review AS-Path Filtering recommendations below.

a. Inbound AS-Path Filtering from Private/Public/Transit Peers

Section 9 of RFC 7454 recommends the following:

  • Private AS numbers should not be accepted, unless used for special purposes such as black-hole origination
  • AS Paths with the first AS number not the one of the peer should not be accepted, unless originated by IXP’s router server
  • Do not accept your own AS number in the AS path

b. Outbound AS-Path Filtering from Private/Public/Transit Peers

Section 9 of RFC 7454 recommends the following:

  • Do not originate prefixes with nonempty AS Paths, unless you intend to provide transit for these prefixes
  • Do not originate prefixes with upstream AS numbers in the AS Path, unless you intend to provide transit to these prefixes
  • Do not advertise Private AS Paths, unless there is a special “private” arrangement with your peers

c. Inbound AS-Path Filtering from Downstream Customers

Section 9 of RFC 7454 recommends the following:

  • Only accept 2-byte and 4-byte AS paths containing ASNs belonging to the customer.
  • If this is not possible, accept only path lengths relevant to the type of the customer, while discourage excessive prepending
  • Do not accept your own AS number in the AS path

d. Outbound AS-Path Filtering from Downstream Customers

  • Do not advertise Private AS Paths, unless there is a special “private” arrangement with your customers

Next-Hop Filtering

BGP can advertise prefixes with a third-party next hop, thus directing packets not to the peer announcing the prefix but somewhere else. This mechanism is commonly used at Internet Exchange Points, where prefixes will be announced by IXP’s route-server.

RFC7454 Next Hop Filtering
RFC7454 Next Hop Filtering

Section 10 of RFC 7545 recommends the following policies at IXP locations:

  • For direct peering (without router-server), apply inbound BGP policy that would set next-hop for the accepted prefix to BGP peer IP address
  • For indirect peering (with IXP’s route-server), accept next-hop attribute advertised by the route-server

BGP Community Scrubbing

Section 11 of RFC 7454 provides the following optional community scrubbing recommendations.

RFC7454 BGP Community Scrubbing
RFC7454 BGP Community Scrubbing
  • Ingress BGP peering policy applied to transit/public/private and downstream peers should remove all inbound communities with SP’s number in the high-order bits, except for the ones used for signaling (e.g. setting BGP Local Preference).
  • Ingress BGP Policy should not remove other communities, as those communities can be used to communicate with upstream providers.

Traffic Filtering

Section 4 of RFC 7454 provides basic recommendations when it comes to traffic filtering and BGP.

RFC7454 Traffic Filtering
RFC7454 Traffic Filtering

 

All packets destined to TCP Port 179 and not originated from addresses of configured BGP peers should be discarded. If supported, Control Plane ACL (point 3 on the diagram) should be used. If not supported, ACL applied to each peer-facing port (point 1) should be used.

If supported, BGP Rate-Limiting (point 4) should also be implemented, to make sure that the number of BGP packets per second does not exceed platform’s capability.

Generic Control Plane protection recommendations are out of RFC 7454 scope and are covered in RFC 6192.

BGP Types of Peering

Introduction

In this article, we will discuss various types of Internet peering. You need to have basic knowledge of BGP protocol to better understand this paper, so if you are not familiar with BGP, we suggest that you start with the following Wikipedia article: https://en.wikipedia.org/wiki/Border_Gateway_Protocol

As a peering administrator, you are responsible for selecting the best peering strategy for your company. In order to determine what’s best for your organization, you need to identify your peering goals. Very frequently, these goals might be at odds. Let’s start with reviewing possible peering objectives and then continue with a discussion on why it is difficult to satisfy all of these requirements at the same time.

Typical Service Provider would have the following peering objectives:

  • Achieve High Availability – no matter what happens, your network should be able to reach any Internet destination
  • Maintain Low Latency and Low Packet Loss – you should always try to pick the path with the lowest possible latency and minimum packet loss
  • Minimize Traffic Cost – achieve the best connectivity possible at the minimal cost possible
  • Maximize Revenue – this often means that you want to attract more customers’ traffic than your competitors

By going through the objectives list, it is clear that the low-cost goal is at odds with other stated objectives. To achieve the best connectivity and high availability, you’d need to peer with as many companies as possible, but peering costs money. At the same time, improved peering might lead to increased revenue, as your network will attract more traffic.

The reality of the situation is that you will need to find a compromise by determining the number and types of peering that is right for your company.

Let’s list the types of peering sessions and then reveal technical details associated with each of them:

  • Upstream, also known as Transit Peering
  • Private Peering
  • Public Peering
  • Downstream, typically Customer Peering

Figure below shows an ISP (labeled as “Your Company”) connected to different types of peering partners.

Types of Peering
Types of Peering

Upstream Connectivity / Transit Providers

Unless you work for the elite group of Tier 1 providers (https://en.wikipedia.org/wiki/Tier_1_network) you will always need to buy Internet Transit Service from one or more service providers. This Transit Connectivity is sold by Upstream providers, who will feed you the full Internet BGP view table and, at least in theory, will deliver your packets to any device on the Internet either over their own network, or via their partners and clients. Selecting the right upstream provider or group of upstream providers is one the most important decisions you’ll need to make while building your network. Reliability, Connectivity types, cost per Mb are just some of the factors that will influence this decision. We’ll talk about selecting the right Upstream later in this article.

By accepting the full BGP table from Transit provider, your routers’ routing tables will get populated with the information about each and every IPv4 (and possibly IPv6) prefix currently present on the Internet.
In return, you will advertise your locally-originated routes, as well as routes received from your BGP customers.

Upstream / Transit Peering
Upstream / Transit Peering

Most organizations will employ direct transport links with their Transit providers, although it is possible (but typically not cost-effective) to leverage physical transport provided by an Internet Exchange Point (IXP) for upstream connectivity.

Private Peering

Private peering is the type of peering where two parties establish BGP connectivity over direct transport link and exchange information about routes originated in their own and their customers’ networks. While most of private peering arrangements are settlement-free, meaning that companies do not pay each over to exchange traffic over private links, there are also cases where an ISP might refuse to establish settlement-free relationships with your company, but is willing to sell access to their customer base at a discount, as compared to buying full transit connectivity from that provider.

It is also important to remember that while the traffic exchange might be free, there will be a cost associated with the physical transport (e.g. 10GE link over DWDM), as well as the cost of a port on your router where this link will be terminated.

In some cases, it might be difficult to predict how much traffic you will exchange with specific peer before establishing direct peering relationships. Although various traffic analysis tools such as Arbor SP might provide you with an estimate, we find that these predictions are not always reliable.

When possible, you should start with establishing Public peering relationships with a prospective peer and, assuming the amount of traffic justifies this, later convert to the Private peering relationships.

Figure below depicts private peering relationships with “Your Company”. Depending on the size of the peer, you might receive from them anywhere from a few routes to tens of thousands of routes. Large number of routes does not necessary mean high volume of traffic. Big CDN provider with just a few prefixes can deliver much more traffic to your network, than an ISP with thousands of prefixes in some remote geography.

Private Peering
Private Peering

Public Peering

Public peering is a type of relationship where two companies exchange IP traffic via one of public Internet Exchange Peering Points (IXP). The main advantage of peering at IXP is the ability to establish sessions with a large number (often hundreds) of partners, without the need to build individual transport links with all these peers. While most of peering relationships at IXP’s are settlement-free, there is often an initial connectivity cost, as well as a monthly recurring cost charged for IXP connectivity. In addition to that, there is a cost associated with a transport link between your peering router and IXP port. In fact, IXP charge and the transport cost when added together, might exceed the cost of buying IP transport from one of the Transit Providers.

Public Peering
Public Peering

With this being said, it is always good to be aware of the peering options in your geography, as not being connected to large IXPs might put you at a competitive disadvantage.

List of Internet exchange points by size can be found here:

https://en.wikipedia.org/wiki/List_of_Internet_exchange_points_by_size

It is also important to note, that presence at an IXP does not automatically mean that you will be able to peer with all Exchange members.  While some IXP participants have open peering policy, meaning they will exchange traffic with any other IXP member, other organizations are more restrictive and you will need to negotiate peering relationships with them on a case-by-case basis.

Downstream (Customer) Peering

BGP peering with your customers, also known as Downstream peering, is the type of a relationship where your company performs the function of a Transit Provider. IP Prefixes received from downstream peers should be re-advertised to all your peers, including Public, Private, Transit, as well as your other BGP-speaking customers.

Now that you’ve been introduced to various types of peering, let us review a few use cases.

Case Study – Small ISP Startup

You were asked to recommend a peering and transit policy for a small regional Internet provider called “ISP Startup.”  This company operates in the country where two large incumbent providers control nearly 80% of the country’s user base. These incumbents buy transit connectivity from various Tier 1 ISPs. Incumbents peer with each other, but will not join settlement-free peering relationships with small local ISPs.

There is an Internet Exchange point in the country. Some Global Content Delivery Networks (CDN), small local ISPs and Enterprises are connected to this IXP.

At the moment, “ISP Startup” does not have any BGP clients, but plans to acquire them in the future. The current goal is to minimize the Internet transit cost, while providing the best possible service to end users.

Based on the information provided, our “ISP Startup” has the following connectivity options to consider:

  • Buy transit from “Incumbent A”
  • Buy transit from “Incumbent B”
  • Buy transit from Global Tier 1 providers used by one or both Incumbent ISPs
  • Buy transit from Global Tier 1 providers not used by Incumbent ISPs
  • Buy transit from Global Tier 2 / Tier 3 providers operating in the country
  • Connect to Internet Exchange Point and try to establish settlement-free sessions

Figure below depicts connectivity alternatives for the new ISP.

Small ISP Peering
Small ISP Peering

This use case will not be complete without some assumptions about transit costs.

Let’s use the following model:

Option Price per Gb/month Remarks
Incumbent A $200
Incumbent B $250
Tier 1 – A $180
Tier 1 – B $220 Used by Incumbents
Tier 1 – C $300
Tier 2 – A $140
Tier 2 – B $160
IXP $50 Will not provide transit

IXP is the cheapest option by far, but it is not a substitute for Transit Internet connectivity. It might be relatively inexpensive to connect to an IXP, but our “ISP Startup” may be disappointed by the amount of traffic exchanged over IXP links. While there are many contributing factors (a type of ISP’s own customer base, number and type of IXP participants), you should not expect to offload more than 30% of your traffic to IXP. In fact, this number might be significantly lower than that. Your next decision is to select one or more upstream providers. If you base your decision on cost, “Tier 2 – A” ISP is the winner. You would establish at least two redundant links to “Tier 2 – A”, and might build a non-redundant link to the IXP as shown below.

Small ISP - Single Upstream
Small ISP – Single Upstream

Various traffic flow scenarios under normal conditions are depicted below:

Small ISP - Single Upstream Traffic Flow
Small ISP – Single Upstream Traffic Flow

While this design allows you to keep the cost low, it has a few major shortcomings:

Small ISP - Multiple Upstreams
Small ISP – Multiple Upstreams
  • There is no upstream redundancy – failure of “Tier 2 – A” ISP would take your company off the air.
  • You customers might experience high latency while communicating to Incumbent’s clients, as they’d need to cross multiple networks
  • If your ISP Startup grows and you acquire BGP customers of your own, it will be difficult to attract transit traffic, as your network will be a few AS hops from the majority of Internet destinations.

Let’s look into an alternative where ISP Startup connects to “Incumbent A”, “Tier 1 – A” and IXP.

Link to “Incumbent A” will provide you with direct access to “Incumbent A’s” customer base, as well as with a short path to “Incumbent B’s” clients. IXP connection will help you to reach the remaining local ISPs and provide access to CDN networks. Direct “Tier 1” connection will give you access to the rest of the Internet.

If Tier 1 link were to fail, you would reroute your traffic to the Internet via Incumbent A. If links to “Incumbent A” or IXP were to fail, you would reroute via “Tier 1” ISP. In addition to that, it will be much easier to attract transit internet traffic to your AS, if you peer directly with one of the global Tier 1 providers.

Let us compare monthly costs, based on the assumption that your network needs 100Gb pipe, of which 10% can be offloaded to IXP, 20% is destined to Incumbent providers and the rest needs to go to the Internet.

Option 1:

IXP: 10Gb @ $50 = $500

Tier 2 – A: 90 Gb @ $140 = $12,600

Total: $13,100 per month

Option 2:

IXP: 10Gb @ $50 = $500

Incumbent A: 20Gb @ $200 = $4,000

Tier 1 – A: 70Gb @ @180 = $12,600

Total: $17,100 per month

As you can see, Option B is ~30% more expensive. You will need to decide, if increased redundancy and improved latency warrants this premium.

Case Study – Medium-Size ISP Operating in 3 Regions

In this case study, we will analyze the scenario of an ISP operating in 3 different geographical regions using one common AS Number. We’ll call these regions West – North – East, although in the real life they can represent three cities, countries or even continents.

Medium Size ISP
Medium Size ISP

Similar to the previous example, this Medium-Size ISP needs to decide on the best connectivity options, while delivering exceptional service to its customers at the lowest possible price points.

Let’s review Transit, Public and Private peering options.

Transit Peering

Because of the size of the company and its desire to attract BGP clients, our ISP is inclined to buy transit from Tier 1 ISPs only. It believes that sometime in the future it will be in the position to negotiate settlement-free peering with Global Tier 2 providers, making it not feasible to buy transit from one of them today.

When it comes to choosing an ISP, the first possible approach is to select three different transit ISPs, one per region.

Medium Size ISP - Upstream Option 1
Medium Size ISP – Upstream Option 1

The clear advantage of this approach is the resiliency of Global Internet connectivity. If one, or even two links to Tier 1 ISPs were to fail, traffic could always be rerouted via the remaining connections.

It is also believed, that direct connectivity to multiple Tier 1 ISPs would help you to attract Internet traffic from your own BGP clients, making your company more profitable.

Unfortunately, while this design might look very appealing at first, there are some major drawbacks you need to consider:

  • You might not be able to negotiate an attractive per Mb transit rate, as your per-Tier 1 ISP traffic commitment in each of the regions will be relatively low.
  • Sub-optimal routing and possible high latency that you are likely to experience.  Let’s explain technical reasons to why this might happen.

Let us consider a scenario where ISP-B’s client residing in the same geography as the “East” section of your network wants to communicate with you client. You’d achieve the lowest latency, if traffic from ISP-B would pass via ISP-C and enter your network as shown below:

Medium Size ISP - Suboptimal Upstream Peering
Medium Size ISP – Suboptimal Upstream Peering

Unfortunately, this is unlikely to happen. For redundancy reasons, you should be advertising your “East” routes to “ISP-B” via the “West” peering point. And because the shortest AS-Path wins, default traffic flow will be as shown below:

Medium Size ISP - Suboptimal Upstream Peering
Medium Size ISP – Suboptimal Upstream Peering

While this type of traffic flow might be acceptable, if your North / West / East regions are just a few miles away, it may pose a problem if there is a significant distance between them. Due to the speed of light limitations, distance always translates into packet latency.

You can try to manipulate your BGP advertisements towards upstream providers, setting AS-Prepend or sending BGP communities in the attempt to prevent this sub-optimal traffic flow from happening, but you are unlikely to find an acceptable remedy for this scenario. BGP protocol likes shortest AS Paths and ISPs prefer to send traffic to their directly connected clients instead of passing through a third party.

Instead of connecting to three different Service Providers across three geographical regions, you might be better off by picking just two transit providers, but connecting to both of these providers in all three geographies.

Medium Size ISP - Upstream Peering
Medium Size ISP – Upstream Peering

Under normal conditions, you’d see the optimal traffic flow between the Global Internet and any of your regions. If one of the links were to fail, traffic to that ISP would reroute via two remaining links. This will increase end-to-end latency for some destinations, but this tradeoff should be acceptable.

Public Peering

As described in the “Pubic Peering” section of this paper, IXP locations are great places to establish direct connectivity to a large number of ISPs, Enterprises and CDN providers. As such, it is encouraged to be present at the public exchange points within ISP’s operating geography, and if cost permits, outside of operational boundaries. For example, service provider operating in Portugal, Spain and France should consider connecting to the largest European Peering points in Germany (DE-CIX), Amsterdam (AMX-IX) and London (LINX).

When establishing peering relationships, ISP should consider its own geography as well as peer’s geographical presence.

Figure below depicts potential peering scenario, where peering relationships could be established with “Public Peer – A” and “Public Peer – B”.

“Public Peer – A” operates in West and North regions, as well as some other geographies, not covered by you.

“Public Peer – B” operates in North and East regions, and also some other non-overlapping regions.

It should be no brainer to establish peering with “Public Peer – B” via “IXP-North” and “IXP-East”, as you would achieve optimal traffic flow between you two companies. Traffic originated from the West region will leverage IXP-North / IXP-East exchange points. This is acceptable as “Public Peer-B” is not present in the West.

Medium Size ISP - Upstream Peering
Medium Size ISP – Upstream Peering

Decision to peer with “Public Peer – A” is more difficult. You can only peer at “IXP-West”, as “Public Peer – A” is not present at other exchange points. This will lead to sub-optimal traffic flow between your “North” customers and “Public Peer-A” customers located in the North region. You are almost guaranteed to achieve better performance by sending the North traffic via one of upstream providers. Recommended solution to this problem is to advertise a subset of your routes to “Public Peer – A”. Instead of sending all the routes originated by your company and your BGP downstream customers, only advertise the routes originated in the “West” region. The same should apply to the routes advertised by “Public Peer – A”. Request your partner to limit their advertisement to their Western routes. Use your transit provider to exchange traffic between “Peer-A” Northern region and your North and East areas.

Private Peering

Most of the service providers start their peering relationships at IXP and upon achieving certain traffic volume might later switch to a private peering arrangement. By switching to private links and bypassing IXP, they can both improve network availability and decrease traffic cost. Peering recommendations covered in Transit and Public sections of this document are also applicable to private peering arrangements. If companies operate in the same geographical regions, they should establish peering sessions in as many points as possible in order to minimize end-to-end latency.

It is not uncommon to see a connectivity scenario, where two companies leverage private connectivity arrangements in some areas, while relying on public peering in other areas. Even after building direct links to a peering partner, you can still maintain BGP sessions at public peering points, diversifying your connectivity. Obviously, you’ll need to manipulate BGP attributes to make sure that private links are preferred over public exchanges. Next diagram depicts such hybrid scenario.

Medium Size ISP - Private Peering
Medium Size ISP – Private Peering

Private peering links were established in the East and West regions. In the West region, companies decided to preserve existing public peering relationships to maintain direct connectivity in case of the private link failure. Direct peering in the North was considered unfeasible due to low traffic volume. As such, two companies rely on “IXP-North” for local traffic exchange.

One final word of caution: when it comes to private connectivity – make sure you properly size your links. It is not uncommon to run into situations where direct private peering might become harmful. Let’s illustrate this with an example:

Our Medium-size ISP has two 100GE links per region to two transit providers. There are also 10GE links to IXP-West, IXP-North, IXP-West. While peering in these locations, Company XYZ was identified as a candidate for private peering connectivity. Netflow data shows that during peak hours, 300Mb/sec of traffic is being exchanged between the two companies. As a result, it is decided to build direct 1GE links in all three geographic regions. Everything works great until Company XYZ releases a new version of their software, and many customers on the Internet decide to download it at the same time. This causes a major congestion on private 1GE links. If companies were not to switch to the private peering and leveraged 10GE IXP connections instead, they would have easily coped with this sudden traffic increase.