On February 24, 2008, at 18:47 UTC, Pakistan Telecom (AS17557) started announcing a more specific route to YouTube’s IP prefix: 208.65.153.0/24. Within minutes, YouTube traffic from around the world was being redirected to Pakistan. The Pakistan government had ordered the ISP to block YouTube domestically, but a configuration error caused the route to propagate globally through PCCW Global (AS3491), their upstream provider.

YouTube engineers responded approximately 80 minutes later by announcing even more specific routes (/25 prefixes) to reclaim their traffic. By 21:01 UTC, the hijack was over. But for nearly two hours, a single misconfiguration in one country had effectively stolen one of the world’s most popular websites.

This wasn’t an isolated incident. On October 4, 2021, a configuration error on Facebook’s backbone routers caused them to withdraw their BGP routes entirely. Facebook, Instagram, and WhatsApp disappeared from the Internet for six hours. Their own engineers couldn’t access the network to fix it because the internal systems relied on the same infrastructure.

The common thread: Border Gateway Protocol (BGP), the routing protocol that holds the Internet together, operates on a foundation of trust with no built-in mechanism to verify that trust.

The Protocol That Routes Everything

BGP is a path-vector routing protocol, a variant of distance-vector algorithms designed for networks that span organizational boundaries. Unlike interior gateway protocols such as OSPF or IS-IS that operate within a single organization, BGP handles routing between autonomous systems (AS)—independent networks with their own routing policies.

The current version, BGP-4, was standardized in RFC 4271 in 2006, though the protocol has been running the Internet since 1994. As of January 2025, the global BGP routing table contains approximately 996,000 IPv4 prefixes and 210,000 IPv6 prefixes, representing over 76,000 autonomous systems.

Each autonomous system is assigned a unique Autonomous System Number (ASN). Cloudflare operates AS13335, Amazon uses AS16509, and Google runs AS15169. These numbers serve as identifiers in the BGP protocol, allowing networks to construct a map of Internet topology.

BGP speakers (routers running BGP) exchange routing information via TCP connections on port 179. Four message types govern these exchanges:

  • OPEN: Initiates a BGP session, exchanging version, ASN, hold time, and capabilities
  • UPDATE: Advertises or withdraws routes, including path attributes
  • KEEPALIVE: Maintains the session when no updates are sent
  • NOTIFICATION: Reports errors and terminates sessions

When a BGP speaker learns multiple paths to the same destination, it must choose one. This decision process—the best path selection algorithm—is where BGP’s power and complexity converge.

The Fourteen-Step Decision

BGP doesn’t use a simple metric like “shortest path.” Instead, it applies a sequence of comparison rules, moving to the next step only when the current step produces a tie. The algorithm (for Cisco routers, though implementations vary slightly) proceeds as follows:

  1. Weight (Cisco-specific): Locally significant, higher wins
  2. Local Preference: Higher wins, propagated within an AS
  3. Locally Originated: Prefer routes originated locally
  4. AS Path Length: Shorter wins
  5. Origin Type: IGP < EGP < Incomplete
  6. Multi-Exit Discriminator (MED): Lower wins
  7. eBGP over iBGP: Prefer external routes
  8. Lowest IGP Metric to Next Hop: Prefer closest egress
  9. Oldest Route: Prefer the most stable path
  10. Router ID: Lowest wins (tiebreaker)
  11. Cluster List Length: Shorter wins
  12. Lowest Neighbor Address: Final tiebreaker

The order matters enormously. LOCAL_PREF dominates AS_PATH. You can manipulate traffic by setting a higher LOCAL_PREF, and no amount of AS path prepending will overcome it.

The AS_PATH attribute records every autonomous system a route announcement traverses. If AS13335 announces a route to AS174 (Cogent), who then announces it to AS3356 (Level3), the AS_PATH becomes 3356 174 13335. This attribute serves two purposes: preventing routing loops (if a router sees its own ASN in the path, it rejects the route) and influencing path selection (shorter paths are preferred).

The Longest Prefix Match Override

Before BGP even evaluates path attributes, it applies the longest prefix match rule. A router receiving announcements for both 208.65.152.0/22 and 208.65.153.0/24 will send traffic destined for 208.65.153.x to the more specific /24 route, regardless of any BGP attributes.

This is why YouTube’s countermeasure worked. By announcing /25 prefixes (208.65.153.0/25 and 208.65.153.128/25), they created routes more specific than Pakistan Telecom’s /24, overriding the hijacked route entirely.

It’s also why prefix hijacking is effective: announce a more specific route than the legitimate owner, and you capture their traffic.

The Economics of Peering and Transit

BGP doesn’t just route packets—it encodes business relationships. When two autonomous systems connect, they establish one of three relationship types:

Transit: One network pays another to carry traffic to the rest of the Internet. A small ISP might pay a larger ISP for transit service. The transit provider announces the customer’s routes to the entire Internet and announces full routing tables to the customer.

Peering: Two networks interconnect to exchange traffic between their respective customers, typically without payment. Both parties benefit from reduced transit costs and improved performance. Peering can be public (at Internet Exchange Points) or private (direct interconnection).

Settlement-Free Peering: A specific type of peering where neither party pays the other, typically between networks of similar size and traffic ratios. The “tier 1” networks—approximately 15 globally including Lumen, Cogent, and Verizon—peer with each other settlement-free and can reach the entire Internet without paying transit.

These relationships influence route announcements through BGP communities and filtering policies. A network might tag routes learned from a transit provider with a community that prevents announcement to other transit providers, ensuring traffic returns via the same provider.

The Valley-Free Routing Constraint

Internet routing follows an informal rule: traffic flows “up” the hierarchy (from customer to provider), then across at the top (between peers), then “down” (from provider to customer). A path that goes up, then down, then up again forms a “valley”—an indication that someone is providing free transit.

Route leaks occur when this constraint is violated. In 2015, Malaysia Telecom accidentally announced 170,000 routes learned from one provider to another provider, briefly rerouting traffic through Malaysia. Similar incidents happen regularly—some intentional, most accidental.

Route Hijacking: When Trust Fails

BGP’s fundamental vulnerability stems from a design choice made in 1989: trust. When a BGP speaker announces a route, the receiver has no mechanism to verify that the announcer is authorized to do so. The system assumes that networks will announce only their own address space and will filter what they receive.

This assumption fails regularly:

1997 (AS7007): A misconfigured router in Florida announced 72,000 routes, redirecting a significant portion of Internet traffic through a single T1 line.

2008 (Pakistan Telecom): The YouTube incident described earlier.

2010 (China Telecom): For approximately 18 minutes, China Telecom announced approximately 37,000 routes, diverting traffic destined for U.S. government and military sites through China.

2018 (Amazon Route 53 Hijack): An attacker announced the IP prefix for Amazon’s DNS service, redirecting traffic for MyEtherWallet.com and stealing approximately $150,000 in cryptocurrency.

2021 (Facebook Outage): A configuration error withdrew Facebook’s routes entirely, making their services unreachable for six hours.

Each incident exploited the same weakness: BGP has no built-in authentication. A route announcement is accepted simply because it was received.

The Anatomy of a Hijack

A prefix hijack involves three elements:

  1. Announcement: The attacker announces a route to a prefix they don’t own
  2. Propagation: The route spreads through the Internet
  3. Selection: Other networks select the hijacked route over the legitimate one

More specific announcements almost always win. If an attacker announces a /24 for a /16 block, they capture traffic to that /24. This is why prefix hijacking is sometimes called “subprefix hijacking.”

The attacker can then:

  • Blackhole traffic: Drop packets, causing a denial of service
  • Intercept traffic: Forward traffic to the legitimate destination after inspection
  • Impersonate services: Host fake versions of websites

Detection is possible by monitoring BGP announcements for unexpected origins. Services like RIPE NCC’s Routing Information Service and BGPmon track announcements globally. But detection after the fact provides only forensic value—the traffic has already been diverted.

RPKI: Cryptographic Attestation for Routes

Resource Public Key Infrastructure (RPKI), defined in RFC 6480, provides a cryptographic mechanism to verify route origin authorization. The system works through signed objects called Route Origin Authorizations (ROAs).

A ROA cryptographically binds an IP prefix to an originating ASN. If Cloudflare (AS13335) wants to announce 104.16.0.0/13, they create a ROA stating “AS13335 is authorized to announce 104.16.0.0/13.” This ROA is signed using a certificate derived from their Regional Internet Registry (RIR).

When a BGP router receives a route announcement, it can query an RPKI validator to check whether a valid ROA exists. The validator returns one of three states:

  • Valid: A matching ROA exists
  • Invalid: The prefix is covered by a ROA, but the originating AS doesn’t match
  • NotFound: No ROA covers this prefix

This doesn’t prevent hijacking entirely. RPKI only validates route origin, not path. An attacker could still prepend legitimate ASes to their path or perform “forged-origin” hijacks where they claim to be the origin of a prefix that lacks a ROA.

As of 2025, approximately 50% of the IPv4 routing table has RPKI coverage. The percentage continues to grow as more networks adopt the technology and RIRs provide tooling for ROA management.

The Deployment Challenge

RPKI adoption faces practical hurdles:

Legal concerns: In the ARIN region, the RIR requires indemnification before signing resources. Some organizations are reluctant to cede control over their routing to a third party.

Operational risk: Incorrect ROAs can cause legitimate routes to be marked invalid. Networks often prefer “fail open” policies that reject only invalid routes with valid alternatives.

Coverage gaps: Approximately half of announced prefixes lack ROAs. Routes in NotFound state cannot be validated.

Path validation missing: RPKI doesn’t verify that the AS path is accurate, only that the origin AS is authorized.

Despite these challenges, RPKI represents the most practical improvement to BGP security available today. Networks including Cloudflare, Google, and Amazon have deployed RPKI-based filtering, and the percentage of filtered routes grows annually.

Convergence and the Speed of Trust

When a route changes, BGP must propagate that change globally. This process, called convergence, can take minutes—far slower than interior routing protocols.

Several factors contribute to slow convergence:

MinRouteAdvertisementInterval (MRAI): BGP implementations typically wait 30 seconds (for eBGP) between sending updates to the same peer. This rate-limiting prevents update storms but delays propagation.

Path exploration: When a route is withdrawn, routers must explore all alternate paths before determining the best new path. Each exploration step involves processing and potential additional queries.

Implementation variations: Different router vendors implement BGP differently, with varying timers and processing priorities.

Research by Labovitz et al. in 2000 found that Internet-wide convergence after a route withdrawal typically takes 2-3 minutes. More recent measurements suggest some improvement, but convergence times exceeding 30 seconds remain common.

This latency matters for applications. During convergence, traffic may be dropped (if no valid path exists) or routed through suboptimal paths. For voice over IP or real-time applications, even seconds of disruption can be noticeable.

Route Flap Damping: A Controversial Mitigation

Route flap damping suppresses unstable routes—those that repeatedly announce and withdraw—to reduce BGP processing load. The mechanism assigns a penalty each time a route flaps; when the penalty exceeds a threshold, the route is suppressed for a period.

However, route flap damping can cause legitimate routes to be suppressed during legitimate topology changes. When a fiber cut causes a route to flap, damping might suppress that route for hours after connectivity is restored. Modern best practices recommend disabling or carefully tuning flap damping.

Route Reflectors and the iBGP Scaling Problem

Within an autonomous system, BGP routers running interior BGP (iBGP) must be fully meshed—each router must peer with every other router. This creates a scalability problem: a network with n iBGP routers requires $\frac{n(n-1)}{2}$ sessions. At 100 routers, that’s 4,950 sessions; at 1,000 routers, nearly 500,000.

Route reflectors solve this by creating a hub-and-spoke topology. A route reflector (RR) receives routes from its clients and reflects them to other clients. Only the RR requires sessions with all routers; the routers themselves need only connect to the RR.

The tradeoff is a single point of failure. Most networks deploy multiple RRs for redundancy, typically in a cluster configuration where all RRs receive the same routes and clients connect to all RRs in the cluster.

Hierarchical route reflection—RRs that peer with other RRs—enables even larger networks. A two-tier hierarchy can scale to tens of thousands of iBGP speakers.

Traffic Engineering with BGP

BGP’s path selection algorithm provides multiple levers for traffic engineering:

AS Path Prepending: Repeating your ASN in the AS_PATH makes the path appear longer, causing other networks to prefer alternate paths. Used to de-preference a backup link.

Local Preference: The most powerful attribute within an AS. Setting a higher LOCAL_PREF on a route causes all routers in the AS to prefer that path for egress traffic.

Multi-Exit Discriminator (MED): Hints to neighboring ASes about preferred entry points. Lower MED is preferred. MED only works when the same AS announces the routes.

BGP Communities: Optional transitive attributes that tag routes with additional information. Common uses include:

  • NO_EXPORT: Don’t announce beyond this AS
  • NO_ADVERTISE: Don’t announce to any peer
  • LOCAL_AS: Announce only within this confederation

Large networks define custom communities for specific purposes—tagging routes by geographic region, customer type, or service level.

The Tragedy of the Routing Commons

The global routing table grows by approximately 50,000 entries annually. Much of this growth comes from traffic engineering—announcing more specific routes to influence traffic patterns. Each network’s decision to announce additional routes imposes costs on everyone else: larger routing tables require more memory, more processing power, and faster TCAM (ternary content-addressable memory) in routers.

This is a classic tragedy of the commons. Individual networks benefit from announcing more specifics while the costs are distributed across all network operators.

Efforts to constrain routing table growth—including RPKI, which encourages aggregation by making it harder to announce arbitrary prefixes—have had limited success. The IPv4 routing table will likely reach 1 million entries by 2026, requiring router upgrades for networks with limited TCAM capacity.

What the Internet Looks Like From the Edge

For most organizations, BGP remains invisible until something breaks. Your ISP handles the complexity. But understanding BGP helps explain why Internet outages occur, why some routes are faster than others, and why trust remains the Internet’s fundamental currency.

The protocol that routes trillions of packets daily operates on assumptions designed for a research network of a few hundred participants. When Pakistan Telecom accidentally hijacked YouTube, the problem wasn’t a bug—it was BGP working exactly as designed, faithfully propagating an announcement that happened to be wrong.

Securing BGP requires changing that assumption. RPKI provides a foundation. Path validation (BGPsec) offers future possibilities. But the Internet’s routing infrastructure changes slowly, weighted by billions of dollars in deployed hardware and decades of operational practice.

Until then, every route announcement remains an assertion of trust, and every outage remains a reminder that the Internet’s reliability depends on the competence of tens of thousands of independent network operators.


References

  1. Rekhter, Y., Li, T., & Hares, S. (2006). RFC 4271: A Border Gateway Protocol 4 (BGP-4). IETF.

  2. RIPE NCC. (2008). YouTube Hijacking: A RIPE NCC RIS Case Study. https://www.ripe.net/about-us/news/youtube-hijacking-a-ripe-ncc-ris-case-study/

  3. Cloudflare. (2021). Understanding how Facebook disappeared from the Internet. https://blog.cloudflare.com/october-2021-facebook-outage/

  4. Huston, G. (2025). BGP in 2024. APNIC Blog. https://blog.apnic.net/2025/01/06/bgp-in-2024/

  5. Lepinski, M. & Kent, S. (2012). RFC 6480: An Infrastructure to Support Secure Internet Routing. IETF.

  6. Kent, S., et al. (2012). RFC 6482: A Profile for Route Origin Authorizations (ROAs). IETF.

  7. Labovitz, C., et al. (2000). Delayed Internet Routing Convergence. ACM SIGCOMM.

  8. Gao, L. & Rexford, J. (2001). Stable Internet Routing Without Global Coordination. IEEE/ACM Transactions on Networking.

  9. Internet Society. MANRS - Mutually Agreed Norms for Routing Security. https://www.manrs.org/

  10. Huston, G. (2019). The BGP Large Communities Attribute. APNIC Blog.

  11. Bates, T., et al. (2000). RFC 2796: BGP Route Reflection. IETF.

  12. Madory, D. (2023). A Brief History of the Internet’s Biggest BGP Incidents. NANOG. https://nanog.org/stories/articles/a-brief-history-of-the-internets-biggest-bgp-incidents/