As we’ve all seen firsthand this year, our universal dependence on a secure and reliable internet has never been greater. The global response to the pandemic highlights all of the ways we rely on the internet to ensure that families stay connected with one another, patients receive care, students continue to learn, and businesses continue to operate.
In recent months, we’ve written about how Google protects our network from DDoS attacks, and the design and capacity planning processes that enable our network to remain resilient and performant—and support Google Cloud customers—in the face of challenges. As part of our work to continually improve the security and availability of our network infrastructure, we want to share details about steps we’ve proactively taken to protect Google’s network against vulnerabilities in the internet routing system, how we’re moving forward, and the importance of collaborating with the wider community to make faster progress to secure routing in the internet overall.
Vulnerabilities in the internet routing system
Since the inception of the modern internet, interconnection between different carrier networks, ISPs and content providers has relied on the Border Gateway Protocol (BGP) to determine routes for traffic between networks across the globe. BGP was designed, and continues to operate, with a trust model that assumes that the information exchanged is always valid. As a result, it is very easy to redirect, intercept, or drop traffic altogether simply by mis-announcing routes via the BGP protocol (commonly referred to as route hijacks or leaks). These frequent disruptions may be caused by configuration errors or deliberate attacks, sometimes resulting in outages or other damage.
Despite more than 15 years of work in the internet community to develop methods to secure routing, limited and slow adoption within the industry means that this problem remains one of the top vulnerabilities in the global internet infrastructure. It is a hard problem, with no silver bullet that prevents all types of BGP hijacks. Moreover, most of the protection measures require effort by the majority of network operators to be effective.
Increasing focus on routing security
This week, we participated in an announcement with the MANRS initiative, creating a new task force to define and publish an updated set of actions, along with corresponding metrics to measure progress, to bring increased consistency and urgency to improving routing security. Formed in 2015 and sponsored by the non-profit Internet Society, MANRS’ goal is to create best practices for routing security for all network operators. As a MANRS member, we helped create a specific program focused on cloud and CDN providers last year working with a number of major service providers.
Developing and deploying routing security protections
Google delivers content and services to users and customers by connecting to thousands of peer networks around the world. These peering relationships provide an opportunity to work with peer networks to improve routing security for our own services and also for the Internet overall. In support of the major focus areas in the MANRS task force, we’ve already undertaken several measures to protect our network infrastructure from hijacks, such as filtering and coordinating with peer networks, which will also make it easier to extend these protections to other networks in the internet. The MANRS Observatory tracks routing security readiness for all member networks and, as shown in the figure below, Google scores highly across all of the key metrics.
Let’s take a deeper look at how we achieve these metrics, and steps we are taking to improve internet routing security overall.
RPKI (Routing Public Key Infrastructure)
The RPKI is a distributed public database of cryptographically signed records that allows operators to securely register routing information about their networks. Other networks can download the records and use them to validate BGP announcements they receive as being correctly originated (RPKI origin validation). RPKI adoption has increased significantly in the last two years, with a number of large Internet ISPs, including AT&T, NTT, Telia, and Cogent announcing they are performing origin validation. RPKI protection ultimately requires all networks to register their routes to enable hijack protection. As of November 2020, Google has registered more than 99% of its routes in the RPKI (as seen by the MANRS Observatory). Further, we plan to deploy origin validation in 2021 to ensure that invalid routes are rejected, thus preventing disruptions due to hijacks for Google Cloud customers and end users.
Consistent route filtering
Many networks continue to publish information about route ownership and relationships with other networks in public IRRs (Internet Routing Registries). While IRR information is not as secure as the RPKI, its wide use and coverage makes it a valuable source of data to build filtering rules that ensure only valid routes are accepted from neighboring networks. To protect our infrastructure, Google is currently deploying IRR-based route filtering that ensures valid routes receive higher preference, and we also maintain up-to-date routing information in the IRRs for our own routes (we use RADb). Through work with MANRS members, we are defining a consistent filtering approach that any cloud provider can follow to clarify and simplify the work required by peer networks to maintain their data in the IRRs.
Working with our peers
Peer networks are our partners in this effort, and we rely on them to maintain records in public routing information sources like the RPKI and IRRs to enable route validation by all networks, not just Google. Through our peering portal, we provide customized information to every peer, showing the IRR status of every route they announce to Google—and by early next year, this will also include RPKI validity information. Since the beginning of 2020, we’ve been proactively contacting our peers to alert them to routing information that appears invalid. This information helps our peers quickly identify which routes may need updated or corrected records, along with guidance on how to make the fixes. In parallel, we are working with other cloud service providers to make data requirements consistent for all peers, simplifying the peering process regardless of the cloud to which they are connected.
Expanding collaboration with Tier-1 networks
Tier-1 networks and large transit providers play a critical role in routing security, since they act as the primary hubs of the internet through which other provider and customer networks connect. Many of these networks have already taken the initiative to deploy various forms of filtering, including RPKI origin validation. Google has established path-based filtering with most of our Tier-1 network partners (sometimes called “peer-locks”)—these filters help ensure that traffic for Google services only follow valid paths. Stopping wide-scale propagation of invalid routes at large hub networks helps minimize the impact of route leaks and hijacks, reducing exposure for our customers and users.
Working together to secure internet routing
Improving internet routing security requires adoption and deployment of multiple mechanisms, and participation by the entire internet community. As we work on securing Google’s services from routing-related threats, we are also actively engaged with the broader community to advance routing security for everyone. This starts with defining clear actions with the new MANRS task force, and aligning our technical approach and how we engage peer networks, with other major cloud service providers. Only by working collectively in the same direction can we make meaningful and rapid progress to secure Internet routing.
Source: Google Cloud Blog