How Google is speeding up the Internet
Engineers at Google have developed a new algorithm to speed up TCP, the main transport protocol for traffic on the internet, by optimizing the speed at which traffic is sent so it doesn’t clog up the available routes.
They say their acceleration method, called bottleneck bandwidth and roundtrip (BBR) propagation time, measures the fastest way to send data across different routes and is able to more efficiently handle traffic when data routes become congested. Google is already using BBR to speed up its YouTube traffic, and last month the company made BBR available in its Google Cloud Platform. Google says implementing BBR sped up the already highly-optimized YouTube traffic by 4% on average, and as much as 14% in some countries.
TCP acceleration efforts
TCP developed in the 1970s as part of the protocol suite TCP/IP to format data into packets for transmission across the internet. Researchers at the Internet Engineering Task Force (IETF) estimate that more than 90% of IP traffic is transmitted via TCP.
Over the past few decades there have been multiple efforts to speed up TCP/IP, many of them focused on how TCP handles congestion. TCP was designed to slow down how fast it sends traffic when it senses congestion, which it determines by monitoring the number of packets lost in transport.
“This worked well for many years because internet switches’ and routers’ small buffers were well-matched to the low-bandwidth of internet links,” Google explains in a blog post announcing BBR. But so-called “loss-based” congestion control doesn’t work as well in today’s environments.
Van Jacobson, one of the original authors TCP and one of the lead engineers who developed BBR, says if TCP only slows down traffic when it detects packet loss, then it’s too little too late.
“(BBR) is not waiting for a problem to occur, like a loss,” Jacobson says. “It’s modeling the pipe as if it has a length and diameter to determine how much data can fit in it.”
BBR constantly estimates throughput and roundtrip traffic time across multiple routes, so it knows how long it will take data to traverse the network if it sends it at a certain rate. By doing so, BBR sends traffic at a speed the network can handle. This is more effective than the original TCP congestion controls.
BBR is also compatible with an alternative transport protocol, quick UDP internet connections (QUIC), devised by Google and being considered for standardization by the IETF.
BBR is not the first effort to speed up TCP. Researchers at North Carolina State University are credited with developing one of the most popular loss-based congestion control algorithms used in TCP today, named binary increase congestion control (BIC) and subsequently, CUBIC. At a high level, these also record measurements to estimate the optimal speed at which to send data when congestion is detected. Another congestion control algorithm that has become popular is named Reno.
These all use packet loss to determine congestion, though Jacobson, the Google engineer who developed BBR, says that to his knowledge BBR is the only TCP algorithm that actually estimates the speed of traffic to determine the best way to send it, regardless of whether packets have been lost.
Reaction to BBR
Mirja Kühlewind is a senior researcher at Networked Systems Group in Zurich and is the IETF’s Transport Area Director who works on TCP maintenance and improvement. She says creating standards in transport and congestion control takes a long time. Through dozens of attempts at improving TCP, there has been only one that has been standardized, and that was before the development of BIC and BBR.
“In general standardizing congestion control schemes is not an easy topic,” she says. If any company could push a standard, it could be Google, given the scale at which they operate, she says.
Jacobson says the company’s goal is for BBR to become a standard.
Kühlewind says BBR shows promise. “Both Reno and CUBIC work based on the same principle and react to packet loss as a sign for congestion and subsequently reduce their sending rate if loss was detected. BBR however utilizes packet timing information to figure out if the link is congested.”
Some Google customers are already realizing benefits of BBR. Wordpress hosts a half-million sites in the Google Cloud and Founder and CTO Jason Cohen cited Google research which showed BBR provided a 2,700x throughput improvement compared to other loss-based congestion controls. Queuing delays were 25x lower, he says.
Users of Google’s Cloud Platform will automatically get the benefits of BBR when using certain GCP cloud services, including Cloud Spanner, BigTable, Storage, CDN and Load Balancing at no additional cost.
This story, "How Google is speeding up the Internet " was originally published by Network World.