CoNEXT 2013: Bullet Trains, Express Lanes... and Aspen Trees
San Diego and Santa Barbara, Dec. 11, 2013 -- Computer scientists from the University of California, San Diego didn’t have far to travel to attend the 9th ACM International Conference on Emerging Networking Experiments and Technologies (CoNEXT) in Santa Barbara, Calif. So why the big focus on “bullet trains” and “express lanes”?
CNS research scientist George Porter co-authored two papers with CSE colleagues, and they were presented on Dec. 11 during a CoNEXT session called, “Trains, Lanes and Autobalancing.”
According to the first presented paper, “Bullet Trains: A Study of NIC Burst Behavior at Microsecond Timescales,” a lot is known at a macro level about the behavior of traffic in data center networks. This includes the ‘burstiness’ – the relative tendency of data traffic to transmit in short, uneven spurts – of TCP (a protocol on which transmission through the Internet is controlled), variability based on destination, and overall size of network flows. However, according to Porter and his co-authors, CSE grad student Rishi Kapoor and CSE professors Geoff Voelker and Alex Snoeren, “little information is available on the behavior of data center traffic at packet-level timescales,” i.e., at timescales below 100 microseconds.
Some 30 years ago, an MIT study compared packets of data with train cars – sent from a source to the same destination back-to-back like train cars pulled by a locomotive. In the context of data centers, however, the UC San Diego researchers came to the conclusion that those trains are more aptly termed “bullet trains” when viewed at microsecond timescales. Porter and his colleagues examined the various sources of traffic bursts and measured the traffic from different sources along the network stack, as well as the burstiness of different data-center workloads, and the burst behavior of bandwidth-intensive applications such as data sorting (MapReduce) and distributed file systems (NFS and Hadoop).
The researchers focused primarily on the network interface controller (NIC) layer, because the controller is directly implicated in the burst behavior that most affects the speed of computer networking. The assumption has been that packets transmitted within a single flow would be uniformly paced, but real life turns out to be more complex. This is primarily because packets are batched differently across the network stack in order to achieve link rates of 10Gbps or higher.
For their paper, Porter and his co-authors studied the burst behavior of traffic emanating from a 10Gbps end-host across a variety of data center applications. “We found that at 10- to 100-microsecond timescales, the traffic exhibits large bursts, tens of packets in length,” said Porter. “We also found that this level of burstiness was largely outside of application control, and independent of the high-level behavior of applications.”
No customer wants their data or service to leak into those of other customers in the cloud, and typically, cloud operators rely on virtual machines (VMs) as well as network-level rules and policies that are enforced on every packet going in and out of the host in order to ensure network isolation. As a result, however, VMs carry innate costs in the form of latency (delays) and the increased cost of processing packets in what’s called the hypervisor, which affect both the provider and the tenant.
The researchers came up with a solution called FasTrak, which keeps the functionality but curbs the cost of processing rules by offloading some of the virtualization functionality from the hypervisor software to the network switch hardware through so-called “express lanes.” There is limited space on a switch – not enough to take care of all the rules required by a server – so for FasTrak, the researchers determined the subset of data flows that could benefit most from offloading via express lanes to hardware. The result: an approximate doubling in latency improvement (i.e., 50 percent shorter delays), combined with a 21 percent drop in the server load (volume of traffic). According to the study’s conclusion, FasTrak’s actual benefits are workload dependent, but “services that should benefit the most are those with substantial communication requirements and some communication locality.”
CNS research scientist Porter also moderated a CoNEXT session on Dec. 10 about…. During the session, recent CSE alumna Meg Walraed-Sullivan (Ph.D. ’12), who is now at Microsoft Research, presented a paper on “Aspen Trees: Balancing Data Center Fault Tolerance, Scalability and Cost.” The paper was co-authored by her Ph.D. advisors, CSE professors Amin Vahdat (on leave at Google) and Keith Marzullo (recently on leave at NSF). The paper flows from Walraed-Sullivan’s dissertation at UCSD, which introduced a new class of network topologies called ‘Aspen trees,’ named after Aspen trees in nature, which share a common root system.
Large-scale data center infrastructures typically use a multi-rooted, 'fat tree' topology, which provides diverse yet short paths between end hosts. A drawback of this type of topology is that a single link failure can disconnect a portion of the network’s hosts for a substantial period of time (while updated routing information propagates to every switch in the tree). According to the CoNEXT paper, this shortcoming makes the fat tree less suited for use in data centers that require the highest levels of availability. Alternatively, Aspen tree topologies can provide the high throughput and path multiplicity of current data center network topologies, while also allowing a network operator to select a particular point on the spectrum of scalability, network size, and fault tolerance – affording data center operators the ability to react to failures locally.
Bullet Trains: A Study of NIC Burst Behavior at Microsecond Timescales
FasTrak: Enabling Express Lanes in Multi-Tenant Data Centers
Aspen Trees: Balancing Data Center Fault Tolerance, Scalability and Cost
UCSD Computer Science and Engineering
Jacobs School of Engineering
Calit2’s Qualcomm Institute
Doug Ramsey, 858-822-5825, firstname.lastname@example.org