Internet Scale Overlay Hosting

From ARL Wiki
Revision as of 19:04, 17 March 2009 by Jon Turner (talk | contribs)
Jump to navigationJump to search

Currently under reconstruction - check back later

Network overlays have become a popular tool for implementing Internet applications. While content-delivery networks provide the most prominent example of the commercial application of overlays, systems researchers have developed a variety of experimental overlay applications, demonstrating that the overlay approach can be an effective method for deploying a broad range of innovative systems. An Overlay hosting Service (OHS) is a shared infrastructure that supports multiple overlay networks and can play an important role in enabling wider-scale use of overlays, since it enables small organizations to deploy new overlay services on a global scale without the burden of having to acquire and manage their own physical infrastructure. Currently, PlanetLab is the canonical example of an overlay hosting service, and it has proven to be an effective vehicle for supporting research in distributed systems and applications. This project seeks to make PlanetLab and systems like it more capable of supporting large-scale deployments of overlay services, through the creation of more capable platforms and the control mechanisms needed to provision resources for use by different applications.

Overlay Hosting Service

As shown at right, an OHS can be implemented using distributed data centers, comprising servers and a communications substrate that includes both L2 switching for local communication and L3 routers for communication to end users and other data centers. End users communicate with a data center using the public Internet, while data centers can communicate with each other, using either the public Internet or dedicated backbone links, allowing the OHS to support provisioned overlay links. This in turn allows overlay network providers to deliver services that require consistent performance from the networking substrate.

We have developed an experimental prototype of a system for implementing overlay hosting services. We have selected PlanetLab as our target implementation context and have dubbed the system the Supercharged PlanetLab Platform. The SPP has a scalable architecture that accommodates multiple types of processing resources, including conventional server blades and Network Processor blades base on the Intel IXP 2850. We are working to deploy five SPP nodes in Internet 2 as part of a larger prototyping effort associated with National Science Foundation's GENI initiative. The subsequent sections describe the SPP and our plans for GENI.

Supercharged PlanetLab Platform

The SPP is designed as a high performance substitute for a conventional PlanetLab node. The typical PlanetLab node is a conventional PC running a customized version of Linux that supports multiple virtual machines, using the Linux vServer mechanism. This allows different applications to share the node's computing and network resources, while being logically isolated from the other applications running on the node. PlanetLab's virtualization of the platform is imperfect, as it requires different vServers to share a common IP address and because it provides limited support for performance isolation. Nevertheless, PlanetLab has been very successful as an experimental platform for distributed applications.

The objective of the SPP is to boost the performance of PlanetLab sufficiently to allow it to serve as an effective platform for service delivery, not just for experimentation. There are several elements to this. First, the SPP is designed as a scalable system that incorporates multiple servers, while appearing to users and application developers like a conventional PlanetLab node. Second, the SPP makes use of Network Processor blades, in addition to conventional server blades, allowing developers to take advantage of the higher performance offered by NPs. Third, the SPP provides better control over both computing and networking resources, enabling developers to deliver more consistent performance to users. PlanetLab developers can use the SPP just like a conventional PlanetLab node, but in order to obtain the greatest performance benefits, they must structure their applications to take advantage of the NP resources provided by the SPP. We have tried to make this relatively painless, by supporting a simple fastpath/slowpath application structure, in which the NP is used to implement the most performance-critical parts of an application, while the more complex aspects of the application are handled by a general-purpose server that provides a conventional software execution environment. In this section, we provide an overview of the hardware and software components that collectively implement the SPP, and describe how they can be used to implement high performance PlanetLab applications.

Hardware Components

Supercharged PlanetLab Platform Hardware Components

The hardware components of our prototype SPP are shown at right. The system consists of a number of processing components that are connected by an Ethernet switching layer. From a developer's perspective, the most important components of the system are the Processing Engines that host the applications. The SPP includes two types of PEs. The General-Purpose Processing Engines (GPE) are conventional server blades running the standard PlanetLab operating system. The current GPEs are dual Xeons with a clock rate of xx, xx GB of DRAM and xx GB of on-board disk. The Network Processor Blades (NPE) include two IXP 2850s, each with 16 cores for processing packets and an xScale management processor. Each IXP has 750 MB of RDRAM plus four independent SRAM banks, and the two share an 18 Mb TCAM. The NPE also has a 10 GbE network connection and is capable of forwarding packets at 10 Gb/s.

All input and output passes through a Line Card (LC) that has ten GbE interfaces. In a typical deployment, some of these interfaces will have public IP addresses and be accessible through the Internet, while others will be used for direct connection to other SPP nodes. The LC is implemented with a Network Processor blade and handles the routing of traffic between the external interfaces and the GPEs and NPEs. This is done by configuring filters and queues within the LC. The system is managed by a Control Processor (CP) that configures application slices based on slice descriptions obtained from PlanetLab Central, a centralized database that is used to manage the global PlanetLab infrastructure. The CP also hosts a netFPGA, allowing application developers to implement processing in configurable hardware, as well as software.

Our prototype system is shown in the photograph. We are using board level components that are compatible with the Advanced Telecommunication Computing Architecture (ATCA) standards. ATCA components include the server blades, the NP blades and the chassis switch, which actually includes both a 10 GbE switch and a 1 GbE control switch. The ATCA components are augmented with an external 1 GbE switch and a conventional rack-mount server that implements the CP.

Network Processor Software

NP blades are used for both the NPE and LC.

Control Software

Planned SPP Deployment

map of planned node locations

details of a typical site with connections to router and other sites

Using SPPs

Discuss how to define slices in myPLC, login to SPP nodes and do configuration. Keep the main flow at a high level, but add a page that gives a tutorial on how the GEC 4 demo is done.