Difference between revisions of "Internet Scale Overlay Hosting"

From ARL Wiki
Jump to navigationJump to search
Line 23: Line 23:
 
the ''Overlay Hosting Platforms'' (OHP) and the links joining OHPs to one another.
 
the ''Overlay Hosting Platforms'' (OHP) and the links joining OHPs to one another.
 
These links are expected to have provisioned bandwidth, allowing the OHS to
 
These links are expected to have provisioned bandwidth, allowing the OHS to
deliver provisioned capacity to the overlay networks that run on top of it.
+
deliver provisioned capacity to the overlay networks that operate within it.
 +
Access to the OHS infrastructure uses the Internet for the final hop.
 +
This traffic can be carried over UDP tunnels, or virtual links,
 +
depending on the deployment context (e.g. MPLS or VLAN).
  
 
<center>[[Image:ohs.gif]]</center>
 
<center>[[Image:ohs.gif]]</center>
Line 30: Line 33:
 
different overlays. The resources allocated to an overlay node can range from
 
different overlays. The resources allocated to an overlay node can range from
 
a small fraction of a general-purpose server, up to hundreds
 
a small fraction of a general-purpose server, up to hundreds
of high performance multi-core network processor subsystems.
+
of high performance multi-core processor subsystems.
 +
To enable a wide range of overlay network architectures and service models,
 +
it's important for OHSs to be as flexible as possible.
 +
Overlay network providers should be free to define their own
 +
protocols, packet formats and service models, within the framework provided by the OHS.
 +
 
 +
The use of provisioned links between OHPs allows OHS providers to deliver provisioned
 +
overlay links to the overlay providers, making it possible for overlay networks
 +
to provide consistent performance to their users. For access over UDP tunnels,
 +
true QoS may be difficult to achieve, but even here there is the potential for
 +
significantly better performance than can be achieved over end-to-end connections
 +
in the public Internet.
 +
 
 +
To clarify the role of various elements of an OHS, it's helpful to introduce some terminology.
 +
The infrastructure and core services provided by the OHS are referred to as the
 +
''substrate''. We also use substrate to refer to the core services provided by
 +
the OHPs. We use ''overlay'' to refer to each overlay network hosted by an OHS,
 +
and we use the term ''overlay node'' to refer to the individual nodes within the
 +
overlay network. For brevity, we sometimes use ''platform'' in place of overlay
 +
hosting platform, and ''node'' in place of overlay node.
  
 
== Scalable Overlay Hosting Platforms ==
 
== Scalable Overlay Hosting Platforms ==
  
 
The purpose of an ''Overlay Hosting Platform'' (OHP) is to provide resources for use by
 
The purpose of an ''Overlay Hosting Platform'' (OHP) is to provide resources for use by
individual overlay networks or slices. We seek to provide the greatest possible
+
individual overlay. We seek to provide the greatest possible
flexibility to slices, while maintaining appropriate separation among slices.
+
flexibility to overlays, while maintaining appropriate separation among different overlays.
 
This of course includes separation of their use of memory and mass storage,
 
This of course includes separation of their use of memory and mass storage,
 
but also their use of network bandwidth and processing capacity.
 
but also their use of network bandwidth and processing capacity.
Line 45: Line 67:
 
If overlay networks are to deliver Internet scale with router-like performance,
 
If overlay networks are to deliver Internet scale with router-like performance,
 
OHPs must be capable of scaling up to terabit IO capacities and the ability to
 
OHPs must be capable of scaling up to terabit IO capacities and the ability to
execute hundreds of basic operations per byte of data forwarded. This will
+
execute hundreds of basic operations per byte of data forwarded. This
require the use of scalable interconnection networks and flexible processing
+
requires the use of scalable interconnection networks and flexible processing
subsystems capable of very high processing throughputs. At the same time,
+
subsystems capable of very high throughputs. At the same time,
we expect that a significant fraction of overlay services may well be quite small,
+
we expect that a significant fraction of overlay services may be quite small,
 
so the system should be able to support a mixture of large and small overlays,
 
so the system should be able to support a mixture of large and small overlays,
 
with the largest being several of orders of magnitude larger than the smallest.
 
with the largest being several of orders of magnitude larger than the smallest.
Line 74: Line 96:
 
including both reserved and shared bandwidth models.
 
including both reserved and shared bandwidth models.
  
• ''Isolation of slices''. The OHP must allow different slicesto co-exist  
+
• ''Isolation of slices''. The OHP must allow different slices to co-exist  
 
without interference.  
 
without interference.  
 
Ideally, each slice should have the illusion that it is operating within
 
Ideally, each slice should have the illusion that it is operating within
Line 82: Line 104:
 
reserve dedicated processing capacity and link bandwidth.
 
reserve dedicated processing capacity and link bandwidth.
  
• ''Minimize constraints on slices''. The OHP should place as few constraints  
+
• ''Minimal constraints on slices''. The OHP should place as few constraints  
 
as possible on the slices it hosts. In particular, it should not place  
 
as possible on the slices it hosts. In particular, it should not place  
 
any constraints on data formats or limit the ways in which slices provide  
 
any constraints on data formats or limit the ways in which slices provide  
 
various capabilities, or constrain the way in which they use their  
 
various capabilities, or constrain the way in which they use their  
 
assigned processing resources.
 
assigned processing resources.
 
  
 
=== Architectural Options ===
 
=== Architectural Options ===
Line 95: Line 116:
 
arising from these options.
 
arising from these options.
  
==== Virtualized line card architecture ====  
+
==== Virtualized line card architecture ====
 +
 
 
Consideration of a conventional router or switch leads naturally to an  
 
Consideration of a conventional router or switch leads naturally to an  
architecture in which line cards are replaced by virtualized line cards  
+
architecture in which line cards are replaced by ''virtualized line cards''
 
that include a substrate portion and generic processing resources that  
 
that include a substrate portion and generic processing resources that  
can be assigned to different meta line cards (Figure 2).  
+
can be assigned to different virtual line cards (Figure 2).  
 
The substrate supports configuration of the generic processing resources  
 
The substrate supports configuration of the generic processing resources  
 
so that different meta line cards can co-exist without interference.  
 
so that different meta line cards can co-exist without interference.  
 
On receiving data from the physical link, the substrate first determines  
 
On receiving data from the physical link, the substrate first determines  
which meta line card it should be sent to and delivers it.  
+
which virtual line card it should be sent to and delivers it.  
Meta line cards pass data back to the substrate, in order to  
+
Virtual line cards pass data back to the substrate, in order to  
 
forward it through the shared switch fabric, on input,  
 
forward it through the shared switch fabric, on input,  
 
or to the outgoing link, on output.
 
or to the outgoing link, on output.
Line 110: Line 132:
 
One issue with this architecture concerns how to provide generic  
 
One issue with this architecture concerns how to provide generic  
 
processing resources at a line card, in a way that allows the  
 
processing resources at a line card, in a way that allows the  
resources to be shared by different meta line cards.  
+
resources to be shared by different overlays.  
 
Conventional line cards are often implemented using Network Processors (NP),
 
Conventional line cards are often implemented using Network Processors (NP),
programmable devices that in-clude high performance IO and  
+
programmable devices that include high performance IO and  
 
multiple processor cores to enable high throughput processing.  
 
multiple processor cores to enable high throughput processing.  
 
It seems natural to take such a device and divide its internal  
 
It seems natural to take such a device and divide its internal  
processing resources among multiple meta line cards.  
+
processing resources among multiple overlays.  
 
For example, an NP with 16 processor cores could be used by up  
 
For example, an NP with 16 processor cores could be used by up  
to 16 different meta line cards, by simply assigning processor cores.  
+
to 16 different overlays, by simply assigning processor cores.  
 
Unfortunately, current NPs are not designed to be shared.  
 
Unfortunately, current NPs are not designed to be shared.  
 
All processing cores have unprotected access to the same physical memory,  
 
All processing cores have unprotected access to the same physical memory,  
making it difficult to ensure that different meta line cards don’t  
+
making it difficult to ensure that different overlays don’t  
interfere with one another. Also, each processor core has a fairly  
+
interfere with one another.
 +
 
 +
Also, each processor core has a fairly  
 
small program store. This is not a serious constraint in  
 
small program store. This is not a serious constraint in  
 
conventional applications, since processing can be pipelined  
 
conventional applications, since processing can be pipelined  
 
across the different cores, allowing each to store only the program  
 
across the different cores, allowing each to store only the program  
 
it needs for its part of the processing.  
 
it needs for its part of the processing.  
However, a core implementing an entire meta line card must  
+
However, a core implementing all the processing for one overlay must  
store the programs to implement all the processing steps for that meta line card.  
+
store the programs to implement all the processing steps for that overlay.  
 
The underlying issue raised by this discussion is that efficient  
 
The underlying issue raised by this discussion is that efficient  
 
implementation of an architecture based on virtualized line cards,  
 
implementation of an architecture based on virtualized line cards,  
Line 133: Line 157:
 
conventional NPs do not.
 
conventional NPs do not.
  
The virtualized line card approach is also problematic in other respects. Because it associates processing resources with physical links, it lacks the flexibility to support metar-outers with a wide range of processing needs. Some metar-outers may require more processing per unit IO bandwidth than NPs provide, and this is difficult with a virtualized line card approach. The virtualized line card approach also does not easily accommodate alternate implementation approaches for metarouters (such as configurable logic).
+
The virtualized line card approach is also problematic in other respects.  
 +
Because it associates processing resources with physical links,  
 +
it lacks the flexibility to support overlays with a wide range of processing needs.
 +
Some overlays may require more processing per unit IO bandwidth than NPs provide,
 +
and this is difficult to provide with a virtualized line card approach.
 +
The virtualized line card approach also does not easily accommodate alternate
 +
implementation approaches for overlay nodes (such as configurable logic).
  
 
==== Processing pool architecture ====
 
==== Processing pool architecture ====
  
 
The processing pool architecture separates the processing resources
 
The processing pool architecture separates the processing resources
used by metarouters from the physical link inter-faces. This allows  
+
used by overlays from the physical link interfaces. This allows  
 
a more flexible allocation of processing resources and reduces the  
 
a more flexible allocation of processing resources and reduces the  
 
need for fine-grained virtualization. This architecture, illustrated  
 
need for fine-grained virtualization. This architecture, illustrated  
Line 144: Line 174:
 
accessed through the switch fabric. The line cards that terminate  
 
accessed through the switch fabric. The line cards that terminate  
 
the physical links forward packets to PEs through the switch fabric,  
 
the physical links forward packets to PEs through the switch fabric,  
but do no processing that is specific to a particular metanet.  
+
but do no processing that is specific to a particular overlay.  
There may be different types of PEs, including some im-plemented u
+
There may be different types of PEs, including some implemented
sing network processors, others implemented using conventional  
+
using network processors, others implemented using conventional  
microprocessors and still others imple-mented using FPGAs.  
+
microprocessors and still others implemented using FPGAs.  
 
The NP and FPGA based PEs are most appropriate for high throughput  
 
The NP and FPGA based PEs are most appropriate for high throughput  
 
packet processing, the conventional processor for control functions  
 
packet processing, the conventional processor for control functions  
that require more complex software or for metanets with a high ratio  
+
that require more complex software or for overlays with a high ratio  
of processing to IO. A metarouter may be implemented using a single  
+
of processing to IO. An overlay node may be implemented using a single  
 
PE or multiple PEs. In the case of a single PE, data will pass  
 
PE or multiple PEs. In the case of a single PE, data will pass  
 
through the physical switch fabric twice, once on input, once on  
 
through the physical switch fabric twice, once on input, once on  
output. In a metarouter that uses multiple PEs to obtain higher
+
output. In a node that uses multiple PEs to obtain higher
 
performance, packets may have to pass through the switch fabric a third time.
 
performance, packets may have to pass through the switch fabric a third time.
  
Line 161: Line 191:
 
and increasing the switch capacity needed to support a given total IO bandwidth.  
 
and increasing the switch capacity needed to support a given total IO bandwidth.  
 
The increase in delay is not a serious concern in wide area  
 
The increase in delay is not a serious concern in wide area  
network con-texts, since switching delays are typically 10 s or less.  
+
network contexts, since switching delays are typically 10 <math>\mu</math>s or less.  
 
The increase in capacity does add to system cost, but since a  
 
The increase in capacity does add to system cost, but since a  
 
well-designed switch fabric represents a relatively small part  
 
well-designed switch fabric represents a relatively small part  

Revision as of 15:42, 23 July 2008

[under construction]

Network overlays have become a popular tool for implementing Internet applications. While content-delivery networks provide the most prominent example of the commercial application of overlays, systems researchers have developed a variety of experimental overlay applications, demonstrating that the overlay approach can be an effective method for deploying a broad range of innovative systems. Rising traffic volumes in overlay networks make the performance of overlay platforms an issue of growing importance. Currently, overlay platforms are constructed using general purpose servers, often organized into a cluster with a load-balancing switch acting as a front end. This project explores more integrated and scalable architectures suitable for supporting large-scale applications with thousands to many millions of end users. In addition, we are studying various network level issues relating to the control and management of large-scale overlay hosting services.

Overvew

An Overlay Hosting Service (OHS) is a shared infrastructure that supports multiple overlay networks. There are two major physical components to an OHS, the Overlay Hosting Platforms (OHP) and the links joining OHPs to one another. These links are expected to have provisioned bandwidth, allowing the OHS to deliver provisioned capacity to the overlay networks that operate within it. Access to the OHS infrastructure uses the Internet for the final hop. This traffic can be carried over UDP tunnels, or virtual links, depending on the deployment context (e.g. MPLS or VLAN).

Ohs.gif

The OHPs contain flexible processing resources that can be allocated to different overlays. The resources allocated to an overlay node can range from a small fraction of a general-purpose server, up to hundreds of high performance multi-core processor subsystems. To enable a wide range of overlay network architectures and service models, it's important for OHSs to be as flexible as possible. Overlay network providers should be free to define their own protocols, packet formats and service models, within the framework provided by the OHS.

The use of provisioned links between OHPs allows OHS providers to deliver provisioned overlay links to the overlay providers, making it possible for overlay networks to provide consistent performance to their users. For access over UDP tunnels, true QoS may be difficult to achieve, but even here there is the potential for significantly better performance than can be achieved over end-to-end connections in the public Internet.

To clarify the role of various elements of an OHS, it's helpful to introduce some terminology. The infrastructure and core services provided by the OHS are referred to as the substrate. We also use substrate to refer to the core services provided by the OHPs. We use overlay to refer to each overlay network hosted by an OHS, and we use the term overlay node to refer to the individual nodes within the overlay network. For brevity, we sometimes use platform in place of overlay hosting platform, and node in place of overlay node.

Scalable Overlay Hosting Platforms

The purpose of an Overlay Hosting Platform (OHP) is to provide resources for use by individual overlay. We seek to provide the greatest possible flexibility to overlays, while maintaining appropriate separation among different overlays. This of course includes separation of their use of memory and mass storage, but also their use of network bandwidth and processing capacity.

We start by articulating some high level objectives for an effective OHP design.

Scalable performance. If overlay networks are to deliver Internet scale with router-like performance, OHPs must be capable of scaling up to terabit IO capacities and the ability to execute hundreds of basic operations per byte of data forwarded. This requires the use of scalable interconnection networks and flexible processing subsystems capable of very high throughputs. At the same time, we expect that a significant fraction of overlay services may be quite small, so the system should be able to support a mixture of large and small overlays, with the largest being several of orders of magnitude larger than the smallest.

Stability and reliability. An OBP providing Internet-scale service delivery must be highly stable and reliable. In particular, failures in one overlay network should not be able to affect the operation of others on the same platform. Moreover, the core components should exhibit a high level of intrinsic reliability, and the system should be capable of rapidly replacing failed subsystems, by switching in equivalent spares.

Ease of use. To enable small organizations to take advantage of overlay hosting services, it's important that the time and effort required to create a new overlay be relatively modest.

Technology diversity and adaptability. Overlay hosting services represent a new class of technology platform, and it is likely that the design of OHPs will evolve over time, as experience develops. It's important that an OHP system architecture be open to the incorporation of new technology components as they become available.

Flexible allocation of link bandwidth. Link bandwidth is a key resource in any network or distributed system. OHPs should support flexible allocation of bandwidth to different slices, including both reserved and shared bandwidth models.

Isolation of slices. The OHP must allow different slices to co-exist without interference. Ideally, each slice should have the illusion that it is operating within a dedicated environment. This means that resources like memory and mass storage must be free from modification by other slices and that each slice have the ability to reserve dedicated processing capacity and link bandwidth.

Minimal constraints on slices. The OHP should place as few constraints as possible on the slices it hosts. In particular, it should not place any constraints on data formats or limit the ways in which slices provide various capabilities, or constrain the way in which they use their assigned processing resources.

Architectural Options

This section discusses two high level system architecture options for scalable overlay hosting platforms and the issues arising from these options.

Virtualized line card architecture

Consideration of a conventional router or switch leads naturally to an architecture in which line cards are replaced by virtualized line cards that include a substrate portion and generic processing resources that can be assigned to different virtual line cards (Figure 2). The substrate supports configuration of the generic processing resources so that different meta line cards can co-exist without interference. On receiving data from the physical link, the substrate first determines which virtual line card it should be sent to and delivers it. Virtual line cards pass data back to the substrate, in order to forward it through the shared switch fabric, on input, or to the outgoing link, on output.

One issue with this architecture concerns how to provide generic processing resources at a line card, in a way that allows the resources to be shared by different overlays. Conventional line cards are often implemented using Network Processors (NP), programmable devices that include high performance IO and multiple processor cores to enable high throughput processing. It seems natural to take such a device and divide its internal processing resources among multiple overlays. For example, an NP with 16 processor cores could be used by up to 16 different overlays, by simply assigning processor cores. Unfortunately, current NPs are not designed to be shared. All processing cores have unprotected access to the same physical memory, making it difficult to ensure that different overlays don’t interfere with one another.

Also, each processor core has a fairly small program store. This is not a serious constraint in conventional applications, since processing can be pipelined across the different cores, allowing each to store only the program it needs for its part of the processing. However, a core implementing all the processing for one overlay must store the programs to implement all the processing steps for that overlay. The underlying issue raised by this discussion is that efficient implementation of an architecture based on virtualized line cards, requires components that support fine-grained virtualization and conventional NPs do not.

The virtualized line card approach is also problematic in other respects. Because it associates processing resources with physical links, it lacks the flexibility to support overlays with a wide range of processing needs. Some overlays may require more processing per unit IO bandwidth than NPs provide, and this is difficult to provide with a virtualized line card approach. The virtualized line card approach also does not easily accommodate alternate implementation approaches for overlay nodes (such as configurable logic).

Processing pool architecture

The processing pool architecture separates the processing resources used by overlays from the physical link interfaces. This allows a more flexible allocation of processing resources and reduces the need for fine-grained virtualization. This architecture, illustrated in Figure 3, provides a pool of Processing Engines (PE), that are accessed through the switch fabric. The line cards that terminate the physical links forward packets to PEs through the switch fabric, but do no processing that is specific to a particular overlay. There may be different types of PEs, including some implemented using network processors, others implemented using conventional microprocessors and still others implemented using FPGAs. The NP and FPGA based PEs are most appropriate for high throughput packet processing, the conventional processor for control functions that require more complex software or for overlays with a high ratio of processing to IO. An overlay node may be implemented using a single PE or multiple PEs. In the case of a single PE, data will pass through the physical switch fabric twice, once on input, once on output. In a node that uses multiple PEs to obtain higher performance, packets may have to pass through the switch fabric a third time.

The primary drawback of the processing pool architecture is that it requires multiple passes through the switch fabric, increasing delay and increasing the switch capacity needed to support a given total IO bandwidth. The increase in delay is not a serious concern in wide area network contexts, since switching delays are typically 10 <math>\mu</math>s or less. The increase in capacity does add to system cost, but since a well-designed switch fabric represents a relatively small part of the cost of a conventional router (typically 10-20%), we can double, or even triple the \capacity without a pro-portionally large increase in the overall system cost. In the GENI context, the switch fabric bandwidth implications of the processing pool architecture are significantly reduced, since we expect the metarouters implemented within a GBP to have a relative high ratio of processing capacity to IO bandwidth, compared to conventional routers.

The great advantage of the processing pool architecture is that it greatly reduces the need for fine-grained virtual-ization within NP and FPGA-based subsystems, for which such virtualization is difficult. Because the processing pool architecture brings together the traffic for each individual metarouter, there is much less need for PEs to be shared among multiple metarouters. The one exception to this i s metarouters with such limited processing needs that they cannot justify the use of even one complete PE. Such metarouters can still be accommodated by implementing them on a general purpose processor, running a conven-tional operating system that supports a virtual machine environment. We discuss below one approach that allows such metarouters to share an NP for fast path forwarding, while relying on a virtual machine running within a general purpose processor to handle exception cases.

Another advantage of the processing pool architecture is that it simplifies sharing of the switch fabric. The switch traffic must maintain traffic isolation among the different metarouters. One way to ensure this is to constrain the traffic flows entering the switch fabric so as to eliminate the possibility of internal congestion. This is difficult to do in all cases. In particular, metarouters consisting of multiple PEs should be allowed to use their “share” of the switch fabric capacity in a flexible fashion, without having to constrain the pair-wise traffic flows among the PEs. How-ever allowing this flexibility makes it possible for several PEs in a given metarouter to forward traffic to another PE at a rate that exceeds the bandwidth of the interface be-tween the switch fabric and the destination PE.

There is a straightforward solution to this problem in the processing pool architecture. To simplify the discus-sion, we separate the handling of traffic between line cards and PEs from the traffic among PEs in a common metar-outer. In the first case, we can treat the traffic as a set of point-to-point streams that are rate-limited when they enter the fabric. Rate-limiting these flows follows naturally from the fact that they are logical extensions of traffic flows on the external links. Because the external link flows must be rate limited to provide traffic isolation on the external links, the internal flows within the switch fabric can be config-ured to eliminate the possibility of congestion.

For PE-to-PE traffic, we cannot simply limit the traffic entering the switch, since it’s important to let PEs commu-nicate freely with other PEs in the same metarouter, with-out constraint. However, because entire PEs are allocated to metarouters in the processing pool architecture, it’s possi-ble to obtain good traffic isolation in a straightforward way, for this case as well. In general, we need two properties from the switch fabric. First, it must support constrained routing, so that traffic from one to metarouter cannot be sent to PEs belonging to another metarouter. Second, we need to ensure that congestion within one metarouter does not affect traffic within another metarouter. The emergence of Ethernet as a backplane switching technology provides the first property. Such switches support VLAN-based routing that can be used to separate the traffic from differ-ent metarouters. The second property is satisfied by any switching fabric that is nonblocking at the port level. While some switch fabrics fail to a fully achieve the objective of nonblocking performance, this is the standard figure of merit for switching fabrics a and most come reasonably close to achieving it.

Abstraction vs. Transparency

There are two very different approaches that one can take to providing a flexible overlay hosting capability. The raw resource approach seeks to make resources available, without imposing any specific usage model on those resources. Such systems can provide a variety of different resources, so long as they can be used safely within a fairly generic system framework. This makes it easy to incorporate new types of processing resources in to the overall system architecture as they become available.

The abstract programming interface approach attempts to shield overlay network developers from the characteristics of the underlying hardware components, by providing a system level abstraction through which users can implement new functionality. This has some obvious appeal, since it can allow developers to work at a higher level, and to readily port their overlay network functionality to take advantage of new hardware subsystems, without a major new development effort.

Both approaches have their merits and limitations. The raw resource approach gives developers complete control over their allocated resources, allowing them to take make most effective use of the underlying resources and maximize the system performance. The abstract interface approach has the potential to greatly reduce development effort, but may make it difficult to achieve performance objectives.

Scaling Up

Scaling Down

Implementation Options

T



specifics for GENI and SPP

Control of Overlay Hosting Services

General control architecture, including design of a control overlay network.

Internet Scale Overlay Applications

Network games work.

Scalable audio.

Mapping Overlays onto an OHS Infrastructure

Jing's work.

Issues for Multi-domain Overlay Hosting

Control issues and multi-domain resource mapping.

References

[BA06]
Bavier, A., N. Feamster, M. Huang, L. Peterson, J. Rexford. “In VINI Veritas: Realistic and Controlled Network Experimentation,” Proc. of ACM SIGCOMM, 2006.
[BH06]
Bharambe, A., J. Pang, S. Seshan. “Colyseus: A Distributed Archi-tecture for Online Multiplayer Games,” In Proc. Symposium on Networked Systems Design and Implementation (NSDI), 3/06.
[CH02]
Choi, S., J. Dehart, R. Keller, F. Kuhns, J. Lockwood, P. Pappu, J. Parwatikar, W. D. Richard, E. Spitznagel, D. Taylor, J. Turner and K. Wong. “Design of a High Performance Dynamically Extensible Router.” In Proceedings of the DARPA Active Networks Conference and Exposition, 5/02.
[CH03]
Chun, B., D. Culler, T. Roscoe, A. Bavier, L. Peterson, M. Wawr-zoniak, and M. Bowman. “PlanetLab: An Overlay Testbed for Broad-Coverage Services,” ACM Computer Communications Review, vol. 33, no. 3, 7/03.
[CI06]
Cisco Carrier Routing System. At www.cisco.com/en/ US/products/ps5763/, 2006
[DI02]
Dilley, J., B. Maggs, J. Parikh, H. Prokop, R. Sitaraman, and B. Weihl. “Globally Distributed Content Delivery,” IEEE Internet Computing, September/October 2002, pp. 50-58.
[FO07]
Force 10 Networks. “S2410 Data Center Switch,” http:// www.force10networks.com/products/s2410.asp, 2007.
[FR04]
Freedman, M., E. Freudenthal and D. Mazières. “Democratizing Content Publication with Coral,” In Proc. 1st USENIX/ACM Sym-posium on Networked Systems Design and Implementation, 3/04.
[GE06]
Global Environment for Network Innovations. http://www.geni.net/, 2006.
[HI98]
Mike Hicks_ Pankaj Kakkar_ Jonathan T_ Moore_ Carl A_ Gunter_ and Scott Nettles. “PLAN, A packet language for active networks,” In Proceedings of the Third ACM SIGPLAN International Conference on Functional Programming Languages, 1998.
[IXP]
Intel IXP 2xxx Product Line of Network Processors. http://www .intel.com/design/network/products/npfamily/ixp2xxx.htm.
[KA02]
Karlin, Scott and Larry Peterson. “VERA: An Extensible Router Architecture,” In Computer Networks, 2002.
[KO00]
Kohler, Eddie, Robert Morris, Benjie Chen, John Jannotti and M. Frans Kaashoek. “The Click modular router,” ACM Transactions on Computer Systems, 8/2000.
[KO04]
Kontothanassis, L. R. Sitaraman, J. Wein, D. Hong, R. Kleinberg, B. Mancuso, D. Shaw and D. Stodolsky. “A Transport Layer for Live Streaming in a Content Delivery Network,” Proc. of the IEEE, Special Issue on Evolution of Internet Technologies, 9/04.
[PA03]
Pappu, P., J. Parwatikar, J. Turner and K. Wong. “Distributed Queueing in Scalable High Performance Routers.” Proceeding of IEEE Infocom, 4/03.
[PE02]
Peterson, L., T. Anderson, D. Culler and T. Roscoe. “A Blueprint for Introducing Disruptive Technology into the Internet,” Proceed-ings of ACM HotNets-I Workshop, 10/02.
[RA05]
Radisys Corporation. “Promentum™ ATCA-7010 Data Sheet,” product brief, available at http://www. radisys.com/files/ATCA-7010_07-1283-01_0505_datasheet.pdf.
[RH05]
Rhea, S., B. Godfrey, B. Karp, J. Kubiatowicz, S. Ratnasamy, S. Shenker, I. Stoica and H. Yu. “OpenDHT: A Public DHT Service and Its Uses,” Proceedings of ACM SIGCOMM, 9/2005.
[SP01]
Spalink, T., S. Karlin, L. Peterson and Y. Gottlieb. “Building a Robust Software-Based Router Using Network Processors,” In ACM Symposium on Operating System Principles (SOSP), 2001.
[ST01]
Stoica, I., R. Morris, D. Karger, F. Kaashoek and H. Balakrishnan. “Chord: A scalable peer-to-peer lookup service for internet applica-tions.” In Proceedings of ACM SIGCOMM, 2001.
[ST02]
Stoica, I., D. Adkins, S. Zhuang, S. Shenker, S. Surana, “Internet Indirection Infrastructure,” Proc. of ACM SIGCOMM, 8/02.
[TU06]
Turner, J. “A Proposed Architecture for the GENI Backbone Plat-form,” In Proceedings of ACM- IEEE Symposium on Architectures for Networking and Communications Systems (ANCS), 12/2006.
[VS06]
Linux vServer. http://linux-vserver.org