Difference between revisions of "Overview Of The SPP Architecture"

From ARL Wiki
Jump to navigationJump to search
Line 132: Line 132:
 
Let's focus on slice A's traffic transiting from MI 1 to MI 2.
 
Let's focus on slice A's traffic transiting from MI 1 to MI 2.
 
Because PlanetLab traffic travels over UDP tunnels, A's packets coming to MI 1 will
 
Because PlanetLab traffic travels over UDP tunnels, A's packets coming to MI 1 will
have outer IP+UDP headers that reflect the tunneling.
+
have PlanetLab (outer) IP+UDP headers that reflect the tunneling.
 
These outer headers encapsulate the application packet that has its own IP+UDP headers
 
These outer headers encapsulate the application packet that has its own IP+UDP headers
 
followed by the application data.
 
followed by the application data.
The SPP processes packets based on the information from the outer IP and UDP headers.
+
The SPP processes packets based on the information from the PlanetLab IP+UDP headers.
  
 
[...[ FIGURE ]...]
 
[...[ FIGURE ]...]
  
 
If we examine a packet that transits R3 from MI 1 to MI 2, the destination IP address and
 
If we examine a packet that transits R3 from MI 1 to MI 2, the destination IP address and
port number in the outer headers of the incoming packet must be (10.1.32.2, 22000); i.e., the
+
port number in the PlanetLab headers of the incoming packet must be (10.1.32.2, 22000); i.e., the
 
socket corresponding to slice A's R3 MI 1.
 
socket corresponding to slice A's R3 MI 1.
Since the transit packet will go out MI 2 the outer IP+UDP headers of the
+
Since the transit packet will go out MI 2 the PlanetLab IP+UDP headers of the
 
outgoing packet must reflect A's tunnel from R3 to R2.
 
outgoing packet must reflect A's tunnel from R3 to R2.
 
If slice A's destination MI on R2 were (10.1.17.2, 44000),
 
If slice A's destination MI on R2 were (10.1.17.2, 44000),
 
that tunnel (source socket, destination socket) would be ((10.1.17.1, 22000), (10.1.17.2, 44000))
 
that tunnel (source socket, destination socket) would be ((10.1.17.1, 22000), (10.1.17.2, 44000))
The outer header of the outgoing packet should have source and destination addresses of
+
The PlanetLab header of the outgoing packet should have source and destination addresses of
 
10.1.17.1 and 10.1.17.2 and source and destination port numbers of
 
10.1.17.1 and 10.1.17.2 and source and destination port numbers of
 
22000 and 44000.
 
22000 and 44000.
  
 
Suppose that slice B also has a packet transiting from MI 1 to MI 2 at the same time.
 
Suppose that slice B also has a packet transiting from MI 1 to MI 2 at the same time.
Then, its incoming packets would have outer headers indicating a destination socket of (10.1.32.2, 33000).
+
Then, its incoming packets would have PlanetLab headers indicating a destination socket of (10.1.32.2, 33000).
And its outgoing packet would have outer headers indicating the tunnel ((10.1.17.1, 33000), (10.1.17.2, X))
+
And its outgoing packet would have PlanetLab headers indicating the tunnel ((10.1.17.1, 33000), (10.1.17.2, X))
 
where X the UDP port number assigned to slice B on R2's MI.  
 
where X the UDP port number assigned to slice B on R2's MI.  
  
Line 160: Line 160:
 
''Processing Engines'' (right).
 
''Processing Engines'' (right).
 
There are two types of Processing Engines (PEs):
 
There are two types of Processing Engines (PEs):
1) ''General-Purpose Processing Engines'' (GPE), and 2) ''Network Processor Blades'' (NPE).
+
1) ''General-Purpose Processing Engines'' (GPEs), and 2) ''Network Processor Engines'' (NPEs).
 
A GPE is a conventional server blade running the standard PlanetLab operating system.
 
A GPE is a conventional server blade running the standard PlanetLab operating system.
 
A user can log into a GPE (using ''ssh'') and can run processes that handle packets.
 
A user can log into a GPE (using ''ssh'') and can run processes that handle packets.
Line 167: Line 167:
 
The NPE has a 10 GbE network connection and is capable of forwarding packets at 10 Gb/s.
 
The NPE has a 10 GbE network connection and is capable of forwarding packets at 10 Gb/s.
  
All input and output passes through a Line Card (LC) which is nothing more than an
+
All input and output passes through a Line Card (LC) which is an
 
NPE that has been customized to route traffic between the external interfaces and the
 
NPE that has been customized to route traffic between the external interfaces and the
 
GPEs and NPEs.
 
GPEs and NPEs.
Line 180: Line 180:
 
In our example, the SPP would process an incoming transit packet in the following way:
 
In our example, the SPP would process an incoming transit packet in the following way:
  
* LC:  Determine the slice context from the outer packet headers and forward it to the designated NPE  
+
* LC(in):  Determine the slice context from the PlanetLab headers and forward the packet to the designated NPE  
* NPE: Create new outer headers and enqueue the packet for the outgoing MI
+
* NPE: Create new PlanetLab headers, enqueue the packet for the outgoing MI and  forward the packet to the LC
 +
* LC(out):  Add the ethernet header and transmit the frame
  
[...[ FIGURE ]...[
+
Before describing how these steps are carried out in the SPP, it is worth noting that
 +
most of the abstractions presented to a user (e.g., queues, filters) are implemented with
 +
corresponding actual SPP components.
 +
A user's logical view (e.g., MI 3) is supported by mapping the user's logical resources
 +
onto the SPPs actual resources.
 +
A user's queues, filters and meta-interfaces are mapped to the SPP's actual queues,
 +
filters and meta-interfaces that have been assigned to the user as part of its context.
 +
The user's abstract components and actual components have non-negative identifiers;
 +
e.g., a queue with QID 3.
 +
So, in our example, even though both users have a queue with queue ID 3, they are mapped to
 +
different actual queues with their own unique queue IDs.
 +
During packet processing, the user's ''logical identifiers'' are mapped to the SPP's
 +
''actual (or internal) identifiers''.
  
The details involved in these two steps are:
+
These mappings are done by the TCAM in each NPE.
 +
The TCAM is a database of key-result pairs.
 +
It selects an entry whose key matches selected packet information and outputs the
 +
corresponding result.
  
* LC:  Determine the slice context from the outer packet headers and forward it to the designated NPE
+
The TCAM in the LC determines the slice context and related internal and external
** xxx
+
identifiers based on the PlanetLab headers of an incoming packet.
* NPE: Create new outer headers and enqueue the packet for the outgoing MI
+
These identifiers include:
** xxx
+
 
 +
* FPid:  Fastpath ID
 +
* Copt:  Code option
 +
* rxMIid:  Incoming Meta-Interface ID
 +
* VLANtag:  VLAN tag to be used to identify fastpath
 +
 
 +
[...[ IPv4 TCAM FILTER FIGURE ]...]
 +
 
 +
The figure (right) shows the format of the NPE's TCAM key and result fields.
 +
The key includes the key fields from the user's filters (e.g., destination IP address)
 +
and outputs a result that includes primarily forwarding information from the user's
 +
filter (e.g., ''--txdaddr'').
  
  

Revision as of 22:11, 12 June 2009

This section gives an overview of the SPP architecture. It describes the key hardware and software features that make it possible to support the main abstractions provided to an SPP slice/user:

  • Slice
  • Fastpath
  • Meta-Interface
  • Packet queue and scheduling
  • Filter

Coupled with these abstractions are the following system features:

  • Resource virtualization
  • Traffic isolation
  • High performance
  • Protocol extensibility

These features allow the SPP to support the concurrent operation of multiple high-speed, virtual routers and allows the user to add support for new protocols. For example, one PlanetLab user could be forwarding IPv4 traffic while a second one could be forwarding I3 traffic. Meanwhile a third user could be programming the SPP to support MPLS.

We begin with a very simple example of an IPv4 router to illustrate the SPP concepts. Then, we describe the architectural features in three parts. The first two parts emphasize the virtualization feature of the SPP while the third part emphasizes the extensibility of the SPP. Part I describes how packets travel through the SPP assuming that it has already been configured with a fastpath for an IPv4 router. Part II describes what happens when we create and configure the SPP abstractions (e.g., create a meta-interface and bind it to a queue) for the router in Part I. Part III sketches how the example would be different if the router handled a simple virtual circuit protocol instead of IPv4.

IPv4 Example

File:Example-two-slices-one-spp.png
Two Slices Sharing One SPP

We begin with a simple example of two slices/users (A and B) concurrently using the same SPP as an IPv4 transit router (R3) between the same two routers (R1 and R2) that are attached to ports 1 and 2 of the SPP (see figure to the right). Furthermore, both slices need 100 Mb/s bandwidth in each direction (R1 to R2 and R2 to R1) and no special treatment of traffic. We have purposely elected to make the logical views of the two slices as similar as possible to show how the SPP substrate can host this virtualization.


File:Example-two-slices-one-spp-logical.png
Logical Configuration of Each Slice

From a logical point of view, each user of R3 needs a configuration (right) which includes one fastpath consisting of three meta-interfaces (m0-m2), four queues (q0-q3), and six filters (f0-f5). Meta-interface m0 goes to R3 itself; m1 to R1; and m2 to R2.



Slice A MI Socket
(IP, Port)
BW
(Mb/s)
Queues
FP na 202
m0 (10.1.16.3,
    22000)
2 q2, q3
m1 (10.1.32.2,
    22000)
100 q0
m2 (10.1.17.1,
    22000)
100 q1
Slice B MI Socket
(IP, Port)
BW
(Mb/s)
Queues
FP na 202
m0 (10.1.16.3,
    33000)
2 q2, q3
m1 (10.1.32.2,
    33000)
100 q0
m2 (10.1.17.1,
    33000)
100 q1

The configuration of R3 for both slices is identical except for UDP port numbers of their meta-interfaces. Both slice A and B will have the logical views shown in the tables (right). Note the following:

  • The total bandwidth of the meta-interfaces (202 Mb/s) can not exceed the bandwidth of the fastpath (FP).
  • There should be atleast one queue bound to each meta-interface (MI).
  • The highest numbered queues are associated with meta-interface 0 which are for local delivery and exception traffic.
  • The only difference between the two tables is that the UDP port number of the MI sockets are 22000 for slice A and 33000 for slice B.



MIout
MIin m0 m1 m2
m0
f0 f1
m1 f2
f3
m2 f4 f5

There are six filters. Each meta-interface has two filters, one for each possible meta-interface destination. For example, traffic from m1 can go to m0 or m2.


The question now is how the SPP makes it appear to both slices that they each have two dedicated 100 Mb/s paths through R3 even when traffic from both slices is coming in at the same time.

Part I: IPv4 Packet Forwarding

Let's focus on slice A's traffic transiting from MI 1 to MI 2. Because PlanetLab traffic travels over UDP tunnels, A's packets coming to MI 1 will have PlanetLab (outer) IP+UDP headers that reflect the tunneling. These outer headers encapsulate the application packet that has its own IP+UDP headers followed by the application data. The SPP processes packets based on the information from the PlanetLab IP+UDP headers.

[...[ FIGURE ]...]

If we examine a packet that transits R3 from MI 1 to MI 2, the destination IP address and port number in the PlanetLab headers of the incoming packet must be (10.1.32.2, 22000); i.e., the socket corresponding to slice A's R3 MI 1. Since the transit packet will go out MI 2 the PlanetLab IP+UDP headers of the outgoing packet must reflect A's tunnel from R3 to R2. If slice A's destination MI on R2 were (10.1.17.2, 44000), that tunnel (source socket, destination socket) would be ((10.1.17.1, 22000), (10.1.17.2, 44000)) The PlanetLab header of the outgoing packet should have source and destination addresses of 10.1.17.1 and 10.1.17.2 and source and destination port numbers of 22000 and 44000.

Suppose that slice B also has a packet transiting from MI 1 to MI 2 at the same time. Then, its incoming packets would have PlanetLab headers indicating a destination socket of (10.1.32.2, 33000). And its outgoing packet would have PlanetLab headers indicating the tunnel ((10.1.17.1, 33000), (10.1.17.2, X)) where X the UDP port number assigned to slice B on R2's MI.

Supercharged PlanetLab Platform Hardware Components

For a developer, the most important hardware components of an SPP are the Processing Engines (right). There are two types of Processing Engines (PEs): 1) General-Purpose Processing Engines (GPEs), and 2) Network Processor Engines (NPEs). A GPE is a conventional server blade running the standard PlanetLab operating system. A user can log into a GPE (using ssh) and can run processes that handle packets. An NPE includes two IXP 2850 network processors, each with 16 cores for processing packets and an xScale management processor. The NPE has a 10 GbE network connection and is capable of forwarding packets at 10 Gb/s.

All input and output passes through a Line Card (LC) which is an NPE that has been customized to route traffic between the external interfaces and the GPEs and NPEs. The LC has ten GbE interfaces, some of which will have public IP addresses while others will be used for direct connection to other SPP nodes. The Control Processor (CP) configures application slices based on slice descriptions obtained from PlanetLab Central, a centralized database that is used to manage the global PlanetLab infrastructure. The CP also hosts a netFPGA, allowing application developers to implement processing in configurable hardware, as well as software.

In our example, the SPP would process an incoming transit packet in the following way:

  • LC(in): Determine the slice context from the PlanetLab headers and forward the packet to the designated NPE
  • NPE: Create new PlanetLab headers, enqueue the packet for the outgoing MI and forward the packet to the LC
  • LC(out): Add the ethernet header and transmit the frame

Before describing how these steps are carried out in the SPP, it is worth noting that most of the abstractions presented to a user (e.g., queues, filters) are implemented with corresponding actual SPP components. A user's logical view (e.g., MI 3) is supported by mapping the user's logical resources onto the SPPs actual resources. A user's queues, filters and meta-interfaces are mapped to the SPP's actual queues, filters and meta-interfaces that have been assigned to the user as part of its context. The user's abstract components and actual components have non-negative identifiers; e.g., a queue with QID 3. So, in our example, even though both users have a queue with queue ID 3, they are mapped to different actual queues with their own unique queue IDs. During packet processing, the user's logical identifiers are mapped to the SPP's actual (or internal) identifiers.

These mappings are done by the TCAM in each NPE. The TCAM is a database of key-result pairs. It selects an entry whose key matches selected packet information and outputs the corresponding result.

The TCAM in the LC determines the slice context and related internal and external identifiers based on the PlanetLab headers of an incoming packet. These identifiers include:

  • FPid: Fastpath ID
  • Copt: Code option
  • rxMIid: Incoming Meta-Interface ID
  • VLANtag: VLAN tag to be used to identify fastpath

[...[ IPv4 TCAM FILTER FIGURE ]...]

The figure (right) shows the format of the NPE's TCAM key and result fields. The key includes the key fields from the user's filters (e.g., destination IP address) and outputs a result that includes primarily forwarding information from the user's filter (e.g., --txdaddr).


XXX


Part II: Configuring the SPP

xxx

Part III: A Virtual Circuit Router

xxx