OVS Conntrack Guide

OVS kernel may use the connection tracking system (Connection tracking system) together, means Conntrack function, the OpenFlow stream may be used to match a connected state TCP, UDP, ICMP, etc. (Connection tracking system track supports stateful and stateless protocol).

This tutorial demonstrates how to use OVS connection tracking system. To establish a match from connecting to the TCP segment connecting removed. The OVS and Linux kernel modules together as a data path for this presentation. (Openvswitch data using the Linux kernel module executing processing).

This demo was tested in the "master" branch of Open vSwitch.

definition

the conntrack : connection-tracking module. A stateful packet inspection.

Pipeline : packet processing pipeline. It is when the packet while traversing the path through the table, the message needs to be consistent with the match field in the table to a stream, and performs a predetermined operation in this stream.

Network namespaces : in a single Linux kernel, create multiple virtual routing domain method. Each network has its own network namespace table instance (ARP, routing), and a specific interface connection.

Flow : This tutorial is OpenFlow stream, or it can use OpenFlow controller OVS programming command-line tools, as used herein ovs-ofctl tool. Stream having a match field and an action field.

Conntrack related field

Matching field

OVS supports the following match fields related to Conntrack of:

  1. ct_state:

Matching the connection status message Possible values ​​are:

- *new*
- *est*
- *rel*
- *rpl*
- *inv*
- *trk*
- *snat*
- *dnat*

Each with a "+" sign as a prefix, expressed must be set, or a "-" sign is represented as a prefix can not be set. You can also specify multiple flags, e.g. ct_state = + trk + new. Here we will see some signs of usage. Detailed instructions, see the OVS fields in the document .

  1. ct_zone : connection tracking area are independent context operation by setting CT.

The latest ct action (by flow OpenFlow located on Conntrack entry) 16 set ct_zonevalue may be used as the matching entry in the other flow field.

  1. ct_mark : Exec action parameters by the CT in the action set to the connection of the current 32-bit data packet belongs.

  2. ct_label :
    submitted by the operator in the 128-bit parameter to the exec tag CT operation, is connected to the current packet belongs.

  3. ct_label :
    Exec action parameters by the CT in the current action is provided to connect the packet 128 belongs label.

  4. ct_nw_src / ct_ipv6_src :
    Match IPv4 / IPv6 source address of the connection tracking direction original tuple.

  5. ct_nw_dst / ct_ipv6_dst :
    matching the target address IPv4 / IPv6 connection tracking direction of the original tuple.

  6. ct_nw_proto :
    the mating connector original orientation tracking IP protocol type tuples.

  7. ct_tp_src :
    matching tuple transport layer connection source port of the original tracking direction.

  8. ct_tp_dst :
    the mating connector port tuple object original transporting layer tracking direction.

action

OVS conntrack connection tracking support and related "ct" action.

*ct([argument][,argument...])*

CT Action Action sends packets to the connected tracker.

It supports the following parameters:

  1. the commit :
    Connect submitted to the connection tracking module, the storage module for this connection packets exceeding the life cycle in the pipeline.

  2. force :
    In addition to the above commit outer flag, flag force may be used to effectively terminate the existing connection and start a new connection in the direction of the current.

  3. = Number Table :
    pipe processing into two. The original message will continue to deal with the current action in the form of a list of action untracked packets. Another example of the message will be sent to the tracking program is connected, then it will re-injection conduit OpenFlow table and continue numbertreatment in which case it has set ct_state ct match matching status and other fields.

  4. zone = value or the src = Zone [... Start End] :
    a 16-bit context ID, and the connection may be isolated in a separate domain, allows the use of different areas of overlapping network addresses. If no value region, the default area 0 is used.

  5. Exec ([Action] [, ... Action]) :
    performing an operation in a restricted set of connection tracking context. In the execlist of actions only to accept the changes ct_mark or ct_label field of action.

  6. = ALG <FTP / TFTP> :
    Specifies alg (Application Layer Gateway ALG) to track a specific connection type.

  7. NAT :
    Specifies the tracked NAT address translation and port.

Example topology

This tutorial to perform the test using the following topology.

         +                                                       +
         |                                                       |
         |                       +-----+                         |
         |                       |     |                         |
         |                       |     |                         |
         |     +----------+      | OVS |     +----------+        |
         |     |   left   |      |     |     |  right   |        |
         |     | namespace|      |     |     |namespace |        |
         +-----+        A +------+     +-----+ B        +--------+
         |     |          |    A'|     | B'  |          |        |
         |     |          |      |     |     |          |        |
         |     +----------+      |     |     +----------+        |
         |                       |     |                         |
         |                       |     |                         |
         |                       |     |                         |
         |                       +-----+                         |
         |                                                       |
         |                                                       |
         +                                                       +
     192.168.0.X n/w                                          10.0.0.X n/w

     A  = veth_l1
     A' = veth_l0
     B  = veth_r1
     B' = veth_r0

The above steps to create the topology follows.

Create a "left" Network namespaces:

$ ip netns add left

Create a "right" namespace:

$ ip netns add right

Veth created the first pair of interfaces:

$ ip link add veth_l0 type veth peer name veth_l1

Adding veth_l1 to the "left" Network namespaces:

$ ip link set veth_l1 netns left

Create a second pair of veth interfaces:

$ ip link add veth_r0 type veth peer name veth_r1

Adding veth_r1 to the "right" network namespace:

$ ip link set veth_r1 netns right

Creating a bridge br0:

$ ovs-vsctl add-br br0

Adding veth_l0 and veth_r0 interfaces to bridge br0 ::

$ ovs-vsctl add-port br0 veth_l0
$ ovs-vsctl add-port br0 veth_r0

In the data packet "left" Network namespaces generated source / destination IP addresses are 192.168.0.x / 10.0.0.x data packet, and in the opposite directions "right" Network namespaces generated will appear in OVS switch, if two networks (192.168.0.X and 10.0.0.X) hosts in a communication.

This substantially to simulate the communication between two hosts or network subnets through the middle OVS.

Note :
one pair veth interfaces to communicate between two networks namespaces, the present embodiment only presentation.

TCP packet generation tools

Scapy can be used to generate TCP packets. We used scapy steps for this test performed on the Ubuntu 16.04. (Scapy mounted inside, beyond the scope).

You can keep scapy two active sessions on each namespace:

 $ sudo ip netns exec left sudo `which scapy`

 $ sudo ip netns exec right sudo `which scapy`

Note: If you encounter the following error:

ifreq = ioctl(s, SIOCGIFADDR,struct.pack("16s16x",LOOPBACK_NAME))

IOError: [Errno 99] Cannot assign requested address

Execute this command:

$ sudo ip netns exec <namespace> sudo ip link set lo up

Matching TCP packets

TCP connection is established

OVS may be added in a simple two streams, the two streams are forwarded from "left" to the namespace "right" namespace, and from "right" to the "left" packet:

 $ ovs-ofctl add-flow br0 \
          "table=0, priority=10, in_port=veth_l0, actions=veth_r0"

 $ ovs-ofctl add-flow br0 \
          "table=0, priority=10, in_port=veth_r0, actions=veth_l0"

In addition to adding these two streams outside, and we'll also add TCP packets matching state of flow.

We will send TCP connection establishment message, namely: Located in the "left" network namespace host 192.168.0.2 syn, syn-ack ack packets and 10.0.0.2 and is located between the "right" namespace host.

First, we add a stream to start "tracking" tracking OVS in the received message.

How it began tracking messages?

To start tracking messages, you first need to match action as "ct" stream. This action sends a message to connect the tracker. To determine if the message is a "untracked" untracked packet stream match field ct_statemust be set to "-trk", i.e., it is not tracked packet. Once the message is sent to the connected tracker, so the only thing we found is that connection tracking state. (Ie, whether the packet represents a new connection or packet belongs to an existing connection or malformed packets, and so on.)

We add the following streams:

 (flow #1)
 $ ovs-ofctl add-flow br0 \
    "table=0, priority=50, ct_state=-trk, tcp, in_port=veth_l0, actions=ct(table=0)"

From the "left" namespace sent TCP SYN packets match the stream # 1, as the packet enters from veth_l0 OVS port, and has not been tracked. (Because the message just entered the OVS. All the packets for the first time into the OVS are "untracked").

Since the configuration "ct" operation, the flow will be reported to the connection packet tracker. "Ct" operation parameters "table = 0" the pipe processing into two parts. The original message as an example of the "untracked" message to continue processing the current action list. (Since there is no operation after that, the original message will be discarded.).

Examples of packets sent to another bifurcated connector tracker, then re-injected into the pipeline continues processing specified OpenFlow flow table, when the state has been set ct_state ct values, and other fields match. Under the above circumstances, packets with matching ct ct_state status and other fields will return to the list 0.

Next, we add a stream to match the datagram packet returned from the conntrack:

(flow #2)
$ ovs-ofctl add-flow br0 \
    "table=0, priority=50, ct_state=+trk+new, tcp, in_port=veth_l0, actions=ct(commit),veth_r0"

Since the packet is returned from the connection tracking, ct_state state should be set up "trk".

In addition, if this is the first packet of TCP connections, the ct_state state should set up a "new" flag. (It is currently the case, because there is no TCP connection between 192.168.0.2 and 10.0.0.2) ctparameters "commit" will connect submit to the connection-tracking module. The significance of this is that the operation of the connection information is stored in the connection tracking module, and the packet exceeding the lifespan of the pipe.

We use scapy send TCP SYN packets (located in the "left" namespace scapy session) (flags = 0x02 is syn):

$ >>> sendp(Ether()/IP(src="192.168.0.2", dst="10.0.0.2")/TCP(sport=1024, dport=2048, flags=0x02, seq=100), iface="veth_l1")

This message will match the flow # 1 and flow # 2.

Conntrack connection tracking module will have entries in this connection:

$ ovs-appctl dpctl/dump-conntrack | grep "192.168.0.2"
tcp,orig=(src=192.168.0.2,dst=10.0.0.2,sport=1024,dport=2048),reply=(src=10.0.0.2,dst=192.168.0.2,sport=2048,dport=1024),protoinfo=(state=SYN_SENT)

Note: At this stage, if the re-transmission of TCP SYN packet, it will match again flow # 1 (since the new message is not always tracked), and it will also match stream # 2. The reason it matches with the flow # 2, despite the conntrack information about this connection, but it is not in the "ESTABLISHED" state, thus matching "new" again.

Next, TCP SYN + ACK packet from the opposite / server direction, we need the following OVS stream:

(flow #3)
$ ovs-ofctl add-flow br0 \
    "table=0, priority=50, ct_state=-trk, tcp, in_port=veth_r0, actions=ct(table=0)"
(flow #4)
$ ovs-ofctl add-flow br0 \
    "table=0, priority=50, ct_state=+trk+est, tcp, in_port=veth_r0, actions=veth_l0"

flow # 3 by the matching server (10.0.0.2) back untracked packet, and sends this message to the conntrack. (Further, we can flow # 1 and flow # 3 merge process is to remove the "in_port" match field)

TCP SYN + ACK packet after conntrack treatment, the ct_state set "est" connection establishment flag.

Note: After seeing conntrack two-way traffic, ct_state set up to connect to "est" state, but it has not seen the client's third ACK packet, it is configured reset the counter on a short-term entry of conntrack device.

Send TCP SYN + ACK packet (the "right" namespace scapy session) (= 0x12 as an ACK flag and SYN) using scapy:

$ >>> sendp(Ether()/IP(src="10.0.0.2", dst="192.168.0.2")/TCP(sport=2048, dport=1024, flags=0x12, seq=200, ack=101), iface="veth_r1")

This message will match the flow # 3 and flow # 4.

conntrack entries:

 $ ovs-appctl dpctl/dump-conntrack | grep "192.168.0.2"

 tcp,orig=(src=192.168.0.2,dst=10.0.0.2,sport=1024,dport=2048),reply=(src=10.0.0.2,dst=192.168.0.2,sport=2048,dport=1024),protoinfo=(state=ESTABLISHED)

Only after receiving the SYN and SYN ACK packet, conntrack state becomes "ESTABLISHED". But this time, if it does not receive a third ACK packet (from the client), this connection will soon be removed from the conntrack.

Next, TCP ACK packets from the client's direction, we can add the following flow matching this message:

(flow #5)
$ ovs-ofctl add-flow br0 \
    "table=0, priority=50, ct_state=+trk+est, tcp, in_port=veth_l0, actions=veth_r0"

Third transmitting TCP ACK packet (in the "left" namespace scapy session) (flags = 0x10 as ACK) using scapy:

$ >>> sendp(Ether()/IP(src="192.168.0.2", dst="10.0.0.2")/TCP(sport=1024, dport=2048, flags=0x10, seq=101, ack=201), iface="veth_l1")

This message will match the flow # 1 and flow # 5.

conntrack entries:

$ ovs-appctl dpctl/dump-conntrack | grep "192.168.0.2"

 tcp,orig=(src=192.168.0.2,dst=10.0.0.2,sport=1024,dport=2048), \
     reply=(src=10.0.0.2,dst=192.168.0.2,sport=2048,dport=1024), \
                                     protoinfo=(state=ESTABLISHED)

conntrack state remains at "ESTABLISHED" state, but now it has received the ACK from the client, even if any data is not received on this connection, it will remain in this state for a long time.

TCP data

When a TCP segment carries a payload byte sent from 192.168.0.2 to 10.0.0.2, the segment carrying the data packets will match the stream # 1, stream # 5 and after.

Scapy send a byte using TCP segment (in the "left" namespace scapy session) (flags = 0x10 is ack) ::

$ >>> sendp(Ether()/IP(src="192.168.0.2", dst="10.0.0.2")/TCP(sport=1024, dport=2048, flags=0x10, seq=101, ack=201)/"X", iface="veth_l1")

Use scapy transmitted over an ACK reply segment (in the "right" namespace scapy session) (flags = 0x10 is ack) ::

$ >>> sendp(Ether()/IP(src="10.0.0.2", dst="192.168.0.2")/TCP(sport=2048, dport=1024, flags=0X10, seq=201, ack=102), iface="veth_r1")

ACK reply data packet should match the flow # 3 and flow # 4.

TCP connection tear

There are different ways to tear down the TCP connection. We send "FIN" packet from the client, the server replies "FIN + ACK" message, then the client sends the last of the "ACK" message dismantlement connection.

From the client to the server will match all packets Flow # 1 and Flow # 5. From the server to the client will match all packets Flow # 3 and Flow # 4. A point worth noting is that even if a TCP connection being
dismantled, all packets (actually tearing down connections) still matches the "+ est" state. A message, or if it was the conntrack entry is "ESTABLISHED" state, it should continue to match ct_state of the OVS "+ est" sign.

Note: In practice, when the connection state is conntrack "TIME_WAIT" state (all TCP connection tear exchange FIN and ACK packet from the desired), a retransmission packet (from 192.168.0.2-> 10.0.0.2) , traffic still hit # 1 and # 5.

TCP FIN packet transmission ( "left" namespace scapy session) (flags = 0x11 for the ACK and FIN) scapy use:

$ >>> sendp(Ether()/IP(src="192.168.0.2", dst="10.0.0.2")/TCP(sport=1024, dport=2048, flags=0x11, seq=102, ack=201), iface="veth_l1")

This packet matching flow # 1 and flow # 5.

conntrack entries:

$ sudo ovs-appctl dpctl/dump-conntrack | grep "192.168.0.2"

  tcp,orig=(src=192.168.0.2,dst=10.0.0.2,sport=1024,dport=2048),reply=(src=10.0.0.2,dst=192.168.0.2,sport=2048,dport=1024),protoinfo=(state=FIN_WAIT_1)

Send TCP FIN + ACK packet ( "right" namespace scapy session) (flags = 0x11 for the ACK and FIN) scapy use:

$ >>> sendp(Ether()/IP(src="10.0.0.2", dst="192.168.0.2")/TCP(sport=2048, dport=1024, flags=0X11, seq=201, ack=103), iface="veth_r1")

This message hit flow # 3 and flow # 4.

conntrack entries:

$ sudo ovs-appctl dpctl/dump-conntrack | grep "192.168.0.2"

  tcp,orig=(src=192.168.0.2,dst=10.0.0.2,sport=1024,dport=2048),reply=(src=10.0.0.2,dst=192.168.0.2,sport=2048,dport=1024),protoinfo=(state=LAST_ACK)

Transmit TCP ACK packet ( "left" namespace scapy session) (flags = 0x10 as ACK) using scapy:

$ >>> sendp(Ether()/IP(src="192.168.0.2", dst="10.0.0.2")/TCP(sport=1024, dport=2048, flags=0x10, seq=103, ack=202), iface="veth_l1")

This message flow # 1 hits and flow # 5.

conntrack entries:

$ sudo ovs-appctl dpctl/dump-conntrack | grep "192.168.0.2"

  tcp,orig=(src=192.168.0.2,dst=10.0.0.2,sport=1024,dport=2048),reply=(src=10.0.0.2,dst=192.168.0.2,sport=2048,dport=1024),protoinfo=(state=TIME_WAIT)

to sum up

The following table summarizes the relationship between the TCP packet stream and match field

  +-------------------------------------------------------+-------------------+
  |                     TCP Segment                       |ct_state(flow#)    |
  +=======================================================+===================+
  |                     **Connection Setup**              |                   |
  +-------------------------------------------------------+-------------------+
  |192.168.0.2 → 10.0.0.2 [SYN] Seq=0                     | -trk(#1) then     |
  |                                                       | +trk+new(#2)      |
  +-------------------------------------------------------+-------------------+
  |10.0.0.2 → 192.168.0.2 [SYN, ACK] Seq=0 Ack=1          | -trk(#3) then     |
  |                                                       | +trk+est(#4)      |
  +-------------------------------------------------------+-------------------+
  |192.168.0.2 → 10.0.0.2 [ACK] Seq=1 Ack=1               | -trk(#1) then     |
  |                                                       | +trk+est(#5)      |
  +-------------------------------------------------------+-------------------+
  |                     **Data Transfer**                 |                   |
  +-------------------------------------------------------+-------------------+
  |192.168.0.2 → 10.0.0.2 [ACK] Seq=1 Ack=1               | -trk(#1) then     |
  |                                                       | +trk+est(#5)      |
  +-------------------------------------------------------+-------------------+
  |10.0.0.2 → 192.168.0.2 [ACK] Seq=1 Ack=2               | -trk(#3) then     |
  |                                                       | +trk+est(#4)      |
  +-------------------------------------------------------+-------------------+
  |                     **Connection Teardown**           |                   |
  +-------------------------------------------------------+-------------------+
  |192.168.0.2 → 10.0.0.2 [FIN, ACK] Seq=2 Ack=1          | -trk(#1) then     |
  |                                                       | +trk+est(#5)      |
  +-------------------------------------------------------+-------------------+
  |10.0.0.2 → 192.168.0.2 [FIN, ACK] Seq=1 Ack=3          | -trk(#3) then     |
  |                                                       | +trk+est(#4)      |
  +-------------------------------------------------------+-------------------+
  |192.168.0.2 → 10.0.0.2 [ACK] Seq=3 Ack=2               | -trk(#1) then     |
  |                                                       | +trk+est(#5)      |
  +-------------------------------------------------------+-------------------+

Note: the acknowledgment sequence number and serial number are displayed tshark capture opposite.

Flow table


     (flow #1)
     $ ovs-ofctl add-flow br0 \
        "table=0, priority=50, ct_state=-trk, tcp, in_port=veth_l0, actions=ct(table=0)"

    (flow #2)
    $ ovs-ofctl add-flow br0 \
        "table=0, priority=50, ct_state=+trk+new, tcp, in_port=veth_l0, actions=ct(commit),veth_r0"

    (flow #3)
    $ ovs-ofctl add-flow br0 \
        "table=0, priority=50, ct_state=-trk, tcp, in_port=veth_r0, actions=ct(table=0)"

    (flow #4)
    $ ovs-ofctl add-flow br0 \
        "table=0, priority=50, ct_state=+trk+est, tcp, in_port=veth_r0, actions=veth_l0"

    (flow #5)
    $ ovs-ofctl add-flow br0 \
        "table=0, priority=50, ct_state=+trk+est, tcp, in_port=veth_l0, actions=veth_r0"

Guess you like

Origin blog.csdn.net/sinat_20184565/article/details/94482558