Open vSwitch and OVN 2021 Fall Conference
The Open vSwitch and OVN 2021 Fall Conference was held online Dec. 7 and 8.
Speaker: Eelco Chaudron (Red Hat, Inc.)
This talk goes into the benefits of having statically defined user- and kernel-space tracepoints. The goal is to show how some simple static tracepoints can help you troubleshoot issues that would normally require a custom OVS build or complex perf script. We will also show how existing tools, like bpftrace and the bcc tool suite, can help you to quickly write custom tracepoint-based debug scripts.
Speaker: Yi Yang (Inspur, Inc.)
OVS DPDK is basically unusable if you enable DVR (distributed virtual routing), its performance is very poor even if you use centralized mode, we have improved OVS DPDK performance in case of VM to VM communication in the same subnet dramatically by enabling VXLAN TSO, GRO, GSO, etc, but for traffic across subnets and floating IP traffic, the performance is still inacceptable even we enabled TSO and batch processing for tap interface, the only way is we don’t use tap interfaces for OVS DPDK at all, we can do so indeed. We proposed https://review.opendev.org/c/openstack/neutron-specs/+/796746, and it has been approved, the implementation https://review.opendev.org/c/openstack/neutron/+/805930 has been there, https://review.opendev.org/c/openstack/os-ken/+/795963 has been merged, we have implemented openflow-based DVR L3 and no tap involved, L3 performance improvement is very remarkable, to be important, it also benefits OVS kernel and makes smart NIC offload better. Currently it almost becomes reality to make OVS DPDK a first-class citizen in openstack community. In this full talk, the speaker will talk more technical details and demonstrate huge performance improvement by apple to apple comparison.
Speaker: Simon Horman (Corigine, Inc.)
QoS is a critical component of production systems to ensure network resources are utilised within provisioned parameters and several QoS features have been added to OVS over time. This presentation will look at recent developments to enhance hardware offload support of two QoS facilities: packet per-second ingress policing, and metering. The focus will be on hardware offload using TC in conjunction with the kernel datapath.
Speaker: Gaetan Rivet (NVIDIA Corporation)
As hardware throughput capability increases, software network functions rely on hardware offloads to cope with the amount of traffic processed. Offloads are constantly improved in Open vSwitch, but they are only useful to the extent that they are synchronized with the software switch. In the userland datapath, offload processing is deferred to a separate thread to reduce the impact on the data plane. The asymmetry between a single thread serializing offload requests and the multiple initiators creates a bottleneck that can disrupt hardware network functions. This talk describes the work done to accelerate the offload control plane. Relevant metrics are identified, motivating a re-design of the userland offload layer and a multi-thread architecture is proposed to allow the control plane to cope with the workload. An implementation is proposed and the improvement is measured along the defined metrics. This talk presents work that has been proposed for integration to the community and is currently in review: https://patchwork.ozlabs.org/project/openvswitch/list/?series=261424&state=*
Speaker: Han Zhou (NVIDIA Corporation)
This talk will firstly introduce the OVN control plane scalability improvements we have achieved recently at NVIDIA, especially for ovn-kubernetes and network policy (ACL) related use cases. It will then cover topics for the next scale challenges and some general principles and ideas for the further improvements.
Speaker: Nick Bouliane (DigitalOcean, Inc)
At DigitalOcean, just like most public clouds, we provide a metadata service that our customers can access via http. In this session we will see how we leverage Open vSwitch to create the datapath required to access the metadata service. We will also learn what is the purpose of a metadata service and see how we use go-openvswitch to programmatically install the flows in Open vSwitch.
Speakers: Pradipta Sahoo (Red Hat, Inc.), Haresh Khandelwal (Red Hat, Inc.)
True that GENEVE overlay encapsulation is ready to serve overlay traffic for NFV use-cases. To validate the throughput performance for the GENEVE overlay network would still be a challenge when we refer to the network line rate. So far overlay network performance relies on TEP gateway where the gateway do encap/dcap operation while measuring the throughput and it is expected to take processing cycle for packets and doesn't meet the line-rate performance. To avoid performance drop, we explored the opensource Trafficgenerator (Trex) to compose the packets with GENEVE encapsulated header and meet the line-rate without Tunnel endpoint Gateway. We would like to showcase the following at this conference: - To provide an operator's outlook on Geneve hw-offload overlay network performance with ml2/OVN core service provided by OpenStack. - Trex traffic generator configuration steps with the enhancement - The problems encountered while working on OVN-HW-Offload with GENEVE protocol and resolution applied to those - Fine-tuning of the required parameters to achieve near line-rate throughput - The numbers observed & analysed under the best possible use case - And sharing our observation/findings with the wider audience
Speakers: Adrián Moreno (Red Hat, Inc.), Flavio Fernandes (Red Hat, Inc.)
Being at the core of OVS and OVN control plane, there's never too much OVSDB monitoring. The ability to print the content of a table in a user-friendly manner or quickly visualize what changes occur on the database while they happen greatly helps understand the logic behind OVN and OVS applications. What started as a toy project to test libovsdb, the golang's ovsdb IDL library, has now become a useful OVSDB monitoring tool. In this short talk we'd like to show the community how ovsdb-mon works, how easy it is to deploy on an OVN cluster and how to use it to quickly peek at a running ovsdb-server.
Speakers: Adrián Moreno (Red Hat, Inc.), Salvatore Daniele (Red Hat, Inc.)
Troubleshooting OpenvSwitch and OVN issues is a path full of thorns. It typically involves staring deeply at a huge dump of flows (openflow or datapath) while trying to get context of what OVN logical entity created them. Apart from the eye strain that is inevitably associated with this task, it's sometimes difficult to extract the datapath logic out of the plain list of flows. This becomes even more difficult when you don't have access to the live system and you only have a set of logs because you cannot make use of our beloved ovs/ovn utilities. We have tried to make this troubleshooting tasks a bit less painful by creating a set of tools that facilitate the visualization of openflow and datapath flows without leaving your console as well as automate the creation of an offline debugging environment that mimics the real one. In this talk, we'd like to present these new tools and demo them to the community. Apart from helping everyone debug OVS and OVN issues we hope to get feedback from the community that can inspire our next steps in this effort to improve OVS and OVN debuggability.
Speaker: Kevin Traynor (Red Hat, Inc.)
It is often seen that when PMD cores are a bottleneck in OVS-DPDK for throughput, performance can be scaled by adding more PMD cores. However, this can only be achieved when the datapath workload is distributed and balanced across those additional PMD cores. In this presentation we will look at some of the new functionality added to the PMD auto load balance feature in recent OVS releases. We will examine how they can be used to better balance the datapath workload across PMD cores and give the user more control of when a rebalance should occur.
Speaker: Dan Winship (Red Hat, Inc.)
Some Smart NICs have supported offloading OpenFlow processing for a while now. Now there are cards (such as the Mellanox BlueField-2) with an entire ARM system on the NIC, allowing "offload" of arbitrary software. We will show how we are modifying OpenShift to allow offloading the entire ovn-kubernetes stack onto the NIC, accelerating the network, freeing up more host CPU, and (eventually) providing increased network security by adding more separation between the control plane and the data plane.
Speakers: Sunil Pai G (Intel Corporation), Cian Ferriter (Intel Corporation), Harry Van Haaren (Intel Corporation)
Today, in many OVS deployments, a large portion of CPU time is being spent on just copying the data to and from the VM. These copy operations could be asynchronously offloaded to a DMA device to save CPU cycles to provide better performance. There are many potential solutions to enable such an offload in OVS. Currently, DPDK provides asynchronous API's for vhost-user devices for this exact purpose. In this presentation, we explore a few of these approaches with their benefits and drawbacks.
Speaker: Abhiram Sangana (Nutanix, Inc.)
By default, OVN supports only one distributed gateway port (DGP) per logical router. This prevents connecting a logical router to multiple external networks. In this talk we will explain some common scenarios that require a logical router to be connected to different external networks and how routing, NAT and Load-balancing will work with multiple DGPs.
Speaker: Numan Siddique (Red Hat, Inc.)
kind  is a tool for running local Kubernetes clusters using Docker container "nodes" for simple testing and development purposes. This talk will show with a demo - how to extend this kind k8s cluster on multiple machines adding multiple worker (fake container) nodes to test your scale needs. - how OVS and OVN is used to provide the networking for these containers "nodes".  - https://github.com/kubernetes-sigs/kind
Speaker: Ben Pfaff (VMWare, Inc.)
Network control planes built from an SDN perspective, including OVN, normally require centralized controller infrastructure. These centralized controllers are commonly built in conventional programming languages, such as C or Java. Writing code in these languages that is “obviously correct” and maintainable is challenging when it also must achieve high performance at scale. On the other hand, domain-specific languages such as DDlog promise high performance for incremental changes but they require developers to learn a new language and the associated paradigm. They also introduce new dependencies and change the way that one debugs and troubleshoots a system. I will present a case study of these alternatives for OVN. OVN’s traditional ovn-northd central controller, written in C, benefits from the many developers who are already well positioned to optimize and maintain it. On the other hand, the ovn-northd-ddlog controller, whose processing core is the DDlog domain-specific language, performs better at scale in many cases when frequent changes to the input yield small changes to the output. In OVN’s case, the traditional approach won. This talk will examine the trade-offs in detail, with attention to raw performance plus factors beyond it.
Speaker: William Tu (VMWare, Inc.)
Currently OVS supports multiple datapath implementations: Linux kernel, Windows kernel, userspace with OVS-DPDK Linux, and HW offload. Adding any new feature to OVS datapath requires OS-specific expertise and usually ends up with feature mismatch, ex: Linux kernel supports feature A, but Windows does not, and high maintenance cost . It would be great if OVS just uses single datapath across different platforms, and the datapath is portatble, performant, and easy to maintain. The natural choice is the dpif-netdev, the usersapce netdev datapath currently used by OVS-DPDK and AF_XDP, currently only works on Linux platform. So the last piece is to make OVS-DPDK runs on Windows. With this work, the OVS userspace datapath becomes the one runs on Windows/Linux/FreeBSD and any new datapath feature should naturally work across different OSes. Performance of userspace datapath should be equal or better to the kernel datapath, due to the kernel bypass design of DPDK/AF_XDP, and optimizations in OVS userspace datapath. Moving forward, OVS will have better cross-platform support, better performance, and easier to maintain. So far I haven't seen any virtual switch capable of doing all of the above. RFC patch: https://mail.openvswitch.org/pipermail/ovs-dev/2021-October/388338.html  https://dl.acm.org/doi/10.1145/3452296.3472914
Speakers: Ariel Levkovich (NVIDIA Corporation), Marcelo Leitner (Red Hat, Inc.)
Cloud native, and modern data center applications such as AI, demand high speed networking. 25Gbs and 100Gbs are the standard today with 200Gbs and 400Gbbs coming soon. Such network speeds make the use of SDN very challenging, mostly due to the heavy CPU investment needed for SDN enforcement. In this talk we will present the progress we made so far with accelerating OVN, OVN Kubernetes and OVN ML2 (OpenStack) via TC and NVIDIA ConnectX SmartNICs and BlueField DPUs, including the support for Conntrack hardware acceleration. We will also present the biggest challenges we have faced in such endeavor and the resulting improvements out of it until now. Some of these improvements helped not only with software datapath correctness but also with performance even when using other OVS datapaths, like OVS kernel datapath. We will also show performance numbers for the main use cases and future plan.
A new solution to support connection tracking in OVS to get higher performance in softeware datapath in OVS.
Speaker: Linsi Yuan (Jaguar Microsystems)
Abstract: Security group in fact is a must feature for most using scenario in cloud environment, which highly rely on the arch and solution design of connection tracking feature. Currently we see a performance degradation after add the ct action, and it's hard to offload ct action to the hw Nic with current design. In this talk, we will propose a new solution, co-work with a cloud service provider, to support the connection tracking feature. This solution has already been deployed on the online production environment and we see nearly 300% performance improvement in software datapath in OVS. At the same time, the new arch design will benefit the hw-offload support, which will make it much more easier to offload traffic to the Nic.
Speakers: Ian Stokes (Intel Corporation), Michael Phelan (Intel Corporation), Aaron Conole (Red Hat, Inc.)
To date, patches for OVS have been validated by a variety of means such as GitHub, Appveyor and Travis etc. However, with the introduction of certain features such as support for AVX512 ISA in OVS DPDK, validation of unit tests require specific platform capabilities. This talk looks to outline the ongoing joint effort between Intel and Red Hat to introduce a CI system working in tandem with the existing 0-day robot to make validation of such areas automated and publicly viewable for any given patch submitted to the mailing list. This system would also be extensible allowing for future validation of other aspects of OVS DPDK such as performance, HWOL and basic DPDK library functionality.
Speaker: Usman Ansari (VMWare, Inc.)
The OVS on Windows is getting popular! VMware is dedicating resources to add new features (I plan to contribute to OVS on the Windows platform in the coming days). Also, there is renewed interest in the industry for using OVS on Windows. People are discussing topics like Kubernetes and Windows using OVS & OVN on Azure https://www.youtube.com/watch?v=gvbv9ImM3B4, Kubernetes, and Windows using OVS & OVN on AWS https://www.youtube.com/watch?v=lc6uu-mvs1w, and more! I plan to discuss how OVS is similar and different from the Linux port and the design and architecture of the OVS data-path on Windows.
Speakers: Kuralamudhan Ramakrishnan (Intel Corporation), Todd Malsbary (Intel Corporation)
Everyone is talking about Kubernetes networking and how to add new features in it to solve the network issues in the cloud native space. You probably have heard about CNI, Network Service Mesh, and the Kubernetes Network Plumbing WG. Solving networking issues and introducing new features in Kubernetes networking is performed by CNI, a simple binary that is only invoked during pod creation and deletion. Even those already using these network components still must maintain the network as a separate entity. How can we manage the network needs in edge locations? How can we enable network features such as Service Function Chaining for 5G workloads? What if we bring OVN networking as an entity in Kubernetes using custom resource definitions and Software-Defined Networking features and concepts without muddling much into CNI or pod annotations? This talk introduces the Nodus (Latin word for "a knot") project that answers these questions. It helps the audience understand the current landscape in OVN and OVS in Kubernetes networking and shows how we can use Kubernetes labels to fulfill the network features in Kubernetes using OVN and SR-IOV VF representator with OVS. It also speaks of the need to create a de-facto Kubernetes standard for Software-Defined Networking and Service Function Chaining where the implementation can vary for different projects.
Speaker: Ilya Maximets (Red Hat, Inc.)
The OVS database server is heavily used in OVN deployments to serve the OVN_Northbound and OVN_Southbound databases. And in terms of workload, the two databases are very different. One has a small amount of clients and data coupled with a relatively high transaction rate. The other has to serve a large number of monitoring clients and handle a significant amount of stored data, while having occasional transactions to process. This talk is about performance optimizations and new database server features made over the past year that have significantly improved the performance and scalability of OVSDB for both usage patterns in OVN deployments.
Speakers: Frode Nordahl (Canonical Ltd.), Dmitrii Shcherbakov (Canonical Ltd.)
With the advent of NICs connected to multiple distinct CPUs we can have a topology where an instance runs on one host and Open vSwitch and OVN runs on a different host, the smartnic control plane CPU. There are many applications and use cases for SmartNICs with control plane CPUs. One of them are security, where you treat the host the SmartNIC happens to share a PCI complex with as untrusted, and subsequently do not allow direct communication between the two. What challenges do we meet in such a topology with the above stated requirements, and how do we solve them?
Speakers: Hemal Shah (Broadcom Corporation), Sriharsha Basavapatna (Broadcom Corporation)
Dynamic rebalancing addresses temporary resource constraints due to offload resource capacity. It was introduced in OvS 2.11 to enable efficient utilization of flow offload resources. Dynamic rebalancing applies to only OvS-TC flower offloaded flows today. In this talk, we will describe a proposal to enable dynamic rebalancing of offloaded flows in the OvS-DPDK environment for the flows offloaded using rte_flow.
Speaker: Mark Michelson (Red Hat, Inc.)
OVN is starting to become the default SDN implementation for a number of large CMSes, such as OpenStack and OpenShift. While previous development efforts towards this goal have included developing new features for OVN, for OVN 21.09, the community effort focused much more on scalability and performance. This talk will focus on what metrics were used for control plane testing, examples of optimizations in OVN, and the overall results of our work. This talk will also hint at future optimizations, some of which are already written, and others which are coming.
Speaker: Frode Nordahl (Canonical Ltd.)
More than a decade has passed since I first heard about the prospect of OpenFlow. At the time, the mere thought of having physical switch hardware forwarding table decisions dependent on an externally located OpenFlow controller, was terrifying. Revisiting the thoughts today, this is still the most prevalent problem. Open "white box" switches have come a long way since then, but the focus appears to be on compatibility with the traditional networking world still, and distributed OpenFlow controllers for physical switches is hard to come by. The pair of Open vSwitch and OVN makes a very good distributed OpenFlow controller for virtual networking today. A lot of work is going into making OVS program switch ASICs through various standard interfaces, and both projects are written in highly portable and efficient C code, suitable for running in constrained environments. Is the time nigh for OVS and OVN to start thinking about entering the physical world?
Speakers: Hemal Shah (Broadcom Corporation), Sriharsha Basavapatna (Broadcom Corporation)
OvS-DPDK full vhost offload with SR-IOV can be supported by leveraging VF representor, virtio forwarder, and rte_flow infrastructures. The recent enhancements in virtio forwarder enables simple match processing for forwarding of Ethernet packets. In this talk, we will describe how virtio forwarder enhancements can be leveraged to improve the performance of OvS-DPDK full vhost offload with SR-IOV.
Speaker: Vasu Dasari (Hewlett Packard Enterprise)
Currently Open vSwitch implements various packet encapsulation technologies like VxLAN/GRE, etc for network virtualization. All these technologies transport user packets over tunnels. These tunnels are representations of encapsulation interfaces. They add encapsulation header which includes L2 and L3 parameters identifying path to take on underlay network, on top of user payloads and push them out of underlay interfaces. Similarly they decapsulate the tunnel packets on ingress and present user payloads to access ports based on OpenFlow rules. Creation of tunnel interface is done by user via OVSDB by specifying all L3 parameters or via OpenFlow rules. L2 information needed during encapsulation time is fetched from underlying TCP/IP stack which includes eth-src and eth-dst and physical port to take for outgoing packet. The absence of ability to specify L2 information restricts the SDN controller from setting up switch remotely in its entirety as there will be IP configuration which is not done via OpenFlow or OVSDB. Administrator has to perform such a configuration on the switch. This talk presents a design and implementation of a way this can be achieved in OVS userspace domain. The proposal is to give an option to specify L2 information but not requiring it. If user does not specify the L2 information, switch would continue to follow the logic as it is done today.
Speaker: Vasu Dasari (Hewlett Packard Enterprise)
As part of OVS feature development generally developers do write tests to verify the feature and to make sure future commits do not break the feature by making them part of the testing framework. When a test fails the developer relies on logs to see what has gone wrong. Developer might want to recreate the bug by running the test case and leave the setup in the state when the bug has occurred so that he/she can poke the switch further. To be able to do that, the test case execution has to be paused when the problem occurs. A new environment OVS_PAUSE_TEST is introduced in OVS. When the variable is set, test case case execution pauses and leaves the setup in the bad state and when done with the looking around developer can unfreeze the test case so the testing framework can clean up the environment and exit cleanly. Here a short demo on how this is done is presented.
Speakers: Hemal Shah (Broadcom Corporation), Sriharsha Basavapatna (Broadcom Corporation)
OvS-DPDK offloading of VXLAN encap/decap functionality has finally been accepted and released in OvS 2.16. OvS-DPDK VXLAN encap/decap offload is a fairly complex offload model where the decap offload happens in two stages. In this talk, we will provide an overview of the challenges with multi-stage offloading and how they were met by the OvS-DPDK full VXLAN offload design. We will also provide an implementer’s view of all possible offload sequences that a given implementation has to consider.
Speakers: Sergey Madaminov (Stony Brook University), William Tu (VMWare, Inc.)
As of now, OVS uses the autotools suite as its build system. Even though it is a powerful set of tools, it has few disadvantages. Autotools have a relatively steep learning curve, which can impede contributions. Furthermore, the generated Makefile is enormously large (over 7000 lines) and it hampers the ability to debug the build process whenever something goes wrong. Albeit working well on Linux, autotools suite has no native support for Windows and requires running it in the emulated shell such as MSYS2. Doing so introduces one more source of issues and slows down the build process. Last year, DPDK switched to the meson build system replacing previously used make. Following its steps, this year I started porting OVS to meson and submitted an RFC for thas work. Currently, OVS can be built on both Linux and Windows using meson and Clang compiler. In this talk, I’m going to discuss the advantages of using this newer build system and present the current progress of the work as well as touch base on the next steps to be done to fully replace the build system of the OVS.
High-performance packet parsing in OVS: Using AVX512 for Miniflow Extract and Protocol Specific Hashing
Speakers: Harry Van Haaren (Intel Corporation), Kumar Amber (Intel Corporation)
This talk details how the packet parsing in OVS 2.16 has been optimized with AVX512. The function known as "miniflow extract" extracts metadata from each packet into a "miniflow" datastructure. This talk explains how the AVX512 SIMD implementation improves performance significantly, while being easy to test, and scales when adding new protocols. Further optimizations around protocol specific hashing are presented too, for consideration into the OVS 2.17 release.
Speakers: Harry Van Haaren (Intel Corporation), Emma Finn (Intel Corporation)
OVS is a flexible vswitch, allowing high flexibility in modification of packets as they pass through the switch. This flexibility allows OVS to deploy many different scenarios and use-cases, but what is the performance hit of having this flexibility? Can we improve the performance of "actions" in OVS by leveraging latest software optimization techniques? This talk shows how OVS Actions can be improved to reduce overhead, allow easy testing, and make use of CPU features such as AVX512 to gain higher performance.
Speakers: Harry Van Haaren (Intel Corporation), Kumar Amber (Intel Corporation), Cian Ferriter (Intel Corporation)
This talk informs the audience of next steps in improving the OVS datapath with software optimizations. Last year the DPIF component was optimized for "outer" packets only. This talk continues the theme, informing on the changes and benefits of optimizing both DPIF and Miniflow Extract (MFEX) for handling "inner" (or "recirculated") packets. An AVX512 optimized implementation of DPIF and MFEX for inner packet handling will be used for showing performance benefits.
Speakers: Rony Efraim (NVIDIA Corporation)
Mellanox started the OVS HW offload long time ago, we first show it in openVswitch conference at 2016. Since then we improving HW capability’s and functionality like GENEVE, CT, NAT, miters and DPDK support . Today we offload the entire OVN to the DPU. Getting a 100Gb/s for full OVN-K8s pipeline with HW acceleration. The OVN offload is used for virtualization and for infrastructure of bare metals, to enhance the security to get zero trust of the host.