Why does ROS2 use DDS as a communication middleware?

Hello everyone, I am Xiaoyu. Today, I will translate an article "Using DDS on ROS" to show you why ROS2 chooses DDS as middleware.

Overview

This article presents a case study of using DDS as middleware for ROS, outlines the advantages and disadvantages of this approach, and considers the impact of using DDS on user experience and code APIs. The results of the "ros_dds" prototype are also summarized for exploring related issues.

related terms

Why Consider DDS

When exploring options for ROS' next-generation communication system, the initial options were to improve the ROS 1 transport or build new middleware using component libraries such as ZeroMQ , Protocol Buffers, and zeroconf (Bonjour/Avahi).

However, in addition to these options (both of which involve us building middleware from parts or from scratch), other end-to-end middleware are also considered.

A striking middleware, DDS, was found in our research.

end-to-end middleware

The benefit of using an end-to-end middleware such as DDS is that there is much less code to maintain, and the behavior and exact specification of the middleware has been distilled into the documentation.

In addition to system-level documentation, DDS provides recommended use cases and software APIs. With this specific specification, third parties can review, audit, and implement middleware with varying degrees of interoperability. This is something ROS has never had, except for some basic descriptions and reference implementations in the wiki. Also, if you're building new middleware from an existing library, you'll need to create this type of specification anyway.

The disadvantage of using end-to-end middleware is that ROS must work within the existing design. If the design is not tailored to the relevant use case or is inflexible, you may need to work around the design. In a way, adopting end-to-end middleware includes adopting the philosophy and culture of that middleware, and it's not that simple.

What is DDS

DDS provides a publish-subscribe transport, which is very similar to ROS's publish-subscribe transport. DDS uses the "Interface Description Language (IDL)" defined by the Object Management Group (OMG) for message definition and serialization. DDS has a request-response transport, similar to ROS's service system, in beta 2 in June 2016 (called DDS-RPC ).

The default discovery system provided by DDS requires the use of the publish-subscribe transmission of DDS, which is a distributed discovery system. This allows any two DDS programs to communicate without the need for a tool like ROS Master. This makes the system more fault tolerant and flexible.
However, there is no need to use a dynamic discovery mechanism, as several DDS vendors offer static discovery options.

Where does DDS come from

DDS started out as a group of companies with similar middleware frameworks and became the standard when regular customers wanted a friendlier switch between vendors. The DDS standard was created by the Object Management Group, who brought us UML, CORBA, SysML, and other common software-related standards.

In your opinion, this may be a positive or negative endorsement. On the one hand you have a perennial standards committee that obviously has a huge impact on the software engineering community, but on the other hand you have a slow-moving body that is slow to adapt to change, so arguably not always Staying the same Keep up to date with the latest trends in software engineering.

DDS started out as several similar middlewares, eventually they became so close that it made sense to write a standard to unify them. So in this way, even though the DDS specification was written by a committee, it has evolved into its current form in response to user needs. This organic evolution of the specification before it is approved helps alleviate concerns that systems are designed in a vacuum and perform poorly in real-world environments.

There are committees that come up with well-intentioned and well-described norms that no one wants to use or don't meet the needs of the communities they serve, but that doesn't seem to be the case with DDS.

Another problem is that DDS is a static specification that is defined and used in "legacy" systems, but is not kept up to date.
This stereotype comes from horror stories about things like UML and CORBA, which are also products of OMG.
] Instead, DDS seems to have an active and organic specification, with more specifications added or in the process of being added in the recent past, such as websockets, SSL security, extensible types, request and response transport, and a new, more specification. A modern C++11-style API specification for the core API to replace the existing C++ interface.

This evolution of the DDS standards body is encouraging, although relatively slowly, it is evolving to meet the needs of its users compared to software engineering technical trends.

Technology reputation

DDS has an extensive list of various installations that are often mission critical.
DDS has been used to:

  • battleship
  • Large utilities such as dams
  • Financial system
  • space system
  • flight system
  • Train switchboard system

and many other equally important and varied scenarios. These successful use cases make the design of DDS both robust and flexible.

Not only does DDS meet the needs of these use cases, but after talking to users of DDS (government and NASA employees who are also ROS users in this case), they all praise it for its reliability and flexibility. These users will notice that the flexibility of DDS comes at the cost of complexity . The complexity of the API and the configuration of the DDS are problems that ROS needs to solve.

The DDS bus specification ( DDSI-RTPS ) is very flexible and can be used for reliable high-level system integration as well as time applications on real embedded devices. Some DDS vendors provide special DDS implementations for embedded systems that have specifications related to library size and memory footprint, in the tens or hundreds of kilobytes.

Since DDS is implemented over UDP by default, it does not rely on a reliable transport or hardware for communication. This meant that DDS had to reinvent the reliability wheel (basically TCP plus or minus some features), but in exchange DDS gained portability and control over behavior.

Used to control reliability parameters, DDS, known as Quality of Service (QoS), provides maximum flexibility in controlling communication behavior.
For example, if you are concerned about latency, such as soft real-time, you can basically tune the DDS to be just a UDP packet.
In another case, you might want something that behaves like TCP, but needs to be more tolerant of long losses, and with DDS, all of this stuff can be controlled by changing QoS parameters.

While the default implementation of DDS is over UDP and requires only that level of functionality for transport, the OMG also added support for DDS over TCP in version 1.2 of its specification.
At a glance, both vendors (RTI and ADLINK Technologies) support DDS over TCP.

From RTI's website ([http://community.rti.com/kb/xml-qos-example-using-rti-connext-dds-tcp-transport](http://community.rti.com/kb/xml - qos-example-using-rti-connext-dds-tcp-transport)):

By default, RTI Connext DDS uses UDPv4 and shared memory transport to communicate with other DDS applications.
In some cases, discovery and data exchange may require the TCP protocol.
For more information on the RTI TCP transport, see the section titled "RTI TCP Transport" in the RTI Core Libraries and Utilities User Manual.

From ADLINK Tech's website, they support TCP since OpenSplice v6.4:
https://www.adlinktech.com/en/data-distribution-service.aspx

Suppliers and Licensing

OMG worked with several companies to define the DDS specification, which are now major DDS suppliers.
Popular DDS providers include:

  • RTI
  • ADLINK Technology
  • TwinOaks Software

As of the Xiaoyu translation article (202202), more DDS suppliers have appeared.
DDS supplier

Among these vendors, there is a range of reference implementations with different policies and licenses.

OMG maintains an active list .

In addition to vendors that provide implementations of the DDS specification API, there are software vendors that provide more direct access to implementations of the DDS wire protocol, RTPS.
E.g:

  • eProsima

These RTPS-centric implementations are also interesting because they can be smaller in scope and still provide the functionality needed to implement the necessary ROS functionality on top.

RTI's Connext DDS is available under a custom "community infrastructure" license that is compatible with the needs of the ROS community, but requires further discussions with the community to determine its viability as a default DDS provider for ROS.

By "compatible with the needs of the ROS community" we mean that although it is not an OSI-approved license , research has shown that it is sufficient to allow ROS to maintain a BSD-style license and make it available to anyone in the ROS community as source code or Redistribute it in binary form.

RTI also seems willing to negotiate licenses to meet the needs of the ROS community, but it will take some iteration between the ROS community and RTI to ensure this works.
Like other vendors, the license is available for the core feature set, basically the basic DDS API, while other parts of their product (like development and introspection tools) are proprietary. RTI appears to have the largest online presence and installed base.

ADLINK Technologies' DDS implementation OpenSplice is licensed under the LGPL, the same license used by many popular open source libraries such as glibc, ZeroMQ, and Qt. It's available on Github :

https://github.com/ADLINK-IST/opensplice

ADLINK Technology's implementation comes with a basic, powerful build system and is very easy to package. OpenSplice appears to be the second largest DDS implementation in use, but that's hard to pin down. TwinOaks' implementation of CoreDX DDS is only proprietary, but apparently they focus on minimal implementations that can run on embedded devices or even bare metal.

The FastRTPS implementation of eProsima is available on GitHub under the LGPL license:
https://github.com/eProsima/Fast-RTPS

eProsima Fast RTPS is a relatively new, lightweight, open source RTPS implementation. It allows direct access to RTPS protocol settings and functions, which is not always possible in other DDS implementations. eProsima's implementation also includes a minimal DDS API, IDL support, and automatic code generation, and they are willing to work with the ROS community to meet their needs.

Given the relatively strong LGPL options and the encouraging but bespoke license from RTI, it seems that relying on or even distributing DDS as a dependency should be straightforward. One of the goals of this proposal is to make ROS 2 DDS vendor-agnostic.
So, for example, if the default implementation is Connext, but someone wants to use one of the LGPL options like OpenSplice or FastRTPS, they can use their implementation choice by simply recompiling the ROS source and changing some parameters.

This is possible because DDS defines an API in its specification. Research has shown that it is possible, if not a bit of a pain, to write vendor-neutral code , as the APIs are pretty much the same across vendors, but with subtle differences in return types (pointers vs. shared_ptr stuff) and header file organization, etc. .

Spirit and Community

DDS comes from a group of decades-old companies and was laid out by OMG, an old-school software engineering organization mostly used by government and military users. So it's no surprise that the DDS community looks very different from the ROS community and similar modern software projects like ZeroMQ.

Although RTI has a respected online presence, questions from community members are almost always answered by RTI employees, and despite being technically open source, neither RTI nor OpenSplice have taken the time for Ubuntu or Homebrew or any other modern Packages provide packages. They don't have an extensive user-contributed wiki or active Github repository.

This apparent disparity in spirit between communities is one of the most worrisome issues when relying on DDS. Unlike options like keeping TCPROS or using ZeroMQ, DDS doesn't feel like a large community can rely on. However, DDS vendors have been very active in our inquiries during our research, and it's hard to say if this will continue when the ROS community asks questions.

While this should be taken into account when deciding to use DDS, it should not disproportionately outweigh the technical pros and cons of the DDS proposal.

ROS is built on top of DDS

The goal is to make DDS an implementation detail of ROS 2. This means that all DDS specific API and message definitions need to be hidden.
DDS provides discovery, message definition, message serialization, and publish-subscribe transport. So DDS will provide ROS with discovery, publish-subscribe transport, and at least low-level message serialization.

ROS 2 will provide a ROS 1-like interface on top of DDS, which hides most of the complexity of DDS from most ROS users, but then provides separate access to the underlying DDS implementation for users with extreme use cases or requiring integration Other existing DDS systems.

DDS and ROS API layout

Access to the DDS implementation will need to depend on additional packages that are not normally used. This way, you can tell if a package is bound to a specific DDS vendor by looking at package dependencies. The goal of the ROS API on top of DDS should be to meet all common needs of the ROS community, because once users tap into the underlying DDS system, they will lose portability between DDS providers.

Portability between DDS vendors is not intended to encourage people to frequently choose different vendors, but to enable advanced users to choose a DDS implementation that meets their specific requirements, as well as future-proof ROS for changes in DDS vendor options.

There will be a recommended and most supported default DDS implementation for ROS.

Discover

DDS will completely replace the ROS Master based discovery system. ROS needs to utilize the DDS API to get a list of all nodes, a list of all topics and how they are connected. Access to this information will be hidden behind ROS-defined APIs, preventing users from having to call DDS directly.

The advantage of the DDS discovery system is that, by default, it is fully distributed, so there is no need for a central node between parts of the system to communicate with each other. DDS also allows user-defined metadata to be used in their discovery system, which will enable ROS to piggyback higher-level concepts into publish-subscribe.

publish-subscribe transport

The DDSI-RTPS (DDS-Interoperable Real-Time Publish-Subscribe) protocol will replace ROS's TCPROS and UDPROS wire protocols for publish/subscribe.

The DDS API provides more participants to the typical publish-subscribe model of ROS 1. In ROS, the concept of a node is most obviously similar to the actors in the communication graph in DDS.

Participants in a communication graph can have zero or more topics, which is very similar to the concept of topics in ROS, but represented in DDS as separate code objects that are neither subscribers nor publishers. Then, from a DDS topic, DDS subscribers and publishers can be created, but they are also used to represent the subscriber and publisher concepts in DDS, rather than directly reading data from or writing data to the topic.

In addition to topics, subscribers, and publishers, DDS also has the concept of DataReaders and DataWriters, which are created by subscribers or publishers, then specialized for a specific message type, and then used to read and write data to a topic . These additional layers of abstraction allow advanced configuration of DDS because you can make QoS settings at each level of the publish-subscribe queue, providing the highest possible configuration granularity.

Most of these abstraction levels are not necessary to meet the current requirements of ROS. Therefore, packaging common jobs under simpler ROS-like interfaces (nodes, publishers, and subscribers) would be a way for ROS 2 to hide the complexity of DDS while exposing some of its functionality.

Efficient transmission method

In ROS 1, there was never a standard shared memory transport because it was negligibly faster than a localhost TCP loopback connection.

Extraordinary performance improvements can be obtained by doing a zero-copy form of shared memory between processes, but nodelets are used whenever tasks need to be faster than localhost TCP in ROS 1.

Nodelets allow publishers and subscribers to share data by boost::shared_ptrpassing to messages.
This in-process communication is almost certainly faster than any inter-process communication option, and is orthogonal to the discussion of network publish-subscribe implementations.

In the context of DDS, most vendors will use shared memory to transparently optimize message traffic (even between processes), using wire protocols and UDP sockets only when leaving localhost. This provides a considerable performance boost for DDS, which ROS 1 does not, because the localhost network optimization happens at call sendtime .

For ROS 1, the process is: serialize the message into a large buffer, and call TCP's "send" once on the buffer. For DDS, the process is more like: serialize the message, break the message into possibly multiple UDP packets, call UDP "send" multiple times. Sending many UDP datagrams this way will not sendbenefit from the same speed as one large TCP.
Therefore, many DDS vendors shorten this process for localhost messages and use a blackboard-style shared memory mechanism for efficient communication between processes.

However, not all DDS vendors are the same in this regard, so ROS does not rely on this "smart" behavior for efficient in- process communication.

Also, if the ROS message format (discussed in the next section) is preserved, there is no way to prevent in-process topics from being converted to DDS message types. Therefore, a custom in-process communication system needs to be developed for ROS that never serializes or converts messages, but instead uses DDS topics to pass pointers (pointing to shared in-process memory) between publishers and subscribers.

For example, custom middleware built on ZeroMQ will require this same in-process communication mechanism.

One thing to point out here is that efficient intra-intra- process communication will be addressed regardless of the network/inter-process implementation of the middleware .

message-Message

The current ROS message definition has a lot of value. The format is simple, and the messages themselves have evolved over the years of use by the bot community. Much of the semantic content of current ROS code is driven by the structure and content of these messages, so preserving the format and in-memory representation of messages is of great value. To achieve this, and to make DDS an implementation detail, ROS 2 should retain ROS 1 as well as message definitions and in-memory representations.

Therefore, ROS 1 .msgfiles will continue to be used, and .msgfiles converted to .idlfiles so that they can be used with DDS transfers. Language-specific files will be generated for .msgfiles and .idlfiles, as well as conversion functions for converting between ROS and DDS memory instances.

The ROS 2 API will exclusively work with ".msg" style message objects in memory and convert them to ".idl" objects before publishing.

message capture

At first, the idea of ​​converting a message field-by-field to another object type to publish on every call seemed like a huge performance issue, but experiments have shown that the cost of this copying is trivial compared to the serialization cost.

The ratio between the cost of converting the type and the cost of serialization is at least an order of magnitude and works for every serialization library we've tried, except [Cap'n Proto](http://kentonv.github.io/capnproto/) There is no serialization step. So if field-by-field copy doesn't work for your use case, neither will serialization and transfer over the network, at which point you'll have to take advantage of in-process or zero-copy inter-process communication.

In-process communication in ROS does not use the DDS memory representation, so this field-by-field copying is not used unless the data is transferred to the network. Since this transformation is only invoked in conjunction with the more expensive serialization step, field-by-field copying seems to be .msga reasonable trade-off for the portability and abstraction provided by preserving the ROS file and in-memory representation.

This does not preclude options to improve the ".msg" file format with default values ​​and optional fields, etc. But that's a different trade-off that can be decided later.

Services and Actions

DDS currently has no approved or implemented request-response RPC standard that can be used to implement the service concept in ROS. Currently the OMG DDS working group is considering approval of an RPC specification, and some DDS vendors have draft implementations of the RPC API.

However, it is unclear whether the standard is applicable to operations, but it could at least support non-preemptible ROS service versions. ROS 2 can implement services and operations on top of publish-subscribe (which is more feasible in DDS because of their reliable publish-subscribe QoS settings) or it can use the DDS RPC specification after the service is complete and then build operations on top of it , as in ROS 1.

Either way, Actions will be first-class citizens in the ROS 2 API, and services may just be degenerate cases of actions.

language support

DDS vendors typically provide at least C, C++, and Java implementations, since the APIs for these languages ​​are well-defined by the DDS specification.
Research has not found any complete version of DDS for Python. Therefore, one of the goals of the ROS 2 system is to provide a first-class, fully functional C API.

This will allow for easier binding in other languages ​​and make the behavior more consistent between client libraries since they will use the same implementation. Languages ​​such as Python, Ruby, and Lisp can wrap the C API in a compact language-idiomous implementation.

The actual implementation of ROS can use the C language, use the C DDS API, or use the C++ language to use the DDS C++ API, and then wrap the C++ implementation in the C API of the other language.

Implementing in C++ and wrapping in C is a common pattern, as ZeroMQ does for example.
However, the author of ZeroMQ did not do this in his new library nanomsg , citing the added complexity and the C++ stdlib as a dependency.

Since the C implementation of DDS is usually pure C, a pure C implementation can be provided for the ROS C API, all the way up to the DDS implementation.
However, writing the entire system in C may not be the primary goal, and in order for a minimum viable product to work, the implementation may be started in C++ and wrapped in C, which can later be replaced with C if it seems necessary.

DDS as a dependency

One of the goals of ROS 2 is to reuse as much code as possible ("don't reinvent the wheel") while minimizing the number of dependencies to improve portability and keep the build dependency list lean.

These two goals are sometimes at odds, as it is often a choice between implementing something internally or relying on an external source (dependency) to achieve it.

This is where the DDS implementation shines, as two of the three DDS vendors being evaluated build on Linux, OS X, Windows, and other more exotic systems, with no external dependencies.

The C implementation only depends on the system library, the C++ implementation only depends on the C++03 compiler, and the Java implementation only needs the JVM and the Java standard library.
The C, C++, Java, and C# implementations of OpenSplice (LGPL) are bundled as binaries (during prototyping) on ​​Ubuntu and OS X, less than 3 MB in size, and have no other dependencies.

In terms of dependencies, this makes DDS very attractive because it significantly simplifies building and running dependencies for ROS.
Also, since the goal is to make DDS an implementation detail, it may be removed as a transitive runtime dependency, meaning it doesn't even need to be installed on the deployed system.

ROS on DDS Prototype

After investigating the feasibility of ROS on DDS, several questions remain, including but not limited to:

  • Can ROS 1 APIs and behaviors be implemented on top of DDS?
  • Is it possible to generate IDL messages from ROS MSG messages and use them with DDS?
  • How hard is it to package (as a dependency) a DDS implementation?
  • Does the DDS API specification really make DDS vendor portability a reality?
  • How hard is it to configure DDS?

To answer some of these questions, a prototype and several experiments were created in this repository:

https://github.com/osrf/ros_dds

More questions and some results captured as questions:

https://github.com/osrf/ros_dds/issues?labels=task&page=1&state=closed

The main work in this repository is in prototypethe folder and is a ROS 1-like implementation of the node, publisher, and subscriber APIs using DDS:

https://github.com/osrf/ros_dds/tree/master/prototype

Specifically, this prototype includes these packages:

This is a quick prototype to answer questions, so it doesn't represent a final product or release at all. Once key questions were answered, the work of some features stopped. rclcpp_exampleThe examples in the package show that it is possible to implement the basic ROS-like API on top of DDS and get familiar behavior.
This is by no means a complete implementation, nor does it cover all features, but is for educational purposes and addresses most of the doubts about using DDS.

The generation of the IDL file proved to have some sticking points, but it was solved in the end, and implementing something as basic as a service proved to be a tractable problem.
In addition to the basic parts above, a pull request was drafted that rclcppsuccessfully std_msgscompletely hides the DDS symbols from any publicly installed and headers:
https://github.com/osrf/ros_dds/pull/17

This pull request was ultimately not merged because it was a major refactoring of the code structure and other progress had been made in the meantime.
However, its purpose is that it shows that the DDS implementation can be hidden, although there is room for discussion as to how this can actually be achieved.

in conclusion

After working with DDS and being skeptical of spirit, community, and licensing, it's hard to make any real technical comments. While it is true that the community around DDS is very different from the ROS community or the ZeroMQ community, DDS appears to be just a solid technology that ROS can safely rely on.

There are still many questions about how exactly ROS leverages DDS, but at this point, they all seem to be engineering exercises rather than potential deal breakers for ROS.

Article source: William Woodall Written on 2014-06

Guess you like

Origin blog.csdn.net/qq_27865227/article/details/123178792