Alternative transport protocols in UAVCAN

Hey there… just wanted to give you a quick update since I’ve been quiet for a bit.

Basically, I’ve written about 3500 lines of code and have a very pre-alpha library which implements UAVCAN/Serial over USB and WiFi TCP/IP for the ESP chips using the Arduino SDK. It handles multiple concurrent connections, and implements most of the current alpha spec including the datatype hash ID’s. (which I actually like in their current form, btw.) It cuts corners all over the place, but those corners will get filled in once there’s enough working to act as a proper testing framework. Most of the missing corners are in the transport state machine, eg: I’m de-duplicating frames on a short timer (as we discussed before) but I’m not really enforcing the strict sequential order yet, not sending redundant packets, no multiple frame reassembly, that kind of thing.

The repo is here: https://github.com/JediJeremy/libuavesp

I’ve got it working in “loopback” mode (Heartbeat and NodeInfo) and one of my next tasks is getting pyuavcan to connect to the device over TCP/IP and validate that the two libraries can talk.

Alas I’ve been having some trouble with that (Python’s not my first or favorite language) so I’m wondering if there’s an example somewhere? I’ve tried writing one based on the code snippets in the documentation, but can’t seem to get it to make a connection. (as in, it doesn’t even seem to open a TCP/IP connection to my device)

The other big question I have is if there are any WireShark extensions/extras that make debugging UAVCAN packets easier? I’ve only just installed WireShark so I don’t know much about it, but it seems like something that might have been done?

I’m currently working on the UDP transport but I’m having some issues with that… won’t bore you with the details but let’s just say the ESP’s networking API isn’t that well documented and everything works fine so long as I don’t actually try to send the packet. :unamused: Assembling the packet, opening and closing the port is all fine. sigh Been banging my head on that one for a couple of weeks now, digging ever deeper into the APIs.

What I’m finding is that there is a big overhead in the “simple” APIs which assumes I’m going to create an object (with packet buffers) for each UDP port I intend to send OR receive on. (which would be dozens to hundreds of ports for a complex UAVCAN node) And trying to use the “low level” API to bypass that overhead is causing hard crashes. I’ll get it soon, one way or the other.

The idea of using different UDP ports for each service/subject makes sense if you have lots of memory and hardware-level packet filtering (which the ESP does not) but it also means constructing a parallel set of objects, callbacks, and ports inside the networking API that mirrors what the library keeps for the other transports. I am seriously thinking of putting the ESP into “promiscuous” mode and filtering/decoding each WiFi packet myself, since there may actually be a hard limit on how many UDP ports I’m allowed to have open - possibly only a few dozen. Still working that out.

Once I can make my crash problems go away, I’ll finally have multiple independent devices on the WiFi network exchanging messages without a central point of failure… I mean, central server. :grin: (if you don’t count the WiFi AP)

Once that is functioning, I’ll get the transport state machine fully up to spec. while also looking at creating an entirely new transport using the WiFi P2P/Mesh networking features of the ESP, which means I can even do without the WiFi access point. That’s the Holy Grail for me.

I’m also developing an even deeper hatred of C++ than I already had, but I suspect you know all about that.

1 Like

You should be aware that recently a design flaw was identified in the UAVCAN/serial packet framing (credits to @VadimZ); see the thread UAVCAN/serial: issues with DMA friendliness and bandwidth overhead. We are going to be replacing the current byte stuffing logic with COBS. That will allow us to gain a near-constant overhead of ~0.4% instead of the variable 0~100%. Vadim doesn’t seem to be available at the moment to replace the framing logic of the UAVCAN/serial implementation in PyUAVCAN, so we could use help here.

Well, that’s a whole hour of writing incremental encoder/decoder state-machines down the drain. :stuck_out_tongue_winking_eye:

I’m fine with the COBS proposal in general although I’ve got a comment or two that I might append to that discussion, which hopefully doesn’t blow up the consensus. See you over in that thread…

Stabilizing uavcan.node.port.List.0.1, introspection, and switched networks

In the interest of advancing the DS-015 effort, I was working on the set of optional UAVCAN application-level capabilities that are to be made mandatory by DS-015. That forced me to return to the long-postponed issue of stabilizing the port discovery services under uavcan.node.port. (which are, as of today, bear the version number 0.1), particularly the listing service uavcan.node.port.List. Currently, the service is implemented using the pull model, where the inspected node is to respond with a list of the subject- or service-identifiers when requested by the caller.

My recent work on the Interface Design Guidelines brought it to my attention that the design of this particular service (as in architecture, not UAVCAN-service) is not in perfect alignment with our guidelines. Specifically, the pull model does not provide means of notifying inspectors of a change in the pub/sub configuration on the inspected node, and introduces a certain degree of statefulness into the service that is easily avoidable. Additionally, leaving the decision on when to invoke the service whose handling may be time-consuming to external agents complicates the implementation of robust and predictable scheduling on the local node. These observations prompted me to consider replacing the port list service with a subject (still optional, of course) that is to be published periodically and on-change to announce the current subscription configuration of the local node. The publisher configuration can be subjected to the same treatment but it is not immediately required by the application-level objectives at hand, and as such, this part of the problem can be postponed indefinitely – after all, the publication set is always trivially observable on the network by means of mere packet monitoring, which is not the case for the subscription configuration.

The introspection I am speaking about here is vital for the facilitation of the advanced diagnostic tools such as Yukon with its Canvas (shown below), which would be unable to display any meaningful interconnection information without being able to detect not only outputs but also inputs.

yukon-canvas-early-demo

Beyond introspection, this capability is important in switched network transports for automatic configuration of the switching logic. As I briefly touched in the OP post, specialized AFDX switches implement routing and network policy enforcement that are to be configured statically at the system definition time (practical installations of AFDX may employ rigid time schedules generated with the help of automatic theorem provers); acting as the traffic policy enforcer, an AFDX switch is able to confine the fault propagation should one of its ports be affected by a non-conformant emitter (the so-called babbling idiot failure). The output ports for incoming frames are selected based on the static configuration of virtual links; to avoid delving into the details of that technology, this can be thought of as a static switching table.

In the UAVCAN/UDP broadcast model discussed in the OP post, the output port contention and the resulting latency issues are proposed to be managed by statically configuring L2 filters at the switch such that the data that is not relevant at the given port is to be dropped by the switch, thus reducing the latency bound of the relevant data. The following synthetic example illustrates the point – suppose that the camera node generates significant traffic that is not desired beyond the left switch, and the perception node generates low traffic destined towards the mission computer node where the latency is critical:

If the output ports were to be left unfiltered, the traffic from the camera would have been propagated towards the right-side subnet, increasing the output port contention and the latency envelope throughout.

The static L2-filter configuration is efficient at managing the port contention issues if we consider its technical merits only, but it creates issues for quickly evolving applications with lower DAL/ASIL levels where the need to reconfigure the switch (or the excessive rebroadcasting that would result if no filtering is configured) may create undue adoption obstacles.

Drawing upon the theory explained in Safety and Certification Approaches for Ethernet-Based Aviation Databuses [Yann-Hang Lee et al, 2005], section 5.2 Deterministic message transmission in switched network, it is apparently trivial to define a parallel output-queued (POQ) specialized switch that is able to derive the output port filtering policy automatically by merely observing the subscription information arriving from said port regardless of the number of hops and the branching beyond the port. I will leave this as a note to my future self to expand upon this idea later because these technicalities are not needed in this discussion.

The described ideas rely on a compact representation of the entirety of the subscription state of a given node in one message. Let the type be named uavcan.node.port.Subscription. There are a few obvious approaches here.

  1. Bitmask-based. Given the subject-ID space of 2^{15} elements, the required memory footprint is exactly 4096 bytes.
# Node subscription information.
# This message announces the interest of the publishing node in a particular set of subjects.
# The objective of this message is to facilitate automatic filtering in switched networks and network introspection.
# Nodes should publish this message periodically at the recommended rate of ~1 Hz at the priority level ~SLOW.
# Additionally, nodes are recommended to publish this message whenever the subscription set is modified.

uint8 SUBJECT_ID_BIT_LENGTH = 15

bool[2 ** SUBJECT_ID_BIT_LENGTH] subject_id_mask
# The bit at index X is set if the node is interested in receiving messages of subject-ID X.
# Otherwise, the bit is cleared.
  1. List-based with inversion. The worst-case size is substantially higher – over 32 KiB. The advantage of this approach is that low-complexity nodes will not be burdened unnecessarily with managing large outgoing transfers since their subscription lists are likely to be very short. The disadvantage is the comparatively high network throughput (albeit at a low priority) being utilized for mere ancillary functions.
# Node subscription information.
# This message announces the interest of the publishing node in a particular set of subjects.
# The objective of this message is to facilitate automatic filtering in switched networks and network introspection.
# Nodes should publish this message periodically at the recommended rate of ~1 Hz at the priority level ~SLOW.
# Additionally, nodes are recommended to publish this message whenever the subscription set is modified.

@union

uint16 CAPACITY = 2 ** 15 / 2

uavcan.node.port.SubjectID.1.0[CAPACITY] subscribed_subject_ids
# If this option is chosen, the message contains the actual subject-IDs the node is interested in.

uavcan.node.port.SubjectID.1.0[CAPACITY] not_subscribed_subject_ids
# If this option is chosen, the message contains the inverted list: subject-IDs that the node is NOT interested in.

The service ports can be efficiently represented like #1 in the same message because of their limited ID space, shall that be shown to be necessary:

bool[512] server_service_id_mask
bool[512] client_service_id_mask

Is it practical to consider the further restriction of the subject-ID space, assuming that it is to be done without affecting the compatibility of any existing UAVCAN/CAN v1 systems out there?

1 Like

As usual, a well researched article here Pavel; thanks.

I don’t think constraining the subject-ID space is wise. I wouldn’t support that.

There are two refinements I can imagine for your improved subscription reporting scheme:

  1. Use compression to allow the transmission of bitfields where small lists only take up a byte or two and the worst case of transmitting a full 4096 bytes only occurs for a node that actually subscribes to 2^15 subjects*. One can have much fun researching and implementing compressed bitfields for datasets that are normally sparse however some kind of RLE would seem adequate if boring.

  2. Use a “sonar” protocol rather than a periodic publication. This is different then a service call (i.e. our current “pull” model) since the request is a broadcast and the responses are not strongly correlated (I’ll otherwise let the sonar analogy describe the implementation). This has the benefit of avoiding the additional bus load for systems that do not consume the subscription data and avoids publication of the data at rates a subscriber cannot consume. Of course this protocol is easy to abuse since it’s the perfect vector to DoS a bus and does qualify as control coupling where a system must utilize CPU resources to respond to the ping. Mitigations could include ping-response rate limiting and tolerating mute nodes. This also obviates the need for a “notify on changed” message since any listener to these messages can simply maintain a table of values and detect changes.

* do we need a special “subscribes to all subjects” indicator for nodes like bus loggers?

Let me be more specific about my idea to make sure we are on the same page. If we removed the two most-significant bits of the subject-ID, thereby limiting the range from [0, 32768) to [0, 8192), the size of the bitmask message would be 1024 bytes, which is guaranteed to fit into one Ethernet frame (or CAN XL frame) assuming a typical MTU. Only now I realize that we neglected to define the required size of the subject-ID space, choosing it based on the available means of implementation (the CAN ID layout) rather than objective application requirements.

Should we approach this problem now?

A sensible combination of the above two methods plus your special case is possible:

@union

uint16 CAPACITY = 2 ** 15  # Or 2 ** 13, see above

bool[CAPACITY]                                mask
uavcan.node.port.SubjectID.1.0[<CAPACITY/8/2] list  # Same maximum size as "mask"
uavcan.primitive.Empty.1.0                    total
# Option "mask" can be used always unconditionally.
# In the interest of bandwidth optimization, option "list" may be used if the number
# of subjects is less than (CAPACITY/16).
# Option "total" is equivalent to "mask" with all bits set (useful for loggers/analyzers).

@assert _offset_.max / 8 == 1 + CAPACITY / 8

Compression is indeed possible but are we not concerned about the resulting complex relationship between the subject-ID distribution and the message size with the variable computational workload necessary for its generation and parsing?

The control coupling (scheduling) problem is part of the reason why I would like to replace services with a subject; the sonar method does not seem helpful here. Another problem here is that in a switched network, a switch is assumed to be a purely reactive device that does not initiate traffic on its own, otherwise, the latency analysis gets complicated. In an operational network, the network switching hardware may be the only consumer that is interested in the subscription information, and if nobody is driving the sonar protocol, the switches will be unable to auto-configure themselves.

The traffic we create with the subscription information can be pushed down to the SLOW or even OPTIONAL priority level because in a statically configured system the loss of any of the messages is tolerable.

I missed that you were hoping to configure switches with this. That would require specialized hardware would it not?

Yes, as is the case with AFDX. But it does not affect compatibility with conventional networking hardware, obviously.

We have defined it. [0, 32768). Perhaps Derrida would be happier with our reasons then Plato but it does seem the decision was made. I think what you mean is that we have new information you want to inject into this decision that might modify it. If so we would need more concrete inputs. A general idea that such a reduction in subject ID scope “might make it easier to build a switch for UAVCAN” isn’t an adequate input.

I think I need to stew on this more given that I was not reading it properly. We seem to be solving an immediate problem while guessing at possible hardware requirements in the future?

I think I am making that mistake again where I just spill out some ideas without explaining the larger context first. The missing bit here is that having released the v1.0-alpha spec half a year ago, we are on track to release v1.0-beta very soon once the last remaining little issue is taken care of. Once the beta is out and the first users from the PX4 community have adopted it (which will happen fast and I am regularly being reminded about the urgency of this epic), we will be stuck with our design decisions for a long time, and the ability to change anything that is not a fatal design flaw and that has the chance of affecting the compatibility with the fielded systems will be essentially gone for a long time.

Now, with that in mind, I am somewhat paranoidly looking for any loose screws we may have left in the foundations and the chain of conclusions that led us to where we are. While doing so, I stumble upon the [0, 32768) hanging in free air supported by nothing, and panic.

I am going to tattoo this line on my face as the best description of developing long-term technical specifications.

I hear you but I’m specifically dubious because of the speculation around hardware support. We have, to date, defined a specification that has no (known) requirements outside of what commonly available hardware can support. Now we are proposing a constraint that materially limits the protocol’s immediate capabilities based on speculations about non-existent hardware. If we are going to build an ASIC switch that supports UAVCAN, for example, then why would a compressed bitfield present difficulty? The decompression algorithm would be in hardware. Yes, the nodes would need to write software compression but we could easily prototype and characterize the complexity of such an algorithm to evaluate the weight that should place on our design.

Perhaps you are thinking that some COTS part designed for a similar protocol might be configurable to support our use case? If so then, again, we need a prototype to evaluate this.

We can safely omit the switching implications from this discussion now and focus on the more immediate objectives, which are two:

  1. The introspection service is a required feature that is demanded by the upcoming DS-015 standard so the question is how we should address it.

  2. Do we leverage the last chance to review the range or do we keep it as is? This choice has implications on the above.

Agreed. I’m concerned about the large amount of periodic data the simple periodic broadcast could add to a bus. While we can send it at a low priority it is available bandwidth we are consuming that is not available to the system for other purposes. Furthermore, the subscription data is only consumed by tooling for CAN systems. Perhaps we still need a service that enables/disables these broadcasts?

I vote for keeping it as it is. Perhaps a reclassification of the ranges would help? Something like the first 8192 belong to the standard space of subject identifiers and the remaining belong to an “extended” set?

Having considered the implications, I am inclined to revise the subject-ID range by dropping the two most significant bits. My motivation is based on the following observations:

  • The state-space of a safety-critical system should be minimized in the interest of simplifying the analysis and verification. This applies to the subject-ID space because even though at a first glance it has no direct manifestation anywhere outside of the message routing, it indirectly affects other entities such as the introspection interface we are discussing. As a specific example, consider the subject list compression idea you introduced earlier that can be obviated completely by constraining the range.

  • I suspect that few realistic systems may require more than 8192 subjects per network. It then follows from the above that the excessive variability offered by [0, 32768) may be harmful.

  • Shall the need arise to review the range in the future after the beta is out, it will only be possible to do so towards the expansion of the range rather than its contraction.

We can discuss it further at the dev call today. If this proposal is accepted, we will have exactly two remaining items to correct before the beta is published.

image

But let’s explicitly reserve the right to increase the subject-ID range in a future revision of the protocol.

1 Like

UAVCAN/UDP multicasting & IGMP

Last weekend I was working on my experimental UAVCAN-based Yukon backend PoC. While doing that, I ventured to review the idea of employing IP multicasting for the experimental UAVCAN/UDP transport instead of broadcasting (it was partly inspired by the realization that with the broadcast-based approach, there may be at most one network operating on the loopback interface). The original proposal (in the OP post here) closely resembles AFDX, where the switching logic is intended to be configured statically. Later, in “Stabilizing uavcan.node.port.List.0.1, introspection, and switched networks”, I briefly outlined the idea of an IMGP-like auto-configuration logic where the switch is to automatically construct the distribution tree by snooping on the port list messages published by network participants. Further inquiry into this subject revealed that the IMGP protocol itself does not appear to be inherently incompatible with real-time or functional safety requirements, and it is possible to replace its inherently dynamic behaviors with static pre-configuration if such is preferred in a highly deterministic setting.

It is worth pointing out that not all networking equipment provides full support for IMGP. It is common for simpler switching hardware to handle multicast traffic by simply rebroadcasting multicast packets into every output port, which is still technically compatible with the formal specification of IP multicast. This serves to demonstrate that a multicast-based solution can be considered as a special case of the original broadcast-based one and that in highly deterministic settings it is possible to intentionally disable the dynamic behaviors introduced by IGMP in favor of static configuration on the switch, similar to AFDX.

Therefore, the advantage of this updated multicast proposal is that it is capable of efficient utilization of the network bandwidth using standard COTS hardware without the need for any UAVCAN-specific traffic handling policies, while still being compatible with fully static configurations shall that be preferred. It should also be noted that the question of reliable delivery that frequently arises in relation to multicasting is addressed by 1. the basic assumption that the underlying transport network provides a guaranteed service level and 2. the deterministic data loss mitigation method can be applied to manage spurious data losses caused by external factors like interference.

This proposal is focused on IPv4 because I expect that the advantages of IPv6 are less relevant in an intravehicular setting. Nevertheless, due to the similarities between v4 and v6, it is trivial to port this proposal to IPv6 shall the need arise. The differences will be confined to the specific definitions of IP addresses and some related numbers, while the general principles will remain unchanged.

The changes introduced by this proposal affect only nodes that subscribe to UAVCAN subjects. There is no effect on request/response interactions (because they are inherently unicast), and there is virtually no effect on publications because per RFC 1112, in order to emit a multicast packet, a limited level-1 implementation without the full support of IGMP and multicast-specific packet handling policies is sufficient.

The proposed here scheme of mapping a UAVCAN data-specifier to a multicast IP address (and the reverse) is static and unsophisticated. This is possible because the UAVCAN transport layer model is in itself very simple and it utilizes only compact numerical entity identifiers (instead of, say, textual topic names).

Without the need to rely on explicit broadcasting, netmask becomes irrelevant for node configuration (per the original proposal it was necessary for deducing the subnet broadcast address). In the interest of simplification, the new approach is to treat the 16 least significant bits of the IP address as the node-ID, which does not necessarily imply that the entire 65536 addresses are admissible (earlier it has been proposed to limit the range of node-IDs to [0, 4095], that still holds). This is both machine- and human-friendly, especially with IPv6, where the least significant hextet simply becomes the node-ID.

The following 7 bits of the IPv4 address are used to differentiate independent UAVCAN/UDP transport networks sharing the same IP network (e.g., multiple UAVCAN/UDP networks running on localhost or on some physical network). This is similar to the domain identifier in DDS. For clarity, this 7-bit value will be referred to as the subnet-ID; it is not used anywhere else in the protocol other than in the construction of the multicast group address, as will be shown below. The remaining 9 bits of the IPv4 address are not used. The reason why the width is chosen to be specifically 7 and 9 bits will be provided below.

Schematically, the IPv4 address of a node is structured as follows:

xxxxxxxx.xddddddd.nnnnnnnn.nnnnnnnn
\________/\_____/ \_______________/
 (9 bits) (7 bits)     (16 bits)
 ignored  UAVCAN/UDP   UAVCAN/UDP
          subnet-ID     node-ID

Then, in order to provide means for publishers and subscribers to find each other’s endpoints statically (as per UAVCAN core design goals, dynamic discovery is inadmissible) while not conflicting with other UAVCAN/UDP subnets, the multicast group address for a given subnet and subject-ID is constructed as follows:

   fixed in this
   Specification  reserved
     (5 bits)     (3 bits)
       ____          _
      /    \        / \
  11101111.0ddddddd.000sssss.ssssssss
  \__/      \_____/    \____________/
(4 bits)    (7 bits)      (13 bits)
  IPv4      UAVCAN/UDP    UAVCAN/UDP
multicast   subnet-ID     subject-ID
 prefix
            \_______________________/
                    (23 bits)
             collision-free multicast
                addressing limit of
               Ethernet MAC for IPv4

From the most significant bit to the least significant bit, the components are as follows:

  • IPv4 multicast prefix is defined by RFC 1112.

  • The following 5 bits are set to 0b11110 by this Specification. The motivation is as follows:

    • Setting the four least significant bits of the most significant byte to 0b1111 moves the address range into the administratively-scoped range (239.0.0.0/8, RFC 2365), which ensures that there may be no conflicts with well-known multicast groups.

    • Setting the most significant bit of the second octet to zero ensures that there may be no conflict with reserved sub-ranges within the administratively-scoped range. The resulting range 239.0.0.0/9 is entirely ad-hoc defined.

    • Fixing the 5+4=9 most significant bits of the multicast group address ensures that the variability is confined to the 23 least significant bits of the address only, which is desirable because the IPv4 Ethernet MAC layer does not differentiate beyond the 23 least significant bits of the multicast group address (i.e., addresses that differ in the 9 MSb collide at the MAC layer, which is unacceptable in a real-time system) (RFC 1112 section 6.4). Without this limitation, an engineer deploying a network might inadvertently create a configuration that causes MAC-layer collisions which may be difficult to detect.

  • The following 7 bits (the least significant bits of the second octet) are used to differentiate independent UAVCAN/UDP networks sharing the same physical IP network. Since the 9 most significant bits of the node IP address are not represented in the multicast group address, nodes whose IP addresses differ only by the 9 MSb are not distinguished by UAVCAN/UDP. This limitation does not appear to be significant, though, because such configurations are easy to avoid. It follows that there may be up to 128 independent UAVCAN/UDP networks sharing the same IP subnet.

  • The following 16 bits define the data specifier:

    • 3 bits reserved for future use.
    • 13 bits represent the subject-ID as-is.

Publishers should use the TTL value of 16 by default, which is chosen as a sensible default suitable for any intravehicular network. Per RFC 1112, the default TTL is 1, which is unacceptable.

Examples

Node IP address:    01111111 00000010 00000000 00001000
                         127        2        0        8

Subject-ID:                              00010 00101010

Multicast group:    11101111 00000010 00000010 00101010
                         239        2        2       42

Node IP address:    11000000 10101000 00000000 00000001
                         192      168        0        1

Subject-ID:                              00010 00101010

Multicast group:    11101111 00101000 00000010 00101010
                         239       40        2       42

Additional background

Socket API demo

Here is a trivial send-receive demo used to test the Berkeley socket API:

#!/usr/bin/env python3

import socket

MULTICAST_GROUP = '239.1.1.1'
#IFACE = '127.1.23.123'
IFACE = '192.168.1.200'

sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM, socket.IPPROTO_UDP)
sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
# This is necessary on GNU/Linux, see https://habr.com/ru/post/141021/
sock.bind((MULTICAST_GROUP, 16384))
# Note that using INADDR_ANY in IP_ADD_MEMBERSHIP doesn't actually mean "any",
# it means "choose one automatically": https://tldp.org/HOWTO/Multicast-HOWTO-6.html
sock.setsockopt(socket.IPPROTO_IP, socket.IP_ADD_MEMBERSHIP,
                socket.inet_aton(MULTICAST_GROUP) + socket.inet_aton(IFACE))

while 1:
    print(sock.recvfrom(1024))
#!/usr/bin//env python3

import socket

#IFACE = '127.42.0.200'
IFACE = '192.168.1.200'
TTL = 16
"""
If not specified, defaults to 1 per RFC 1112.
"""

sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM, socket.IPPROTO_UDP)
sock.setsockopt(socket.IPPROTO_IP, socket.IP_MULTICAST_TTL, TTL)
# https://tldp.org/HOWTO/Multicast-HOWTO-6.html
# https://stackoverflow.com/a/26988214/1007777
sock.setsockopt(socket.IPPROTO_IP, socket.IP_MULTICAST_IF, socket.inet_aton(IFACE))
sock.bind((IFACE, 0))

sock.sendto(b'Hello world! 1', ('239.1.1.1', 16384))
sock.sendto(b'Hello world! 2', ('239.1.1.2', 16384))

With multicasting in place, it is no longer necessary to segregate subjects by UDP ports; instead, a constant port number can be used. Suppose it could be 16383 (a value as good as any other, as long as it belongs to the ephemeral range and does not conflict with any popular well-known ports.

Service ports can continue using the original arrangement, except that instead of growing downward from a pre-defined base port, they should extend upward for simplicity.

The resulting arrangement is as follows:

  • Subject port: 16383
  • Service request port: (16384 + service_id * 2)
  • Service response port: (16384 + service_id * 2 + 1)

Here is a brief example I came across while testing my PoC implementation (currently on a branch named multicasting). Here we have a command (copy-pasted from the integration test suite):

$ pyuavcan -v pub 4321.uavcan.diagnostic.Record.1.1 '{severity: {value: 6}, timestamp: {microsecond: 123456}, text: "Hello world!"}' 1234.uavcan.diagnostic.Record.1.1 '{text: "Goodbye world."}' 555.uavcan.si.sample.temperature.Scalar.1.0 '{kelvin: 123.456}' --count=3 --period=0.1 --priority=slow --heartbeat-fields='{vendor_specific_status_code: 54}' --tr='UDP("127.0.0.51")' 
2020-12-07 04:55:13 518515 INFO     pyuavcan._cli.commands._subsystems.transport: Configuring the transport from command line arguments; environment variable PYUAVCAN_CLI_TRANSPORT is ignored
2020-12-07 04:55:13 518515 INFO     pyuavcan._cli.commands._subsystems.transport: Resulting transport configuration: [UDPTransport(<udp anonymous="false" srv_mult="1" mtu="1200">127.0.0.51</udp>, ProtocolParameters(transfer_id_modulo=18446744073709551616, max_nodes=65535, mtu=1200), local_node_id=51)]
2020-12-07 04:55:13 518515 INFO     pyuavcan._cli.commands.publish: Publication set: [Publication(uavcan.diagnostic.Record.1.1(timestamp=uavcan.time.SynchronizedTimestamp.1.0(microsecond=123456), severity=uavcan.diagnostic.Severity.1.0(value=6), text='Hello world!'), Publisher(dtype=uavcan.diagnostic.Record.1.1, transport_session=UDPOutputSession(OutputSessionSpecifier(data_specifier=MessageDataSpecifier(subject_id=4321), remote_node_id=None), PayloadMetadata(extent_bytes=300)))), Publication(uavcan.diagnostic.Record.1.1(timestamp=uavcan.time.SynchronizedTimestamp.1.0(microsecond=0), severity=uavcan.diagnostic.Severity.1.0(value=0), text='Goodbye world.'), Publisher(dtype=uavcan.diagnostic.Record.1.1, transport_session=UDPOutputSession(OutputSessionSpecifier(data_specifier=MessageDataSpecifier(subject_id=1234), remote_node_id=None), PayloadMetadata(extent_bytes=300)))), Publication(uavcan.si.sample.temperature.Scalar.1.0(timestamp=uavcan.time.SynchronizedTimestamp.1.0(microsecond=0), kelvin=123.456), Publisher(dtype=uavcan.si.sample.temperature.Scalar.1.0, transport_session=UDPOutputSession(OutputSessionSpecifier(data_specifier=MessageDataSpecifier(subject_id=555), remote_node_id=None), PayloadMetadata(extent_bytes=11))))]
2020-12-07 04:55:13 518515 INFO     pyuavcan._cli.commands.publish: Publication cycle 1 of 3 completed; sleeping for 0.099 seconds
2020-12-07 04:55:13 518515 INFO     pyuavcan._cli.commands.publish: Publication cycle 2 of 3 completed; sleeping for 0.098 seconds
2020-12-07 04:55:13 518515 INFO     pyuavcan._cli.commands.publish: Publication cycle 3 of 3 completed; sleeping for 0.098 seconds
2020-12-07 04:55:13 518515 INFO     pyuavcan._cli.commands.publish: UDPTransportStatistics(received_datagrams={MessageDataSpecifier(subject_id=7509): SocketReaderStatistics(accepted_datagrams={}, dropped_datagrams={51: 3}), ServiceDataSpecifier(service_id=430, role=<Role.REQUEST: 1>): SocketReaderStatistics(accepted_datagrams={}, dropped_datagrams={})})
2020-12-07 04:55:13 518515 INFO     pyuavcan._cli.commands.publish: Subject 7509: SessionStatistics(transfers=3, frames=3, payload_bytes=21, errors=0, drops=0)
2020-12-07 04:55:13 518515 INFO     pyuavcan._cli.commands.publish: Subject 4321: SessionStatistics(transfers=3, frames=3, payload_bytes=63, errors=0, drops=0)
2020-12-07 04:55:13 518515 INFO     pyuavcan._cli.commands.publish: Subject 1234: SessionStatistics(transfers=3, frames=3, payload_bytes=69, errors=0, drops=0)
2020-12-07 04:55:13 518515 INFO     pyuavcan._cli.commands.publish: Subject 555: SessionStatistics(transfers=3, frames=3, payload_bytes=33, errors=0, drops=0)

Shortly after launch the node subscribed to the heartbeat subject (for reasons of node-ID collision detection). Its subject-ID is 7509 = (29 << 8) | 85, which translates into 239.0.29.85. The command takes 300 ms to execute (3 message sets, interval 0.1 seconds). All transfers were single-frame transfers.

How about a simple practical example of running UAVCAN/UDP on a physical hybrid Ethernet + wireless network. I just ran this locally while testing the new multicast implementation I made in PyUAVCAN. Naturally, the implementation is reasonably covered by integration tests (test coverage ~80%) that execute both on Windows and GNU/Linux, but they use the loopback interface instead of a physical network (the latter is harder to set up in the CI environment, but we might get there one day). I wanted to see how my regular office-grade networking equipment handles local multicast traffic.

On the first computer, generate the DSDL definitions and launch the PyUAVCAN demo application. Don’t forget to change the hard-coded local IP address in the demo to something appropriate per your local network configuration (in my case it’s 192.168.1.21, this computer is connected over Wi-Fi):

uvc dsdl-gen-pkg ~/uavcan/pyuavcan/tests/dsdl/namespaces/sirius_cyber_corp
uvc dsdl-gen-pkg ~/uavcan/pyuavcan/tests/public_regulated_data_types/uavcan
basic_usage.py

You may want to also launch Wireshark on either computer to see what’s happening on the network. Although, on the other hand, being stateless and simple, UAVCAN does not generate much interesting traffic (aside from, perhaps, IGMP and deterministic packet loss mitigation).

On the other computer, generate the same DSDL definitions as well, and configure the transport for the CLI tool via the environment variable (it is also possible to use the command line arguments, but I find it to be less convenient):

uvc dsdl-gen-pkg ~/uavcan/pyuavcan/tests/dsdl/namespaces/sirius_cyber_corp
uvc dsdl-gen-pkg ~/uavcan/pyuavcan/tests/public_regulated_data_types/uavcan
export PYUAVCAN_CLI_TRANSPORT='UDP("192.168.1.200",anonymous=True)'

Make sure we can see the heartbeats from the other node (option -M is to see transfer metadata):

$ uvc sub uavcan.node.Heartbeat.1.0 -M
---
7509:
  _metadata_:
    timestamp:
      system: 1607803261.129515
      monotonic: 687025.357389
    priority: nominal
    transfer_id: 962
    source_node_id: 277
  uptime: 962
  health:
    value: 0
  mode:
    value: 0
  vendor_specific_status_code: 74

...

Yup. Let’s stop the heartbeat subscriber and listen to the log messages instead (add -M if you need to see which node each message is coming from):

uvc sub uavcan/diagnostic/Record_1_1

(notice that the CLI tool understands different type name notations for better UX, so it is possible to rely on file system autocompletion)

Keeping that running, in a different terminal, configure the transport in the non-anonymous mode and make a service call to the remote node:

$ export PYUAVCAN_CLI_TRANSPORT='UDP("192.168.1.200")'
$ uvc call 277 123.sirius_cyber_corp.PerformLinearLeastSquaresFit_1_0 '{points: [{x: 1, y: 1}, {x: 10, y: 30}]}'
---
123:
  slope: 3.2222222222222223
  y_intercept: -2.2222222222222214

(where 277 comes from 192.168.1.21 as (1 << 8) + 21)

So we have the response. Also, the remote node emitted a log message that we can see in the first terminal:

$ uvc sub uavcan.diagnostic.Record_1_1
---
8184:
  timestamp:
    microsecond: 0
  severity:
    value: 1
  text: Least squares request from 456 time=1607803038.091053493 tid=0 prio=4

---
8184:
  timestamp:
    microsecond: 0
  severity:
    value: 2
  text: 'Solution for (1.0,1.0),(10.0,30.0): 3.2222222222222223, -2.2222222222222214'