Architectural guidance around UAVCAN Gateway

pavel.kirienko · February 27, 2020, 10:43am

The following is a cross-post from Libuavcan issue #299 posted by Janaka S.:

Hi All,
We are in the process of trying to utilise UAVCAN on a medium size research instrument. Our scope of work does not allow us to change the Hardware or the System architecture of the instrument. We want to implement something like the following:

image427×689 23.6 KB

Needs and Notes:

Gateway service needs to forward all CAN packets destined for Service1 and 2, to a MessageBroker over a TCP link. And carry message from Service1 and 2 back down to embedded Boards over CAN.

We want to use the DSDL and UAVCAN to encode and decode packets at the MessageBroker

Service3 and each of the Boards needs to be unique nodes on the CAN network

What is the best way to achieve the Gatewaying using UAVCAN?

I have moved the conversation here because the original post is out of place.

pavel.kirienko · February 27, 2020, 10:44am

Gateway service needs to forward all CAN packets destined for Service1 and 2, to a MessageBroker over a TCP link.

It sounds like what you mean is not actually about CAN frames but rather about UAVCAN messages (or services), am I right? Because if the task was to ferry CAN frames around, you wouldn’t need UAVCAN.

How keen are you on using TCP between the Message broker and the Gateway? Have you considered using UAVCAN/UDP instead? It might provide you with seamless integration between CAN and UDP since you would be using the same protocol across the system. Find the details here:

Implementation with usage examples in pyuavcan.transport.udp.
Background, motivation, theory: Alternative transport protocols

Would it be possible for you to provide a higher-level description of the system? What problem is it going to be solving? That might help us to provide better advice.

Janaka · February 28, 2020, 12:07am

You are absolutely correct. I was meant to say UAVCAN messages.
I am not fussed if it is TCP or UDP as long as I get a reliable transport with correct packet ordering (which from my reading UAVCAN/UDP provides).
Are you suggesting that I have two UAVCAN stacks in the Gateway; downstream one with CAN and upstream one with UDP? Or do you have different solution?
In either solution do you foresee any complication with a unified DSDL, Node Addressing or timing?

Unfortunately I can’t provide much more information about the instrument domain due to market sensitivity and confidentiality clauses. However, I can discuss the generic technical communications architecture openly.

pavel.kirienko · February 28, 2020, 3:40pm

Okay, thanks for the clarifications.

For a more constructive discussion we are going to need to know what the transport reliability requirements are and what are the failure modes. What could cause an Ethernet frame to be lost in your setting? For example, is the cable routed through a factory floor with welding equipment generating nanosecond disturbances, etc. Is your bandwidth budget tight? Can you please read this and see if the approach is applicable in your case: Idempotent interfaces and deterministic data loss mitigation

Are you suggesting that I have two UAVCAN stacks in the Gateway; downstream one with CAN and upstream one with UDP?

Yes, this is exactly what I am suggesting. This can be implemented trivially using PyUAVCAN. Actually, I am probably going to draft up a simple demo tomorrow or the day after for demonstrational purposes; I expect it’s going to be a hundred lines of Python or so.

do you foresee any complication with a unified DSDL

No.

Node Addressing

There are two obvious solutions: a transparent bridge and a facade. The former is assumed to expose the CAN nodes below the Gateway as if they were on the same logical network with Message Broker. The latter renders Gateway a dedicated node which hides the topology of the underlying network from the Message Broker; e.g., if Board1 publishes a UAVCAN message, the Gateway forwards it over to the Message Broker, and the latter sees it as if the message was published by Gateway itself. Both Bridge- and Facade-based approaches are possible; the former is very low level (requires dealing with the network at the transfer level), the latter is high level (operating on messages and service calls). The optimal approach depends on the requirements of your application.

timing?

There is not enough data to provide a meaningful response. To answer broadly, UAVCAN is simple and a single human can effortlessly place the entire stack into one’s imagination, so at the protocol level the temporal properties are evident. Translating a message from one network into another amounts to receiving frames at one end (e.g., CAN frames) and emitting equivalent frames at the other end (e.g., UDP packets).

Janaka · March 2, 2020, 12:38am

Thank you for the detail explanation and links.

What could cause an Ethernet frame to be lost in your setting? For example, is the cable routed through a factory floor with welding equipment generating nanosecond disturbances, etc…

Yes. The Ethernet cabling is routed through an area that contains some grunty motors and pumps, but not as bad as welders. Furthermore Ethernet link is not bandwidth bound, nor particularly timing sensitive (200ms delays are acceptable). I did read through the Idempotent interfaces link you sent through. Unfortunately it will require major rework at higher layers to make the interfaces Idempotent and is not currently possible. Also doubling up of the traffic to mitigate against loss is currently undesirable. For this particular instrument, I would prefer a TCP transport layer for UAVCAN if there is one (or re-sending data only during loss or corruption).

There are two obvious solutions: a transparent bridge and a facade…

Few questions about these two solutions:
With the Facade approach, if you use the same DSLD messages on all sides, wouldn’t the Service or Board information of the sender be lost during the routing due to visible sender being the Gateway if two UAVCAN stacks are used?
What sort of customisation will be required for the Bridge approach?

pavel.kirienko · March 3, 2020, 8:25pm

Noted about the doubling up but I think you might have misunderstood the part about idempotency. Deterministic data loss mitigation (DDLM) is not manifested above the transport layer at all, so your application-level interfaces are not affected. The only practical effect of the DDLM is that the number of lost transfers will be reduced and the bus traffic will be increased.

Regardless, TCP is also possible. PyUAVCAN supports the Serial transport which is designed for stream-oriented point-to-point links such as the serial port or USB CDC ACM, and it is also usable with TCP since it’s structurally similar. For that, you will need a simple TCP broker that would interconnect multiple nodes together, as explained in the documentation (section “TCP/IP tunneling”).

Yes. I perceive that it is undesirable in your case.

So, uhm, PyUAVCAN is not exactly designed to support this case so it gets somewhat convoluted. I poked at this a bit but I am going to need at least a few hours to come up with a working demo, couldn’t find the time yet.

If you are interested to pursue this option, it might be much easier to just do everything at the low level. For the CAN side, take Libcanard, and for the TCP side just re-implement the serial transport as documented in PyUAVCAN, it’s very simple. I might be able to come up with the PyUAVCAN demo later, right now I am stuck with other things, sorry.

Janaka · March 4, 2020, 2:55am

Thank you for your thoughtful insights.

Yes. I perceive that it is undesirable in your case.

Yes. Higher level services depends on the Transparency of the senders and receivers. Hence a bridge/proxy approach is desirable. Anyways, I should be able to get a TCP mechanism working as per your suggestion.
I do currently have multiple Embedded Boards running example libuavcan demos on STM32s. Only area of concern I have is that DSDL generated code adds a bit to ROM/Flash space (But not RAM). Stats I have:

Publisher = ~1KB
Subscriber = ~3.5KB
Service server = ~5KB
Service client = ~11KB
I am looking at ways to use the DSDL in a more optimum way.
I will have a look at Libcanard as well.

pavel.kirienko · March 5, 2020, 5:26pm

Hi Janaka,

I thought about it some more. I notice that there are methodological similarities between the task of transfer bridging, as we are discussing here, and the monitoring mode discussed at:

I do not see a clear way of implementing transfer-level bridging using the means provided by the public API of the existing implementations. Although the task may seem similar to conventional node-to-node communication, bridging is actually different because it requires the bridge to operate in the promiscuous mode and operate on transfers that neither originate nor terminate at the local node. If you review the dataflow inside a bus monitor and a bridge, you see that they are, in essence, equivalent.

For now, I recommend you to implement bridging by modifying the transport reception logic in Libcanard v1 so that incoming frames are not filtered out based on the address mismatch and that new subscriptions are created automatically ad-hoc upon first reception of a matching frame.

By switching to Libcanard v1 you will also resolve your ROM footprint issues. You will be able to rely on direct manual field manipulation rather than auto-generating code. It’s tedious but ROM-efficient:

uint8_t  mode   = canardDSDLGetU8(heartbeat_transfer->payload,  heartbeat_transfer->payload_size, 34,  3);
uint32_t uptime = canardDSDLGetU32(heartbeat_transfer->payload, heartbeat_transfer->payload_size,  0, 32);
uint32_t vssc   = canardDSDLGetU32(heartbeat_transfer->payload, heartbeat_transfer->payload_size, 37, 19);
uint8_t  health = canardDSDLGetU8(heartbeat_transfer->payload,  heartbeat_transfer->payload_size, 32,  2);

uint8_t buffer[7];
//              destination offset   value bit-length
canardDSDLSetUxx(&buffer[0], 34,          2,  3);   // mode
canardDSDLSetUxx(&buffer[0],  0, 0xDEADBEEF, 32);   // uptime
canardDSDLSetUxx(&buffer[0], 37,    0x7FFFF, 19);   // vssc
canardDSDLSetUxx(&buffer[0], 32,          2,  2);   // health
// Now it can be transmitted:
my_transfer->payload      = &buffer[0];
my_transfer->payload_size = sizeof(buffer);
result = canardTxPush(&ins, &my_transfer);

Janaka · March 9, 2020, 3:22am

Hi Pavel,
Thanks for the links and information.

For now, I recommend you to implement bridging by modifying the transport reception logic …

I think, for now, I have managed to solve the architectural bridging problem by:

Removing the requirement around needing to address individual services on the x86 PC. I.e.: now only the MessageBroker will be addressable from Embedded or Linux nodes/services. e.g.: Address <1>.
Making Gateway address the same as MessageBroker. e.g.: <1>. Hence anything targeted at the PC will have address <1> on the CAN bus.
Sharing the SocketCan interface on Linux between the Gateway and other services on the ARM Linux board. Effectively this means that ‘Service3’ for example gets its unique CAN address on the bus and any communication to it from the PC goes via the Gateway as per other CAN nodes. Performance wise, the packets will have to traverse few layers on the Kernel more before poping out on the CAN bus and then swallowed back up.

Although the above solution is not fully optimum, I will be able to use the two CAN stack approach previously discussed with minimum customisation.

By switching to Libcanard v1 you will also resolve your ROM footprint issues.

Further analysis of the number of interfaces and messages used within the system has shown that at worst case the ROM hit will be around 200KB per STM board. Which will not be an issue in the current system. I will have a look at Libcanard if all else fails.

pavel.kirienko · March 31, 2021, 4:39pm

This is a delayed reply but I would like to inform future readers that this case is now natively supported in PyUAVCAN (which can be deployed on an embedded GNU/Linux computer if necessary). The required logic is accessible through the capture/tracing/spoofing API.

A simple example is available in the docs. The idea is to use capture+tracing on the downlink interface to receive all transfers from that segment of the network, and then use spoofing to emit received transfers on the uplink interface.