Uavcan v0 found data transfer reversal

In this case we found that

data: 21 50 21 8d f8 05 10
data: 4f f0 03 02 14 bf 44

transfer reversal, then data decode error.
how can I process this case?
I think protocol stack need consider data order.

Of course the protocol is aware of basic things like frame transmission ordering. Your case looks like a typical priority inversion problem. Please tell us how do you manage the transmission queue in your CAN driver? Can you share the code here?

Related: UAVCAN/CAN: tx buffer management in CAN (FD) controllers

We use like this:

      clock_(std::make_shared<uavcan_linux::SystemClock>()),
      can_driver_(std::make_shared<uavcan_linux::SocketCanDriver>(*clock_)),

What CAN controller are you using? Perhaps the inversion is happening within it. Try also to reduce the number of frames enqueued inside the socket from the default 2 to 1 (see max_frames_in_socket_tx_queue):

mttcan:

https://docs.nvidia.com/drive/drive_os_5.1.6.1L/nvvib_docs/index.html#page/DRIVE_OS_Linux_SDK_Development_Guide/System%20Programming/sys_components_tegra_can.html

I doubt there are issues in the M_CAN driver but then again, you never know.

You are dealing with an inner priority inversion problem. It is not really specific to UAVCAN. There are two things to try:

  • Reduce the number of enqueued frames to 1 as suggested above. The default should actually be updated in the sources; if you could submit a pull request that would be appreciated.

  • Inspect the sources of the M_CAN driver (which are rather massive) to see if the TX queue management is implemented correctly. Maybe reach out to the maintainers.

Is that means make change

SocketCanIface(const SystemClock& clock, int socket_fd, int max_frames_in_socket_tx_queue = 1)

as a PR?

Yes.

Here:

1 Like

Not sure if there’s any remedy here, since this driver is deprecated, but at the very least I thought I’d leave this comment for someone going down the same rabbit hole as me -
In the original V2 drivers this max variable was set to “2” - It was changed to “10” in February of 2020 because of this bug showing the buffer was not large enough for SocketCAN requirements.

After that, in March 2020, the platform_specific_software repo was moved to OpenCyphal, and it seems that move did not include this change to size 10 for whatever reason. Then this was posted and the above PR was put up in Oct 21 to set it to “1”, but seemingly that’s only needed for the MTCAN controller mentioned above?

Would be nice if this was broken out into something settable by the API if different values are needed for different systems, but given that it’s legacy code it’s probably not worth it :person_shrugging:

A merge request adding a setter method would be accepted.

@Cherish-forever Did the SocketCanIface(const SystemClock& clock, int socket_fd, int max_frames_in_socket_tx_queue = 1) solve your issue?