Problems with DS-015

tridge · May 2, 2021, 10:28pm

I’ve been having an in-depth discussion with Pavel about the future of v1 and DS-015 in ArduPilot. While the discussion has been useful, I think it a wider technical audience is worthwhile.
First the good news. From my point of view as systems lead for ArduPilot I see two big advantages of UAVCAN v1:

supporting FDCAN, so that in the future we can get more bandwidth (higher bitrates and larger frames)
the ability to extend messages as requirements change to add new fields without duplicating a message (so hopefully we don’t get things like the mess we have now with “Fix”, “Fix2”, “Auxiliary”, “Status” (all sent by current GNSS v0 nodes).

Along with that good news there is some bad:

as far as I know, nobody has demonstrated correct operation of a mixed network of v0 and v1 on the same CAN bus with a rich set of v0 messages. This must be demonstrated for v1 to have any chance of wide adoption. I’d appreciate it if anyone can point me to such a demonstration if it has actually been done.
the message set in DS-015 is in my opinion not fit for purpose. It is in fact a complete mess.

The main focus of the discussions between Pavel and myself has been around DS-015, and it is the philosophy behind the design DS-015 that is the core problem. This description from Pavel sums up the philosophy:

You would notice that providing accurate modeling of each sensor kind is,
in fact, an anti-goal, which is why there is no dedicated message, say, for a
GNSS fix, magnetometer, or lidar. DS-015 takes a higher-level approach,
modeling physical processes and entire subsystems instead of particular
kinds of hardware. Under this model, the autopilot is not intended to be
the central piece of the onboard intelligence, but rather yet another agent of the
distributed computing system. The objective of this design is to enable
composable systems where new, complex behaviors can emerge from a
combination of simple and robust components, instead of aggregating the
entire intelligence in a single complex unit (being, in this case, the ArduPilot).

To me, the idea of new complex behaviors emerging from the design makes me want to run away from this as fast as I can. I want predictable behavior from the network, not emergent behavior. This is not some academic exercise we’re we are trying to get a paper in a journal. ArduPilot is depended on by a huge community of users, from hobbyists to companies flying multi-million dollar aircraft carrying payloads of a ton or more.
Beyond the high level philosophy is a much deeper problem where this design concept has driven the message design in ways that make it completely unworkable. Some of this philosophy driven poor design was present in v0 as well, but in v1 it has grown much more pervasive.
To explain the issues a simple example is useful. Pavel and I hit upon the representation of airspeed in DS-015 as simple example of our key difference of opinion, so I’ll use that here.
This is the DS-015 airspeed message:
https://github.com/UAVCAN/public_regulated_data_types/blob/master/reg/drone/service/air_data_computer/_.0.1.uavcan#L1
Airspeed is also used in the top level README.md as an example of the design philosophy:
https://github.com/UAVCAN/public_regulated_data_types/blob/master/reg/drone/README.md#L25
In UAVCAN v0 airspeed is represented with the RawAirData message. I’ve never liked that message, as it has a whole lot of useless fluff in it. The only useful fields in RawAirData are the differential_pressure and the static_air_temperature. All the others are just wasted bandwidth.
I was absolutely stunned to see that in v1 the design philosophy has resulted in the removal of the differential pressure. No amount of philosophy or “emergent behaviour” can justify that.
As explained in the v1 README, the reason for this is v1 wants the “air data computer” to be smart and to know how to turn the differential pressure into a CAS (calibrated airspeed). The sensor itself is supposed to know about all the effects that go into creating a useful CAS. That would be a disaster.
The key problems are:

we absolutely need to know the differential pressure to analyse flight logs properly. It will be rare for an airspeed sensor to have logging, and even if it did have logging the burden of aggregating logs from a dozen different UAVCAN sensors into a single usage log set would be prohibitive.
we use a small EKF to automatically calibrate airspeed sensors in ArduPilot. We certainly don’t want to duplicate that EKF code on the airspeed sensor node, and we want the detailed logging of the operation of that EKF to be embedded in the main flight log. The most common UAVCAN airspeed sensors for ArduPilot are based on a STM32F1 and F3, which have neither the processing power nor the flash space to run a suitable EKF, quite apart from the development, testing and update nightmare of getting this deployed.
users commonly don’t even know which port on pitot is static and which is dynamic pressure. They see two identical tubes and connect them. ArduPilot lets them do that and we sort it out inside the autopilot. It cannot be robustly sorted out inside the sensor itself.

I spend a lot of time analyzing flight logs, and this is a common type of graph (off my NanoTalon from a flight yesterday)

here I am checking the calibration of the airspeed sensor by scaling the GPS speed by EAS/TAS ratio and comparing to a simplified form of the airspeed equation from differential pressure. I can’t do that type of analysis if I don’t get the differential pressure in the log.
Those of you who know about airspeed sensors may be thinking at this point that some airspeed sensors don’t work via a differential pressure so sending a differential pressure is silly. I am well aware of that, for example the SDP3X does not have a differential pressure to send, and instead is a type of “flow” sensor. In that case we should send the data that it does have as a separate message designed for this type of sensor. The SDP3X is actually an interesting case as you can’t calculate an airspeed without knowing the current pressure altitude (or more specifically outside air pressure rho). Under the DS-015 design philosophy the SDP3X based sensor would listen on the network for someone else to give it the air pressure. That completely misses the fact that it is common that there are multiple barometers on the aircraft, and the SDP3X based sensor has no way of knowing which one to use. On my little NanoTalon I have 3 barometers. Some aircraft have more than that. To job of choosing which one is the best choice needs to be made by the main autopilot, which has all the other sensors available to it and can see which barometer is most consistent with the full set of sensors.
It is also absolutely critical that deploying UAVCAN be as easy as we can possibly make it. We’ve been working very hard to make UAVCAN viable for as many people in our community as we can. That involves ensuring that you can buy a UAVCAN airspeed sensor (like this one) plug it in and immediately be able to use it, just like if you plugged in an I2C sensor, but with all of the nice bus properties of CAN. Under Pavels model you would need to do a pile of configuration of the sensor before you could use it, and even after doing that you’d end up with a less useful sensor than you’d get with I2C. Adopting this design approach will undo much of the hard won gains we’ve made in UAVCAN adoption.
Now you may be thinking this is just about a silly approach to airspeed in DS-015, but it isn’t. The same poor design choices for airspeed are right throughout the DS-015 message set. Everything in DS-015 is made subservient to the design philosophy, rendering the entire message set not fit for purpose.
So what to do? We have to drop DS-015. Instead we should create a message set that contains the information we actually need for robust UAS.
For the simple example of an airspeed sensor based around differential pressure it should contain differential pressure and temperature. That is it. It should not have a covariance matrix (the use of piles of covariance matrices in v0 and v1 is one of the worst aspects of the design). It should not try to provide a calibrated airspeed reading.
If you have a true air data computer like is common in manned aviation then that should be a separate message. That is likely to remain very rare in small UAS, but as ArduPilot is used on more complex aircraft then it will happen. In that case we should have a separate CalibratedAirspeed message. We should not try to fake up this message if the sensor has no reasonable way to provide the data in a robust manner.
The UAVCAN message set needs to be a robust transport of accurate data from the sensors which take measurements to the nodes that need those measurements. It should not make stuff up in order to fit into a broken philosophical mold.
The ArduPilot dev team is happy to create this sane message set. I hope that everyone will understand why it is necessary, and why adopting DS-015 would be a great disservice to our community.
Cheers, Tridge

coder_kalyan · May 3, 2021, 2:16am

Hello, while I don’t have any experience with your specific example (airspeed sensors) I wanted to respond to some of your concerns regarding DS-015. Also, please note that I am a user of UAVCANv1, not a core designer. However, I am (generally speaking) in favor of the principles it strives for.

as far as I know, nobody has demonstrated correct operation of a mixed network of v0 and v1 on the same CAN bus with a rich set of v0 messages. This must be demonstrated for v1 to have any chance of wide adoption.

I agree that this is a limitation that should be addressed better than it is currently.

To me, the idea of new complex behaviors emerging from the design makes me want to run away from this as fast as I can. I want predictable behavior from the network, not emergent behavior.

The two are not mutually exclusive. “Emergent behavior” does not mean that the network will start behaving in unpredictable ways as it gets more complex. It just means that the network should be flexible enough to incorporate new functionality that was not considered by the original designer of the system, and that a set of well-defined services communicating in a network can come together to perform all required vehicular tasks.

As explained in the v1 README, the reason for this is v1 wants the “air data computer” to be smart and to know how to turn the differential pressure into a CAS (calibrated airspeed). The sensor itself is supposed to know about all the effects that go into creating a useful CAS. That would be a disaster.

I understand where some of this thinking comes from, as I am also guilty of thinking this way. (Again, please note that I talk not about airspeed sensors in particular, but about the underlying philosophy behind smart nodes). However, I think it helps to turn to the non-robotics/embedded industry. In the rest of the software world, the following design principles are being applied more and more in software development:

separation of concerns
modular/distributed architecture
flexible/future proof designs
no single point of failure

There is no reason why (at least IMO) this shouldn’t be applied to robotics. The software features performed by modern autonomous vehicles stretch what can/should be performed on a single central component like a flight controller. Not only does this increase the code complexity of an FC, but there’s enough literature to tell you the dangers of a master/slave architecture (primarily the creation of a single point of failure). What you speak of regarding calibrated airspeed vs. indicated airspeed seems to me as separation of concerns, which is primarily to avoid leaking unnecessary context between nodes.

It is also absolutely critical that deploying UAVCAN be as easy as we can possibly make it.

Maybe I’m missing something but I don’t see how DS-015 makes this difficult.

Under Pavels model you would need to do a pile of configuration of the sensor before you could use it

If you’re talking about the fact that a lot of things need to be hooked up/registers/ids/ports set, yes this is annoying, but I don’t think its a big problem in the long run as plug-and-play services are part of the DS-015 standard and can/will be implemented on the autopilot.

We have to drop DS-015. Instead we should create a message set that contains the information we actually need for robust UAS.

You are welcome to do so, but I would urge you to address specific issues with DS-015 instead if you can, as we can all benefit from further standardization in the community (even when people disagree on design philosophies) instead of the constant forking and re-forking that is common in these projects.

likely to remain very rare in small UAS,

Why? I know it may not be super common right now, but that’s not the point - in general it’s a good idea from an architectural point of view.

The most common UAVCAN airspeed sensors for ArduPilot are based on a STM32F1 and F3, which have neither the processing power nor the flash space to run a suitable EKF

This is something I’ve faced as well - low end microcontrollers don’t have the specs to run the required “smart” algorithms, so the obvious solution is to make the node a “dumb” one and keep the autopilot “smart”. However, as much as I like affordable and small MCUs on my cannodes, the small increase in price/complexity to increase the processing power of the cannode is IMO worth it in the big picture.

Hope my two cents was of help.

tridge · May 3, 2021, 5:40am

This should have been priority number one. It should have happened before v0 was deprecated. We had a stable UAVCAN (what is now called v0). It had a rich, stable API and was widely adopted. If you’re going to deprecate it then you should not even think about doing that until the new protocol is demonstrated to be compatible on the bus with the existing protocol.

The existing v0 protocol (and just about any network protocol with the ability to add new messages) meets that bar.

nope, it actually reduces complexity of the flight controller a lot. It is not as though we are going to drop support for v0 sensors or serial sensors or i2c sensors. So we must have the code in the flight controller to directly handle the differential pressure.

It is not just CPU load - the airspeed sensor node does not have the information available to it to provide the information that the DS-015 message set requires. Pavel suggested to me it should go and get that information on the network. Yet where from? Which of the many nodes on the network broadcasting different versions of this information should it use? It has no way to evaluate which sensor is the right one to use and the recipient of the final CAS has no way to reverse engineer what the airspeed node did, which makes it impossible to do any decent diagnostics.
The airspeed sensor has a a simple differential pressure and temperature available to it. In the DS-015 model we need to now have an EKF, sensor selection, logging and diagnostic information, all because the philosophical model wants to avoid sending that simple floating point number.
It is a broken design, with more points of failure than we have with the vastly simpler design of just sending that float on the network.
I’m well aware of the principles underlying distributed computing and network protocol design, and I’ve been implementing distributed systems for more than 30 years. I’m pointing out that this is not good distributed system design or network protocol design. This is an inflexible ideology driving UAVCAN for UAS down a path that makes it unsuitable for anything outside an academic context.

you can’t “plug and play” the airspeed calibration data. You need information about the aerodynamics of the aircraft that is gathered by actually flying it.
There is far too much of this hand-waving dismissal of the real problems of DS-015. Hand waving at plug-and-play solving things doesn’t work without thinking about what it would have to do and whether it has the information needed to do it.
Similarly I see hand waving about bandwidth, dismissing bandwidth concerns with “its all going to be FDCAN at high bitrates anyway”. The tempo of hardware and software releases means we are going to be dealing with deployments at 1MBit for many years to come. As long as a single node on the network is not capable of FDCAN then all nodes are stuck at the slow data rate.
I’m pushing hard to get vendors to use MCUs that are FDCAN capable, because we’re already under bandwidth pressure with complex setups and it is clearly going to get worse. But I know that real adoption will be slow, so we do need to care about bandwidth efficiency in v1 or we’ll just be ensuring that v0 lives a lot longer.
Cheers, Tridge

proficnc · May 3, 2021, 7:22am

Adding vendor’s perspective to @tridge’s comments.

I agree with @tridge especially the point that is adding smarts to the nodes. As mentioned with smarts comes configuration added cost to hardware. Another cost that is added is user experience and support. For configuration it is expected to have a UI and additional education to consumer on how to do this and that. And all of it will be distributed across different docs, unlike now where user just needs to follow procedure as mentioned on one or two forums.

It is my opinion as well, that it should be entirely optional to have nodes smart. I don’t see clear advantage of the said architecture in practical terms anywhere, just tropes and ideologies.

Mallikarjun_SE · May 3, 2021, 8:19am

Hello everyone,
I’m not an expert in this particular discussion. But what I could say is AP_PERIPH with v0 has been very friendly to users. Everything is just plug and play and there’s right amount resource and documentation available for any new users to adapt. We have built many devices including GPS, Airspeed sensor, Industrial magnetometer using AP_PERIPH.
I have seen my software team easily get these devices working in very less time. Before using AP_PERIPH, we had planned for v1. The team really struggled to adapt to the changes that are made from v0 to v1.
Thank you!

auturgy · May 3, 2021, 8:47am

As it both summarises the core issue and proposes a way forward, I’ll copy in my last email from the preceding correspondence with Pavel:

I’ll chime in here, as I’d really like to find a way to move forward.

Pavel, I’m sure it’s not lost on you that the concerns raised by Tridge are largely repeats of concerns raised during other implementation efforts. It must be frustrating and tiresome for you keep having to address them, but the fact that they are consistently raised by independent parties should indicate that there is in fact a problem.
In my view the core issue is basically a conflict between “what does an ideal uavcan v1 system look like” and “how can I implement uavcan v1 in an existing autonomous vehicle”. I hope that when viewed from the latter perspective that the validity of our concerns is apparent, even if it is a little at odds with a “pure” implementation of uavcan v1. We are not starting from a clean sheet of paper, and simply have no option but to consider existing users and systems.
Moving on from the philosophical to the technical, having watched the evolution of DS015, most efforts at service implementation (ie battery, gnss) have required changes to the definitions to achieve a workable outcome, and at least in the case of gnss not yet resulting in a complete solution.
I expect that this pattern will continue as adopters attempt to implement DS015, and ArduPilot simply isn’t going to go through the process of fixing it.
From our perspective it will be more efficient to start again, rather than hacking away piecemeal as is currently happening.

If ArduPilot commits to implementing the v1 transport, would you consider using the funding you’ve offered from UAVCAN to develop more workable services for autonomous vehicles?

I would ask that governance for such an effort be independent of both ArduPilot and DroneCode, to sidestep the history and conflict that exists and unfortunately can’t be ignored. It might be that uavcan can find a way to provide some neutral ground.

Regards,

James

For the sake of the wider autonomous systems industry, I’d like to avoid a “hard fork”: there are many good things about v1, and also many good things about standardisation and collaboration. Unfortunately DS-015 isn’t workable, but perhaps raising and accepting that provides an opportunity to reset and get this done right.

dagar · May 3, 2021, 3:22pm

I think I understand the goal of having data types that map to the physical process rather than dedicated to a specific type of sensor, but I don’t see why that necessarily means the only option is to force a dumb differential pressure sensor to become an intelligent air data computer.

I propose we add services for handling these simple sensor types, and leave things like “air_data_computer” as aspirational examples. The nice thing about UAVCANv1 is you don’t need the service to start or do something custom, you can directly publish a dynamic pressure (reg.drone.physics.thermodynamics.PressureTempVarTs) on your own subject. What’s missing from my perspective is how we can keep these common simple cases easy to use, hard to misuse. This seems solvable.

It would also be nice to make the covariances optional in these messages. https://github.com/UAVCAN/public_regulated_data_types/blob/1337b1c86fee5bd3f3c3c0f1027bcf19e5c08aae/reg/drone/physics/thermodynamics/PressureTempVarTs.0.1.uavcan#L9

As an aside, the example of forcing a differential pressure sensor to become a full fledged intelligent air_data_computer is a bit absurd, but a more realistic example to work through later might be something like an INS (eg VectorNav, Inertial Sense, or ultimately a triplex system of our own making). Here we’ll need a range of high level (state estimates), low level (raw or minimally processed sensor data), and diagnostic information.

dagar · May 3, 2021, 6:11pm

For reference, here’s what an earlier version of DS-015 had for differential pressure.

alexwien7 · May 3, 2021, 6:13pm

Just a short feedback from an Ardupilot partners view:

My background is > 30 year of SW development of algorithms for embedded and backend solutions.
From that background I must follow @Tridge, and in many cases a smart sensor is a less suitable solution:

A technical solution that cost more hardware money, due more processing power in an intelligent sensor, are unlikely to be accepted by
vendors that produce a mass amount of devices, like e.g in automotive companies.

I have done much of development for GNSS/GPS based algorithms and give you an example: Although a GPS sensor is an intelligent one, it also provides the necessary sensor data:

Some of you might remember the SIRF Star GPS chips used long ago, later a new generation (Sirf III) was more accurate, but that lead to the fact that when a vehicle is standing still, they got so called static navigation (= jumping positions and driving direction). This still is a problem in today GPS chips.
To avoid that existing car navigation systems will show crazy jumps of the vehicle position SIRF implemented an option to enable “static navigation filter”. That was a simple solution that solved the problem for navigation systems.
So an intelligent sensor to provide the correct speed.
However we used GPS for a truck tolling algorithms and our (=my map matching) algorithm works differently than a Navi. The algorithm used in the field of the application, works better for a specific filtering different than the standard one. So the preferred solution was to disable that “intelligent behavior” and get the unfiltered speed, and apply our own filters.
It was absolutely necessary to get the raw sensor data (by means of unfiltered speed), to have the best result possible.

You can also pick some buzzwords related to software architecture principles as posted above, so I take one of the ones I follow:

Testability

It’s great that Ardupilot has one flashlog file, I assume that this improves testability much. I know a companion system that have 7 different log files. That’s not a good idea. If you have to write an logging adapter for each intelligent sensor, you end up in writing 90% glue code.

If an airspeed sensor is not really intelligent, than I would not artificially treat it as such.

tridge · May 3, 2021, 8:27pm

yes, we should have simple ones that just reflect the real data. I don’t think we should add the “air_data_computer” unless there is going to be real hardware that will really be used that needs it.

we should just stop using covariance matrices. It just encourages developers to make up meaningless numbers. We shouldn’t be wasting bandwidth on stuff that is just made up.
A few guiding principles in message design:

don’t add fields unless there is a real need for them
don’t add fields that force the developer to make stuff up that they don’t really know

closer, but should remove a bunch of fields.

I don’t think the timestamp really has value on this message. Timestamps have enormous value on messages like GNSS position and velocity, but on differential pressure used for airspeed I don’t think it is useful. The time of arrival is fine.
remove both filter_delay and the filtered differential pressure. It would only make sense if we were greatly reducing the sample rate on the bus and I don’t think we are likely to be doing that.
get rid of the variance, as the sensor is unlikely to really have a good measure of that
only have one temperature

We should aim to get it down to a single CAN frame if possible.

but that doesn’t tell you what this reading actually is. It just says it is a difference between two pressures and a temperature (plus a pointless timestamp and covariance). We need it to be broadcast in a form that says “this is from a pitot tube, if you want to get a pitot based airspeed you can use this”.

tridge · May 3, 2021, 10:17pm

A related topic is if we want to identify the specific sensor type. For example, when analysing logs I really want to know if the differential pressure sensor is a DLVR or a MS4525, as we know the temperature drift on the DLVR is much lower, so when looking for the reason for a stall knowing if you can trust the readings matters.
The way we cope with that now in the ArduPilot world is AP_Periph sends vendor/product info in the NodeInfo, and also sends LogMessage strings with sensors it finds (eg. for GNSS the type of GPS detected, and for u-blox it sends the fw version of the u-blox, which really matters a lot).
A more organized approach to this would be nice. Otherwise we’ll just keep using the strings.
The thing to avoid like the plague is general “this is a pressure” broadcasts. It matters if its a engine pressure, differential pressure from pitot, leak detector in a submarine, barometer etc etc. The whole “its just a physics thing” is not good.
Those of you who have dealt with vehicles with both a MS5525 airspeed sensor and a MS5611 baro, both on I2c will know just how much people like me curse measurement specialties for not making it possible on i2c to determine which type of sensor it is. Let’s not make the same mistake. Just because the units match does not mean it is equivalent.

coder_kalyan · May 4, 2021, 12:29am

we should just stop using covariance matrices.

It’s helpful to have them for when they are actually used but I agree that in the majority of scenarios they are not worth the bandwidth (for instance I just send 0s most of the time).

We need it to be broadcast in a form that says “this is from a pitot tube, if you want to get a pitot based airspeed you can use this”.

Can’t this be done with subject name semantics? I don’t think a specific message is necessary.

tridge · May 4, 2021, 12:50am

strings aren’t great for separating semantics, unless you have well defined strings to ensure everyone always uses the same string.
Personally I’d prefer a separate msg

Madman · May 4, 2021, 3:55am

Dropping my two cents into the conversation. I, like others, agree with @tridge and what he has said. But a hard fork is going somewhat backwards IMO.

tridge · May 4, 2021, 3:59am

I wouldn’t be posting here if I didn’t want to avoid a fork. I’d much rather find a solution that meets the technical needs while working across the broader UAVCAN community.
I just can’t ignore the technical requirements, so if we have to split into APCAN then we will, but at this stage I’m hopeful it will not be necessary.

tridge · May 4, 2021, 4:01am

I alluded to some concerns with mixed networks in the above discussion. I’ve now started a separate topic for that:

pavel.kirienko · May 4, 2021, 3:44pm

@tridge I apologize for being a bit late for this discussion — it took me a bit of time to dig through the extensive background and understand where ArduPilot is coming from. The worst thing I could do in this situation is to give a quick knee-jerk response. Let me know if you think I missed anything critical from the preceding conversation.

Common ground

Everyone involved in this conversation or who would like to get involved should first understand what DS-015 is and what it is not. Please, do not post anything regarding DS-015 until you have thoroughly familiarized yourself with the following sources (I know that @tridge, @dagar, and @coder_kalyan have already done so):

https://uavcan.org (the text on the main page explains what to expect from UAVCAN v1)
https://uavcan.org/specification (reading the first two chapters will suffice)
The Cyphal Guide - Applications & Usage - OpenCyphal Forum
Choosing Message and Service IDs (more on this below)
https://github.com/Dronecode/SIG-CAN-Drone/blob/main/DS-015%20UAVCAN%20Drone%20Standard%20v1.0.pdf

I want to set the foundation by listing certain points that I believe we can easily agree on.

First, this conversation is not about UAVCAN v1. I think both @tridge and I agree that ArduPilot, and the entire ecosystem around it, will definitely benefit from supporting v1 because it does fix many issues that used to be present in v0. For instance, v1 allows one to modify data types without breaking wire compatibility (which has already been successfully demonstrated); also, it is transport-agnostic, which will become important in the longer term. This conversation, however, is about the application-layer communication standard built on top of UAVCAN v1. Unlike v0, the stable version does not address any application-layer objectives; instead, it delegates this task to higher-level standards. DS-015 is one such standard. There may be others. One such standard may be crafted by the ArduPilot team independently from anyone else (although I would gladly offer my best advice if they welcome it); we will be referring to this hypothetical entity as “APCAN”.

Second, DS-015 is not the best choice for building sensor networks, just as a microscope is a poor choice of tool for hammering in nails. The criticism provided by Tridge is entirely correct — if you take a thing designed to do X and apply it for task Y you should expect suboptimal results. It saddens me that the critique is so off the mark, I take it that I should have done a better job at explaining why DS-015 is designed this way. I will try to correct this, but in return, I would like to ask you to do your part by honestly zooming out without holding onto your existing preconceptions.

Third, we should strive to unify our requirements instead of building APCAN next to DS-015, as the fork will be beneficial no neither party. Find my proposal at the end of this post.

What DS-015 is not?

It is not a replacement for I2C, SPI, UART, or UAVCAN v0. Much of the frustration in @tridge’s post comes from incorrectly set expectations.

Let’s say, you have been using a hammer for a long while. It wasn’t a great hammer, but it was just good enough to do its job. Then I came along, took your hammer away, and gave you a new screwdriver as a replacement. You looked at me in bewilderment, asking if I have completely gone insane, because how are you supposed to hammer in nails using that.

You can’t use idiomatic DS-015 for ferrying sensor measurements or actuator commands between the flight controller and its peripherals.

UAVCAN v0 integrates with your flight controller at the driver layer. DS-015 integrates with your flight controller at the layer of its business logic because the other network participants become part of the high-level control processes that used to be concentrated exclusively in the autopilot. This is what you can build using v0, but not idiomatic DS-015:

UAVCAN v0 is built for data exchange. DS-015 is built on the modern theory of distributed computing. @coder_kalyan has done a decent job summarizing the basics — partly repeating the UAVCAN Guide — so I will omit the details.

No more data type identifiers

We need to correct a critical error of interpretation that I spot in these posts (note the added emphasis):

These passages make me realize that I have probably done a poor job writing section Semantic segregation of the Interface Design Guidelines, introduction to the DS-015 standard, and the chapter “Basic concepts” of the UAVCAN Specification because all of these were supposed to address or prevent this misunderstanding. Let me quote the relevant bit from The Guide:

Instantiating a service necessarily involves assigning its subjects and UAVCAN-services certain specific port-identifiers at the discretion of the implementer (for example, configuring an air data computer of an aircraft to publish its estimates as defined by the air data service contract over specific subject-IDs chosen by the integrator).

Excepting special use cases, the port-ID assignment is necessarily removed from the scope of service specification because its inclusion would render the service inherently less composable and less reusable, and additionally forcing the service designer to decide in advance that there should be at most one instance of the said service per network. While it is possible to embed the means of instance identification into the service contract itself (for example, by extending the data types with a numerical instance identifier like it is sometimes done in DDS keyed topics), this practice is ill-advised because it constitutes a leaky abstraction that couples the service instance identification with its domain objects.

So what does it mean practically? Here is an example. Suppose you want to connect a differential pressure sensor to a legacy UAVCAN v0 network. You would make a data type that might look as follows:

uint8 sensor_id              # Which specific sensor is it?
float32 pressure_difference  # [pascal]

You would assign it a data type ID, let’s say, 20001. Then, when integrating a new sensor into the network, you configure it, defining which value of sensor_id should it publish. Alternatively, you could use the node-ID of the sensor to differentiate its data from other sensors of the same kind.

None of these methods work in UAVCAN v1: there are no data type identifiers. Further, the Guide also explains why the node-ID should not be used at the application layer (excepting special scenarios that are not related to this discussion).

In UAVCAN v1, your data type would look as follows:

float32 pressure_difference  # [pascal]

Or, since it is just a physics thing, you could just use uavcan.si.unit.pressure.Scalar.1.0 to the same effect.

Okay, but if there is no data type identifier, then how does your sensor know how to publish the data, and how does your subscriber (e.g., the autopilot) know where to look for this data? The nodes are configured at the time of their integration into the system. UAVCAN v1 fixes the so-called syntax-semantic entanglement problem that was, by far, the worst offender among the design deficiencies present in v0.

UAVCAN v1 offers port-identifiers as a replacement, but they cannot be set statically at the data type definition level, as explained in the linked resources. Exceptions are given for some standard data type definitions, but only that — a vendor cannot define a data type with a fixed identifier. Section “Basic concepts” of the Specification explains why.

This means that the user of your differential pressure sensor cannot possibly integrate the unit into the system without first configuring the subject-ID at which its data should be published; said configuration is also performed via UAVCAN using the Register Interface (no need to craft additional user interfaces). Did you miss this point?

It is also possible to automate the port-ID assignment in certain scenarios, although this is expected to be of limited utility in general. There is an unfinished proposal that can be resurrected if there is interest.

If you are surprised by this, please stop now and read this discussion before continuing, because it is absolutely instrumental for us to have a sensible conversation:

To have a hands-on experience with the computation graph offered by UAVCAN v1, please run this Python demo on your local computer (works best with GNU/Linux): https://pyuavcan.readthedocs.io/en/stable/pages/demo.html. Then also run the DS-015 servo demo to see the embedded side of the same problem.

I hope the questions that I quoted at the beginning of this section are now answered.

No more sensor nodes

UAVCAN v1 is built to enable (hard) real-time distributed computing. Distribution enables one to construct more complex systems using less complex individual parts. This point, in essence, mirrors the old debate about the differences between monolithic and microkernels in OS design, which I presume most are well-familiar with. If you want a more in-depth discussion of this aspect, refer to the Guide.

DS-015 takes advantage of the capabilities offered by v1, bringing the specifics of one particular domain (drones) together with UAVCAN such that the tasks that are pervasive in this domain are addressed consistently in a way that is easy to standardize around.

While plug-and-play is, generally, in the scope of DS-015, one should not expect it to be as extensive as in v0. Since we are talking about a distributed system, expecting it to be entirely plug-and-play is akin to expecting the autopilot firmware to write itself.

Similar principles of distribution and compartmentalization stand behind ARINC 653, which also enables one to construct independently certifiable components, simplifying the maintenance and upgrade of the system. The benefits of such architectures do not necessarily need to be confined to safety-certifiable systems only, of course.

DS-015 assumes that every component is an independent agent that works in collaboration with its peers, such that the role of the underlying UAVCAN v1 is similar to IPC. An existing piece of COTS UAVCAN v0 drone hardware can be made DS-015 compliant by merely swapping its software, without the need to alter the hardware (excepting, perhaps, some marginal scenarios I am not aware of). However, the software does become more complex, which is acknowledged. I presume that vendors of low-cost drone hardware who don’t have access to adequate software development expertise won’t be able to pull it off unless we provided them with extensive support, perhaps in a fashion similar to AP_Periph. The UAVCAN Consortium is well-equipped for this type of collaboration.

The following quote from an adjacent topic reveals critical misunderstanding:

There are several issues:

UAVCAN v1 does not really say anything about sensor nodes, because it is below the layer of abstraction where this distinction makes sense. In UAVCAN v1, there are just nodes. You may call them sensor nodes if you want, this is fine.
DS-015 models physical processes and subsystems instead of sensors. Calling a DS-015 node a “sensor node” is like calling a public Java method a “subroutine”: it is either wrong, or it is evidence of poor design. Idiomatic DS-015 assumes that you hide your sensor behind a higher-layer abstraction. The airspeed estimation with the IAS/CAS debate is a good example of this distinction.

We are equipped now to address the specific problems listed in the OP post:

Do you also need to analyze the raw current measurements made by the ESCs? Unless you have any highly non-trivial requirements I am not aware of, I don’t expect the need to analyze the raw data of every sensor to persist once you adopt the distributed mindset.

The logging aspect is handled by UAVCAN itself, which assumes publishing diagnostic data (such as internal states or informational messages) at a low priority level. We will talk about this more in the thread about the transition from v0 to v1.

This is not really a deficiency of DS-015, but rather a case of incompatible requirements. More on this in the next section.

If this is manageable for the autopilot, then it is manageable for the air data computer node as well.

Your requirements are incompatible with idiomatic DS-015

I understand that what you are looking for at this moment has little to do with idiomatic DS-015. While I am confident that sooner or later you will acknowledge the benefits of distributed architecture outside of the most trivial scenarios, at the moment you need a simpler solution. To this end, @dagar has suggested a middle-ground solution that is mostly valid and can be implemented without causing undue fragmentation of the ecosystem.

A basic airspeed sensor node (sic!) can be easily implemented using the physics data types provided by DS-015 and the standard uavcan.si namespace. This does not agree with idiomatic DS-015 but it will be built on the same basic foundations and opens a solid path for an eventual transition to DS-015 for adopters who find value in it.

Taking our airspeed sensor node (that is, taking the low-level approach as opposed to the alternative offered by DS-015) as an example, we could conceivably make it publish on the following subjects:

Subject	Type
`differential_pressure`	`uavcan.si.unit.pressure.Scalar.1.0`
`outside_air_temperature`	`uavcan.si.unit.temperature.Scalar.1.0`

You could also take the more complex types from the physics namespace. We could also alter them or define new ones. We are entirely open to extending the reg.drone.physics namespace to suit your requirements.

There is another alternative that is not mutually exclusive with the above. Your usage of UAVCAN does not really allow you to leverage the architectural advantages it offers — you are treating it as a point-to-point, star topology network (I am talking about the application layer topology here) to mimic I2C/SPI. In this case, we could also consider defining an additional profile next to reg.drone specifically for tunneling I2C, SPI, and possibly other protocols over UAVCAN. This way you could plug UAVCAN v1 as a backend for one of your I2C/SPI sensor drivers, which I suppose should be relatively easy to do.

Here is a call to action for you: please define a list of data objects that the airspeed sensor node has to publish and subscribe to. Then we will work together to make this design aligned with the DS-015 type system. It won’t be idiomatic DS-015, but at least it will rest on the same foundation using the same type library, opening the path for future convergence. I would like you to postpone forming premature opinions about the results of this experiment until it is concluded.

Other applications can benefit from idiomatic DS-015

As I wrote in my last email, the UAVCAN Consortium is inclined to fund work on implementing the support for idiomatic DS-015 on the ArduPilot side. The first step might be focused on supporting DS-015 actuators. I would like to gauge your opinion on this and whether you would be open to accepting high-quality contributions to this end. I do not expect this work to burden the core dev team beyond reviewing pull requests.

Meta: about this discussion

I would like to invite everyone to keep the conversation constructive and strictly on-point. Please desist from posting anecdotes and “+1”-style responses that do not add new information. Also, I would kindly ask everyone to avoid sharing opinions about DS-015 or UAVCAN v1 until you have at least a basic understanding of what these are.

We aspire to somewhat higher standards of discourse than some of the newly registered users might be used to. If this seems new, consider reading the FAQ.

coder_kalyan · May 4, 2021, 8:55pm

Hello @pavel.kirienko, thanks for the lengthy and clear explanation. I’d like to summarize a few things that we might want to prioritize going forward in the community:

UAVCANv0 versus v1

There is a lot of software (and more importantly, hardware, as it’s easier to scrap software than hardware) that exists that was designed to run UAVCANv0. A lot of this does not translate to idiomatic UAVCAN v1/DS-015, but I don’t think we should respond by deprecating all of this hardware and telling people to completely rethink their network architecture overnight.

There is another alternative that is not mutually exclusive with the above. Your usage of UAVCAN does not really allow you to leverage the architectural advantages it offers — you are treating it as a point-to-point, star topology network (I am talking about the application layer topology here) to mimic I2C/SPI. In this case, we could also consider defining an additional profile next to reg.drone specifically for tunneling I2C, SPI, and possibly other protocols over UAVCAN. This way you could plug UAVCAN v1 as a backend for one of your I2C/SPI sensor drivers, which I suppose should be relatively easy to do.

Perhaps one layer higher than a raw protocol tunnel would be preferable as it would probably reduce bandwidth waste and allow the node to clean up some of the useless hardware-specific implementation details, but I agree that having a lower-level-than-DS-015 solution alongside DS-015 will help especially with the migration process from UAVCAN v0 idioms that I mentioned in the previous point.

Another consideration is that despite UAVCANv1 being designed as a facilitator for application-level, abstract distributed computing, there are many scenarios where people are simply looking to use the physical CAN transport to solve physical layer deficiencies of I2C/SPI/UART. I personally think that the transport layer UAVCAN protocol is flexible enough to be applied to at least a subset of these use cases, so I think developing an application level spec or at least idioms that enable this is worthwhile. Part of this confusion comes from the legacy name of “CAN” inside “UAVCAN,” but that’s something we have to live with. There are also specific situations where the costs (one possible example being an air_data_computer) of implementing distributed computing outweigh the benefits, which I think also presents the need for a low level spec.

APCAN

I understand that what you are looking for at this moment has little to do with idiomatic DS-015. While I am confident that sooner or later you will acknowledge the benefits of distributed architecture outside of the most trivial scenarios, at the moment you need a simpler solution. To this end, @dagar has suggested a middle-ground solution that is mostly valid and can be implemented without causing undue fragmentation of the ecosystem.

I agree with @dagar’s points, DS-015 could do with some improvements. I think that while I agree with the architectural goals of DS-015, the standard neglects some of the realities about the current state of vehicular systems, and not all of these can be simply brushed under the rug in the name of “better architecture.” A compromise can be made, however. For instance, a lower level message set that accompanies the higher level one, or making the covariance types optional.

One such standard may be crafted by the ArduPilot team independently from anyone else (although I would gladly offer my best advice if they welcome it); we will be referring to this hypothetical entity as “APCAN”.

I don’t believe that developing another high-level spec is beneficial to the community. It encourages the “fork when you disagree instead of improving” mindset, leads to software and hardware fragmentation, and further widens the architectural gaps between the Ardupilot community and the PX4/Dronecode community, which ultimately hurts users of both platforms.

PnP

While plug-and-play is, generally, in the scope of DS-015, one should not expect it to be as extensive as in v0. Since we are talking about a distributed system, expecting it to be entirely plug-and-play is akin to expecting the autopilot firmware to write itself.

I think this is one point on which I disagree with @pavel.kirienko. I agree that it is very ambitious and perhaps impossible to make a 100% plug and play system while upholding the current design principles of UAVCANv1. However, I don’t think enough emphasis is being placed on pnp. As it currently stands, UAVCANv1 is only really situated to be leveraged by an experienced vehicular system/network designer. I think that UAVCAN has a scope bigger than that, and it would significantly stunt adoption to continue this mindset. Users of both PX4 and Ardupilot have enjoyed a very plug-and-play friendly architecture for many years now (being able to grab both supported and community-maintained flight controller boards, wire up sensors and peripherals to the ports, and get most equipment to “just work” with relatively little manual configuration), which is one of the reasons that someone can get involved with building autonomous vehicles without knowing the intricacies of how everything works. If, suddenly, every actuator, battery management system, external smart ADC/GNSS/etc required hours of reading its documentation in order to get a system up and running, there would be much less incentive to use it.

I think the spec and implementation of pnp should stay independent of both DS-015 and “APCAN” if that comes to be, and I’m interested in pursuing this further.

Usage semantics

I would appreciate it if @pavel.kirienko could improve the documentation regarding how message data definitions (DSDL) differs from the semantics of how the subject is being used. I hope that at least some of this can be automated with pnp in the future but using message data type names to represent the use of a subject, while perhaps the most obvious solution, is not very flexible.

P.S:

Then also run the DS-015 servo demo to see the embedded side of the same problem.

@pavel.kirienko This link seems to be broken?

pavel.kirienko · May 5, 2021, 10:24am

A post was merged into an existing topic: Handling mixed v0/v1 and Classic CAN / CAN FD networks

pavel.kirienko · May 5, 2021, 11:17am

Sure, but observe that many hardware products can support DS-015 by means of a mere software update. This should be the preferred way forward. Where this is impractical, non-idiomatic DS-015 (as suggested by Daniel) remains available as the alternative.

This is what the non-idiomatic DS-015 option is about though — if you must carry raw measurements, you can use the data types defined by DS-015 (or even UAVCAN v1 itself, see uavcan.si) directly. I hope we can construct an example airspeed sensor node with Andrew to demonstrate that this approach is viable.

I think you are overestimating the amount of manual configuration required to get a piece of basic hardware to work. If you open the servo demo (the link is valid btw, you just need to accept the invitation I just shared), you will see that connecting it to another system (which could be an autopilot or your laptop) only takes assigning a few numbers. This is comparable to configuring a slightly unconventional GNSS unit.

Regardless, we can resurrect the existing limited auto-configuration proposal that I mentioned, I just don’t think it should be a priority right now because we don’t have anything to apply it to.

Section “Semantic segregation” of the Guide is dedicated to this problem. I think it should help one understand the core idea by viewing subjects as objects (class instances) in an OOP program, and DSDL types as classes.

You define a class to model a concept, then you apply it to a concrete problem by creating an instance. Same goes with DSDL data types and subjects: you have a data type that models the kinematic state of a body. Then you create subject of this type to model the kinematic state of a particular mechanism on your robot.

“Using message data type names to represent the use of a subject” is valid in a very limited set of scenarios. In OOP these are called singletons.