Data type compatibility assurance

pavel.kirienko · October 16, 2018, 6:32pm

While working on the updated specification I encountered the challenge of updating the principles of data type compatibility. An update is needed because we must take into account the new concept of Port ID and the new CAN ID layout, which lacks space for the major data type version number.

With the new approach to communication, nodes are no longer equipped with reliable methods of detecting data type compatibility problems. The following diagram illustrates the problem both for services and messages.

+-------------+                       +-------------+
|             |      Request vA.0     |             |
|             |---------------------->|             |
| Service     |                       |     Service |
| version A.0 |                       | version B.0 |
|             |      Response vB.0    |Service ID X |
|             |<----------------------|             |
+-------------+                       +-------------+

+-------------+                       +-------------+
|             |                       |             |
| Publisher   |      Message vA.0     |  Subscriber |
| version A.0 |---------------------->| version B.0 |
|Subject ID Y |                       |Subject ID Y |
+-------------+                       +-------------+

Generally, messages and services with dynamically assigned ID are not affected by this problem, because it is assumed that since a pre-configuration step is required (either manual or, eventually, perhaps in v1.1, automatic), it is up to the configuring entity (a human or a high-level controller) to ensure that the computation graph is conflict-free. Shall there be conflicting data types, the configuring entity should be able to map conflicting data types to different subjects/services (the umbrella term is “port”, pending approval), and if a conflict-free configuration turns out to be unattainable, it can always be detected very early where the cost of error is negligibly low (e.g., while the aircraft is being serviced as opposed to while it’s in the air).

Data types with fixed IDs, however, lack the ability of being remapped, due to the very fact that their IDs are fixed by design. It used to be possible to work around that by choosing the right handler dynamically, since the version information was attached directly to the transfer. That is no longer possible.

A simple solution that came to mind is to add a new version management rule: when the major version number of a data type definition is incremented (i.e., when either bit compatibility or semantic compatibility or both are broken), the static ID of the data type must be changed. When an old major version is removed from the data type set (after a lengthy deprecation period, of course), its static ID can be re-used for a new purpose – either for a newer major version of the same data type, or for a completely different data type.

Consider sirius_cyber_corp.golgafrincham_b_ark.CryopodStatus as an example:

First we came up with 62000.CryopodStatus.1.0.uavcan. The static ID was chosen and released.
Newer minor versions are released. As minor versions are by definition compatible with each other, nodes can communicate with each other successfully while using definitions with different minor versions, e.g., 62000.CryopodStatus.1.6.uavcan.
A breaking change is required. Some nodes now have to support both the new version and the latest version released under the previous major version number. The new version has to use a different static ID in order to ensure lack of runtime type compatibility conflicts. Both versions can coexist on the same bus conflict-free, at the same time nodes are not burdened with additional runtime checks and extra logic – lack of conflict is ensured statically.
- 62000.CryopodStatus.1.6.uavcan
- 62001.CryopodStatus.2.0.uavcan

To summarize:

Presence of a static ID can’t be changed under the same major version number. Meaning that if a new major version was first released with a static ID, it can’t be removed until a new major version is released. Likewise, if a new major version was released without a static ID, it can’t be added until a new major version is released.
All revisions of a data type that share the same major version number have to use the same static ID. The implication is that the same static ID can’t be used for incompatible definitions.

kjetilkjeka · October 17, 2018, 9:22am

I like the guarantees given for the default subject and the static ID range. But, we can still open a subject in the dynamic DSDL range with a data type that has a default ID in the static data range, right? I don’t see much reason to disallow this.

pavel.kirienko · October 17, 2018, 4:10pm

But, we can still open a subject in the dynamic DSDL range with a data type that has a default ID in the static data range, right? I don’t see much reason to disallow this.

While my arguments for its disallowal are weak, the arguments for the opposite seem even weaker. When not sure, always lean on the side of simplicity (which in this case means disallowal). I wrote about this here: On Subject/Service ID ranges

This means that once you’ve decided to stick a static ID (N.B.: “static”, not “default”) to a data type (either service or message), it will be available only under that ID. Any user of that data type will immediately see that and be relieved of extra work that dynamic ID require. If that turned out to be a mistake, the designer of the data type can always fix it by releasing a new major version without the static ID.

kjetilkjeka · October 17, 2018, 4:29pm

While my arguments for its disallowal are weak, the arguments for the opposite seem even weaker. When not sure, always lean on the side of simplicity (which in this case means disallowal). I wrote about this here: On Subject/Service ID ranges

I disagree, we shouldn’t disallow this just for the heck of it. Lean on the side of simplicity is allowing these datatypes to work as any other datatype even though they have assigned a static ID.

pavel.kirienko · October 17, 2018, 4:50pm

Your objection makes me suspect that you have a good use case in mind that wouldn’t be possible if we chose the static way. My reasoning is based mostly on my assessment that vast majority of sensible data type definitions will fall into one of the two following categories:

Where the ID cannot be predicted statically. For example, sensor data.
Any kind of feature that is always in a one-to-one relationship with its node. For example, node status, node configuration parameters, some kind of über-extended vendor-specific node information. In this case it makes sense to relieve the application from the need to manage these dynamically. It also guarantees compatibility: the user is always certain that the ID are static and the data types available through it are always of a particular compatible version. Making it changeable removes that bit of guarantee, promoting uncertainty where it’s avoidable (this is why I believe it’s simpler).

Those odd data types that could be considered somewhere in the middle would have to be shoveled into the first category.

kjetilkjeka · October 18, 2018, 9:33am

Your objection makes me suspect that you have a good use case in mind that wouldn’t be possible if we chose the static way.

Not really, I just think this is the worse choice. I think it clutters updating of data types, implementation and usage flexibility.

In this case, it makes sense to relieve the application from the need to manage these dynamically

Even though some applications might want to open an additional log subject for some reason other applications don’t need to listen on this subject. The reference libraries already need a way to open an arbitrary type on an arbitrary port. Meaning that this means no extra overhead.

I’m not arguing for making it changeable. It should always be served at the static ID. But it should be allowed to serve it under several dynamic ports additionally. I’m also not arguing that anyone should create nodes that use these alternative ports instead, that would be counter-productive. It should also not be allowed to serve it under other static ports than the port assigned to it, that would be crazy. Additionally, it should (and is) always be easier to use the default ID than assigning a dynamic ID, this will be enough to avoid misuse.

These are concretely my problems with the disallow approach:

Adding a static ID to a datatype will become a breaking change (needing a major version increment). It previously could have been served under any dynamic ID, now it cannot. We should have the ability to stabilize a datatype without a static port and add the static port later. This might be especially important for vendor-specific data types and vendor static IDs.
An unnecessary separation is created between the two set of data types. A static type is no longer any other type except that it is served by default under a static ID. This means that we have to create a mechanism in the libraries to hinder use that’s not dangerous, just arbitrarily disallowed.
Even though we might not see the perfect use case, someone else might. Not seeing a good use case is not a good reason for disallowing it.

pavel.kirienko · October 18, 2018, 1:35pm

Okay, now it makes sense. What I was against of is moving static types away from their static ports; using them elsewhere beside their static ports shouldn’t be prohibited.

Unless I missed something important, that brings us to the following rules:

A static ID can’t be removed under the same major version number. Meaning that if a major version was first released with a static ID, it can’t be removed until a new major version is released. On the other hand, if a major version was released without a static ID, it can be added later under the same major version.
All revisions of a data type that share the same major version number have to use the same static ID. The implication is that the same static ID can’t be used for incompatible definitions.
Applications are allowed to utilize types with static ID with arbitrary ports, as long as the port defined through the static ID is always available as well. The port ranges have to be respected though.