SI namespace design

As the SI namespace is expected to be the key part of the new standard data type set, we should approach its design especially carefully.

I would like everyone who has any relevant ideas, especially @scottdixon and @kjetilkjeka, to propose their own vision of the uavcan.si namespace either directly here or via pull requests on GitHub (against the staging branch uavcan-v1.0).

My own vision of that has already been published here in the earlier commits: https://github.com/UAVCAN/dsdl/pull/47

Are you thinking that a sensor emitting samples on multiple ports (i.e. with timestamps) must guarantee the order of each port and/or of each sample?

I think the former is not necessary but the latter is essential to simplifying implementations.

I’m trying to understand how complicated it will be to implement your proposed si scheme where a given node is emitting samples across several ports. We’ve agreed that the scheme is not bus efficient but that we don’t care (bus efficiency is for vendor-specific types). That aside, how efficient is the logic required to assemble the samples from multiple messages across several ports. One thing to consider is the logic for detecting sample “edges”. Below is a quick statechart I drew that imagines what this logic looks like if we assert that samples cannot be interleaved:

I’ll restate the logic in the diagram for taking the “new sample detected” branch in psedo-code:

if rx.timestamp != sample_time then
	sample_time = rx.timestamp
	raise on_new_sample event

Note that if we allow samples to be interleaved and applications care to assemble old samples then the assembly logic becomes much more complex and would require more memory to buffer multiple samples. This post ignores this possibility.

This statechart is only describing the detection of start/restart conditions based on timestamps. It assumes that a higher layer knows how many ports are required to assemble a full sample and can tolerate dropping partial samples if new samples arrive before older ones or if the older sample is missing port messages. For this logic to be generic we need a way to have a “superclass” that is “timed-ci-message”. As we can all agree that a type with a timestamp and then a giant union of all CI types is undesirable I think the next best thing we can provide is that all timed CI messages have the timestamp first. This helps but we still need a port router to push the right ports through the right state machines needed to assemble a sample. This component (which I don’t think is part of UAVCAN but should be acknowledged as a possible strategy for dealing with si ports) must be configured by an application to route ports through the appropriate sample collection logic.

I realize now that I’m confused about our new terminology. Do Ports carry Messages or are Ports synonymous with Messages?

image

The multi-port sampler would contain the above timestamp statemachine and also logic like this:

struct MySample 
    timestamp = INVALID
    field0 = INVALID
    field1 = INVALID

# storage
current_sample : MySample

do forever

    if timestamp_statemachine.is_raised_on_new_sample() then
        # we have a new sample
        if current_sample.timestamp is not INVALID then
            # Some applications might still want to consume
            # partial data
            deliver_partial_sample_to_application(current_sample)

        current_sample.timestamp = timestamp_statemachine.get_sample_time()
        current_sample.field0 = INVALID
        current_sample.field1 = INVALID

    if transport_receiver.has_message() then
        # we have a new message
        message = transport_receiver.get_message()

        if message is field0_port then
            current_sample.field0 = message
        if message is field1_port then
            current_sample.field1 = message

    if current_sample.timestamp is valid and 
          current_sample.field0 is valid and
          current_sample.field1 is valid then

       # We have a complete sample
       deliver_complete_sample_to_application(current_sample)
       current_sample.timestamp = INVALID

So far I’m not offering opinions. I’m just illustrating what we expect application logic to look like when using the generic SI layer we’re discussing here. Remember that the above “port router” would require the discussed logic for all known node types and so would be quite complex when the application was a master node like a flight-controller. For responses to this post I’m just looking for “yeah, this is basically right” or “no you’re missing something important”.

Per response in the weekly dev call: Pavel says see http://docs.ros.org/api/message_filters/html/c++/classmessage__filters_1_1TimeSynchronizer.html for an example of ROS has that is basically what my “multi-port sampler” box was doing.

Yeah, this is basically what I had in mind.

As you said, I don’t think the message synchronization logic should be a part of the specification; rather, this is a piece of application-specific logic (although we should at least provide some generic implementation recommendations in the specification, perhaps). We should, however, provide our libraries with decent implementations of multisubject synchronizers. Luckily, the synchronization logic is invariant to the particular data types used, so it should be trivial to implement using available means of metaprogramming (barring inherently limited languages like C), without the need to pose any specific requirements to the synchronized messages (I don’t think it’s necessary to put timestamp first, because the ordering of the fields concerns only the serialization layer).

As I see it, such a multisubject synchronizer would be instantiated by the user with a set of subject IDs and their corresponding data types. The synchronizer then would expect that the timestamp information is provided in a field bearing a particular name and type (e.g. uavcan.time.Point timestamp); alternatively, we could allow some variance here by letting the user override the default name of the timestamp field per data type, if necessary. The synchronizer then would subscribe to each subject and collect messages; once a full set of messages under the same timestamp is collected, the full set is delivered at once (synchronously) to the application.

It would be desirable to implement additional user-selectable error handling policies (as you mentioned), such as:

  • Allowing the synchronizer to maintain several simultaneous sets concurrently (the maximum number of concurrent sets is to be configurable), allowing it to correctly reassemble message sets even if some of them are interleaved with adjacent sets. For example, let 1 be a message belonging to the first set, 2 be the message belonging to the next set, and so on (assuming 4 subjects per set here):
1112123322334...
    ^    ^ ^
    |    | |
    |    | third set completed
    |    second set completed
    first set completed
  • Allowing the synchronizer to report incomplete sets rather than trying to complete delayed sets retroactively:
1112123322334...
   ^^ ^ ^^ ^
   || | || |
   || | || third set completed properly
   || | both ignored
   || second set reported while incomplete
   |late message ignored
   first set reported while incomplete

I imagine the following C++ interface to be an adequate representation of this logic, and I don’t foresee any implementation difficulties here, even if we had to be stuck with C++98 (although things would get a bit ugly):

void callback(const MessageTypeA& foo,
              const MessageTypeB& bar,
              const MessageTypeB& baz,  // (not a typo)
              const MessageTypeC& zoo)
{
    // ...
}

constexpr auto NumConcurrentSets = 3;

auto synchronizer = MultisubjectSynchronizer<NumConcurrentSets>(
    std::make_tuple(MessageTypeA, subject_id_foo),
    std::make_tuple(MessageTypeB, subject_id_bar),
    std::make_tuple(MessageTypeB, subject_id_baz),  // (not a typo)
    std::make_tuple(MessageTypeC, subject_id_zoo)
);
synchronizer.setCallback(callback);

Although the metaprogramming facilities of Rust seem to be quite limited, I don’t anticipate much difficulty there either.

I realize now that I’m confused about our new terminology. Do Ports carry Messages or are Ports synonymous with Messages?

The set of Ports contains both Subjects and Services. A Subject ID is also a Port ID, a Service ID is also a Port ID. As we’re talking only about messages here, we don’t have to use Port at all, for clarity and specificity.

So this was your latest proposal @pavel.kirienko -> https://github.com/UAVCAN/dsdl/tree/f9f1d783191535b844bba9d37c9ebce7351d5c6a/uavcan/si

While I continue to be nervous about bus efficiency we did discuss this at-length in Stockholm and we all agreed to ignore bus efficiency for the SI namespace so I’ll say nothing more about it.

Your proposal seems to aspire to a system where any si quantity could be described. This is highly desirable as we can easily add any types missed without affecting the existing namespaces. But your proposal isn’t quite formalized enough and mixes some concepts. I think we should agree on a schema and then we only have to propose what values we concretely define in 1.0.

Here’s my proposed si namespace schema:

si/[quantity]/[Scalar|Vector].X.x.dsdl -> [type] [unit]

For example:

si/rotational_velocity/Scalar.1.0.dsdl -> float32 omega
si/rotational_velocity/Vector3.1.0.dsdl -> float32[3] omega

Time Stamps

I propose that we make timestamped versions a separate namespace with the same schema:

si_ts/[quantity]/[scalar|vector].X.x.dsdl -> [type] [unit]

and that timestamps always appear first so all si_ts types can be handled generically by looking at the first location in memory for any given si_ts message.

Would you like to hear a radical brand new idea?
What if we made timestamps mandatory?

SI types are expected to offer a somewhat lower bus utilization efficiency (what an understatement). Suppose we’re carrying a 32-bit float over CAN; on top of that, there are at least 29-bit CAN ID + 8-bit tail byte + some bits of the CAN protocol overhead, which make the payload insignificantly small compared to the overhead, so adding a timestamp won’t affect the bus traffic significantly. Additionally, it is expected that SI types would rarely contain enough useful information by themselves, one would have to rely on message synchronization as discussed above, which is inherently timestamp-based. Further, I just don’t like the idea of having multiple options of the same thing (timestamped or not) when one might be enough.

32-bit values with 56-bit timestamps work well with CAN FD: 32-bit payload + 56-bit timestamp + 8-bit tail byte = 96 bits = 12 bytes, which corresponds exactly to the DLC code 10012. Same is true for 3-element vectors: (56 + 32 * 3 + 8) / 8 = 20 bytes, DLC = 10112, no padding bytes necessary. Although I’m talking about efficiency again here, and we’ve agreed not to do that anymore…

I like the schema proposed by Scott, which upon addition of timestamps would look like:

si/<quantity>/<Scalar|Vector<N>>.X.x.uavcan
    uavcan.time.Point timestamp
    <type> value

However, having just “Scalar” and “Vector” is clearly not enough, is it? We have to introduce extensions for Euler angles, spherical coordinates (latitude/longitude), higher-precision vectors (e.g. for ECEF coordinates), and time (the order of magnitude of time is not going to change anytime soon, so it makes sense to use fixed point (e.g. nanoseconds) for the sake of (sigh) bus utilization efficiency). How do we handle that?

  1. I agree that we should just make timestamps mandator.
  2. We need to express units in the types somewhere. I had it where you put “value”. Are you opposed to this:
si/<quantity>/<Scalar|Vector<N>>.X.x.uavcan
    uavcan.time.Point timestamp
    <type> value_with_units_as_name

(I’m not sure how to express this in our little DSL we’ve invented)

Alternatively we could require that the type be aliased to a set of si units (derived or otherwise) so we would have something like:

si/rotational_velocity/Scalar.1.0.dsdl -> uavcan.si.units.omega value
si/rotational_velocity/Vector3.1.0.dsdl -> uavcan.si.units.omega[3] value
  1. I need to give more thought on the rest of your questions.

Perhaps it is? for example:

si/angular_position/Vector2.1.0.dsdl
    uavcan.time.Point timestamp
    uavcan.si.units.radians.nano latitude
    uavcan.si.units.radians.nano longitude

We’ve already established that the above lat/long values are binary compatible with a uavcan.si.units.radians.nano[2] value so the above is just “named vector elements”. No?

Where this breaks down is if we have heterogeneous types that are components of a quantity. Does such a thing exist or does such a thing imply a thing too complex to be part of this namespace?

We need to express units in the types somewhere. I had it where you put “value”. Are you opposed to this

I am not opposed to that, even though it seems a bit redundant, since it is already stated that all units are according to the SI. So if a type is defined under uavcan.si.energy.Scalar, it is clear that we mean joules. Being redundant here is not necessarily a very bad thing though.

Should the unit of measurement be used in its plural form or singular? float32 joule or float32 joules?

Alternatively we could require that the type be aliased to a set of si units (derived or otherwise) so we would have something like

Seems too complicated. Let’s go with the first suggestion.

Where this breaks down is if we have heterogeneous types that are components of a quantity. Does such a thing exist or does such a thing imply a thing too complex to be part of this namespace?

I see now that by adding spherical coordinates and such, we are trying to fit too much semantics into the SI namespace. Perhaps, indeed, it is best to leave higher level concepts out of it, in which case simple Scalar and VectorN would be enough, except that we’ll also need higher-precision options for those. For example, how do we represent a vector of float64? WideVectorN? Something else?

Higher-level heterogeneous quantities are definitely needed; the question is, though, whether they are needed in UAVCAN v1.0, or can they wait until UAVCAN v1.1. I am currently mildly inclined towards the latter option. Those would include global positioning data (“global” as in “pertaining to a nearly-spherical celestial body”: longitude, latitude, elevation, etc.), spatial location (position, orientation, velocities), electrical circuit status (voltage and current), inertial readings (acceleration and angular velocities, also integrated), and so on. I don’t feel confident enough in my understanding of the topic to commence any sensible work on it yet.

I think singular is more correct.

I’ve tried to read this a couple of time now. To be honest, the design doesn’t feel great, but I’m struggling to realize clearly exactly where that is feels wrong. Let me attempt to put some words on it.

It feels like we’re over engineering. We’re creating loads of structure to, in the end, barely verify units. And even that in a way that feels clumsy.

What we really want is simplicity, modularity and robustness, but it feels like we’re shooting for modularity + robustness but failing at robustness.

I think the following doesn’t completely feel right either, but would probably be better than what we currently got:

Values

# 
# The type used for dynamic configuration of information flow.
# Check the datasheet of the unit for explanation what the field indexes means.
#
# A unit (non controller) should allow configuration of a subject where this message will be transmitted and received.
# - The transmitted messages are readings (or commands if this unit includes control functionality)
# - The received 
#
# To ensure compatibility, some design rules must be followed when it comes to units.
# Only SI units are allowed, first some frequent questions will be answered before a list of SI units will be listed.
# 
# Frequent faults:
# - For rotational velocity `rad/s` is to be used. RPM is not an SI unit and must be avoided.
#
# SI units:
# - meter [m] (length)
# - kilogram [kg] (mass)
# - second [s] (time)
# - ampere [A] (electric current)
# - kelvin [\theta] (thermodynamic temperature)
# - mole [N] (amount of substance)
# - candela [cd] (luminous intensity)
#
# Derived units
# - TODO: Fill in relevant

uint26 timestamp_us # ~1 minute overflow

float32[<=20] values

Physical dimensions must be reflected in the type system. Meaning that there must be a dedicated message (or nested) type for kilogram, radian per second, and so on. Putting this all into a union or a large array is hardly sensible because it basically brings us back to the point where it’s no better than just using a raw array of dimensionless numbers. Also, on timestamping, see this: On timestamping

I think what we defined here so far may be hugely inefficient, but it’s effective. Exposure of clearly separated physical quantities allows us to build very modular and robust interfaces. Perhaps a few simple case studies would help us here; these may not be perfect because I made them up on the spot, but they should clarify the objectives.

  • Imagine that you have a robotic arm, and there is a joint motor controlled by a UAVCAN controller. The joint motor controller needs to know the position and the angular velocity of the joint so that it can close the position control loop. You connect the motor controller and a joint position sensor to the bus, and you see that the joint motor controller has two inputs: joint angle, of type uavcan.si.angular_position.Scalar, and joint velocity, of type uavcan.si.angular_velocity.Scalar. The joint position sensor has two outputs of same types. You assign them proper matching subject identifiers, and you’re done. Network configuration tools can check that there aren’t any type conflicts (some nodes can choose to check that for themselves automatically, too).
  • Imagine that you have an aircraft with a camera gimbal. You want the gimbal to point the camera towards a particular geo coordinate. You supply the coordinate to the controller; the controller picks up the orientation, angular velocities, and the geo location of the craft from the bus using uavcan.si.angular_position.RollPitchYaw, uavcan.si.angular_velocity.RollPitchYaw, and uavcan.si.angular_position.Vector2 (longitude, latitude; this choice of name is questionable). The gimbal computes the required angle relative to the body from the supplied information (difference in geo position and the orientation of the craft). Again, network management tools or nodes themselves can ensure correct type matching.
  • Suppose there is a generic air data computer provided with the inputs from total air temperature sensor, static pressure sensor, and a pitot probe. Like in the examples above, correct setup can be verified through the type system.
  • Have a look at the ESC example in the Stockholm summit recap post.

32-bit values with 56-bit timestamps don’t work well with CAN 2.0. They don’t work well at all. Because we don’t fit into one CAN 2.0 frame, we’d have to spill over into another frame, adding two more bytes for CRC. So the overhead would be: 7 bytes timestamp, 2 tail bytes, 4 CRC bytes; the total overhead amounts to 13 bytes for just a single 4-byte float. Uh oh.

I realize that we aren’t supposed to care about efficiency but I’m not sure we’re supposed to not care to such a wild extent.

Kjetil has pointed this out at the today’s call and so we ended up adding SynchronizedAmbiguousTimestamp, 24 bits wide, with an overflow interval of 16 seconds. Which is bad, because overflowing timestamps are :warning: dangerous, especially those that overflow every 16 seconds.

So, let’s consider for a second dropping the resolution of ambiguous timestamps down to, say, 10 microseconds. This is bad because of this:

But 16-second overflow period is also bad, so we have to decide which is worse. Given these new developments, I am inclined to accept timestamps with 10 us per LSB.

Another review and the ongoing work on supporting other protocols have revealed issues with the roll-over timestamp type. It is now proposed to switch the SI types back to the large monotonic (non-roll-over) timestamp type and remove the roll-over type from the standard set. Details here: