Homepage GitHub

On multi-agent services and design guidelines

The UAVCAN Guide says this:

A service may be designed in such a way where it is provided by a group of identical collaborating agents with strong logical cohesion . In this case, we admit that the means of differentiating said identical agents from each other belong to the business domain, and as such, the service specification should contain the necessary provisions for that. We call such designs multi-agent services . […] As a practical example of a multi-agent service, consider the propulsion system of a quad-plane VTOL: the propulsors creating lift in the hover mode belong to one multi-agent propulsion service, each propulsor being modeled as a separate service agent; the propulsors creating thrust in the forward flight belong to the second multi-agent service. In this example, the service consumers are unaware of which UAVCAN nodes implement which propulsor agent – each agent could be a dedicated ESC/FADEC, or they all could be managed by a single hardware unit, or something in between.

The idea is simple, but what is not simple is whether there are any well-defined criteria that justify the use of a multi-agent service architecture over the traditional service design?

To put the question into practical terms, consider the already introduced example of a group of motor drives (ESC) that are controlled synchronously using a shared setpoint message and publish some feedback/status messages. The multi-agent approach prescribes that the feedback messages from all ESCs are to be published on a shared subject, each publisher differentiated by its index in the group. The traditional architecture would designate a dedicated subject for each participant and remove the index from the definition of the message.

In both cases, the subject-ID of the feedback subject may be defined as some function of the setpoint subject and the index to reduce the state space of the configuration options; for example, if the setpoint is published over subject X, and the index of the given ESC is N, in the multi-agent case the feedback would be published over subject X+1; and in the traditional case it would be X+1+N. It follows that the complexity of configuration is a weak differentiator.

It is not obvious if either approach is architecturally beneficial compared to the other. Even if it may be so in the case of ESC specifically, how does one make a sensible choice between the two in other scenarios?

Let me ramble/think aloud a little more here.

One related consideration is that a multi-agent service cannot be easily composed with a non-multi-agent one. The limitation in composability needs to be kept in mind because the larger and more complex the distributed system is, the more likely it is that it will become a critical growth bottleneck.

Suppose a multi-agent group member publishes some observation M over subject S that includes its index and the observation itself. Suppose that the objective is to set up another networked service that would be consuming the observation. In this case, seeing as M includes the index and multiple agents share S for their individual observations, it would be impossible to directly compose the services.

In a traditional architecture this would not be a problem because if S is not shared, the consumer can subscribe to it directly.

Ultimately, it may be sensible to rely on both approaches:

  • Elements that are inherently coupled with the specifics of the service and are not composable by their nature may use the multi-agent architecture freely because if they are not composable there is no downside.

  • Elements that are generic enough for composability to matter should use the traditional architecture.

When I speak about composability, I mean that if a particular part of a service models a common entity or process, it should be presented in a way that allows its reuse in a different service that is also involved with the same entity or process. One might see how composability is related here.

As a trivial example to illustrate the point, observe that both an electric drive and a turbogenerator are involved with power conversion between electrical and mechanical systems. Suppose we modeled electrical power as ElectricalPower.1.0:

float32 current
float32 voltage

And mechanical RotationalPower.1.0:

float32 angular_velocity
float32 torque

Then combined both to model the state of a power conversion process:

ElectricalPower.1.0 electric
RotationalPower.1.0 rotation

A more shortsighted approach would be to engineer highly specialized types for either specific application without regard for their commonalities. For example, one might do this (a completely made up example):

uint7 esc_index
bool esc_engaged
float32 angular_velocity
float32 current
float32 voltage

While seems benign here and now, in the future it has the potential to blow up the ecosystem with message compatibility hell.

stream of consciousness goes on

Observe that the regulated data types are to be the cornerstone and the foundation of the ecosystem because that’s what future applications and hardware will be built upon. We can certainly quickly throw some things together like I did for UAVCAN v0, but that would be just unwise.

We need to apply best practices as outlined in the IDG. By looking at the draft proposal prepared by the DS-015 SIG before I joined the effort, I see that the main issue, when you zoom out enough to see things abstractly, is the lack of orthogonality.

A reusable and scalable design would keep semantically unrelated things away from each other. Let’s view it this way: in a vehicle there is equipment that manages physical processes. One can conveniently view a physical process as the end objective that is facilitated (implemented) by a piece of equipment. Swapping the equipment should not affect the nature of the process, and vice versa.

Reusability and composability requires that entities belonging to equipment are kept separate from entities modeling physical process. Likewise, physical processes that are inherently different or not semantically coupled should be modeled separately.

This is why one cannot simply put RPM (which is not permitted actually, should be radian/second) and, say, an ESC index into the same message. That would run the whole architecture into the ground. For the same reason one would not define a dedicated message type for carrying the torque/position of a servo.

It follows that we have essentially two loosely coupled sub-efforts here:

  • modeling the logical states of equipment.
  • modeling the physical processes.

For the same reason I am envisioning that there will be no message type for GNSS units, with latitude and longitude. Instead, there will be a dedicated position-velocity-covariance type in the physics namespace that will be published by GNSS receivers (and other positioning units). Time is to be published in a separate message (as long as we constrain our models to Newtonian physics, it is to be considered unrelated).

Now, obviously, one might object here that a GNSS receiver emits a ton of additional metadata that has no place in such a purist approach. That’s where the equipment namespace is useful – information that is specific to a particular sensor kind is to be kept there. So we end up with (this is a rough approximation):

  • a generic type modeling the kinematic states (position, velocity, covariance)
  • time (whether the GNSS receiver acts as a time sync master is irrelevant here; either way the time message may be useful on its own)
  • GNSS metadata: satellites, status flags, augmentation, corrections, etc.

(Nuno might object here that at this point nobody cares about GNSS, people need BMS. That’s true but we have to keep a holistic view of things to come up with a sensible design.)

An attempt to provide range/size-optimized generic messages by offering float16 & float32 variants was unsuccessful because the resulting type library turns ugly and redundant very quickly. Most importantly, the 16/32 differentiation renders such optimized interfaces bit-incompatible. Therefore, in the interest of providing a clean architecture, we will be focused on float32 in the first place.

Now, in v0 multi-frame transfers have been known to cause major performance degradation, but in v1 this is not expected to be the case because our new libraries are equipped with highly efficient O(1)/O(n) data handling pipelines.

We have conflicting requirements:

  • Ensure adequate functionality over Classic CAN (MTU 8 bytes)
  • Ensure future-proof, clean, and scalable architecture.

The main goal right now is to establish the required messaging rates for Classic CAN deployments, which will be limited in their functionality, and based on that derive the optimal trade-off between Classic-CAN-friendliness and the purity of the architecture.

CAN FD hardware is already there and the existing legacy should not delay the transition significantly.

The value of DS-015 or any other industry-specific standard is in enabling compatible, composable, and extensible complex systems. A design that does not put these capabilities above the resource utilization concerns would defeat the purpose of the standard.

This is not to say that we don’t care about the resource utilization, of course, and we are not blind to the obvious fact that no matter how well architected the standard is, it will not be needed if it cannot be used within the constraints of real-world systems. The challenge is to find the optimal balance.

So the current design seems to be compatible with Classic CAN: