Big-endian vs. little-endian in the context of bit-level encoding

A minute of weird questions and low-profile self doubt.

I would like everyone who cares about bit layouts and also @scottdixon and @kjetilkjeka (even if they don’t) to squint at the following diagrams and see if one of them seems more sensible than the other.

This diagram is titled little endian:

This one is big endian:

One can see here that the big endian format preserves bit ordering continuity across the byte boundary, making the serialization and deserialization logic somewhat simpler in the general case. The advantage of the little-endian format is that while the general case is more complex (notice how the nibble 11102 is thrown all the way into the next byte while it’s supposed to be the most significant one), it is natively supported by most of the modern microarchitectures (those that support both big and little endian are still likely to run an OS or other software compiled for little-endian so it doesn’t really help), so byte-aligned primitives can be serialized by a trivial memcpy, no bit twiddling needed.

Preservation of the bit ordering continuity allows one to encode/decode serialized representations by shifting things in/out a large arbitrary-precision integer without intermediate byte-swapping per value which we’re forced to do in pyuavcan, for example:

https://github.com/UAVCAN/pyuavcan/blob/master/uavcan/transport.py#L94-L109

Little endian is what we currently specify in v0 so if we don’t have a really strong argument for changing this we should stay with little endian.

2 Likes

The fact that byte-order is little endian confused me while implementing a javascript decoder for uavcan v0. It would indeed be convenient if bit order was continuous. But I think I value compatibility with existing firmware and tools more.

1 Like

A colleague of mine continues to be vexed by the v0 dynamic array layout, which would remain the same in v1.

According to my colleague, given an example type:

uint16[<=16] data

and a payload with a data array length of 14:

73 2B FE 14 0F 73 FE 73 FE 83 FA 9C 06 4B F8

The first octet, 0x73, contains the 5-bit array length in bits 4 - 8 with bits 1 - 3 as the lowest 3 bits in the first index of the array.

The second octet, 0x2B, contains the highest 5 bits of the first index in bits 1 - 5 and the lowest 3 bits of the second index of the array in bits 6 - 8.

This pattern continues until the fourteenth index’s (byte 15) highest 5 bits are taken from bits 1 - 5 of the fifteenth octet, 0xF8 (bits 6 - 8 of byte 15 is padding and is ignored).

In C, the array length is found as such:

// 0x73 = 0111 0011 b
// 0x73 & 0xF8 = 0111 0000 b
std::uint8_t array_length = (0x73 & 0xF8) >> 3; // 14

The problem is that the length value itself is violating little-endian encoding of integers. This being the first value in the data it should appear in the LSB of the first value instead of the MSB. This defect is more evident when considering array lengths that require > 8 bits to encode.

So instead the story should be:

The first octet contains the 5-bit array length in bits 1 - 5 with bits 6 - 8 as the lowest 3 bits in the first index of the array.

The second octet contains the highest 5 bits of the first index in bits 1 - 5 and the lowest 3 bits of the second index of the array in bits 6 - 8.

This pattern continues until the fourteenth index’s (byte 15) highest 5 bits are taken from bits 1 - 5 of the fifteenth octet (bits 6 - 8 of byte 15 is padding and is ignored).

Is this something we should fix in v1?

Regardless we should update the v0 specification text:

Dynamic array encoding rules are sophisticated; hence it is recommended to review the existing implementations for a deeper understanding.

to actually specify how this has been implemented.

Notice that the payload in the example is about twice shorter because each element is two bytes long.

The bits 1-3 contain the highest 3 bits of the value. This holds for the following bytes as well.

Suppose we defined a data type from your description named dsdl.Array.1.0 containing just one statement uint16[<=16] data, then:

$ uc -v dsdl-gen-pkg dsdl
<...snip...>
pyuavcan._cli.commands.dsdl_generate_packages: Generated package 'dsdl' with 1 data types at '/home/pavel/.uavcan/pyuavcan/v0.2.0/dsdl-generated-packages/dsdl'
$ PYTHONPATH=/home/pavel/.uavcan/pyuavcan/v0.2.0/dsdl-generated-packages python3
>>> import pyuavcan, dsdl
>>> dsdl.Array_1_0()                                                                         
dsdl.Array.1.0(data=[])                                                                      
>>> list(map(lambda x: bytes(x).hex(), pyuavcan.dsdl.serialize(dsdl.Array_1_0(list(range(14))[::-1]))))
['7068006000580050004800400038003000280020001800100008000000']

Hex bytes:

70 68 00 60 00 58 00 50 00 48 00 40 00 38 00 30 00 28 00 20 00 18 00 10 00 08 00 00 00

Binary:

01110000 01101 000 00000000 01100 000 00000000 01011 000 00000000 ...

Binary, split at value boundary:

01110                # 14 elements in the array
000 01101 000 00000  # 13 = 0b1101
000 01100 000 00000  # 12 = 0b1100
000 01011 000 00000  # 11 = 0b1011
000 ...

I’d like to emphasize that Scott’s colleague is vexed not by arrays but by bit-level serialization in general. Arrays are not special, they follow the same bit layout principles as any other data type.

Seeing that the interpretation in the beginning of the post is incorrect, the following does not seem to hold:

The problem is that the length value itself is violating little-endian encoding of integers. This being the first value in the data it should appear in the LSB of the first value instead of the MSB. This defect is more evident when considering array lengths that require > 8 bits to encode.

The length value is not special. It’s just an integer like any other.

The problem I described in the beginning of this post is not about any internal inconsistency of the bit layout policy (I don’t think there is one but I could be blind to it); instead, the problem is that the bit order and byte order are different, which seems odd if you write an encoded value in binary (because in positional numerical system the most significant digits are on the left). In v0, the rationale for the difference was that this arrangement is optimal from the standpoint of the amount of computation needed to encode or decode a value on a conventional computer, because most of them are little-endian, nowadays.

Sorry, that was my typo. I meant to make this uint8[<=16] and defined the data and subsequent text as if this were the definition.

Sorry, that was my typo.

Okay. The core message of my post is invariant to the element type:

>>> import pyuavcan, dsdl
>>> list(map(lambda x: bytes(x).hex(), pyuavcan.dsdl.serialize(dsdl.Array_1_0(list(range(14))[::-1]))))
['706860585048403830282018100800']
70 68 60 58 50 48 40 38 30 28 20 18 10 08 00
01110      # 14 elements
000 01101  # 13
000 01100  # 12
000 01011  # 11
000 01010
000 ...

I have been perplexed by serialization in CAN for several years now, mostly coming from DBC. In the end, I concluded that thinking in bit ordering within a register is not helping me at all, because… there is no such thing - or at least I am not aware of a common instruction set that would allow me to address bits directly. To get a bit from a memory address I need to load data from that address into a register and apply some bit operations to clear the bits I am not interested in anyway. Drawing strings of 0 and 1 just made my head spin even more because no matter how I did it, it wasn’t simple and/or understandable.

So what usually helps me the most to understand an encoding is example code. @pavel.kirienko, are you saying that changing the serialization would avoid the dance I see here or something else?

Anyway, my 2 cents are to not use strings of 0 and 1 in documentation and use some pseudo-programing language instead.

I think switching to big-endian would be a mistake, after all. The main reason being that modern computers are little-endian, as I wrote here earlier. We could consider swapping the bit ordering but I see no immediate gain there. I predict that whatever solution we came up with, it will be unwieldy, simply because it is in general hard to distribute unaligned fields across byte boundaries.

Nope, I don’t think it’s possible to avoid low-level bit manipulation. As you stated correctly, modern microarchitectures are ill-suited for the task, so we have to live with that.

I guess we could extend the spec with pseudocode, unless there are other proposals. A pull request would be welcome!

(after a great pause)

Okay, I have to drive this to ground now.

So, according to my reading of section 3.7.4.2 of the v1 specification (and I believe we didn’t change this between v0 and v1) your example of deserializing a dynamic array with a capacity of 16 and a current fill of 14:

list(map(lambda x: bytes(x).hex(), pyuavcan.dsdl.serialize(dsdl.Array_1_0(list(range(14))[::-1]))))
['706860585048403830282018100800']

demonstrates an error since the first five bits of the first byte should be 01110 (14, the array length; 5bits = ceil(log2(16+1))) and the remaining 3 bits should be 101 (the lower 3-bits of the first value, 13) giving us 0xAE. This is why, in that same section of the specification, we recommend that the message designer add padding. So your example shouldn’t be correct for a message:

uint8[<=16] a

but would be correct for:

void3
uint8[<=16] a

How do you explain the discrepancy in pyuavcan?

Correct. The logic is the same since the very first RFC. It is also implemented in Libuavcan v0.

True.

Per the intention behind the current spec, this is untrue. The value, in this example it’s =13, is serialized first, then its bits are encoded into the message bitstream, then split at the byte boundary. So the sequence is as follows:

  1. The original data is: ((uint5) 14, (uint8) 13, (uint8) 12, …, (uint8) 0)
  2. The resulting serialized fields are: (011102, 000011012, 000011002, …, 000000002)
  3. Concatenated: 011100000110100001100…00000000
  4. Split into bytes, the last one is zero-padded: 01110000 01101000 01100000…00000000

The above was a simple case because the field type is uint8. Here is a more complex case where the array is defined as:

uint10[<=16] data
  1. The original data is the same as above: ((uint5) 14, (uint10) 13, (uint10) 12, …)

  2. The resulting serialized fields are:

  • 011102 = 14
  • 00 000011012 = 13
  • 00 000011002 = 12
  • 00 000010112 = 11
  1. After being arranged in the little-endian byte order:
  • 011102
  • 000011012 002
  • 000011002 002
  • 000010112 002
  1. Concatenated: 01110000011010000001100000000101100…

  2. Split into bytes, the last one is zero-padded: 01110000 01101000 00011000 00000101 100…

Same thing in Python (the DSDL definition is compiled with uvc dsdl-gen-pkg dsdl_src/dsdl):

>>> import pyuavcan, dsdl
>>> list(map(lambda x: bytes(x).hex(), pyuavcan.dsdl.serialize(dsdl.Array_1_0(list(range(14))[::-1]))))
['70681805814048100380c02808018040080000']
>>> bs, = _
>>> bin(int(bs,16))[2:].zfill(len(bs)*4)
'01110000011010000001100000000101100000010100000001001000000100000000001110000000110000000010100000001000000000011000000001000000000010000000000000000000'
>>> 

From Scott

Sticking with the simple example for now:

Given dsdl/VarArray.1.0.uavcan as

uint8 [<=16] data
import pyuavcan, dsdl
data = list(range(254,254-14,-1))
print('input : {}'.format(data))

input : [254, 253, 252, 251, 250, 249, 248, 247, 246, 245, 244, 243, 242, 241]

var_bits = pyuavcan.dsdl.serialize(dsdl.VarArray_1_0(data))
bit_string = next(map(lambda x: bytes(x).hex(), var_bits))
print(bit_string)
bytes_array = []
for i in range(0, len(bit_string), 2):
    bytes_array.append(int(bit_string[i:i+2], base=16))
 
# raw value (what goes on the wire in the current implementation)
print('hex: {}'.format(['{:02X}'.format(x) for x in bytes_array]))
print('dec: {}'.format(['{}'.format(x) for x in bytes_array]))
77f7efe7dfd7cfc7bfb7afa79f9788

hex: [‘77’, ‘F7’, ‘EF’, ‘E7’, ‘DF’, ‘D7’, ‘CF’, ‘C7’, ‘BF’, ‘B7’, ‘AF’, ‘A7’, ‘9F’, ‘97’, ‘88’]
dec: [‘119’, ‘247’, ‘239’, ‘231’, ‘223’, ‘215’, ‘207’, ‘199’, ‘191’, ‘183’, ‘175’, ‘167’, ‘159’, ‘151’, ‘136’]

# This is what is implemented
​
data_reconstructed = []
​
# First 5 bits are the fill
fill = (bytes_array[0] & 0xF0) >> 3
print('fill should be {} is {}'.format(len(data), fill))
​
# This means the fill field starts at bit 3 which violates LSB rules.
for i in range(1, 15):
    # Each element is then made out of the lowest 3 bits of the previous byte
    # and (OR) the last 5 bits of next byte. This means the next element begins at
    # bit 3 which, again, violates LSB rules.
    bte = (bytes_array[i - 1] & 0x7) << 5 | ((bytes_array[i] & 0xF8) >> 3)
    data_reconstructed.append(bte)
print(data_reconstructed)

fill should be 14 is 14
[254, 253, 252, 251, 250, 249, 248, 247, 246, 245, 244, 243, 242, 241]

# This is what we expect
​
data_reconstructed = []
​
# The fill begins at bit 1 per LSB format
fill = bytes_array[0] & 0x1F
print('fill should be {} is {}'.format(len(data), fill))
​
for i in range(0, 14):
    # Each element in the array begins at bit 5 of a byte and then continues
    # from bit 0 in the next byte.
    bte = ((bytes_array[i+1] & 0x7F) << 3) | ((bytes_array[i] & 0xE0) >> 5)
    data_reconstructed.append(bte)
print(data_reconstructed)

fill should be 14 is 23
[955, 895, 831, 767, 702, 638, 574, 510, 445, 381, 317, 253, 188, 68]

The Intel format (little-endian rules):

  1. Bytes are ordered LSB
  2. Bits are ordered LSB
  3. Parts of the bytes are ordered LSB – UAVCAN has a bug in this rule by placing fractional byte fields in Motorola format (big-endian). We followed rules 1 and 2 above (so bits inside those fractional byte fields are LSB), but we wrongly selected which part of byte to occupy by the fractional byte field (and the rest of following data).

Our experience has been that this implementation is incompatible with all known CAN databases and vendor tools for processing frames.

The intention of the messages I posted above was to show that the current implementations (PyUAVCAN, Libuavcan, Libcanard) are spec-compliant. I did not mean to imply that the currently specified behavior is optimal.

You are saying that the culprit is here:

Do we want to change it? The affected sections of the specification are:

  • 3.7.1.2 Bit and byte ordering
  • 3.7.3.1 General principles
  • 3.7.5 Composite types
  • 4.2.3 Examples

I believe I may have taken on faith something that I’m not able to personally verify. Specifically, my colleague asserted this part of my last post:

I’m not able to find that LSB dictates any bit order. This seems to be left to the realm of network transports and the only thing I’ve found for CAN are:

  1. Vector tools consider MSb (Most-Significant-bit) to be “inverted” but also supports this format without difficulty (according to http://vi-firmware.openxcplatform.com/en/master/config/bit-numbering.html)
  2. Most CAN peripherals integrated with LSB architectures expect MSb bit-ordering (based on my personal experience)

Given these two facts, it would seem changing this part of the specification is a bad idea. What we need to run to ground (and perhaps this is something I need to dive a bit deeper into with the tools vendors that have reported difficulties with our unaligned data) is, why are generic tools having trouble comprehending UAVCAN data payloads only when they are unaligned? I wonder if Kvaser could shed some light onto this?

Also, the diagram 3.7.1.2 is very misleading. Being someone that reads diagrams first and text second I assumed that you were specifying LSb since bit-number 0 is almost always shown as the LSb graphically (see any binary calculator, ever). I believe you should invert the bit numbering in this diagram and the associated text such that “the most significant bit is considered first (‘8th bit’, ‘bit 7’, or ‘the first bit received’)” and relabel the diagram to start at 7 with the label being “bit numbering”. We should additionally show a wire-level diagram that shows that “bit 7” is the first received for a byte (i.e. Most-Significant bit first) and perhaps point out that the CAN specification itself required MSb in the data field.

I think that as long as we stay with least-significant byte first (which is important for best compatibility with modern CPUs), everything else is a matter of convention so it can be changed freely. I will be standing by for the results of your analysis of bit-level compatibility with other solutions out there.

The bit numbering in the diagram 3.6 is intended to demonstrate how to arrange subsequent fields within a byte. We say that the most significant bit is the first one, index 0. What it means is that if we have a structure like this:

uint2 a
uint3 b
uint1 c

It goes into a byte like this (one symbol – one bit here; x – padding):

aabbbcxx

Instead of like this:

xxcbbbaa

If you think it should be shown on the diagram in a more clear way, could you please open a PR in the spec repo?

FWIW, GCC and Clang on AMD64 use little-endian bit order:

#include <stdio.h>

struct A
{
    unsigned a : 2;
    unsigned b : 3;
    unsigned c : 1;
    unsigned d : 16;
};

union U
{
    unsigned bits;
    struct A a;
};

int main()
{
    union U u = {0};
    u.a.a = 3;
    u.a.c = 1;
    u.a.d = 0b1111110011001100;
    printf("%x\n", (unsigned)u.bits);
    return 0;
}

Outputs 3f3323 which is 0b1111110011001100100011, which is d=1111110011001100 c=1 b=000 a=11.

Although, in Python, numpy.packbits() (and its counterpart numpy.unpackbits()) defaults to the big-endian bit order.

CiA 301 – CANopen application layer and communication profile (3.4 MB)

CANopen is bit-little-endian:

Comparison:

# dsdl/Test.1.0.uavcan
uint10 value
$ uvc dsdl-gen-pkg dsdl_src/dsdl/
$ python3
>>> import pyuavcan, dsdl
>>> [bytes(x) for x in pyuavcan.dsdl.serialize(dsdl.Test_1_0(0x21C))]
[b'\x1c\x80']  # CANopen would have yielded b'\x1c\x02'

One issue with the little-endian bit order is that it is poorly compatible with the implicit prefix alignment. By “implicit prefix alignment” I mean this: Proposed change to variable-length arrays, note that it is applicable not only to arrays but to unions also.

Suppose we have a definition containing a variable-length array like this:

void5
uint8[<8] a

Or a definition containing a union of 8 fields:

void5
MyUnion.1.0 a

Denote the void bits v, the implicit prefix bits i, and the value bits a. The current, big-endian rule results in the following layout:

vvvvviii aaaaaaaa ...

Whereas the proposed little-endian rule results in the following layout:

iiivvvvv aaaaaaaa ...

Observe that only the former (i.e., current format) is equivalent to:

iiiiiiii aaaaaaaa ...

Only the current format allows implementations to substitute a pattern like voidX + [u]intY with [u]int(X+Y).

This is not a major advantage, obviously, but the gains of the little-endian format are not high either.

As a reference, here is an implementation of the little-endian format in PyUAVCAN: https://github.com/UAVCAN/pyuavcan/pull/99

But in that section (7.1.3.2) it states:

Octet 1 is transmitted first and octet k is transmitted last. Hence the bit sequence is transferred as follows across the network:

b7, b6, …, b0, b15, …, b8, …

So we do see that CANopen is little-endian for bytes but if bit 7 is first in the first octet and bit 15 is first in the second octet how does it follow that CANopen is little-endian for bits?

In the example b = b0 .. b9 = 0011 1000 this means the least-significant-byte (1st octet: 0x1C) would be transferred as such:

+--------------------------------------------+
| bit number | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
| value      | 0 | 0 | 0 | 1 | 1 | 1 | 0 | 0 |
+--------------------------------------------+

The most-significant-bit was transferred first here.

Let’s pause and first be sure we’re on the same page here to reduce ongoing confusion.