Endian Safe Bitfields

Somewhere on the internet you might have stumbled upon this nifty approach to handling endianness in c++. https://twitter.com/delroth_/status/117316970273177602

struct uint32_be {
  uint32_be() {}
#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
  uint32_be(uint32_t value) : {
    *this = value;
  }
  uint32_be &operator=(uint32_t value) {
    raw = htonl(value);
    return *this;
  }
  operator uint32_t() { return ntohl(raw); }
#else
  uint32_be(uint32_t value) : raw(value) {}
  uint32_be &operator=(uint32_t value) {
    raw = value;
    return *this;
  }
  operator uint32_t() { return raw; }
#endif

  uint32_t raw;
};

This is handy because it stores the value in big endian format in memory regardless of the host byte order. You can apply templates and whatnot to implement all your favorite base types. The constructor, assignment operator and cast operator allow you to use the type (almost) identically to a built in integer type.

uint32_be foo = 2;
printf("%u\n", foo + 1);

This is really handy when creating structures for accessing encoded data. For example, let's say you wanted to read a UDP header.

struct UdpHeader {
  uint16_be source_port;
  uint16_be destination_port;
  uint16_be length;
  uint16_be checksum;
  uint8_t data[];
} __attribute__((packed));

Awesome! Now you can just reinterpret cast a buffer and and access it through normal member access syntax.

What happens, though, if you have a field that isn't byte sized? You could use C bitfield syntax, but unfortunately bitfield positioning is implementation defined and endian-dependent. That's a pain!

What we can do is combine the same trick above with anonomyous unions and a little template goodness.

template <int bitpos, int bitsize>
struct BigBitfield32 {
 public:
  BigBitfield32 &operator = (const uint32_t &other) {
    const uint32_be mask = ((1 << bitsize) - 1) << bitpos;
    const uint32_be bits = other << bitpos;
    raw = (raw & ~(*reinterpret_cast(&mask))) |
          (*reinterpret_cast(&bits) &
           *reinterpret_cast(&mask));
    return *this;
  }
  operator T() const {
    uint32_be bits;
    *reinterpret_cast(&bits) = raw;
    return (bits >> bitpos) & ((1 << bitsize) - 1);
  }

  uint32_t raw;
};

Note the absence of a default and a copy constructor. C++ is finicky and doesn't allow non-pod types to be overlayed inside a union. Luckily, the constructors aren't that useful anyway because you can't really construct multiple overlayed values in a union anyway! You can add more template goodness to once again to expand it to different word sizes (and little endianness). Here's how you can use it. An IP header has some bitfields.

struct IpHeader {
  union {
    BigBitfield8<4, 4> version_field;
    BigBitfield8<0, 4> ihl_field;
  } __attribute__((packed));
  union {
    BigBitfield8<2, 6> dscp_field;
    BigBitfield8<0, 2> ecn_field;
  } __attribute__((packed));
  uint16_be total_length;
  uint16_be identification;
  union {
    BigBitfield16<13, 3> flags_field;
    BigBitfield16<0, 13> fragment_offset_field;
  } __attribute__((packed));
  uint8_t time_to_live;
  uint8_t protocol;
  uint16_be header_checksum;
  uint8_t source_address[4];
  uint8_t destination_address[4];
  uint32_be options[];  // if ihl > 5.
} __attribute__((packed));