How-To (de-)serialize integers

Serialization of integer values to/from binary buffes is a common task when dealing with network or socket I/O – so much so, that we have a range of standard functions such as ntoh(), etc. for converting from network (big endian) to host byte order (big or little endian).

Unfortunately, such functions have two main drawbacks:

  1. They do not exist for all integer types.

  2. They convert C++ values, and do not distinguish between the value as stored in a typed variable, and the value as encoded in a binary buffer. This means that forgetting to use one of those functions yields nonsensical values and is difficult to find.

We take a different approach here. We treat C++ variables as always in host byte order, so that such confusion cannot occur. And then we make the serialization and deserialization into a byte buffer explicit, at which point the values will be converted to/from network byte order.

Note

There is a current trend to encode data on the network in little endian format because a lot of systems either are little endian, or support both modes of operation.

For a number of reasons too long to discuss here, this is not the approach taken here. One reason is that historical networking protocols do use big endian as the network byte order, and we have to work with those.

Serialization

Given any integer variable, we can serialize it with the liberate::serialization::serialize_int() function.

Serialize an integer
1#include <liberate/serialization/integer.h>
2
3using namespace liberate::serialization;
4
5char buf[1024];
6
7size_t value = 42; // Any integer type will do
8auto used = serialize_int(buf, sizeof(buf), value);
9assert(used == sizeof(value)); // For all integer types

Deserialization

Deserialization works in inverse using the liberate::serialization::deserialize_int() function.

Deserialize an integer
1size_t result = 0;
2auto used = deserialize_int(result, buf, sizeof(buf));
3assert(used == sizeof(value));
4assert(used == sizeof(result));

Non-Integer Types

Why focus solely on integer types? Largely, this is because for floating point values, there is no universal definition for how to encode them on the wire. Encoding and decoding them is significantly more difficult.

It also turns out that more often than not, such numbers are not required – often enough they are the result of e.g. dividing one integer value by another, which suggests that it may be better to encode those rather than the result.