NotEuclid

Data serialization is a very common task, and JSON and XML are probably the most popular formats for that. They both can be used to store essentially arbitrary data. However that doesn’t mean they are interchangeable.

For example, consider this XML document:

<?xml version="1.0" encoding="UTF-8"?>
<bookstore>
	<book category="fiction">
		<title>Dragon Flames</title>
		<author>John Doe</author>
		<year>2012</year>
		<price>35</price>
	</book>
	<book category="non-fiction">
		<title>Localisation &amp; Globalisation</title>
		<author>Smith White</author>
		<year>2020</year>
		<price>120</price>
	</book>
</bookstore>

The same data can be encoded in JSON much more succinctly:

{
	"books": [
		{
			"category": "fiction",
			"title": "Dragon Flames",
			"author": "John Doe",
			"year": 2012,
			"price": 35
		}, {
			"category": "non-fiction",
			"title": "Localisation & Globalisation",
			"author": "Smith White",
			"year": 2020,
			"price": 120
		}
	]
}

This looks much better. However it’s not actually the same data. It may be the same for the application but, note the differences:

  • year and price are numbers in JSON. XML doesn’t know numbers, they’re all strings.
  • category is an attribute in XML. In JSON it’s a field like everything else.
  • In XML each entry is labelled as a book separately. In JSON only the array is labelled (with a field name).

For a specific application that probably doesn’t matter (though would be a big pain for a generic converter). But, consider a slightly modified example:

<?xml version="1.0" encoding="UTF-8"?>
<bookstore>
	<book category="fiction">
		<title>Dragon Flames</title>
		<author>John Doe</author>
		<year>2012</year>
		<price>35</price>
	</book>
	<magazine category="non-fiction">
		<title>Localisation &amp; Globalisation</title>
		<editor>Smith White</editor>
		<year>2020</year>
		<price>120</price>
	</magazine>
</bookstore>

The change is minuscule but, how to reflect it in JSON? One way is to use separate arrays for books and magazines but, if the relative order matters, the only way is to add a special key field, like:

{
	"items": [
		{
			"$type": "book",
			"category": "fiction",
			"title": "Dragon Flames",
			"author": "John Doe",
			"year": 2012,
			"price": 35
		}, {
			"$type": "magazine",
			"category": "non-fiction",
			"title": "Localisation & Globalisation",
			"editor": "Smith White",
			"year": 2020,
			"price": 120
		}
	]
}

This works but, it’s not all that clean anymore. Especially given that the field order can be arbitrary so, $type may end up somewhere in the middle.

There is more however. XML isn’t just for data serialization, it can also handle text markup. In fact, that’s its original purpose. Consider:

<review user="Willy Smith" rating="5">
	<p>I’m <em>so</em> glad I’ve read it! It’s great!</p>
	<p>Totally recommend!</p>
</review>

It is sure possible to store anything in JSON but...

{
	"user": "Willy Smith",
	"rating": 5,
	"text": [
		{
			"type": "paragraph",
			"content": [
				"I’m ",
				{ "type": "emphasis", "content": "so" },
				" glad I’ve read it! It’s great!"
			]
		},
		{
			"type": "paragraph",
			"content": "Totally recommend!"
		}
	]
}

...is it really desirable?

And if you think the example is exaggerated, maybe a bit but, the real issue is mixing text and markup. JSON just isn’t designed to handle this.

There is a reason however, why XML is often considered “legacy”. While it is actually much more flexible than JSON real-world data doesn’t need that flexibility, which gets in the way instead. Just look at the first example again: if all you need is a bunch of items with specific fields, JSON is perfectly good for it! Only if your data has to be a mixed bag of everything like an AST or “just” marked up text XML starts to gain advantage — and may end up being a better choice.

For quite some time I was wondering why are text formats so popular while binary formats often struggle. Even for graphics there is PPM, and JSON and HTTP¹ are now ubiquitous. While there are very popular binary formats like ELF and JPEG they are also very specialized, while less specialized formats like MsgPack and ProtoBuf are far less common.

It was stated this is rooted in human capabilities: our mental recognition works well with text including text-based formats yet is totally unsuitable for binary data. While I do agree with that the core problem is subtler in my opinion.

One part is human-machine interface (i.e. tooling). Image having half of your text files encoded in EBCDIC and some others in word-swapped UTF-32. Now suddenly, your mental recognition is of no use as your text editor just can’t open that and leaves you looking at 41 e2 c3 c9 c9 in the hex editor...

That’s not enough for a problem though; after all if we have good tooling for text why can’t we have the same for binary? But the problem here is one detail that is so obvious it is often omitted:

Text is binary.

Text is not an alternative to binary, it’s a binary format of its own. Or rather, a text encoding is; there are many but, what they encode is the very same notion of text, just like there are many image formats but what they encode is the same notion of pixels².

Now, that may or may not be obvious but it doesn’t look like a problem. But the more interesting part is a corollary:

Binary formats (like MsgPack) aren’t siblings of text formats (like JSON); they are siblings of the text format itself, of the venerable ASCII!

This cuts it clear. Competing with XML and JSON is one thing but competing with ASCII is just another level entirely. No wonder there is no accepted universal binary format: there actually is one, and it’s called “text”.

I think this is why EBCDIC has died. It is also probably a major reason Unicode succeeded despite all the pain: while for human-targeted data it may be beneficial to have tailored formats for different needs (scripts), for machine communication benefits of a single universal format are so enormous UTF-8 was accepted nevertheless.

¹ Okay, HTTP 2 is not text-based anymore. Except it is, in a sense: it still uses textual key-value pairs, only wire-level encoding is a bit more complicated.

² For raster image formats only, obviously.

Wikipedia describes a monad in some Haskell syntax. Oh well. But actually, one doesn’t need to know Haskell to understand what a monad is and how it can be used.

Basics

Suppose you have a Future<String> but that string is actually a decimal integer so, you want to parse it. There are two ways:

  1. Imperative: wait for the Future to resolve, then parse into an i32.
  2. Monadic: apply the parser “right now” getting a Future<i32> as a result.

Both ways can also be used when instead of parsing you need something that is itself asynchronous, i.e. returns Future<i32> and not i32; in either case you’ll get a Future<i32> as the result.

While imperative code is generally easier to follow monadic version is often leaner on resources. That’s why the modern async/await syntax blurs the distinction a bit: it looks like 1 but works more like 2, evidenced by the fact the result (of an async block or function) is a Future and not the computed value itself.

Generalization

So far I was only talking about Future specifically. But the approach can be generalized and, that generalization is what a monad is.

In more concrete terms a monad M is a generic type M<T> which allows applying functions T -> U and T -> M<U> to itself. As there may not be a T inside at the time the result can’t be plain U so, it is M<U> in either case.

There are many examples of monads. The textbook one is Option (aka Maybe). But while tremendously useful on its own its monadic nature is shadowed somewhat: sure one can do map or and_then on it but, one can do if let Some(x) { y += x; } just as well and, that’s not at all monadic. Except in Haskell where the very notion of monad originated, one can’t: while one can pattern-match one has to return something regardless of the match and, there is no imperative stuff like += in Haskell whatsoever so, just going the monadic way may very well be easier.

Composition

Another way to look at monads is in the parlance of monadic composition.

Suppose you have two functions, f(A) -> B and g(B) -> C. You can compose them into a new function f_then_g(A) -> C. However, if what you have is f(A) -> M<B> and g(B) -> M<C> regular composition doesn’t work as types don’t match. But monadic composition does!¹ The result is a function f_then_g(A) -> M<C> which applies g to f’s result effectively even though it can’t do that directly.

Outside of Haskell though composition often isn’t spelled out. But it is still there. Consider:

fn download(url: &str) -> Vec<u8> {
	let data = fetch(url);
	let data = decrypt(data);
	let data = decompress(data);
	data
}

It’s actually just a composition of 3 functions: fetch, decrypt, and decompress. But in many cases that’s not possible as-is. E.g. fetch is likely asynchronous. So the code would be more like this:

async fn download(url: &str) -> Vec<u8> {
	let data = fetch(url).await;
	let data = decrypt(data).await; // maybe it’s hardware accelerated?
	let data = decompress(data).await;
	data
}

And this is actually a monadic composition of the 3 functions. Indeed fetch being async doesn’t return a Vec<u8> anymore but a Future<Vec<u8>> instead while decrypt still takes a Vec<u8> as an argument, yet they are essentially composed together.

In the last example await is the key: it’s what facilitates the composition, and it is also specific to Future. But Future isn’t the only monad out there. The first example was overly simplistic: actual functions would return a Result and one’d need to handle that as well, like:

fn download(url: &str) -> io::Result<Vec<u8>> {
	let data = fetch(url)?;
	let data = decrypt(data)?;
	let data = decompress(data)?;
	Ok(data)
}

And this is again, a monadic composition of the 3 functions. The monad is io::Result this time. It has its own special syntax (shared for Result and Option), so short because it tends to be everywhere in Rust.

¹ Provided that M is a monad of course.

How this works

There is no direct support of composition in the definition of monad. But it is actually pretty easy: if you have f(A) -> M<B> and g(B) -> M<C> you can just call f and apply g to the result.

It actually works the other way as well: if you have an x: M<A> and need to apply f(A) -> M<B> to it using composition, you can wrap x as h(()) -> M<A> (returning x unconditionally), monadically compose h with f, and call the resulting (()) -> M<B> function to obtain the M<B>.

Conclusion

While directly working with monads may be a hassle they are really useful as building blocks, allowing to unify such distinct concepts as error handling, asynchronous programming, and more.

It has trait-based generics, which are safer and much more convenient than C++ templates. It also has rule-based macros which are safer and much more powerful than those of C/C++. And as if that wasn’t enough it also has procedural macros! Those are comparable to C++ templates in power (quite literally: both are Turing-complete) but this time, it’s not some pattern matching but a full-fledged program that can directly do arbitrary transformations — except not exactly arbitrary, safety checks are still there so it won’t end up with a missing closing bracket or such. And for those with special needs, pretty much all of that machinery is also available “offline”, essentially as a code generation framework.

Now, of course there are rough edges. Traits, while incredibly flexible have their limits. And macros only operate syntactically, not semantically. There appears to be no good way to do complex transformations at semantic level. While trait specialization can help, not only it’s not a solution for everyone it’s also a huge can of worms on its very own. It’s actually one of reasons C++ templates are so hard to work with: anything can be specialized, anywhere. Now, Rust does put a stop on the latter part: as I understand it specializations are still governed by the orphan rule so, there are only so many places for each. The former part is also tamed, specialization is strictly opt-in as proposed. But I’m still somewhat uneasy.