Merde is not Serde

Amos Wenger: Today we're talking about: merde. It's pronounced 'mer- day.' Okay, this is a pretty fun joke. The title of today's presentation is "merde is not serde." The subtitle is "another take on (de)serialization in Rust," and I have been forced against my will to talk about it now, even though it's not perfect yet.

It's gonna be perfect soon, but it's not yet. Because James did his own take on another serde, another possible serde, and so I had to present mine, which is actually in version 8.1.2. I've been doing some major iterating on this thing, and people have asked me a lot, but "Are you aware that 'merde' in French means poop?"

And yes, yes, I am. This is the logo for the library. It's drawn by Misia. There's going to be a link to her website in the show notes on sdr-podcast.com/episodes, where you can find the slides as well.

And, um... it's what I want serde to be, which is very different from what you want serde to be, James, because I want something that builds fast. I think actually we're aligned on that part, but then I also want a lot of functionality, and I mostly want to deserialize a bunch of JSON, because I have a website.

So I deserialize a bunch of stuff, and I hate the big Codegen, I hate the long compile times, I hate the proc macros. So I did like a quick hack for myself and then things got out of control as they tend to do. So now I'm maintaining a whole ecosystem. It does support deserializing JSON, YAML and MessagePack, just because that's what I was using on my website.

And it does support serializing JSON and not any of the other ones. There's no reason why it doesn't just- I haven't gotten around to -

James Munns: I haven't needed it yet...

Amos Wenger: Someone said, "Hey, can I contribute KDL support?" I was like, "Knock yourself out." I mean, things are still majorly shifting in the crate, as you can tell by the major version number increasing rapidly.

But yeah, it is in production in my website. So if you can pwn my server somehow, cause I forgot about something in merde, then... you know, more power to you.

James Munns: I'm getting my denial-of-service vectors ready.

Built-in value type

Amos Wenger: One big thing that's been bothering me in the serde ecosystem is that everyone is using the serde JSON value type. When you deserialize to something, you don't know the shape of it, you're not sure, it's kind of the Any type of the deserialization word.

You use the serde JSON value, which looks like this. It has null, it has bool, it has a number, string, array, and object. And you don't really get to choose much. This is pretty much set in stone. I don't think they can change it at this point. Like, adding a variant would be breaking, changing the type of variant would be breaking. We're pretty much stuck with this.

Even for formats that are not JSON, you can take a binary format and deserialize it to that. But then, if you deserialize, if there's like a byte slice in there, it's gonna be an array of U8.

James Munns: Yeah, I actually have another crate, postcard-dyn, which transcodes postcard data into JSON, and I actually just used serde_json::Value for this, because when you don't know the shape of it, it's an easy, if you have a heap, you can stack items here, and it just works, but... I guess that's the trappings of success, is once it's popular, it becomes very difficult to change.

Amos Wenger: Yeah, I, want to make it very clear that part of the reason why I'm able to experiment with merde is that Rust has changed and I don't have any compatibility guarantees. I'm sure that they would like to change some things in serde if he could, but serde v2 is a hard sell in the Rust community, the v1 is a big selling point so changing it has to have some significant upsides. And this is not what I'm trying to do I'm just exploring the space in a very different part of space than you were James, so that's why it was funny to me.

One thing that I noticed in that enum is that object has a map type? And I was like: why not just use HashMap? Clearly they are using the base type, the string is just an own string in Rust. Number, I don't know what number is hiding, but the thing that is hidden behind map is whether insertion order is preserved or not, so it's an alias, or like... It's a struct with a hidden impl in there that forwards all the implementations to the underlying thing.

And it's either a BTreeMap, which is ordered, but not by insertion order, or it's an index map, which is like a regular map, but it also keeps the insertion order. And so it iterates by the order by which objects were inserted into it. I'm sure you know what I mean.

James Munns: Yeah, I do think BTreeMap preserves insertion order.

Amos Wenger: No it doesn't! It orders keys, so keys must be comparable, like they must implement Ord, and then when you iterate, it iterates from smallest to largest. But it doesn't preserve insertion order. That's why I was confused. I was like, "Oh, BTreeMap, if it's ordered- no, wait." You can look it up. Now we have time...

James Munns: I'll trust you. I definitely know what you mean with index map, cause we also have that on embedded.

Amos Wenger: Because if BTreeMap preserved insertion order, why would you need index map? Right?

James Munns: But I thought that's why I use BTreeMap instead of HashMap in a lot of places, but... maybe it's just a consistent iteration or- well, yeah. Okay. I'm going to look it up, cause we're talking about it.

Amos Wenger: We are going to look it up. Yeah. that does make sense.

James Munns: Tap, tap, tap, tap, tap. Mm-Hmm.

Actually, we should go to the top of std collections cause they talk about this. "Use BTreeMap when you want a map sorted by its keys." So yeah, I guess you're right- it has a consistent iteration order, but it's going to be sorted by its keys and not in the insertion order.

Amos Wenger: Okay. Yeah, I was confused too 'cause I remember looking at the code base for cargo-dist a lot and it has type aliases for like ordered map versus unordered map.It's good for comparing things because it's going to have a predictable order as opposed to HashMap which could resize in the middle or like the different random seed that turn to avoid denial of service.

In typical SDR fashion, I have 32 slides and we're spending 10 minutes on slide 4. So anyway, it's just using basic Rust types, but it's serde JSON. People kind of standardize on that. You can serialize to a serde JSON value, you can deserialize from a serde JSON value instead of like some input in some markup format.

I decided to make the value type a first class citizen in merde and add things like bytes and add things like copy-on-write.

You can see CowStr, you can see CowBytes and different array and map types. You can see I64 and U64. Because, well, U64 can have larger values than I64, so not everything fits in there.

You can see float is an ordered float, so the whole type is ordered. That's a compromise. But basically, F64 does not implement Ord because you're not supposed to be able to order NaNs, I think?

James Munns: Yeah, exactly.

Amos Wenger: NaN is, smaller than NaN is always false, but NaN is bigger than NaN is always false as well, and there's like 4 million NaNs? It's a lot of NaNs.

James Munns: In F64 probably.

CoW all the things

Amos Wenger: A big thing I wanted to do, cause I was like: I'm gonna compromise on a bunch of things. I'm gonna compromise on monomorphization, cause I want better build speeds. I'm gonna compromise on- I don't know, doing dynamic dispatch, which is the same kind of compromise.

One thing we can have for free is just borrow from the input instead of copying to the heap whenever we can, so we have the CowStr and CowBytes types. Every Rust project has their versions of CowStr, because there's a CoW type in the standard library, and you can pass it a reference type... like str, and then if that type implements ToOwned, then you have your pair of types. You have the borrowed type and the owned type, and in this case the borrowed type would be, str slice, so ampersand str, and then the owned type would be the, um, standard library type String with a capital S which is the own version of a string.

But I don't want String to be the owned type I want compact string to be the owned type because if you have a lot of short strings it's kind of silly to do a tiny allocation on the heap and put like 8 bytes there if you have a lot of first names or something.

In the space that it would take to point somewhere in the heap, you can just store the data inline directly. There's a slew of small string crates.

James Munns: Yeah. I was going to say, which one are you using? Because I know there's like eight and they all go back and forth on which one -

Amos Wenger: I'm using the best one obviously, which is compact_str. There's going to be again links in the show notes. And the same thing for bytes, you can do the same trick for bytes. You just need a different crate. It's a crate that was actually based on compact_str. I did a review of all the small string crates a few years ago, and it's out of date. And now the best one is compact_str. So I should update it.

James Munns: Pro tip.

Amos Wenger: As I mentioned, a priority for merde is build speed, because I like to iterate on my website a lot. I interact with a lot of APIs. I interact with the Reddit API, the Patreon API, the GitHub Sponsors API, my own APIs internally. And so I don't want to be spending my whole time compiling serde generated code, which we've already brought up a lot in this podcast. We know that it's an issue.

One of the things that make compiling projects with a lot of serde derived types... it's serde_derive just to be clear. It's not serde itself. It's serde_derive specifically, which I think most people are using. I think very few people are doing manual implementation of serde's Serialize and Deserialize type if they can help it.

James Munns: And even if you aren't using it personally, if you have dependencies on types that have a serde feature, you're still going to be pulling it in in your dependencies and paying that cost, really.

Amos Wenger: Yes.

No proc macros

Amos Wenger: And proc macros are really hard to cache properly. There have been some experiments to enable caching for proc macro output. If all the inputs are the same, then there's no point in even compiling the proc macro code and running it. It's a long story. We could have a whole episode on that, but there's gains, but it's hard to determine what the actual inputs of the proc macros are.

It could be arbitrary code. And even if you sandbox all the things. You still have to compute cache keys and all that. And at the end of the day, it's really hard to come up with a solution that works for everyone, that actually speeds up build rather than slowing them down. Caching is hard.

James Munns: Especially when you have non sandboxed items and the fact that proc macros can be side effectful and are allowed to do things like write or read from disk or make network connections like Diesel does

Amos Wenger: But I have used my superpower. I have complained online about it, which let me know the people were already discussing, "What can we do?" And yeah, constraining what proc macros can do, giving them a way to make up their own cache key, and if they get it wrong, it's their fault. There's a lot of things to do. Again, different episode idea, for later.

But in merde, no proc macros, only declarative macros. So this is what serde would look like: you use the serialize and deserialize trait and derive macros that both symbols are named the same. Rust namespacing rules are fun. And then you have this attribute on top of your struct, so if you have a struct points with two fields x and y of type i32, on top of that you just slap pound, hash, octothorpe, whatever, little sharp symbol, and then square brackets, derive, serialize, deserialize, debug.

So, nice thing about that, you can derive serde's traits the same way you can derive debug, which is a built in derive macro.

But in merde, you don't do that because that would be slow, so instead you just have a normal declarative macro.

So first you declare your struct, just as usual, derive debug, struct point, two fields x and y of type i32, and then separately, you call merde, colon, colon, derive, or you can import derive into your namespace and then in there you have this kind of weird DSL, a domain specific language, and you'd say which traits of serde you want to implement.

Maybe you only need to deserialize, you don't need to serialize, just like with serde you can implement or the other. So impl, what looks like a tuple, but it's just a list of traits. So impl, open deserialize, comma, serialize, close parentheses, and then for, struct point, and then a list of fields, so struct point, pointy brackets, x, y.

You just have to list the fields again. Because it's a declarative macro, not a proc macro, it doesn't see the body of the struct declaration, so you have to list the fields. I'm very happy that you don't have to also repeat the field types, but you do have to give it the field names.

James Munns: I'm very interested to what happens if these fall out of sync, and I'm sure there are errors that go on.

Amos Wenger: It's actually not that bad. A thing that's great about declarative macros is, sure, it's not DRY. Don't repeat yourself, dry in the acronym, but the error reporting is actually pretty solid. Rust-Analyzer is able to see through the invocation and everything. It's not as awkward to use as I thought it might be.

And, uh, yeah, if those get out of sync, you do get errors. If you specify a field that doesn't exist, it's going to be like, "Well, point doesn't have a field called ' blah'." And if it's missing one, it's going to be missing field. Like when you do a struct literal and you're missing a field, it's just going to say, you're missing field.

It's actually not that bad in practice. I just don't like that you need to repeat yourself. So that's why I was thinking about something more like Codegen. That's actually my last slide, but I was thinking about code generation because now that I have those declarative macros, I could just have like a separate schema files that would generate both the struct definition and also the macro invocation or even just the trait implementations.

James Munns: Yeah, I've looked at that for postcard as well. I think that's sort of the ultimate aim, or not ultimate aim, but like... it's the last step you have to hit, and it's one of those things where you end up with something like protoc from protobufs, where you just have a schema file and you just do Codegen from them.

And either for flexibility reasons or for one of the things for postcard is to be able to support other languages... 'cause serde is very nice, but it's very Rust, which means if you want to generate decoding and encoding libraries for another language, the proc macro is probably not going to help you very specifically.

Amos Wenger: Yeah. There's a crate called schemars which supports the same annotation that serde does, and you can generate JSON schema definition files so that you get autocomplete and editors and whatnot and that's great, but it's all the hack. It's all like serde was the first big thing that took off, it happened after rustc_serialize, which now you can see some traces of. They're like, "Don't use that. That was early on. We deprecated it. We removed it from the standard library. Don't- don't look at it."

But yeah, serde is the standard, everyone's adopted it I think you don't see a lot of experimentation outside of that. There's like only the zero copy frameworks because they really don't have a choice. The zero copy serialization and deserialization cannot just use the serde traits. But apart from that, everyone else is just like kind of stuck with the serde interface, which is a blessing and a curse.

So back to merde. This- again, should be looking at slides. I'm sorry. This is going to be a slides-heavy one: go to sdr-podcast.com/episodes to look at the slides.

You can derive, deserialize and serialize for structs that are fully owned. So this struct doesn't have a lifetime parameter, but even if you do, that's kind of the thing, copy-on-write all the things. If you have some CowStr fields, if you have regular CowStr lifetime fields, if you have whatever things might borrow from the input.

There's only one lifetime allowed as opposed to serde, which has more flexibility. You can have different line time and specify which one's actually borrowing from the input here. You can only have zero or one. And if you have one, then in the invocation of the derive macro, you just add the lifetime parameter. So it's struct name, angle brackets, single quote s and then the list of fields.

James Munns: Do you support generics too? If that's the next slide, just go to the next slide. But I run into this with Postcard-RPC has a macro where I wanted to accept lifetimes for borrowed types. It's not so far from this, but trying to support both lifetimes and generics between the angle brackets in a macro by example is challenging because I couldn't figure out how to get them separate.

Because for what I was doing specifically I needed to have the lifetimes separate, I couldn't just have a token tree of all of the characters, because when I used them in different positions I needed to put the generics in one place and I needed to put the lifetimes in another

So I'm wondering if you solved that, or if you just said, "Not my problem yet."

Amos Wenger: I have not. No, don't have any generic types. Like I said, it's in production on my website because I was tired of waiting for things to build. So I moved everything to merde and a lot of the iterations are like me running into the next step of: Oh, for this scenario, it doesn't work. So I need to change the design. But no, I haven't done generic type parameters yet. Only lifetime parameters and only one of them.

Built-in IntoStatic trait

Amos Wenger: So what do you do: you want to turn that into name static and how do you do that? Usually I don't know what people do, they just do to owned, they implement to owned manually? Well ...

you can derive, kind of, it's not really a derive macro, but it implements into static for you, which has a tag parameter called Output, which is constrained to have the static lifetime. And it's a very weird trait because Output is supposed to be the same as self, but static.

This is the first merde presentation on this podcast, and maybe not the last, because there's a whole bunch- there's a with lifetime trait so that I can define a deserialized owned trait, and like map from any lifetime to that type, but with the static lifetime.

I didn't figure it out. Someone figured it out for me on social media. I asked the question, "Hey, is that even possible in Rust?" And I got 80 percent questions like, "Why are you trying to do that? Why don't you know Rust? You clearly, you should know that it's not possible in Rust." And then there's like, some person from somewhere, they've following me forever and they're like, "I think I found something."

And you're like, "Oh wow, that's dirty, but it works." This is a teaser for a future episode.

James Munns: Have you ever seen Manish's three part blog post series on zero-copy, yoke, and I forget what the third one is, but-

Amos Wenger: I have an open issue for yoke support, because it's fun. Yeah.

James Munns: Yeah, I was gonna say, it's the exact same thing where you're trying to support copy-on-write types or really... instead of just taking the slice from a cow input, kind of taking the cows, or doing a clone of the cow and things like that. It's a tricky lifetime problem because all of this is geared towards like what the input is, but if the inputs a borrow of the cow...

Amos Wenger: Yeah, yeah...

James Munns: Sub lifetimes and stuff like that.

Amos Wenger: Okay, yoke is super interesting because it's a middle ground between: We're borrowing from an input and so we can only exist as long as the input exists and you can only call deeper you can call to a lot of sub functions. We can never return anything tied to that input or... it's complicated.

Or you copy everything to the heap, on compact type like compact string and yoke is like: no, if you move the source along with everything that borrows from it, then that's fine.

But the Rust type system doesn't really let you encode that. So we need to use a bunch of unsafe code and like expose a crate that lets you do crimes, but in a sort of controlled environment. It's like rubicon. It's, it's really-

James Munns: Bounded crimes. Yeah.

Amos Wenger: I want to bring yoke support into merde but it's not done yet. The hazard here, I guess, is imagine deserializing a two gigabyte document, and you're borrowing a tiny string from it. Yeah, you're lugging around the entire source document.

JavaScript has the same problem. In browsers, strings are transparently borrows or copies of things, and sometimes they can retain, like, act as garbage collector roots for very large datasets. And that's what memory usage is caused by sometimes.

James Munns: Yeah. The Bytes crate in tokio as well has this problem where it tries to do that kind of like copy-on-write sort of behavior. But if you have a whole one megabyte buffer and you're borrowing three characters from it: surprise, you get to keep the whole buffer live for as long as that little borrow is alive for.

Amos Wenger: And that's really hard to find out because when you're designing, when you're writing the code, you don't know what the input's going to look like necessarily. So it could be that the design was sound, but then later on the shape of the input changed and now suddenly it's using a lot of memory. So it's a good reminder that our instincts are usually wrong, and it's better to just go and measure things with the right tooling.

Deserialize trait

Amos Wenger: Just like serde, merde has a Deserialize trait. It looks a little bit funny, but James, don't say anything.

It takes a lifetime parameter called s for source. It is sized for unclear reasons, I forget why. It takes a mutable reference to a deserializer because what the deserializer does is simply yield a bunch of events. So that's the big difference from serde:

serde has methods per data types in the visitor interface or something, in the deserializer interface, I don't know. It has one method per type, which is great because it's static dispatch. The compiler knows exactly the path of the code it's going to take, it can inline everything, it's very, very fast.

I just like enums for some reason, so we have a big event enum that has one lifetime parameter. It kind of mirrors the value enum, but it also has map start and map end, array start and array end. Some formats are self descriptive, and so you get a hint as to the size, the number of elements in the map, or the number of elements in the array, and that's your queue. Some are also self descriptive? I always think JSON is technically self descriptive, but you don't know how long an array is going to be. You just know when it starts and when it ends, and you have the difference between I64, U64, F64.

I haven't shown an implementation of deserialize, but basically, yeah, you ask for the next event and you look at what it is. And if it's not what you expected, then you return, which another thing, and it's not really covered in the slides is: what do you do if you get the next event? And it turns out you shouldn't have? There's no such thing as peek. There is only next. So you can't put back an event in there, but then what happens instead is that there's a "deserialize starting with", and you can pass the first event.

So you can kind of inject an event back into the stream, which is needed in some scenarios. It's kind of dirty. I don't really like it, but it works.

James Munns: If you're taking a stream of events, how do you handle out of order struct field? So if Struct says that it's A, B, and C, but because JavaScript or JSON or whatever has reordered the fields, they might send CAB on the wire.

Does that still work within the map start? Is it just like collecting all of the items and you go, "Ah, that's still in the list. It's fine."

Amos Wenger: Basically what it does is it just asks for events and then you look at the key names and it has in scope a bunch of bindings that are option the type of the field. And that's an interesting part that I didn't really show, but there's a way to- without knowing the type of a field. If you just have the struct name and the field name, you can declare a local of type option the type of the field. And that's what the declarative macro relies on.

So you just have a bunch of fields and you just assign them. And then at the end, if they're none, then you're like, "Oh, I guess we missed that field." And you also know if there's duplicate fields, you can decide what to do about that. There's a bunch of things you can do.

DeserOpinions trait

Amos Wenger: This brings us to our next slide, which is how do you specify without the flexibility of a field level annotations like you have in serde, how do you specify if you should... well, deny unknown fields, but that's container level annotation, or whether you should allow- like, just fall back to a default value for some field if it's absent or something like that.

James Munns: You've made up a crate on the internet, just so it can have opinions.

Amos Wenger: I truly did. There's a DeserOpinions trait that has a default implementation in merde.

And I gave the trait definition here. It has a deny unknown fields function, which returns a Boolean, pretty straightforward one. There's a map key name function. It gives you the name of the key for a map, and then you can either return it itself because it takes a CowStr and returns a CowStr, so you don't need to copy anything. Or you can map it to something else.

And I have a little example here. On my website, I tend to deploy before an article is ready, so it's in draft mode, and sometimes I want to get others to proofread it, and so it has a draft code in the front matter, the YAML front matter at the beginning of the markdown.

And it used to be called... well, actually, the Rust field is called draft underscore code in snake case. But in Markdown, I want to write it in kebab case, so it's draft-code.

The third method currently in DeserOpinions is "default field value". That one made people angry on the internet, because it takes a key, this time it's borrowed, you can't mess with. It happens after "map key name", so first you get to translate the key name to something else, and this is less efficient than serde.

It's also more flexible in a way? You could look up that key name somewhere if you wanted to. You could have a generic change all snake case things to kebab case things or to camel case things. So it's a compromise. Something costs at runtime, but it's less code also in the binary. Honestly, in the context of web application servers, it's totally fine unless you're Amazon, but I'm not. So I'm fine.

And then the default field value takes the key as a borrowed string, and then it takes a slot, and that's awkward? Because, again, we're not in the serde universe where we can generate things precisely based on the type of fields. Here, that single function has to work for every field of any type. So the field slots type cannot be generic over the type of the field. Which brings us to a problem: what happens if you pass like the address of an option I64 and someone tries to put a string there?

James Munns: What does happen?

Amos Wenger: What does happen? Well, what happens is: I have a vendored version of mini type ID, and it just says, it's not the right type. But it's not very safe.

James Munns: Okay. Okay.

Amos Wenger: But basically in Rust, there's a type ID function in the standard library. You can pass it any type. Well, as a type parameter, there's also type ID of. I forget the exact name of the function, but yeah, there's a type ID type. You can get some value that is unique to that type, and you can compare those so you can tell if two things are the same type or not, but that doesn't work if you have lifetimes.

Which we do because we borrow things so that's why you need a separate thing. And that's why it doesn't use the standard library oneit's the thing I stole from somewhere, but it's in the comments where I stole it from so I think it's okay license-wise, I don't know don't sue me. Just Chat GPTed that shit, bro.

James Munns: Uh oh. Hate that...

Amos Wenger: Oh, and then how do you use opinions in the derive, like DSL for merde, after imple deserialized, four structs, blah, with a list of fields, you can do via, and then the name or the opinion- the type that implements DeserOpinions.

So I thought that was interesting: all of that could be done via a more complicated DSL inside the declarative macro, but the declarative macros in merde are already pretty bad. They're already at the limit of what I can read. I got LLMs to generate bits of it, and I had to step back like, "Wait, wait, I'm not sure what happens here." There's several level of expansions. There's a lot of different syntax variants.

I figured out how to simplify them in the last release, but it's really kind of pushing the boundaries of what you should be doing with declarative macros, which is why I was excited to see you also doing the declarative macro crimes, James, with the whole um... generating schemas at compile time and like mixing const and declarative macros. That was fun.

James Munns: I don't think I picked up on this originally, but I think you mentioned it. So you have your own Deserialize and Serialize traits. These aren't using the serde Deserialize, Serialize traits. You have your own version of it.

Amos Wenger: That's correct. Merde has its own Serialize and Deserialize traits for a very good reason. We'll get to that.

Serialize trait

Amos Wenger: So let's actually look at the serialize trait. The serialize trait is kind of boring.

It takes a reference to self and it takes a mutable reference to a serializer and it returns a result.

The result is the happy path is the empty tuple and the error path is the error from the serializer. Pretty boring. Again, there's only one thing that's kind of weird. Uh, it's async. It's an async fn in trait, which is something we've gotten since Rust 1.75. I know because I'm working on a draft of an article that I started months ago and I'm now updating for release.

James Munns: Yeah. Hell yeah.

Amos Wenger: This is the serializer trait. Just like in serde, there's serialize and serializer- one letter difference. Serializer has an error associated type, so that each serializer can have different types of errors. And then the important function, I guess there's a bunch of other ones that I've hidden from you, but the important one is "write".

And it takes a mutable reference to self and an event. So events are used both for deserializing and for serializing, which means you can pipe a deserializer straight into a serializer and it should work.

Now we're getting into the interesting part, which is why is that function async? And the answer is, well, sometimes...

Stack Full? Get Another!

James Munns: Oh no... that's not the reason I was expecting. That was not what I was hoping for!

Amos Wenger: Okay, so sometimes you serialize deeply nested data structures like this one. This is example code that generates 100,000 nested JSON arrays. So it's an array that contains an array that contains an array that contains an array that contains blah blah blah- 100,000 layers, an empty array. And if you deserialize that to a serde JSON value, you're going to blow the stack.

Because it's a function that calls a function that calls a function, it's recursive, so it's piling up those stack frames, and the stack is a space reserved for return addresses, arguments, some locals, and eventually you're going to run out of space because every thread has some fixed amount dedicated to the stack, which can range from, I don't know, on desktops maybe one megabyte to eight megabytes or something?

James Munns: There's a couple of megabytes.

Amos Wenger: If you have a lot of threads and you know you're not going to go over, you can make threads with less, you can make threads with more.

Okay, so why is this a big problem and not just a small problem? It's a big problem because if you're accepting user input, people can send you that weird payload, which is just a hundred thousand nested arrays.

And if they can do that and crash your application, then that's a problem for you. So you have to do either of two things. You have to stop them and decide: okay, I'm not deserializing that. Clearly this is a bomb. This is like a zip bomb. This is malicious. I don't want to run out of memory. so I'm just protecting that. Or you deserialize it properly without actually blowing the stack.

James Munns: So you also mentioned that you support deserializing YAML, I believe? Or just serializing.

Amos Wenger: Just deserializing.

James Munns: Because YAML has the other fun thing where it has referential support, so you can reference other things, which I think is also a pretty big denial of service vector.

Amos Wenger: I have not tried that. I'm using like an event based YAML parser and then just kind of translating it to merde types. I have not tried that. That's a good point.

James Munns: Because those are the two attacks, yeah, you get the super nested stack overflow and then you get like the zip bomb where you reference something that references 10 other things which references 10 other things which refere- and in some parsers just explodes.

Amos Wenger: So, merde doesn't actually solve that because it solves one half of the problem... I used to call it infinite stack, but that's not really true. So I'm calling it metastack now. The prior art for this is stacker. So stacker is a crate.

You can inject calls to it at several points in your program where you think you might run out of stack and what it's going to do is check how much stack you're using and if you're using too much it just grows the stack and by 'just'- this is a load bearing 'just'- I should really have looked into that before today's presentation but what I assume it does is it allocates an entirely different stack, copies everything to it and then changes the stack pointer essentially. But now that I'm saying it, it cannot work that way because there's things pointing to the stack. So, I'm not sure.

Maybe it just chains stack? Like it allocates another chunk of the stack and moves there and changes the stack pointer... but then when you return, it restores the old stack pointer. That must be how it works, right? There's no other way.

James Munns: This is one of those cursed things where I'm used to the embedded level of cursed where you can't extend the stack-

Amos Wenger: Yeah.

James Munns: The stack is statically known. And when you run out, you're out.

Amos Wenger: You barely even have a heap allocator. Yeah. Yeah.

James Munns: Yeah, yeah. So it's one of those things where I know how it works on like a very cursed bare metal kernel or operating system level, but no idea how user space handles this kind of thing or chaining stacks or resizing the stack and stuff like that.

Amos Wenger: I don't know. Stacker worries me because now that I think about it- yeah, it cannot move things that are already on the stack. I'm pretty sure, because that would invalidate a bunch of pointers.

In C# you can do things like that. It has a moving, compacting garbage collector. It's aware of all the things that point anywhere. And so it's able to update pointers, or actually references, when it moves objects in memory. But there's no such thing in the C, C++ Rust cinematic universe. If something is there, it's not going to move.

That's the whole reason we have pin and friends, another complicated topic in Rust. So i'm pretty sure this is how it works. It's like: okay, this stack is full, we'll get another and then just stack things onto there. And this is kind of what I do in merde except instead of changing the stack pointer and things which is scary, it's terrifying because what if some other language unwinds? I don't know how that works. Maybe it works well because you have return address in the stack? I don't know! I'm not sure, now I'm interested.

But basically, instead of doing all that, you can just have an async function, which is a state machine. And what you get to do with an async function is return all the way back to the runtime. You've been stacking these calls to async functions and you get to yield, you get the return poll pending, which doesn't mean you're done, but it just means you're waiting for something.

And in this case, you're waiting for more stack, which is never going to happen. It's not like you go all the way back to the runtime, I mean, I guess it could... what I did instead is that: when you're about to run out of stack, it creates a next future, it stores that in a global, and then it yields all the way to the runtime, and then the runtime is like: oh, it wasn't ready. I polled the deserialize function, the future, and it didn't return poll ready, it returned poll pending. Let me check the global. Ah, sure, okay, there's more work to be done. Let's do that again. Let's do that next work on the stack we already have. And so on and so forth. If that one returns poll pending, it's pushed onto a queue and then we'll run the next future and so on and so forth until we're done deserializing and then we pop from the queue and then we resolve back. You should see my hands go around, which we're not going to be in the edit.

James Munns: So, you're heap allocating new futures cause essentially futures are just an enum of all the state of essentially each of the await points at that. So what you're doing is you're creating a whole new future. And then sort of like, daisy chaining all the futures together to get it there... okay, that's interesting.

Amos Wenger: The weird part is that initially none of this is async. We're not actually doing async I/O at this point. We're just trying to deserialize something synchronously. So on the outside, everything is synchronous. The public API for all this is synchronous. You don't have tokio going on, you have nothing.

In there at some point in the internals of merde, it sets up a dummy waker that doesn't actually allow registering for being woken up. It doesn't actually have timers. It's not an actual reactor. It's entirely made up. The only purpose is so that we have something to pass to the poll function. That's part of the future trait in the standard library. So that we can yeah, call all that. And if it returns pending, we know that we needed more stack. And we just run the next future and so on and so forth.

James Munns: Interesting. So it's sort of like the gen await thing, where you're really wanting generators, you're wanting like pausable iteration, but you're wrapping the stable interface, which is async, to sort of get something generator ish out of it.

Amos Wenger: And that's also why the deserialized trait is taking a mutable reference to a deserializer and not like taking ownership of it because you're lending it to the next future. And then when it finishes, it completes, it's going back to the previous future. It really needs to be able to do that. Like you said, daisy chaining, I think, is a good term for that.

The result is that you can actually deserialize something like a hundred thousand nested arrays to the merde value type, the dynamic- we don't know what it is, so some of it's going to have to be heap allocated because you cannot have infinitely recursive types that would be infinitely large.

So, if you look at the value enum, you can map an array, they are heap allocated, because you can't do otherwise. How do we explain that to someone who's never thought about infinitely recursive types? Like a type that contains itself is infinitely large.

James Munns: How do you explain it? When you have that level of indirection, like you can have a vec that contains vecs that contains vecs that contains vecs. You run into this where the compiler will actually warn you if it realizes this has happened and it'll tell you: you need some level of indirections either through references or boxing or things like that.

Amos Wenger: I guess imagine an enum with like two variants. One of them doesn't have any data associated to it, and the other one has itself. Then the size of the entire thing is either, it has to be at least one byte because you need to have the discriminant between one variant and the other, and then it also has to have the size of the largest of the variants, which is itself, so it's itself plus one byte... plus one byte plus one byte plus one- because it keeps going. If you keep computing the size, it just recurses and that you get infinitely size type. So that doesn't work.

But in this case, it does work because it goes through the heap. So you can recurse, as long as you have memory. Which is why I renamed it from infinite stack to metastack, cause you don't actually have infinite memory. However! James, can you foresee any problems with this?

James Munns: You're not running out of stack, but if you're putting an infinite number of things on the heap, eventually you'll have heap exhaustion.

Amos Wenger: That is an issue. Yes, that's a problem with current merde. This is more dangerous than serde actually, because running out of stack on a desktop OS is fine. I think it's just going to restart. But running out of memory, it's going to make the whole machine start swapping and become very slow.

James Munns: Depends on how you set your ops things like that.

Amos Wenger: Yeah. you better set your quotas, right? Because-

James Munns: What your memory limit is. If you do force the whole system into swap because it's unbounded, then yeah, you'll start swapping into whatever your cheap host is.

Amos Wenger: But there's another issue with that. I'm really curious if you can guess what the next slide is because you can't see the presenter view. _I _see what the next slide is, but you can't. Can you guess?

So you deserialize, you build that value? That's like very very deeply nested and then at some point you don't need that value anymore.

James Munns: Oh, does drop become incredibly...

Drop bombs, async I/O, dynosaur

Amos Wenger: Yes, you just made a drop bomb.

James Munns: Yeah.

Amos Wenger: Because drop is synchronous. So you drop the outer array, which drops the array inside of it, which drops the array inside of it, and all those drop calls pile up on the stack and eventually blow up the stack. So you made a type you can never drop. You made a drop bomb.

James Munns: Oh, interesting. Yeah. I do think I've seen this in the serde JSON code before

Amos Wenger: Other people have made drop bombs before. You have bomb disposal code. You can like go deep in there and start dropping things manually, shift things around so that the stack never gets as deep. Or I guess in my case, I would just make another trait to dispose of them safely. Like use the same metastack async function things so that you could write natural code.

So the whole point is, it's weird machinery with a fake runtime and everything, but then you just get to write natural code. You just get the right aysnc function and you get to recurs as much as you want. You don't need to think about any of it. All de serializer can be protected against too much memory usage. 'Cause you can measure how much memory you use. It has to go through this weird runtime thing, which decides if you get more stack or not. So actually, I haven't implemented the protection, but there's a single point to do it.

Whereas in the serde ecosystem, serde JSON specifically has protections against that, I think, but all the other deserializers would have to do their own thing. There's no standard mechanism for that because it's not baked into the deserializer or deserialized interfaces.

I have the upper hand because I have zero compatibility guarantees. I get to use modern Rust. I get to break things whenever I want. Some people have started porting things to merde v3 and I'm like, sorry, it's v8 now. Like, you're five breaking changes behind, so don't use it, but do come with me and experiment

You know, other thing, it's that now that you have those functions as async, you can actually do async I/O.

James Munns: Yeah. That's where I was hoping you were going with this -

Amos Wenger: Yeah!

James Munns: That is a very useful thing.

Amos Wenger: Because if you're going to change everything, you're going to change all the traits. You're not going to completely go off course, off the serde ecosystem and everything's async anyway. Initially, I assumed you would read everything and your source is always like a byte slice... but actually you can have a reader.

You can have a network socket. You can have any type of IO underneath it. And you can call all the merde methods synchronously with that fake runtime thing or asynchronously. And initially I was confused of how it was gonna work 'cause there's real async where you have tokio and you have weird fake async just for the stack thing. But actually it's pretty easy! If the next future global is set to something, you know you need more stack. If not, you're going back to tokio.

So passing through and saying like: okay, is it just a stack thing or does it actually waiting for a read or a timer or something? So it does work. You can do streaming deserialization from like an HTTP response coming in and streaming deserialization. You don't need to keep everything in memory.

James Munns: So you said global a couple of times, and that makes me nervous because what if you're decoding on eight threads and you have that, is that a real global?

Amos Wenger: No, it's a thread local, but now that I'm thinking that it's real async...

James Munns: If it's real async, you're going to be moving around different worker threads and you could technically have two threads. Like if you yield waiting for more data, you could end up-

Amos Wenger: No, it's fine!

It's fine, I just thought about it, you're right. Okay, first of all, it's a thread local, so as long as things were actually synchronous from the outside, it was fine. But now that we're actually doing async I/O, is it still fine? And the answer is yes, because whenever you yield, you pass through our fake runtime.

James Munns: Wait, but do you use the fake runtime for real async or do you only use the fake runtime when you use the blocking code?

Amos Wenger: Yeah, because it's the thing that calls poll. It calls poll, and then whenever anything downstream of that, whenever they return poll pending, we see it. So we intercept it.

James Munns: The interceptions not in the executor, it's in the future, like your manual future impl that wraps the thing. When the actual child future returns pending the parent future, okay. That's what I missed is that it's the future catching that and not the runtime or something.

Amos Wenger: Yes. Outside of the deserialization, tokio would never get to observe that ThreadLocal being anything other than none, because it's synchronously being checked and then immediately starting work on a separate future, but that all happens synchronously. It never yields in the middle of deserialization, which might be an issue actually, but I guess it would because tokio has budget thing, but yeah, I think the ThreadLocal thing is sound.

James Munns: The thread will yield, but the task won't. You can't force a task to yield.

Amos Wenger: I mean, you just return poll pending. Yeah, yeah, well.

James Munns: You can't force it. Yeah, it has to be... it's cooperative.

Amos Wenger: It's not preemptive, yeah, it's cooperative, exactly.

Okay, we're also almost out of slides.

One last thing, except for CodeGen, that I'm excited about, is that currently, the deserialize implementation has a deserialize function that is generic over the type of the deserializer.

But it shouldn't need to. Because it's only taking a mutable reference. So we don't need to know the size of the deserializer. We're taking a reference to a trait, so there's two ways to do that in Rust. You either do ampersand mut, and then some generic type, or impl trait. Or you take a mutable reference to a trait object, but that wording I just learned today, this morning, has been renamed.

We used to say that a trait is either object safe or not object safe, but they renamed that to dyn compatibility, which actually, means what it says. It's like: can you make a dyn out of it. And in this case because we have an async function in the trait which is something you can do starting from Rust 1.75 it is not currently dyn compatible but there is a crate called dynosaur with a y: d y n o saur.

James Munns: I was really wondering whether you were going to say dinosaur or dynosaur.

Amos Wenger: Obviously dinosaur! But yeah I say dyn but yeah okay. Do you say 'dine' like 'dine' compatibility?

James Munns: No... I would say 'dyn trait'. Yeah.

Amos Wenger: It's the dinning philosopher problem.

So what I want the trait to look like is to actually take a reference of a dyn deserialize, but I can't because that's an async function in trait, but I could if I use dynosaur.

So this is why I was forced to give the presentation now, which is good because it's already long, but this is my next step. One of the big problems with serde is that you derive serialize and deserialize, and then they get instantiated when you actually compile the thing.

So if you have a library with a lot of types, the things that gets cached is kind of just the templates. And then when you actually use them in your application, everything gets instantiated and it takes a long time to build everything. What I want is a single implementation of deserialize and serialize, a single copy of the code that works with any deserializer.

I want a full use dynamic dispatch. Because we don't care. Because if it's an application server anyway, we're parsing JSON, for crying out loud. We would use another format if we cared about performance. But we don't. Dynamic dispatch is fine. Heap allocations are fine. we want rapid iteration. We want to be able to deploy a change in a few seconds, like rebuild the entire website.

So I want to actually use dynamic dispatch and that's something that dynosaur is playing with for now. It's kind of like the async trait crate. It's like allowing you to do what the language will eventually permit you to do and you can experiment with the design. Just the fact that you were able to use async fn trait. Again- it's a Rust 1.75, which came out a few months ago. I don't remember when exactly. This is six week release train.

James Munns: What are we on, 84 or something now? So it's like 10- 60 weeks or something?

Amos Wenger: 1.83 just released this morning when recording. Yeah.

James Munns: Oh, 1.83. Okay. So just about a year then.

Amos Wenger: As I've been adding features to merde, this is where I confess: initially when it was very simple, of course, it was faster than serde. With serde, if you want type to conditionally support serde, you have to do cfg attributes.

It's kind of annoying to do. Whereas with merde derive, if you don't enable the flags, macro just expands to nothing. So you can have it on all the time, which is great. It's very convenient.

But yeah, over time, as I added deserialized and serialized impls for a lot of traits, like all tuples up to size 20: it's a bunch of code and, you know, rustc churns. So I would like to have this dynamic dispatch thing and see it fixes anything.

I also thought about, if I did all this for nothing and you can just, like, instantiate types in crates so that they're cached and reused. And I don't know if I told you about this idea, James.

James Munns: You had the aha moment when I was talking about postcard-forth, you went, " gasp You could do- nevermind! We'll talk about that later..."

Amos Wenger: Yeah, it's about instantiating those types. Like I said, you have those generic types in the crates. So they don't really get compiled. They're ready for being monomorphized later. But if you force the crate to monomorphize them by like having a JSON features, then anything that depends on that could just use those. But that's depending on the compiler flag... More research is needed. That's gonna be in another episode.

Thanks for coming to my show. You can use merde right now, but you really shouldn't. I know it's version 8, but even I feel funny about it. It's usable. It's just less flexible than serde, and it's questionable whether it actually builds faster, but I've had a lot of fun with it,

and I'm looking forward to like adding stuff like yoke support, doing actual dynamic dispatch, maybe doing Codegen. It's aggravating to have to re list all the fields.

I got LLMs to animate that for me, but still, I don't know, I don't like it.

James Munns: That's good to prototype. It's good to mess around because want to be able to figure out what is and isn't good before you really commit. it's the opposite of, well, you're testing in production, but it's, testing in a more scoped production, which means you get actual feedback from it and you get runtime data. And you can see if like you have asserts on and they hit and things like that,

Amos Wenger: Another thing I put the emphasis on when I was designing this is to get good diagnostics. When deserialization fails, unless you opt into like some other crates, you get very little information. It's like missing field. That's it. Somewhere deep in your document that you don't know where.

So with merde JSON implementation, has this nice syntax highlighting thing and it points exactly to the path. It costs memory to keep track of those things but to me it's worth it because in an application server you absolutely want to know exactly where it failed. Some third party server returned a funny response once in a blue moon, and you absolutely want to know what happened.

So yeah, it's fun to make different trade offs from serde. I still think most people should be using serde, but I'm excited that we get to experiment with uh... other parts of the design space.

James Munns: Hell yeah!

Amos Wenger: Of course, now I want to trash everything and replace it with bytecode. No thanks to you, James.

James Munns: You're welcome.

CodeCrafters is a service for learning programming skills by doing.

CodeCrafters offers a curated list of exercises for learning programming languages like Rust or learning skills like building an interpreter. Instead of just following a tutorial, you can instead clone a repo that contains all of the boilerplate already, and make progress by running tests and pushing commits that are checked by the server, allowing you to move on to the next step.

If you enjoy learning by doing, sign up today, or use the link in the show notes to start your free trial. If you decide to upgrade, you'll get a discount and a portion of the sale will support this podcast.

Merde is not Serde

Video

Audio

Show Notes

Transcript