Intrusive lists for fun and profit

(on embedded)

Sure, it's unsafe, but it's not like, UNSAFE unsafe.

View the presentation

Video

Audio

Download as M4A

Show Notes

Episode Sponsor: Depot

EuroRust Paris, Oct 9th and 10th 2025
negging
linked lists
SDR episode: "What are you syncing about?", async/await, waker
garbage collection 'GC', Rust used to have GC: Rust prehistory notes on GC, Rust Blog 2015, The History of Rust talk by Steve Klabnik
More Pre-1.0 stuff: Actor Model, (removing) Green Threads, The Rust That Could Have Been talk
Static random-access memory 'SRAM' & dynamic random-access memory 'DRAM', CPU cache levels, Double Data Rate 'DDR' memory, low-power double data rate 'LPDDR' memory
Optimization Settings, referred to as -Oz or -Os as the cli version of opt-level = "z"/opt-level = "s"
Direct (or dynamic) memory access 'DMA'
Rust unofficial "Learn Rust With Entirely Too Many Linked Lists by Aria Desires / @Gankra on GitHub
- Mumble mumble kernel embedded something something intrusive
Miri. The first commit of the too many linked book was in 2015, nearly 3 years before const fn stabilized in Rust 1.31 in 2018, though the first commit to miri was later in 2015, but this was long before it was used by the compiler.
"Stack Vectors", like tinyvec and heapless
itch.io app
Red Zones
Endianness, memory management unit 'MMU'
Memory protection unit 'MPU'
MSPLIM and PSPLIM, Cortex-M23/M33 or Thumbv8m instructions for setting stack limits
flip-link from Ferrous Systems
/dev/mem, a linux interface that allows you to view the live memory of the system, opening this as a file can violate memory safety rules in Rust
axum, a rust webserver library, tower, a rust protocol library, "Box Dyn Trait", a way of removing/simplifying generic types by using dynamic dispatch in Rust
Amos wrote about Hyper in "The HTTP crash course nobody asked for", Returning Traits with dyn
RawWaker and RawWakerVTable
Function std::mem::forget()
the Pin type & the pin module, Box
Amos' article "Pin and suffering"
Saoirse or boats/withoutboats
Dmitry Vyukov who writes about lockfree & waitfree
One Direction Amanda Easter Egg
Pinlist
Deadlock, The mutex crate's with_lock interface
Quote from Superbad "How old are you, McLovin?"
cfg-noodle
ergot
Pinlist PR: Fake news, dyn trait working in Pinlist only came up in chat
Eliza Weisman, cordyceps crate, cordyceps fungus
Mutex crate, Concise Binary Object Representation 'CBOR', CBOR library

Transcript

Amos Wenger: You have good taste, James. I'm surprised. Okay, we're gonna have to do it again because I...

James Munns: I'm surprised?

Amos Wenger: Of course I'm surprised.

James Munns: Damn, that was a compliment until the last second stab in the back.

Amos Wenger: Negging is a sport. Just don't do it to the people you're pursuing romantically. Just do it to your friends.

Amanda Majorowicz: Unless they're into it.

Amos Wenger: We're not.

Amanda Majorowicz: Bye.

Amos Wenger: Should we just cancel the recording?

James Munns: Bye, Amanda. No, we gotta get this one done.

Amos Wenger: I know, but how do you get our collective minds out of the gutter?

James Munns: Um, um, linked lists. Surprise.

Amos Wenger: So James, what do you have for us today?

Intrusive lists for fun and profit

James Munns: Okay, starting with a different word. We're gonna go over intrusive lists for fun and profit. Mostly on embedded, not entirely on embedded. All of this kind of works everywhere, but some of the things where I say it works really well are specifically on embedded. But, uh, Amos, do you remember what intrusive linked lists are?

Amos Wenger: I do. It's when the data... No, the list, it's when the data is next to the point... No, it's no pointer. It's when the pointer is part of the data, the elements? Something like that. It's good for cache locality, okay? That's what I know.

James Munns: That is true. So we touched on this a little bit in "What are you syncing about?" Cause we talked about intrusive linked lists being used in async await executors. We talked about it being used in things like wakers, where you want to have a bunch of things waiting on one task. But we'll do the quick concept reminder. So classic linked lists, kind of the opposite of what you said. They have separate allocations for the node. So the thing that actually gets chained into the linked list. So we're talking about doubly linked lists right now. So each node points to someone in front of them and someone behind them or left or right or however you want to call it. But then each node also points to its data. So you've got this unbroken chain of nodes and each of them point to one piece of data.

Amos Wenger: Right, and if you're a visual person, you can look at the slides that we have for this episode. Cause this podcast has slides. You can find them on sdr-podcast.com/ episodes or on YouTube or on Spotify.

James Munns: Yeah, I got to figure out. I don't think Apple has video podcasts. I think they might have like some fancy M4A format where we can embed stuff, but I haven't looked at that and I'm probably not going to this season. Maybe that's a season three thing, probably not.

Amos Wenger: Maybe it's only for premium podcasts.

James Munns: Ooh, that's true.

Amanda Majorowicz: I'm on it, don't worry. For in between seasons, I will look into it.

"intrusive" linked lists

James Munns: Right, whereas intrusive linked lists do something slightly different. They put a header inside of the data. So they kind of co-locate the linked list header and the data together. And so you don't have two allocations per thing. You just have one allocation of two things instead of two allocations of one thing, if that makes sense. The downside for the classic mode, like you said, for cache locality, you got to follow twice as many pointers. That means more cache misses. You've got to do two allocations. Allocators take some time to do things.

So the downside with the classic flavor is that you have to do two allocations for every node and follow twice as many pointers for every node, which is kind of a bummer. But the classic upside is that items can exist in multiple lists. So if you do have those separate, especially if you're doing something Java-y, where you might want to have like a, you know, not duplicated list of strings, and you can have multiple things pointed to them and they're ref counted or something, you can maybe do this. But in intrusive linked lists, you totally can't do that. But for a lot of cases, that's not actually a problem, especially if you care about things like tasks waiting in a list instead of, hey, I want to deduplicate all of the JSON that I've received in individual strings or something like that.

Amos Wenger: Right, right. So with multiple nodes pointing the same data, in Java that's easy because it's garbage collected. So it's just references, right? And the GC takes care of it. And Rust would be arcs or Weaks or RCs or it depends,

James Munns: Something like that. Yeah, you'd want arc or ref count or some people are trying to make GC types, but you need something to make sure that that data eventually goes away when nothing's pointing to it. And similarly, it doesn't go away until that point.

Amos Wenger: I just think they should make a Rust 2 and bring back the GC because Rust used to have garbage collection. I love this fact so much.

but linked lists are bad, right?

James Munns: Yeah, it had GC, it had actor model-y stuff. It was a squishy language for a very long time. But I've been told linked lists are bad, right? There's a whole bunch of writings in Rust that say linked lists are bad. You should use something with more direct locality, something like a vec because you go, hey, my processor has a cache and it loves running through linear address space. So running down a vec, even though it feels like it should be slower, a lot of times isn't actually much slower and it makes a lot of things much, much easier. But ha ha, we're doing embedded systems and it turns out that these factors sort of get flipped on their head for embedded systems because we care about slightly different things.

Amos Wenger: Oh, right, right, right.

James Munns: Yeah, you can't have a cache miss if you don't have any cache. If all of your RAM in your processor is basically L1 cache, which is how most microcontrollers work, they have SRAM instead of DRAM, which means there's never far to go for your memory from the processor, which means that following a linked list is actually just as fast as a linear for loop because it's just changing the pointer and dereferencing it, which takes the same number of cycles, whether you're going from here to here: okay, I'm pointing for the audience, from here to here means very close, and from here to there means very far away. Doesn't really matter, it's still a pointer load and so it doesn't matter.

Amos Wenger: So when's the last time like desktop CPUs cared more about instruction counts than they did about cache locality? It's at least 10 years, right?

James Munns: A long, oh, more than 10 years, 20 years? I mean, you'd have to go back to like--

Amos Wenger: Are we that old?

James Munns: Yeah, you'd have to go back to like early 90s, late 80s maybe, like it's a long time. This is one of those things that microcontrollers are running into today. So there's two main kinds of memory that we use. This is an aside that we don't need to go into, but I'll be quick. There's two main kinds of memory. There's SRAM or static RAM and DRAM or dynamic RAM. Static RAM uses way less power, but is physically larger on silicon. And so microcontrollers use just SRAM because it's lower power and when we have little low power devices, we want it to be there. The problem is it's physically large. So at some point, getting more SRAM closer to your processor is just physically difficult and also just takes up a lot of space.

So your microcontrollers, your processors get much bigger. So most things for bulk memory tend to use DRAM, which is what you're like DDR memory is using. The problem is that uses a ton more power, but you can fit way more in a smaller package. So most desktop processors these days have a little puddle of SRAM close to the processor and we call that L1, L2, L3 cache. And then we have a big bulk DRAM further away. And this is your main memory or your DDR3, four or five LPDDR, whatever, that holds a lot of storage, but is kind of further away. It's the storage unit at the end of the block rather than right in your pocket.

But this is why we avoid linked lists a lot of times because if you're jumping through hoops in linked lists and you end up having to go to that storage facility at the end of the street instead of what happens to be in your pocket, it slows you way down. Whereas if you were smart with linear operation in a vec your processor is gonna be smart and go, "Hey, you're running this direction in memory. Let me just go pull the whole pallet out of storage and pull the whole thing in at once." And that way, if you keep going in a linear order, we're already gonna have the thing that you need. So linear iteration is great on desktop processors. But like I said, when you're made up of nothing but pockets on microcontrollers, then it doesn't cost anything to jump around. And so we get to do this for basically for free.

Mhm,mhm.

Amos Wenger: I like this because coding like it's 1987 or something, is nice because you can beat compilers. It makes a lot of sense to write assembly, not just to take advantage of newer instruction sets and do SIMD and stuff like that. You have to be clever. Like there's a whole lot of stuff that we lost, a lot of art because computers got fast enough. So I guess if you miss those days, you can just go work in embedded.

James Munns: For a while, but it's starting to get to the point where we have external memory because at some point, this is one of those funny things about microcontrollers today versus like 1990s desktops. The processors are about the same. If anything, microcontrollers have way faster processors these days, but microcontrollers don't have as much memory as desktop systems in the 90s had. A desktop system like an Apple in 1990, would have probably had like a 16 megahertz processor, but it probably would have had a couple megabytes of RAM.

Whereas today's microcontrollers are like single or dual core, 100 megahertz, 32 bit processors, but they still only have 500K of RAM because they don't have DRAM sitting next to them. But now we start to want it. So now ESP32s and some other ones are starting to add DRAM next to it because you wanna have an eight or a 16 megabyte buffer for audio samples or video and things like that. So in 10 years, some of this will also be a little out of date because most of the like, at least the high end microcontrollers that you'd wanna run displays or audio on will have DRAM and things like that. So you start having to care about locality again, but we'll get there.

Amos Wenger: Right. And so this is what, like this is just different set of constraints and this is what -os like optimized for size is for, or -oz even. So that kind of scenario you think?

James Munns: It depends. So optimization settings changes how your actual executable code is compiled. So sometimes having smaller code makes it go faster than higher optimizations because just having less load means that it runs faster. But we're mostly talking about memory now. So for the linked lists, we're talking about things in memory and compiler optimizations mostly apply to the code size and things like that.

Amos Wenger: Wait, because --

James Munns: The compiler doesn't really know about cache most of the time.

Amos Wenger: The code would be in a ROM, which is different from SRAM? Like SRAM is just working memory?

James Munns: Yes. Yes. But also on a microcontroller's code can sometimes be-- okay, I'll try and keep this self-contained because maybe we'll cut this, because it's a whole aside. So code is stored on microcontrollers. Often it's on the chip itself in flash. So this is one of the other cool things about microcontrollers is they have all of their hard drive, CPU and RAM in one chip, but this is also physically limiting because on-chip flash is also large, which means that fitting more into the same package means the package has to be bigger. And the problem is if you make it in the chip, then you have to sell all the different varieties because people want a cheaper one because they go, "I don't need that much flash. So I want to save some money. So I want a smaller one," but then you have to ship 10 kinds of processors with different hard drive sizes because it's kind of like a MacBook.

It's all soldered onto the same chip. But these days, a lot of microcontrollers, it's getting more popular to have external flash. And this has the same kind of things as like external hard drives where internal flash might only have like a slight penalty for access and you might have cache as long as you're accessing linearly, it goes fairly well. Versus when you have external memory, there's higher latency on that. So you can have bigger bulk storage, but it might have higher latency. So again, this is like, not only do we have cache access penalties for RAM, we also have cache access penalties for flash. And if you're doing something hard real time, a lot of times you'll just load your whole program into RAM. So you never have to pay for flash access penalties if you care about like deterministic CPU counts and things like that.

Amos Wenger: Yeah, that's what I was thinking of, yeah.

James Munns: Yeah, but yeah, it all matters like this. But specifically for linked lists we're talking about just the RAM side of things. So less about processor optimizations.

Amos Wenger: So in the world of embedded, because the physical size of the chip matters, you can't do what they do in the desktop world, which is manufacture the same chip, but disable some of it for cheaper models or like do binning and like, those are a little bit damaged. Like you just disabled that.

James Munns: They do it too. Yeah, no, no, micro controllers do that too. But I mean, there's just some limit of even in the best bin, like the physical size or the amount of power that you need to use to keep your hard drive or keep your memory online, because it costs things, just gets too high. So even the biggest model, the best bin where everything worked, it just becomes too physically large or too, it increases your baseline power usage too high. And that becomes problematic either for heat or for battery life or something like that. So embedded is hard to generalize because different people care about stuff. Some embedded systems are plugged into a wall and don't care. Some embedded systems run on a tiny battery that need to run for five years. And so they have different optimization constraints. So it's hard to ever totally generalize for micro controllers.

Amos Wenger: If you think shipping for desktop Linux is hard, just wait until you do embedded.

James Munns: Embedded systems is fun because your application is the operating system, which means you need to know everything about your system like an operating system does. Hello, Sherlock.

(Laughing)

Amos Wenger: This is a co-co-co host.

can't have a vec if you don't have alloc

James Munns: And this is the trade-off is people were like, "Oh, you use vec because it has linear access order. That's great, but vec requires alloc because you have to allocate that space. Or if it changes over time, you have to reallocate those space. And because we don't have a lot of memory to deal with, we actually usually don't have alloc at all in embedded systems. We just say, you should figure out what kind of memory you need ahead of time and don't do dynamic memory access because it can cause a lot of variability in time. Like if you have to go hunting for memory to alloc, or you tend to need more memory with dynamic memory access because you can have fragmentation and things like that. So if you need typically on average, like let's say 64K of memory allocated anytime, you probably need an allocator with a pool of 128 or 192 or something like that because if it ever gets fragmented, you don't wanna go, "Well, I have space, but not in one linear shot."

Right,right.

James Munns: Sorry. So it adds a lot of variability and inconsistency and non-determinism, which we don't love so much for embedded systems. A lot of times we just don't use alloc, which means having a vec is not really an option in embedded sys-- You can do it, but a lot of times you don't. Practically you can, pragmatically you don't, if that makes sense.

Amos Wenger: Yeah, cause it's-- Jesus fucking Christ. Sherlock!

(Speaking In Foreign Language)

Amos Wenger: But don't you have the same problem with allocations for linked lists?

James Munns: Ooh, I'm excited to get there. And before we get there, one more fun fact is the... Rust--

Amos Wenger: I was gonna talk about this article!

James Munns: Yeah, so there's an excellent book called "Learning Rust with Too Many Linked Lists"

Amos Wenger: Sorry, what I call articles, other people call books. Yeah, sorry.

James Munns: (Laughing) Your articles kind of are books. But they talk about all the places where they go, "You probably should never use linked lists in Rust because there's a lot of reasons you should use vec, you should use this. If you're coming from another language, you might reach for them." But in Rust, they happen to just not be as great as you might think because they're not abstracted away. But there's a couple caveats of like, "Hey, what are some cases where you still might want linked lists in Rust?" And one of them is mumble mumble, kernel embedded something, something intrusive. And they say, "It's niche, you're talking about a situation where you're not even using your language's runtime. Is that not a red flag that you're doing something strange? It's also wildly unsafe. But sure, build your awesome zero allocation lists on the stack." And that's what we're gonna do.

Amos Wenger: This is the best slide.

James Munns: Yeah, yeah. I think this is Gankra's writing or Aria, yeah.

Amos Wenger: In the show notes, we better have a link to that because yeah, I believe that it's real. I recognize the frustration of maybe people trying to do that kind of stuff too early in Rust's lifetime. And being like, "It's a systems language, therefore it should be able to do that." And it just wasn't quite there yet. And after the hundredth person asking about it, you're like, "Okay, you're on your own."

James Munns: It's like writing any other collection is that you can totally do it with Rust. And Rust has gotten better at making some of this unsafe stuff easier. It's just hard. It's not the first project you wanna do. Even if you go, "Ha ha, I am a gray beard. I should know." You gotta learn Rust and you gotta learn how to write unsafe code good. And we'll get into that. Because what I'm about to show you is wildly, wildly unsafe, but it requires unsafe code but is safe because of some of the guarantees we'll get into. So it can be done with unsafe code, but in a safe API sort of a setup.

Amos Wenger: I just also wanted to mention that when this was written, Miri probably wasn't as good as it is now.

James Munns: Miri may not have existed when this was written.

Amos Wenger: Yeah, I have no idea what the timeline is there.

James Munns: This might've been before the days of const because Miri didn't exist until we had const. And const didn't exist till 1.30 or something. And this might predate that. It'd be good to look up. The show notes will definitely tell you whether it does or not.

Amos Wenger: Because Amanda's the best.

so how do you have a "variable" quantity w/o vec?

James Munns: Yeah. So if we don't have vec, how do we do a variable quantity of things? Like in embedded systems, if you say we might have one, two, three, four, five of something at any given time and this might change over time, how do we deal with that fact in embedded systems if we don't have something like vec where we can push and pop? Well, we usually use upper bound collections. So if you've used something like a stack vec before or heapless is a very popular crate, what you do is you go, here's a chunk with room for up to 16 items. And we track what our current len is separately from our capacity and you can push and pop things. But if you have 16 items and you go to push, it fails. And this is generally how vec itself works, but instead of failing on that 17th item, it goes, reallocates, copies over, and then goes, surprise, I had room the whole time. But actually it's doing reallocation.

Amos Wenger: That's what I was thinking.

James Munns: These are just special flavors of that that just never do that step.

Amos Wenger: When a vec grows, you need extra space for that to happen. Just like--

James Munns: Yeah, and usually it'll grow some reasonable amount, like double or something like that. So if you have 16, it goes, okay, I'm gonna get you 32, just so I don't have to go and reallocate too often and things like that. But again, no allocator, not an option.

Amos Wenger: This reminds me of two things. When I was working on the itch desktop app and we started having larger games, like multi-gigabyte games, we had the problem where people had, for example, 15 gigabytes free, and they wanted to install a 12 gigabyte game. And if you just download the archive and then try to extract it, that doesn't work. But I did the streaming extraction thingy where the archive is actually never stored on disk, and now that worked.

James Munns: Nice.

Amos Wenger: I think that's what Steam is doing too. And the other thing I was thinking about was when you catch an out of memory error, you might not have enough memory to deal with it.

James Munns: Yeah, the nice thing in embedded is we usually, for all these upper bound collections, usually they're gonna live in one of two places. They're either gonna live in static. So a lot of times we'll just say, look, we're never gonna handle more than 16 connections ever in this firmware. That's just a hard upper limit. And so we store that as a static, which means the linker actually has an option to say, do you have enough room for this or not? You can basically do your memory allocation at compile time by filling up your statics, and you tell the compiler how much total storage you have. And if you run out of storage, then it just says, nope, does not compile, you're out of space. That's great.

But then sometimes we also put things on the stack and in embedded stack overflows are one of the most painful things to debug for exactly like what you're saying, because if I put these on the stack, especially if I try and return it or pass it up and down the stack and it makes duplicate copies, and it's very large, that's problematic. But pragmatically, most embedded systems for big, chunky things, like if they're bigger than a couple bytes or if you have a bunch of them, we're gonna put them in a static. And that way there's just always space guaranteed for it. And the compiler can say, it's allocated right here.

Amos Wenger: About stack overflows, isn't there a way, like you have a red zone? I don't know if that's what red zones are for or something, but like--

James Munns: So what a red zone is--

Amos Wenger: -- detect when it's about to blow.

James Munns: Yeah. Yeah. So another fun aside--

Amos Wenger: This whole podcast is asides in case you hadn't noticed, James. This is kind of, this is our second season.

James Munns: That's what we do.

Amos Wenger: Yeah, that's what we do.

James Munns: So most stacks grow downwards. So when you put more stuff on there, they start at some top address and they grow downwards. Growing downwards is sort of a choice, but it's kind of like big endian little endian. Most processors have decided that growing downward is the direction that stacks go. The way that you detect stack overflows on a desktop system is that your operating system will put what's called a red zone there. And what it'll do is it'll say, if anyone touches this, it throws a fault and we halt the program and that's a problem. So it kind of puts like a no man's land at some limit on your stack. So that if you increment the stack pointer and start writing into that dead zone, then it throws an error and the operating system can either choose to do something like allocate more memory or kill the program or something like that. That requires the operating system to do so and the CPU to have something that can catch that access.

Amos Wenger: Yeah, I was gonna say--

James Munns: Usually that's an MMUs job.

Amos Wenger: That's the right barrier. Yeah, that's the MMU who would catch that, like that you're actually writing there. Cause you're not, it's not like you have an interpreter or something, you can check every memory access. Well, yeah.

James Munns: Yeah, exactly. So your operating system will just say, hey, this region of memory, if anyone ever writes into it, throw an error, the operating system catches that and then does something with it. So fun fact about microcontrollers, no operating systems, no MMUs, they usually sometimes have something called an MPU which can do that protection job that we talked about, but can't do the memory, virtual memory remapping that desktops do.

Amos Wenger: I imagine MPU for a memory protection unit?

James Munns: Yep, exactly. But not every system has that and not everyone likes setting that up. So on embedded systems, it's very classic that when your linker comes along and is deciding where to put things in memory, it starts at the bottom of memory and starts putting all of your global variables and your statics and stuff like that. And it places those upwards. And then your stack starts at the top of memory. So you have the largest variable possible amount and the amount of stack memory, your max stack usage, is just the space from the top of your memory to when you hit the first variable.

But the linker does nothing to set up any guard in between that, which means if you stack overflow, you just start scribbling over all your globals and statics. And when you don't have multiple threads, you only have one. So it just grows from the top downwards. That's one of those horrific programming things that everyone just accepted because that was the best you could do for a very long time and now it's just become standard. Newer, like new, like coming out right now processors for microcontrollers will have a CPU level stack limit where you could basically not use an MMU or an MPU and just say, if the stack ever hits this number, it throws a processor fault. Because if you like increment the stack pointer above some limit.

Amos Wenger: Right, you could watch the register.

James Munns: Yeah, yeah, yeah. So thumb V8, which is like the new Cortex-M33, 23 processors, like the new Raspberry Pi Pico 2 have an instruction for stack limit setting. Older processors don't. And so there's a really cool tool in embedded. And the problem is with linkers, a lot of this numbers you don't really know until you're like done compiling and it's placing things. There's a cool tool from Ferrous called Fliplink which does a cool trick where it compiles once, figures out what the size of the global variables are. And instead of putting them at the bottom where the stack will grow into them, it moves them to the top and puts the stack pointer just below them so that it grows downwards because on most of these processors, when you hit the bottom of RAM and start writing into the next addresses that are not valid RAM, you do get a hard fault.

Like you just get a processor fault that says like, "You tried to write to memory that doesn't exist, sorry!" And so that's a much more predictable way of making sure that your stack overflows become like hard program stops versus just, "Hey, it happens to keep going except for you just zeroed all of your executor state tracking variables, oops." And so like anytime anyone comes in and they're like, "Hey, this is weird, it only crashes sometimes," or, "It does crash in debug mode, but it doesn't crash in release mode." We go, "You should try Fliplink because you probably have a stack overflow!" Like anytime anyone just goes spooky things are happening because that's really weird in Rust. We don't run into that almost always. You had a stack overflow and you accidentally did undefined behavior to all of your global variables.

Amos Wenger: So you're saying Rust is unsafe...

James Munns: Environments are not safe. Okay, no, no, we're moving on.

Amos Wenger: "If you file open dev mem and you write to it..."

James Munns: Yeah, exactly. These are one of those things that the programming language has certain rules that your environment has to upkeep and that at some point the language just has to say, I have to assume these facts are always true. And if you violate those as the implementer, usually that's like a problem with your operating system integration, but embedded systems, everyone is writing their own operating system basically. So if you violate systemic rules, then you have systemic problems. But yeah, to avoid this, we usually put these things in static so we can see that they are there and then track them and go, hey, it's really clear. We have a baseline memory usage of this much.

The downside is, is you have to sort of guess and if you guess too high, you have potential waste. If you say, hey, I'm just gonna say I have room for 16 connections, but in practice you only ever have three, you're just wasting whatever 13/16ths of that chunk of collection is because you just never use it. And so that memory could be used for something else or more stack. So you don't run into stack overflows as much. You kind of have to manually size these things. Too small? And you run into errors where it says, no, we couldn't allocate another socket or something like that. Too high and you just are burning memory, which we've talked about is like one of your most valuable and not abundant resources on embedded systems.

And the other side is when you're writing a library and every user wants to use a different number of sockets or connections, we put a const generic on it. So you say this is a pool of sockets that is generic over some number that lets you say it could be four, it could be 16, it could be 32 and generics are infectious. When you have to deal with generics on multiple things or you are making like a bigger data structure that contains a bunch of variably sized things, you can end up with like five or 10 generics on one structure that are just like the dials for what are the maximum sizes for these. And it's tremendously tedious to write that. And it looks super ugly everywhere you use it and name that type and aliases can help, but it does not bring joy.

Amos Wenger: Yeah, I think the one of the most generics heavy code base that I encountered early on was Tower and Hyper with all the service abstraction... I wrote about it actually back in the days. And the nice thing on desktop is that you can eventually just do what axum does and just box it if it becomes out of control.

James Munns: Like box dyn or whatever?

Amos Wenger: Yeah, which I assume is not an option here on embedded.

James Munns: It is, it's a pain. I mean, boxing is not an option. We still have like dyn trait, you can still do dyn trait. The downsides is like, where do you put it? Because if you can't box it, where does it go? And you can have these kind of like stack collections or like static boxes, like one use boxes. And we use those for a bunch of things on embedded too. It's not impossible, it's just a pain in the butt is the answer.

Amos Wenger: I see.

maybe linked lists aren't so bad?

James Munns: After all that, maybe linked lists aren't so bad, at least for embedded systems. So cool, maybe we wanna try intrusive linked lists, but how? And how do we get like these linked lists where we talked about like where do we put stuff and how does it work and how do we make it safe and things like that? Like we talked about in "What are you synching about?" We already do this in maitake-sync because we have intrusive linked lists for wakers and things like that. So we talked about WaitCell which is just like, it's like an option waker basically. So you have room for exactly one thing, which works really great if only one person's ever gonna wait on that.

But if you have five tasks that might wanna wait on that, or in some cases three and in some cases five, how do we deal with that without having a const generic for the upper bound? And the answer is intrusive linked lists. We put wakers inside of tasks and they are pinned inside of their tasks. So they cannot be forgotten without running their destructor. And what we end up doing is we make a little linked list of all of these wakers inside of the individual tasks. So if you have five tasks that are waiting on something, each of the wakers lives inside of that task's memory and is chained into a single list. So that for example, when the thing happens that all five are waiting on, you can go through and either wake the next one or you can wake all of them because they are all waiting on the thing that has happened.

Amos Wenger: What is inside a waker?

James Munns: It's a pointer. What is it?

Amos Wenger: It's a vtable?

James Munns: Is it a vtable? I think it's a vtable and a pointer to the thing. So basically it's something--

Amos Wenger: So you're talking about the standard type from Rust.

James Munns: Well, the container is standard. The executor is responsible for providing what the contents of a waker is. So the raw waker is something that has a bunch of vtable functions and a base pointer in it. The contract in Rust is basically in that context struct that gets passed to futures when they're polled. The executor has a way of getting a waker that is relevant for the executor you're running on, but it's kind of a blind interface. So every executor defines its own waker. So an embassy waker and a Tokio waker are not the same, but they have the same like opaque contract, if that makes sense. Yeah, and the waker is basically like a callback that takes a task and moves it into the ready state for the executor. So it tells the executor, hey, this task of yours is ready. Whenever you get a chance, go check on them.

Amos Wenger: You mentioned that it can't be dropped without its destructor being run, but what about mem forget?

James Munns: That's the whole existence. You wrote "Pin and Suffering." So the whole point of pin is that it is impossible to forget something without running the destructor of it. So pin makes sure that something is stable and in place. And it means that you can't forget it because you never have ownership of it again. So in most cases, you either box pin it, which only gives you a mute ref, which means you can't run forget on it, or it'll be leaked. So it will exist forever without destructor running, and that's fine. But for stack full pinning, like when you call the pin macro, it hides, it rebinds the same name and basically hides the item from you. So you can't call mem forget because you need to have the own thing to call mem forget. So you can mem forget the reference to it, but that still allows the original item, which is now like has a nameless existence to still run the destructor at the end of scope.

Amos Wenger: Sure, I know I wrote about it James. It was a while ago. So you're right to remind me, but there's no mention of forget in either my piece or the official documentation for pin. This is why like the whole point of pin is that is something I hadn't heard before. So that's why I'm surprised.

James Munns: Interesting.

Amos Wenger: Yeah.

James Munns: Yeah. I mean, it's not the whole point. It is one of the major points of pin. Pin does two things. It makes sure that the item can't be moved and it makes sure the item can't be forgotten. As I understand it, the two main things. I think the docs for pin even talk about intrusive stack full linked list. I should have put a screenshot in here too, but I think the docs for pin actually has like, "What can you do with pin? You can use intrusive linked lists because it lets you do this!" And yes, it's very spooky, but these are the kind of guarantees that you can expect with pin. So when we have our list of wakers--

Amos Wenger: Fact check, they do not talk about intrusive linked list. They talk about no std context where you don't have access to the standard library or allocation in general.

James Munns: I'd have to look, it might be at the pin module level instead of the pin type level. I think someone wrote that. I'd have to go find it.

Amos Wenger: You are correct. I was wrong. In std::pin, the module itself, it's talking about self-referential types and intrusive data structures.

James Munns: Yeah. I talked to boats about it, because yeah, as boats was interesting, because it's exactly the kind of thing you can do with pin. Yeah. So we have all of these tasks and we go where do these wakers live? They live in the tasks wherever they exist, because we know that the future that the task is running is pinned. And so we can pin items inside of that task and they exist until they don't. And then if we cancel the task and the future falls out of scope, we run the destructor. And when we run the destructor, we remove the item from the list. So even though we have this list, we know that anything that on the list is valid, because if it wasn't valid, we would have removed it. And anyone who is on the list must exist because to create something to attach it to the list, it had to exist at that point.

And we know that if it still exists there, then it hasn't become invalid at any point. So we're safe to assume that no matter how spooky following all these pointers is, it must be valid because we can rely on pin as long as no one's done any unsafe incorrectly. In safe Rust, it's safe to assume these things. And similarly, if we drop the whole WaitQueue the job of dropping the WaitQueue means running down the list and removing everyone from the list. And this is why it has to be a doubly linked list and not a singly linked list, because every node has to be able to remove itself. So it has to be able to like, if it decides I'm gonna drop myself, I have to grab my two buddies to the left and right of me, tie them back together, and then drop myself out. Otherwise, we could do some really cool lock free programming. But since the nodes have to be able to remove themselves from the list, and we don't wanna change their order, it has to be a doubly linked list.

Amos Wenger: Well, do you need to maintain the structure of the list? Could you just not care about all the pointer? You'd just navigate it in one direction and then free everything?

James Munns: Hard maybe, you'd have to go ask Vyukov about that. Okay. So Vyukov is the person who writes a lot of those lock free data structures. The answer is maybe. There are cases where you can use it.

Amos Wenger: It wouldn't be unwind safe anymore. If you panic in the middle of that, you're in a very bad place. So maybe that's the reason.

James Munns: Yeah, yeah, yeah. Unwind safe and panic safety as a whole. Yeah, that's a topic for another chat. Another podcast, really.

(Laughs)

James Munns: Pin guarantees we can't skip the drop. If we drop, we remove ourselves, or we remove the whole list, great. Then it doesn't really matter that these things, they don't have lifetimes that we can super explain to the borrow checker, but as unsafe programmers, we say it exists as long as it exists. And then that's the fulfilling of the unsafe guarantees there. That's our safety comment is we know it's load bearing.

I made a crate for this: PinList

James Munns: But I made a crate for this called Pinlist Very unimaginatively named because it takes pinned items and puts them in a linked list. The way it works is you usually have your list in some kind of static. This becomes your anchor point. You have a global static in your project that says, this is the list where a variable list of items will go. You go, it lives right here. It's a big flag in the sand for everyone to go find it. And then every time you have something you wanna add to that list, that can live inside of a task, or even it could live inside of the stack. And you say, I'm gonna make a node right here. And it has a header with my linked list pointers and whatever data that I want. And all the data is the same kind across the array. And we're gonna chain it onto this list. So when I wanna push something onto this list, I go to the list, I chain it onto this, and I create my node item.

So actually what I do is I pin it in place first. Then after it's pinned, you can attach the pinned item onto the list. And then it exists on that list until either someone removes it or it removes itself by dropping the item. This is really, really neat because it means that tasks can lend their ephemeral data to the list. So if you say, this is a socket that holds whatever, and I need one here, depending on how many tasks you have, you don't need to go and fudge this number of what's the perfect min max number or whatever. Anywhere you create one, it exists. And you don't necessarily need to worry about them because if they exist at different stages of the tasks and things like that, you always know there's enough room because if it exists in the task, there must be storage for it. Therefore, if it exists in the list, it's good to use.

But wait, I just said I'm gonna put something in a static and then I'm gonna point to non-static data from that static. And that's like Rust's whole thing that you're really not supposed to be able to do that because you're taking items that have a lifetime that does not live as long as static and you're storing them in a static and then calling that static lifetime. How do we do that? And the answer is it's unsafe. Like we use unsafe code to do that, but we can put a safe exterior on interiorly unsafe code because that's how all abstractions in Rust work is you limit the blast radius and then you put a safe outer shell on it. And then you say, anyone who uses this in safe code never has to worry about unsafey because I've done all the spooky stuff inside. The trick here that I didn't really mention is that you can only access the items on the list while you're holding a blocking mutex, not an async mutex, a blocking mutex.

And what this means is that when you add stuff or remove stuff, you have to hold the mutex so you can access the list and add or remove something from it. Or if you wanna run down all the items, like I wanna iterate over all my sockets, you need to be holding a mutex for the whole time that you are running across that list because the mutex acts not only as mediating exclusive access, so like being able to do mutable things to each item on the list, it's also preventing you from adding or dropping anything to that list. So an item can't drop itself off the list if the mutex is already being held.

Amos Wenger: Yeah, I was gonna ask, well, what's preventing you from doing that? Does it just throw, not throw, but panic? How does it check, does it block indefinitely?

James Munns: Well, it's just a blocking mutex. So if you try and get the mutex in your destructor, it's just gonna block until it's able to get that mutex. So if someone's iterating over it, you have to wait until they're done iterating, they release the mutex, then the destructor can grab the mutex, remove itself, and go on. But the code just blocks if you try and do this.

Amos Wenger: Yeah, is there any provision against deadlocks? Yep. So the way you access mutex, like I'm looking at the API right now, and there's a with lock method on a node handle, and it takes closure and an FnOnce. What if you drop another node handle in there, then you have a deadlock, right?

James Munns: You would, if you're accessing it from the node itself, let me go back to the picture. If you're accessing it from the node itself, you typically in your local scope only have access to your own node because you've declared it there. The only thing you have in other scopes or your only way of seeing other nodes is through the Pinlist method that gives you access to the other items. I think, yes, you probably could make it deadlock in some way if you like, I will make two nodes here that I have local context to, and while I'm holding the mutex, I will try and drop the node. But actually with pinning, it's very difficult to manually drop something because you don't have access to the original item. So the only way to really make things trigger drop is to make them fall out of scope. And to make them fall out of scope, you have to return, and if you're inside a scope where you're doing this stuff, you can't really return from the scope while you're doing something else, if that makes sense. You probably could, in a very convoluted way, shoot yourself in the foot and cause it to deadlock. It's not particularly easy to do with the interfaces I've given you here.

Amos Wenger: Well, I guess the easy case is just two nested calls of with lock, but then it's pretty visible, like the wouldn't pass code review.

James Munns: Yeah, yeah. You'd be in a blocking context and calling, you'd be calling lock to get access to the thing you already have access to. So it would, yeah, you probably could deadlock, and at that point, like a std mutex, if you try and lock it while it's locked, it'll deadlock if you're in the same thread. If you're at a different thread, it will just block. If you're in the same thread, it will just deadlock because it'll detect that. On embedded systems, we don't have the operating system locks.

Amos Wenger: So when are you adding a deadlock detector? Just like, what's

James Munns: the, the answer is I kind of actually do. Cause on desktop, if you're in the same thread and you lock it again and it would deadlock, std mutex will already panic. On embedded systems, we don't have std mutex, but I have a crate called mutex and it uses what's called a scoped blocking mutex. On embedded systems to handle this, we're gonna take like a critical section or whatever. And if you go to lock it again, we will see that it's already locked. And so we can do the same deadlock detection where it'll just panic. So if you deadlock, it will panic, but the answer is it's exceedingly hard to deadlock unless you're trying, I would say.

Amos Wenger: All right. Well, good thing deadlock detection is super easy with a single thread.

James Munns: Yeah, yeah. And running drop means holding the mutex. So actually running the destructor here means that if you're not holding the mutex, you have to wait. And if you're holding the mutex so you can drop, you know that there are no other live views of the list, which means there's no problem because no one can have like an item spookily pulled from under them. And the interface for gaining access to this doesn't let you return a reference to an item on the list outside of the closure of holding the lock, which means you can never like return an item beyond the actual call, which means all your access of the items has to be done with the mutex locked. You can't just hold on to a reference that could get stale because someone could decide to drop it, which means you do all of your actions immediately rather than holding on to something and doing it later.

Amos Wenger: This feels weird because there's pins so I would expect async, but no, it's all blocking. It's all sickness.

James Munns: Yeah, it's correct. Yeah, pin is not exclusive to async and we do need to do it blocking because there's no async drop right now. So once we get async drop, I could probably loosen this restriction and be able to do async drop here where something can yield until it's able to remove the item from the list. You'd still only be able to do it with it held. I do actually have another variant of this. Well, I'll mention at the end. If you make all of your items static instead of pinned, so that they can never be dropped, you actually can use an async mutex to this and not worry about it because then, items are always live because they are static items and they can never be dropped. And I actually use that in one of the libraries I'm gonna mention because we do want async for some things like this, but you actually can--

Amos Wenger: But then why do you need a linked list if it's only static items? Like it's just to maintain an order?

since as long as it's on the list, it exists...

James Munns: I'll get to it in a second. Back to talking about holding non-static things in a static item, it's good enough because we know that if as long as it's on the list, it lives long enough. And really static doesn't mean lives forever. It means how old are you, old enough to drink kind of thing, like it's like, that's probably a joke about a four that's not gonna make a lot of sense. So the answer is it just needs to exist as long as anyone could possibly want it. Sometimes that means forever. Sometimes that just means long enough. And so that's what we're doing here. Everything lives long enough because it lives as long as it's on the list. And if it's not alive, then it's not on the list.

Amos Wenger: Static is sometimes code for just owned. Like I think if it is owned or self-contained or something, it's not borrowed from somewhere else. So we control this lifetime.

James Munns: I really think of it as lives long enough because if it's owned, it always lives long enough because it lives as long as you hold it. And if it's in a static and lives forever, obviously it lives enough because it lives forever and forever is always long enough. You're right, there's like in between cases between owned and like a static forever reference where it really does mean it lives as long as it could ever need to. So that's usually how I think of tick static is just it lives as long as it could possibly ever need to. And that means for as long as someone knows about it, it exists. And for owned things, only one person can know about it or as long as the borrow lasts and things like that.

So yeah, I think a good way to think of static is it lives as long enough as someone could possibly want it to. So the reason why you might want something like this, both in the pin case and in the everything's a static case where it's a variable item, it lets you do a very cool thing of separating ownership of resources from a worker of those resources. So what I mean by that is because you have this flag in the sand where everyone can meet, either to add items to the list or to access the list, you can have your IO worker task independent from all the items that are storing things, which means you can do cool things with different libraries. So one library that I've written is called cfg-noodle It was a silly placeholder name.

Amos Wenger: Love that.

James Munns: This is one of those things I have a habit of adding placeholder names. This was originally written because I was just noodling around. I was trying to figure out what to do before it. So like on a guitar, when you're noodling around, it's just like improv and not really doing anything meaningful. So I named the repo cfg-noodle because I was just noodling around to figure out what I wanted to do. And then the customer never came up with a better name. And so now the library is just released as cfg-noodle But it does things where it holds storage. Like you can have a config item that exists in all the tasks.

And then because that storage in the config exists on external storage, you have one IO worker task that just whenever any of the config nodes says, "I would like to store data." Or at the beginning at boot up, they go, "I need to be hydrated with my data." They can just raise a flag and the IO worker, which is totally separate, can just access the external memory, populate all the items on the list, or grab the items on the list and flush them back to external memory. So instead of everyone having to hold like a reference to things where the generics have to be consistent because all the items have to be of the same type or whatever, you can just say, "Okay, IO worker lives over here. And then all the nodes live over here. And they just exist in the linked list."

And by having a node that exists in the linked list, you get the benefit of someone running the IO task for you. And your IO task is totally independent where it doesn't have to have references to however many tasks you have. It just says, "I just run down the list whenever someone asked me to, and I either flush data to the disc or I load data from the disc." And it works really well. And this way you can basically have storage for keeping all the items cached wherever they are. So anywhere where you have the item and you would like to have access to it, that's where your cache exists. So you don't need to figure out exactly how much cache I need. You just have the right size of cache everywhere where you have an item and it just exists inside of the task. So it's always right sized. You don't have to like tune any values to figure out how much you need. That is really cool.

Amanda Majorowicz: Yeah, yeah, yeah.

James Munns: This is like a very classic way of doing things on embedded systems or kernels. Like it's just not usual that you have a safe interface to it because the language is expressive enough to like not totally make the guarantees. You still have to write the unsafe under the hood, but it can at least express that on the exterior so you can use it safely, if that makes sense.

Amos Wenger: Yeah, you don't have to write any unsafe codes to use Pinlist, right? Correct, correct.

James Munns: Yeah, I've written all the unsafe for you. So as long as you just use it with its public API, it'll always work and you won't have to write any of your own unsafe code.

Amos Wenger: Yeah, that reminds me of the Rust and Linux work where they made safe interfaces for a bunch of stuff. And a bunch of people were like, oh, this looks complicated. I'll do that one again.

(Laughing)

James Munns: Turns out this was all everything that you had to do anyway and was just implicit.

Amos Wenger: Yeah, yeah, exactly. And it's like now we're just annotating that and this can never go wrong because of that.

James Munns: Yeah, we can describe the contract in the way that the compiler can enforce. Even though we can't describe what's going on under the hood in a way that the compiler can understand because there's pointers and whatever, we can at least explain the contract in a way that the compiler can enforce, which is super useful because the person who knows the most about this is the implementer. And most of the time the people who know the least about it are the users of it. And so you wanna protect all of your users as long as one maintainer understands it and can rely on the compiler enforcing that rule.

Amos Wenger: That's one of my favorite things to explain about Rust, especially when people are frustrated about it. It's that the number of programs that the Rust compiler will accept is very small compared to the number of programs that are memory safe. But that's the compromise we make because all the programs that are accepted are safe. So you don't have to worry about it.

ergot: embedded networking library

James Munns: Exactly. And the other place that I'm using this is in Ergot, which is my networking or message communication library for embedded systems, where all of my sockets work basically the same way. The sockets are things that you hold in the tasks that you're using them from. And there's IO worker tasks where if you have an external interface and a packet comes in, it can run through that list, find the destination for that packet and drop it in the correct socket. Or it can run to the end and say, I didn't find anyone, send a negative acknowledgement back or like a NAK and things like that. And that way it doesn't matter how many sockets you have. Or if you go, oh, I only have this socket open in this mode, or I only wanna open a socket just long enough to send a request, get my response, and then immediately drop my response because I'm only waiting on one response here.

So you can have ephemeral sockets or sockets that live forever or as many as you want and things like that. And the IO worker task that's taking these incoming packets just goes, hey, look at the header, figure out the destination, is that here? Do we have a socket like this? Yes, we do. Drop it in and then move on. And because any socket can then just send directly in the outgoing queue, it means that now you don't have to tightly coordinate in all of these things. And you get the same kind of feeling interface that an operating system socket library gives you, but it's all in user space. Like it's all just linked together in a way that keeps this reasonable.

And then you get that as like a firmware developer, you get something that feels like something an operating system would give you, but is actually totally done in firmware. There's a little bit of code behind the library, but there's no magic behind the operating system because there is no operating system. Like I said, it's a pretty handy pattern and it's how I've been building a bunch of libraries. This is like a streak that I've been on is I kind of went back and looked at a WaitQueue and went, "I wish we had that for more general stuff because I could use that for a bunch of stuff." And then I started using it in ergot, then I started using it in cfg-noodle and then eventually someone was talking to me about it and was like, "Hey, how would I do this?" And I go, "You should use this pattern. Here's a linked to like 500 lines of unsafe code. Just do what I do and also hold it right." And I went, "Well, that's not great."

So I eventually extracted Pinlist. There's actually, Pinlist only does one of the two tricks. So both cfg-noodle and ergot take this one step further where Pinlist requires that all the items in the linked list are of the same type. So when you iterate over them, you get the same type on the same place. But actually ergot and cfg-noodle and also async executors that do this kind of stuff don't. Every item has a common header, but then you have basically like a dyn object below that pointer. So all those sockets that I talked about, they all have differently type deserializers because they're given the message, but one socket might be of this type and one socket might be of this type.

And same with configuration. One configuration element might be of this type and one configuration element might be of this type. And what you do is you actually have a vtable in all of the nodes. And so the IO worker that's coming along can go, "Is this for you?" "Yes, it is for you." "Cool, I'm gonna call your vtable method with like a raw pointer and just do it." So the second stage of this is actually like really spooky type erasure, which is I think someone actually sort of figured out how to get dyn trait working in Pinlist I'd have to check and see if that PR is still working and if it's still sound.

But the vibe of what cfg-noodle and ergot are actually doing is basically having a linked list of dyn trait items so that you can hand them a packet and go, "Did you like that?" And then they have at least a consistent interface to say, "Yes, I did and I took that message." Or, "No, I didn't, I don't want this." And that way the IO worker, that's all the IO worker needs to know. They don't need to know what it did with it. They just wanted to know, "Did I deliver it to you or not?" So that's the spooky second half. It's a whole nother episode, but just storing items is a pretty neat technique. And then you can take that neat technique a little further at some point if you want.

Amos Wenger: So with items of different sizes, you still need some sort of dynamic memory allocation.

James Munns: You do not.

Amos Wenger: You don't.

James Munns: Because they are stored inside of the task. And the task knows what type they, that's the trade off is the list doesn't know what type they are, but the nodes know what type they are. So when you create the item inside the task, it knows it's the specific item that it is, but then it just has the common header that's at the top of the item and that exists in the list, but they can be of different sizes in every single task.

Amos Wenger: Right, this can all happen on the stack is what you're saying, or yeah, or static.

James Munns: Yep, in embedded for tasks, we actually statically allocate our tasks usually. So they exist as basically like reusable boxes for specific tasks because in Rust now we can figure out the size of a future. And if we can figure out the size of a future when we're creating the task, we can make a perfectly sized static that is the size of this specific task. And then we just say we can have one of those or five of those or 10 of those or whatever. So it's kind of magic in: it is stored in a static, but it's the compiler figuring out how large that static needs to be. And then you don't have to think about it.

And you just know that it's bounded like, okay, I can have up to four of these exist at the same time. So we are kind of doing that like fixed upper bound kind of thing, but most of it's hidden in the executor's plumbing where you don't have to care, but you are still at compile time deciding how much memory you need for all of your tasks. But the thing is these sockets or whatever could just be one leaf future state inside of that whole task. So if you have like part of the time I have this socket open and another part of the time I have this socket open, you don't need to reserve space for two sockets because the compiler is smart enough to tell these two states of the future are never live at the same time. So I just need like, kind of like an enum. I only need the superset max size, but not the sum of all size, just the max of all leaf states of a future.

Amos Wenger: That reminds me of an episode where we argued a little bit about how easy it is to write async code. And I think in the comments someone said, "Amos, actually you're wrong. Like the tooling is good now, it's fine." So.

James Munns: It got better. There's still some stuff where there's rough edges. And I think both of us are on the same page of like, we know the places where it sucks. And we know the places where it's actually pretty good. And we know that the set of places where it's pretty good is getting bigger over time, but there's still like a non-zero set of items that kind of sucks to debug, but yeah, it is getting better.

Amos Wenger: So how many of those libraries, well, those crates are you maintaining now? How big is the dependency tree that is your own under your network stack?

James Munns: The network stack... I'd have to look at the dependencies. So there's three libraries, basically where I'm using this trick. I listed all of them. There's ergot, cfg-noodle and Pinlist.

Amos Wenger: Yeah, but it's not the only pattern, the only like idea you've turned into a crate, right? You've been publishing a bunch of those.

James Munns: Of this one so far, yes. Cause I really only started doing this a couple months ago, but I'm sure I will have more by the time that probably anyone's listening to this, but the main two dependencies to make this happen are Cordyceps, which we talked about in "What are you syncing about?" It's the intrusive linked list library with the best name ever, cause it's named after a fungus that invades your brain and makes you do stuff. So the intrusive linked list library is called Cordyceps. That's from Eliza and I guess I help out with that library. So it's Eliza's library, but I have made a lot of PRs to it.

Amos Wenger: Right.

James Munns: And that's used also by maitake-sync. So like WaitQueue and WaitCell are using Cordyceps under the hood as well. And then all three crates that I named are using this for the intrusive linked list. Then for that blocking mutex, there's a crate called mutex, which I also own with Eliza and that's used in all three of those crates. Actually, no, I don't think it's used in cfg-noodle because like I said, we made them static. So we don't need a blocking mutex. We just say that every node lives forever. So it's not using that, but those are the only two main building blocks. And I think Cordyceps has no direct dependencies or maybe just one or two small helpers and the mutex library also, I think it depends on what feature you use. Like on embedded systems, you want the critical section crate. And then on desktop, it's just using std mutex if I remember correctly. So it's not a deep dependency list.

Amos Wenger: In fact, Cordyceps has no dependencies. It has like three dev dependencies and a loom feature that enables loom and tracing for testing I'm assuming.

James Munns: That makes sense. And then the mutex crate probably is just based on like, it's generic over, you tell me how we can make a mutex because that way on desktop, you can do one thing and on embedded, you can do another thing. So it probably has a couple of dependencies depending on which feature you enable, but in practice, you probably are only ever enabling one of them. So it'll be like either the Cortex M crate or critical section or std, maybe actually just critical section with the std feature enabled. So you don't really need a lot of deps to do this because like the depth of the unsafe code that you need here is pretty shallow just for the data structures. And then what you do with it on top of that is sort of the question.

So like cfg-noodle then relies on the CBOR library, which is a serialization and deserialization library, which is how you implement those V table methods for like loading and storing and then some other like storage libraries for how you store stuff externally. Ergot at that point, ergot-base is where I do all of this stuff and then ergot's like the nicer higher level interface over this. I'd have to look, ergot's like 10,000 lines of code at this point and then the deps I don't know. I'd have to look at, I don't know exactly what my load bearing deps are there, but there's more there because I'm doing stuff like postcard and serde and I'm using a bunch of crates from the postcard ecosystem in there.

So those probably have a bunch of recursive deps as well. Yeah, for the actual just like this pattern, you need like two libraries. You could totally do it yourself if you wanted, but Cordyceps is a really good library and the mutex library is exactly what you want for this. So probably two in most cases, unless you're doing additional spooky things, but then it's just whatever.

Amos Wenger: Just looking forward to see the James cinematic universe... grow!

James Munns: Yeah, yeah, yeah. We'll get to ergot probably next season. I was thinking about doing ergot for this last episode, but I could do a season on ergot so.

Amos Wenger: We both decided it was too much. Yeah, yeah.

James Munns: Yeah, so maybe season three is just gonna be the ergot season at this point. Exactly. I think that's a good end.

This episode is sponsored by Depot: the build acceleration platform that's on a mission to make all builds near instant. If you're tired of watching your builds in GitHub Actions crawl like the modern-day equivalent of paint drying, give Depot's GitHub Actions runners a try. They’re up to 10x faster, with unlimited concurrency, faster caching, support for Linux, macOS, and Windows, and they plug right into other Depot optimizations like accelerated container image builds and remote caching for Bazel, Turborepo, Gradle, and more.

Depot was built by developers who were tired of wasting time waiting on builds instead of shipping. It's made for teams that want to move faster and stay focused on what actually matters.

That’s why companies like PostHog use Depot to cut build times from over 3 hours to just 3 minutes, saving tens of thousands of build hours every week.

Start your free 7-day trial at depot.dev and let them know we sent you.