How Usable is the Rust Cranelift Backend Today

RSS podcast badge Spotify podcast badge Apple podcast badge YouTube podcast badge

An experience report of attempting and MOSTLY succeeding in using the Cranelift backend for real MacOS applications

Presenation unavailable

Audio

M4A

Download as M4A

Show Notes

Episode Sponsor: fasterthanlime

Transcript

Amos Wenger: Okay, let's go. So, this week I want to talk about: I want everyone to congratulate me on being so generous. Not really. Okay. So here's what happened. The Rust compiler: it's a machine that turns your code into a metric ton of LLVM intermediate representation. And then LLVM turns that into native code, and then your linker just sleeps in a loop, probably, because it's that slow.

James Munns: Puts a big bow around it.

Amos Wenger: Magically, yeah, you have an executable file appear on your disk and it's between, let's say two and 700 megabytes, depending on where you work and how many debug information you pull in. That's basically what the Rust compiler does. Actually. Well, no, no- that's the compiler side gig. I think was the talk Esteban gave. So the main thing the Rust compiler does is parse and emit diagnostics. And then as a side gig, it generates network. But, problems with LLVM: it's a big pile of C++, it's a research project that has been productionized and then remained a research project. So there's a lot of things happening. Like, I think they turned their people optimization register allocator from code to like a model. Like let's just train a model. Cause it's going to be better than whatever heuristics we can come up with. So there's things like that. There's like really strange vectorization things happening. LLVM is still research ground and-

James Munns: So these are all optimizations. Like we have some code that comes in and LLVM goes, "I see what you're doing. There's a better way to do that." And without actually knowing the source code form or with limited access to it just goes, "No, no, no. It's better like this."

Amos Wenger: Yeah, yeah, yeah. And the way it actually works is like it transforms the input into different intermediate representations. And then using those forms, it's easier to analyze like what you can optimize because you, you realize: Oh, those two things come from the same computation. So we can just merge them or whatever.

There's a lot of that going around. So LLVM is a blessing and a curse because the point that they made the choice to use it- let's say I'm not familiar with the entire history- it was the good obvious choice. And since then, another project has happened, for WebAssembly reasons called Cranelift, but it does happen to generate AMD64 code or x86-64 or whatever you want to call it, and a bunch of other architectures now as well.

I don't exactly know who had the idea at first, but they were like, "Well, Cranelift, the WebAssembly based projects, the backend is Rust. The Rust compiler is Rust. Let's arrange something." I don't have any of the dates or the timeline off the top of my head, but it went as far as like a few months ago on the official Rust blog, they talked about using that backend, the Cranelift CodeGen backend for debug builds. And it is now distributed on a rustup for at least 64 bit Linux. And I was very interested in it, but as it happens, I switched entirely over to Macs for a very simple reason. It's that macOS is the only operating system that works on desktop in 2024. Because you have Linux, which is, you know, the year Linux on the desktop is any year now.

And then you have Windows, which is being actively made worse with every passing month. I defended it, until I couldn't. I don't know. I just have so many workflows. I was like, "Well, this is great, but it only works on Mac." And I was like, "What if I only had Macs? Then everything is great." Essentially, except when AirDrop doesn't work.

James Munns: For me, it was just having a reasonable ARM architecture processor, because like the M-series Macs versus any other laptop- like on a desktop, it's more debatable, I feel like, but at least in terms of like efficiency and performance for a mobile device, like a laptop, which I use very often. I was looking at like Framework or M-series Macs and it was like: okay, it pretty much has to be M-series Mac.

Amos Wenger: That is definitely what pulled me in on the laptop level. But then I had like the nice laptop and then I had to go back to my Windows desktop and just interacting with Windows- just even transferring files. I'm like- I'm not going to email files for myself. I'm not going to use USB sticks. Like I don't even have USB ports. I have USB C. What do I do? I don't know.

And, I was always interested in the Cranelift codegen backend. But it didn't work on MacOS- aarch64, whatever the actual name of the architecture is. And recently it started working and I was like - it's time to check it out. And I don't know if you know, but I write articles.

I'm a bit of an author myself, you may say, and my website is powered by a gigantic bunch of Rust code as well.

James Munns: I've sat down and built a static site generator where you go, "Aha, I have written code that generates HTML." You went like the one step further of like, "I have a fully dynamic running web server backend that powers my entire site that I've written while writing articles about writing a website dynamic backend in Rust."

Amos Wenger: Yes, and I want you to know that every single decision and isolation made sense. I still don't know how I ended up here with like 20,000 lines of code to maintain and operate on Kubernetes. But somehow we're here. And the reason I was talking about built performance and whatnot in the previous episode was because I'm trying to get it back into a shape where I can actually iterate and add features rather than just make sure it builds with the latest version of crates.

So: that mountain of code ends up having about 700 transitive dependencies, which is not all my fault. I don't know. I pulled the S3 crates. That's a hundred dependencies right there. Probably the aws-sdk-s3 crate. I'm doing what I can, okay? I'm doing video encoding, I'm doing image encoding, I'm doing a bunch of templating. It's complicated! It's not all my fault. I'm doing compression, broadly, gzip, whatever, Patreon API, all the APIs. There's so much functionality in there. And I tried the Cranelift codegen backend, fully expecting it to vomit an error and then just immediately stop after a couple of crates. But to my surprise, only like a couple crates didn't build out of the 700.

So there's ring because it has asserts that are supposed to run at compile time. And I'm not even exactly sure how that works. But it was very, very unhappy about that. But the nice thing about asserts is that like, let's just make sure this actually looks like 64-bit ARM on MacOS and you can just comment them out because yes, it is. It's regular MacOS, trust me, it's going to be fine. So that took care of ring. And then the other thing was in the half crate, which you may know, actually from your embedded work, but it's like it defines float16. And so it has assembly, and so we actually moved away from LLVM assembly to like Rust assembly syntax... I want to say one edition ago? I don't know exactly how that change happened, but I know I had to update my series that had inline assembly.

Yeah, so we had ring, we had half, and half wouldn't compile. So I just- it turns out that it was pulled in by a crate I didn't actually use. So I just dropped that dependency and uh, everything built! But didn't link! It didn't link because- and we talked about that recently, so it all tied in, this is why I was excited to talk about this- it doesn't link on Xcode 15, which is the release where they optimized the linker. I'm assuming they rewrote a large part of it. So you can downgrade to Xcode 14 and then it works. Or you can use a patch that someone wrote and it's the most hilarious patch because it's like really domain specific linker knowledge, like it's the wrong kind of relocation being generated.

James Munns: Wait, a patch to what? The compiler?

Amos Wenger: To the Cranelift codegen backend. Yeah. It's not a patch, it's not a pull request. It's like someone did 'git diff' and then posted the result in a git command saying, "Don't use this. I don't know what I'm doing. Someone who actually knows what they're doing should take care of this." But if they don't know what they're doing, who does? Cuz I don't know what's in the patch I don't understand it...

James Munns: There's this whole fight on Twitter, and I know you found this thread too, of like: Linkers are the like intersection point of so many integration schemes. It's like the integration point- it is the cursed knowledge, stare-into-the-void item that is at the intersection of like five other cursed stare-into-the-void items. I was not happy with all the responses on that thread because some of it are just like, "Oh, it's cursed, you'll never fix it."

So I think the issue is generally that people like to avoid- like, linkers are unpleasant. Like, I feel like this is the old way of handling abstraction is you pretend that things don't exist. Then they're allowed to grow as big and uncomfortable and terrible to deal with as they want.

Because you just go, "Just don't, don't look at it. Don't make eye contact, and you'll fine." Linkers are like the ideal of that: you just hope that you never have to hear the word 'linker.'

Amos Wenger: And this ties into two other things I already talked about on this podcast, which are: I was wrong about splitting your code base into many different crates because it creates interfaces and interfaces by definition are boundaries. The exact same thing happens. You start to ignore what's on the other side of the boundary and you're missing out on optimizations or simplifications that could have been done.

And, let me tell you: ever since I merged everything back into one big crate, I'm going, "Oh, this thing is duplicated all over the place. I can simplify the whole thing. I can build this thing once instead of doing it on demand and like reallocating over and over." Like, the whole structure of the thing becomes so much clearer now that it's one big blob. And I'm looking forward to finding the interfaces that do make sense and actually split it up again.

But specifically for linkers, I think you're right. I think we collectively, like people around compilers tend to think that it's not our problem, right? We just emit the object files. Whatever happens after that is none of our concern.

So back to the Cranelift code generation backend: you can either downgrade to Xcode 14, which is not a future proof solution. It's not what's favored by the maintainers of cg_clif is the short name for the Cranelift code generation backend. And the patch that is in the comments as an inline diff file that I did apply works... until it doesn't? So it did compile all the crates individually which already involves some linking.

James Munns: Yeah. Cause typically when things get linked, you create one archive or library that you link together as an archive or like a unified thing. And then, essentially at the end the linker goes, "Merge all of those intermediate archives into one big final executable."

Amos Wenger: And so there's many different opportunities for linker errors to happen. You don't actually have to downgrade to Xcode 14, you can just use a linker flag called ld_classic. And of course there's like 26 different ways to spell it wrong? If you just get it wrong, it's just ignored. I love that, I love when programs do that.

James Munns: Linker UX is real good.

Amos Wenger: Just pass a flag it doesn't know about it's like, "Okay"

James Munns: That's compatibility!

Amos Wenger: Yeah, it is for sure, which is why when I derive deserialize, I always have like 'error on unknown fields.' I forget the exact attribute.

James Munns: Oh yeah, serde ignore?

Amos Wenger: Not ignore.

James Munns: You're right. Yeah, yeah, I, know exactly what you're talking about.

Amos Wenger: I want to look at the schema change and I want to know about it. Well, I could version my configs, but I'm- I'm a, I'm a one man shop. I don't need to do that. I'd rather have production trash.

James Munns: Schema evolution is a conversation for another day and we should talk about schema evolution at some point.

Amos Wenger: It doesn't fit in 20 minutes. We can never talk about it here.

James Munns: No

Amos Wenger: So, there is a linker flag called ld_classic. I was trying to use it, but as you can imagine, building an alternative codegen backend for rustc is a bit involved. So there's an actual build system. So if you're building the Rust compiler, rustc, if you check it out, which I encourage everyone to do, because they have a really good contributors guide.

Everyone can just clone it. It's not that big. It can download pre built versions of LLVM. It doesn't actually take that long. I thought it was going to take like a whole day to build for the first time. I thought it was like Chromium or whatever. No. Contributing to rustc is actually not that scary. It has this thing called x.py, which is a Python script that ends up building a bunch of Rust, which is then the actual build system for rustc. So the rustc build system is mostly written in Rust, but it has this Python wrapper, so it's cross platform. And the Cranelift codegen backend has a similar thing.

It has y.sh and y.ps1 and whatever for other platforms. And it also builds some Rust to also end up invoking the Rust compiler and it applies patches to the standard library so that it gets to compile with that backend and whatever. For some reason throughout this whole process, the linker flag got lost. So not only will the linker not complain if you give it a nonsense flag, or Rust flags will be ignored if you forget -Wl. So it's not a linker flag, it's a normal flag. Just a bunch of drivers and frontends and programs just ignore flags if they don't know about them, which is lovely.

James Munns: Does the phrase linker driver mean anything to you?

Amos Wenger: To me, yes, but you should still explain.

James Munns: Okay- I was going to say, I'm probably gonna get this wrong, cause this is again, cursed knowledge, but like- a lot of the times you don't necessarily even invoke the linker as its own binary program, you call the compiler with certain arguments because hopefully the compiler knows what it's supposed to be doing and then can pass on more of that information when it invokes the linker.

A lot of those flags that you set are rustc flags, which means you basically are giving them to rustc with forwarding instructions to the linker. So you're not even talking to the linker itself. I don't know. I've built some build system type stuff where you do independently do it, but it means that you then need to remember everything that you gave to the compiler, and when Rust has, like, cargo followed by rustc, the frontend, followed by whatever codegen backend it has, followed by whatever linker you're using, there's just a lot of parts that really, really need to agree on what configuration means to them.

Amos Wenger: That is correct. The actual flag is ld_classic, but that's the flag for LLD, which is invoked by CC, and the flag for CC is so what we were saying: " -Wl, -ld_classic," but then we're actually passing a Rust flag. So the full chain is '-Clink-arg=-Wl, ld_classic' which: don't ask me why I know that by heart! This is just the trauma-

James Munns: Gonna say trauma is the reason that you have to know that.

Amos Wenger: Yes it is. Wow, this was a short story. I didn't think it was going to last that long. The problem was: I think it's safe to say I haven't checked the GitHub, but the person that has been behind most of the effort of like, hooking up rustc to Cranelift- he hasn't developed all of Cranelift by himself. It's a different team. It's like Wasmtime, the Wasmtime team? I don't know exactly who works on what, but like the Bytecode Alliance, Wasmtime, Cranelift, whatever-related-word-cloud.

And then rustc is a different set of people. And then between rustc and Cranelift: Bjorn! Bjorn the third, no bjorn3, on GitHub, whom I met at EuroRust. I asked, " Are you getting paid for this? Do you get any kind of support?" And he was like, "Well, the Rust Foundation is giving some things including like a VM and whatnot." And as I was messing around with the Cranelift codegen backend, I realized: it's so close. It seems so close. I don't know if it actually runs because the last linker command failed.

I asked him, "Hey, do do you have access to a Mac machine? Like an M1 or an M2 or anything? Have they given you a VM?" And the answer is of course, no, because it's so easy to give someone a 64-bit Linux VM, right? It costs like pennies or whatever on whatever cloud platform, but the M2s and whatnot... kind of hard to get by, kind of more awkward to start a VM yourself and everything. I had never done it before. I just heard that like recently it became possible to log into your iCloud account from a MacOS ARM VM ran from an ARM machine? I didn't even know that was "not a thing" before. I didn't even know that was a limitation that existed.

James Munns: Yeah, Mac has weird licensing rules on virtualization, which means essentially every virtualized, like, hosting provider of Mac is sort of in a gray area, and also they're required to have physical machines on hand. Like, you can only rent out the VM- not on demand, it has to be, like, in increments of 24 hours or something like that, which means they have to have physical units on the shelf that they only rent out once a day. I don't know. Maybe it's changed in recent years. Apple made the licensing of essentially OSX ridiculous, which makes virtualization of it very hard, which means in turns it becomes very expensive.

Amos Wenger: So long story short, I created a VM on my Mac Studio, which I bought for video editing reasons, and it is way overpowered for what I need to do on a daily basis. Like I'm not always compiling Rust. Sometimes I think in between compiles, believe it or not.

And so I made a VM, gave it to Bjorn and now he, he is actually able to build stuff and like test it in real time, not wait for CI, which it's as you know, it's a long iteration loop. Whatever. Can't find the words.

It just makes you think, I don't know if he ever was asked that question before. Cause there's so many issues about 64-bit MacOS ARM. But he never had the machine, so he was never able to test. And it cost me next to nothing because I already had the machine here. I just created the VM, created a VPN, and then there you go. Remote desktop SSH. Go wild, do what you want.

James Munns: We're so surprisingly close to so many things that people think are very far away, just of getting the triple point of: the person who knows about this, to have the time and money to do it, and to have the resources to do it, like the access to the things they need. And you see this a bunch in Rust, I feel like: Someone makes some very tiny token donation, and they go, "Okay, well, yeah, then this person can buy the VM or the laptop they need."

And then, two weeks later, something you thought was years away is done. And ready. There's so much of that inefficiency of resource distribution in reality.

Amos Wenger: And this is why I'm happy to announce my new charity, which is called Mac minis for Everyone. You can donate now, go to macminisforeveryone. com, and I will send everyone a Mac mini. And then finally, we won't have to wait for things to compile on your fucking Acer laptop again. There you go.

Episode Sponsor

Thank you to fasterthanlime for sponsoring this episode.

fasterthanlime is a 34 year old person named Amos who makes articles and videos about computers for a living. They tend to go on long explorations to find out the answers to questions that have been bothering them since they were a kid. Rust and Linux are often involved, but also asynchronous IO, dynamic linking, thread locals, and other topics that only a few people get the chance to mess with on their own terms. You can support Amos's work by reading articles on their website, watching their videos on YouTube, and by sponsoring them on GitHub sponsors or Patreon.