(and some things I still don't understand)
Amos and James reminisce about how weird clipboards have always been. Or is it pasteboards? Or buffers? Oh boy.
Video
Audio
Show Notes
Episode Sponsor: CodeCrafters
- Ubuntu, Windows for Workgroups 3.11, DOS computer, Windows 95 & 98 SE
- X Window System or X11, X.Org, BSD
- GTK, Qt
- xclip, Wayland
- camera emoji
- Apple's Continuity feature
- pbcopy, CleanShot X
- Pasteboard-Viewer by sindresorhus
- Android intents
- UTF-8, UTF-16, WTF-8, Universal Coded Character Set UCS
- Betamax
- Mastodon client Phanpy, universally unique identifier UUID
- Tag image file format TIFF, web archive file, Rich Text Format Directory RTFD
- Slides Code Highlighter by Roman Nurik
- CSS Orphans and CSS Widows
- Pygments, Pandoc
- caniuse.com and caniuse.rs
- A clipboard for Rust arboard
- Amos' PR
Transcript
Amos Wenger: I like that I'm on black tea and like downers and uh, James is on Red Bull? Can everyone see my slides?
James Munns: Looks lovely.
Amos Wenger: I got tired of making presentations about Rust. I want to talk about something else for a change because I feel like I've been giving the same talk two or three times on SDR.
Things you might not have known about clipboards
Amos Wenger: Today's presentation is about clipboards. It's called "Things you might not have known about clipboards and some things I still do not understand."
The reason I care about clipboards is because I make a lot of slide decks nowadays. I make them for SDR. We have slides every time, you can see them on the website, you can see them on the video if you watch them on YouTube. And sometimes we have code and when we have code, I don't want to have a picture of a code editor, I actually want to have syntax highlighted code. I want actual texts that will render, crisp at any resolution and have the color scheme that I chose and not some random dude who published a weird Google Slides plugin five years ago chose because it's going to be bad.
And I did a bunch of weird stuff to make that happen. It's linked to my website because I usually write things like as articles and also as videos, so I need to paste them in different places. And that led me to play a bunch with clipboards. And I just want to have an SDR episode where we go together over how weird and cursed and fantastic clipboards are.
I guess a good introduction is, as a kid, I got an Ubuntu CD, and that was the beginning of the end. I installed it on my parents' computer, I think? Before I got my own computers and it got me into the whole like Linux thing.
It was over for me. Like it was, it was a gateway to becoming an open source maintainer, burning out and all that stuff. But one thing I learned is that on Linux- at least back then, I don't even know if that's still a thing today, it probably is- if you select text, it immediately goes into some clipboard, which is not called a clipboard, we'll get to that.
But it goes somewhere, and then you can middle click somewhere else, and that's actually going to paste it. So, I had played with Windows a bit, the early Windows, I think my first Windows was Windows 3.11 for Workgroup.
James Munns: Hell yeah. I had a, I had an old DOS computer too. It was a IBM something-or-other, but it was like- it would boot in DOS. And if you needed to, you could boot into Windows to play a certain game or whatever.
Amos Wenger: Exactly, you type W I N, enter, and then it would boot up into Windows, yes. And then I spent most of my time on Windows 95 and 98SE, you know. And crashes were a lot more frequent, I forgot how frequent crashes were, and how blessed we are today. Today, it just, kind of, sometimes the desktop reboots, that's it. That's what happens when you get a crash.
X has multiple selection buffers
Amos Wenger: For me, the clipboard was like Ctrl+C, Ctrl+V, right? Ctrl+C, it's in the clipboard, Ctrl+V, you paste it. Not at all the case, like this is the experience most people have of clipboards, I would imagine.
But my first exposure to how weird they were was that on X, I was using it on Linux, but on X, which I think you can also use on various BSDs. So X11, xorg, I don't even know how to call it at this point. There are multiple selection buffers.
The primary selection buffer is kind of the highlighting middle click thing that I just talked about. And then the clipboard selection buffer is the Ctrl+C, Ctrl+V thing that you get in GUI applications, applications using GTK, using Qt.
And so there's a command line tool called xclip. And you can pipe things into it. So if you have like a command that generates a lot of text, you actually want to paste that somewhere into the browser, for example, you can run that command and then pipe and then xclip, but you have to specify -selection clipboard
, otherwise it goes into the wrong selection buffer.
Browsers are especially cursed
Amos Wenger: Browsers are especially cursed as well. Again, fond memories of my Linux on the desktop days. I have since moved to Windows and then to macOS. I remember pasting an image into a Linux browser- so like doing a screenshot, and then Ctrl+V into a browser, like on GitHub or Twitter or whatever- could freeze it? And I'm not sure that's still the case, but honestly, I wouldn't be surprised, but now X has been phased out to some extent and there's Wayland, so I imagine clipboards are now breaking in new, exciting ways that I haven't even learned about yet.
James Munns: Always something to debug. So many opportunities for learning.
Amos Wenger: Exactly.
Oh, this is another fun one: I have a slide here. And again, you're missing out if you're not actually looking at the slides, you can get the PDFs of the slide decks on sdr-podcast.com. Go to episodes, find the one we're talking about, or you just watch this on YouTube, in which case you have been looking at the slide for quite some time.
The slide is just the Twitter. message composer, whatever they're calling it. And it has a single camera emoji, like DSLR, not like movie camera. And that's the result of doing copy image from another tab and then pasting it into Twitter on Safari. Instead of pasting the image, it just pasted this emoji.
James Munns: What? What? Who was like, "This is what they'll want. This is, you know, exactly what they were looking for!"
Amos Wenger: I have no idea. Honestly, I feel like everyone gets mad at browsers that aren't Chrome. So people get mad at Firefox, people get mad at Safari... when actually it's probably on the developers of websites and web apps for not testing on anything other than Chrome. And, you know, given the recent run of Twitter being Twitter, I guess... like Twitter being X, it's not surprising to me that this happens. I mean, a little, still, but... I don't know what to tell you.
Apple cross-device paste
Amos Wenger: There are some good things about clipboards. Ever since I moved everything over to the Apple ecosystem- I have like a Mac mini, I have a Mac Studio, I have an iPhone, I have a MacBook- I can copy from some device and paste into another device. So, banking apps tend to be on phones for security reasons.
You have face ID and everything. And then I usually fill forms out on desktop because that's convenient. So you can just open the banking app, hit copy and then paste into the desktop. There's no like setup to do. They do even a crazier Continuity things where you can if you have two computers, you can just move the mouse to the edge of one of the screens and it'll appear on the other one magically. A feature that is cool when it works, and then I'm sure exactly twelve people on earth know about, because I don't think they've done any effort in trying to tell people that it's there so people just download third party apps instead that are worse and they need to pay money for.
James Munns: I don't know if you mentioned it because you mentioned that xclip works on Linux. It took me forever to realize that on macOS they also have pbcopy? Which I don't know if is built in or whatever, but they have a similar sort of you can pipe a bunch of stuff into a command line and then paste it somewhere else.
I'm just, you know, Ctrl+C Ctrl+V all over the place kind of person.
Lessons learned...
Amos Wenger: If I had to summarize everything we've seen so far, we have learned that operating systems, or like, windowing systems, it's fuzzy, okay? The nomenclature is- don't, don't sue me. Operating systems have multiple clipboards, for X- based systems and I'm assuming Wayland also copies this. I have no idea. I don't, don't at me about Wayland. I don't care anymore.
For X, there's a primary selection buffer and clipboard selection buffer. On macOS, there's also a different set of paste boards as they call it. So nothing, none of that is actually called a clipboard. Like clipboard is just the name of one X selection buffer. But there's no clipboard on macOS. It's a pasteboard.
Hence the name pbcopy. We're talking about the pbcopy CLI tool. That's pasteboard copy. Uh, but it still has a bunch of different pasteboards. It has the general one, it has the drag one. For example, when I use one of my new favorite tools on Mac- well, it's not new, but it's new to me- called CleanShot X.
It's just a screenshotting tool that is better than the built in MacOS one. And instead of putting the image in the general pasteboard. It has it floating as a window and then you can drag the window wherever you want to. And what that does is that it doesn't override whatever you had in the clipboard.
So actually this was super useful when coming up with that slide deck because it didn't override what I had in there... because I was taking screenshots of a tool called pasteboard viewer.
Not only do we have multiple clipboards, each clipboard can have multiple items. This makes sense if you think of like selecting a bunch of files in the finder and then copying them. Each of those files is one item.
And then also items can be available in several formats. And that's where, if you have access to the slides, again, you know where to find them.
But you will be able to see
What's in the pasteboard...
Amos Wenger: a little app called Pasteboard-Viewer made by sindresorhus. I'm definitely butchering that name, I'm sorry. It's a very nice app, it's free, it's open source, and it lets you see what's in the macOS Pasteboard. And, uh, here's an example.
When you copy two files, like I promised, you have two items, and they both have a format, or a representation, or a content type, or a MIME, I'm not even sure how to call it, a variance, a presentation, so many names you could call that, called public.file-url, which is a URL, starting with file:///.
And then, not the path of the file that I copied, it's actually .file/id=, some numbers dot some other numbers. That's the surprising part to me. Everything else makes sense so far, but this, I don't understand why it's not just the path of the file. I imagine it's because you can actually copy from a network drive, or I don't know, from inside of an archive, a mounted thing, but all of these have actual paths.
You could like pass the copy on the command line. So I don't actually know why they're doing that, but I do understand that the contents of the file themselves are not actually in the clipboard. There's just a reference so that if you paste into an application that supports that, they will know where to find the files, where to get metadata about them, how to read them and everything.
James Munns: Yeah. I know like intents on Android or whatever are sort of similar in that, like you can declare that you can handle certain intents and that's how things like the "share to this app" happens is I'm sure they have something very similar. That's like- this is the, either a file path or like a pointer to the whatever buffer or some output like how programs talk to each other especially in a GUI world is something totally out of my knowledge area.
Amos Wenger: Yeah, but the weird thing is that the file protocol, it's just like HTTP or HTTPS. It's an actual protocol and it actually works. You can open file URLs in your browser. The thing is that this one doesn't exist. So this is the part I cannot explain. I promised there were some things I still don't understand. This is one of them. This does not exist. If you try to stat it in the terminal, it does not exist.
The other interesting thing is that when you copy two files from the finder, the first item in the pasteboard, the general pasteboard, if we're going to be specific about it, has a UTF-8 plain text representation, which is just the two names of the files.
So if you were to, like, copy two files and paste that into a text editor, for example, you would just have the file names, which is knowledge that I've never used, but yeah, if you just need, in a script like a list of files, you can just copy them from Finder and paste them into your text editor and you would get that list of all the file names.
They also have a UTF-16 version of that. Now, I don't know the implementation details. Is it an API that converts on the fly?
I really hope it does, because otherwise it means when you select a really, really, really long text, what it does is it takes the normal UTF8 version of it and then also creates a UTF-16 version, which is probably twice as large, roughly, just to put in the clipboard. I know Windows is heavily UTF-16 based, or is it UCS2?
I keep forgetting. Or is it WTF something? I don't know.
James Munns: WTF8 is fun but yeah, I don't, I, wide char, I don't know what wide char is if that's UC2 or UTF-16.
Amos Wenger: Windows is weird because at first it was like, okay, we have ASCII, we have like seven bits and then one that's useless so we can do code pages so that the upper 128 characters can be accents. It's mostly going to be like French accents or whatever latent stuff. And then they were like, "Okay, no, there's a lot more later. So we're going to do UTF-16."
And then they were like, "Well, now the rest of the world is kind of standardized on UTF-8," so actually there's a code page that's like, instead of ASCII, it's now UTF-8, but it's a mishmash of a lot of different things. I've come to appreciate the Apple approach a lot more, where they just deprecate stuff.
They're just like, "No, this doesn't work anymore. Screw you." Whereas Windows is like, "Well, yeah, we, we kind of try to keep everything working..." and then it kind of doesn't actually, I don't know.
James Munns: There was an awkward future point and I think Java and Windows both got hit by it where they're like we need something other than ASCII and alternative code pages and UTF-8 didn't exist yet or at least wasn't popular and they committed hard to UTF-16 and oops, that was not the format that won. They chose the Betamax of uh, string formats.
Amos Wenger: That's harsh. I think Betamax was actually technologically superior or something.
James Munns: That's fair.
Amos Wenger: I don't think UTF-16 is like paying twice as much for the most common letters is not a good format.
James Munns: Fair.
More pasteboard explorations - when you copy an image from Safari?
Amos Wenger: More pasteboard explorations.
When you copy an image from Safari, this is the case I had where the reason you saw the tweet composer is because I manually cross post between BlueSky, Mastodon, and Twitter. So depending on which composer I use first and then I copy to the others, I get that freaking camera emoji and I have to paste into the preview app first and then copy from there, and then that works?
Anyway, when you copy an image from Safari, this is what you have in the pasteboard. There's so many different representations here. There's 1, 2, 3, 4, 5, 6, 7, 8, 9 of them. It's kind of amazing. Again, I hope some of them are built lazily, like they're getters that are computed on the fly because otherwise it's a lot of computations, just when you're hitting command C, no, no wonder sometimes it just lags.
One of the representations is called public.url and it's- I think it's the reason we see that little camera emoji in the tweet composer is because the URL is not an actual URL. It's a blob URL, which is not HTTP, it's not HTTPS. It's not a file. It's not even a version of file that doesn't exist, like what happens when you copy from the finder.
It's a blob URL. So it's only valid within a certain context. What happens is when you paste an image into Phanpy, a Mastodon client I really like, it stores it as a blob, which is just a collection of bytes. And then you get this UUID. What's the actual meaning of UUID? There's unique in there somewhere.
James Munns: Universally unique identifier?
Amos Wenger: There you go. You get this universally unique identifier, and that's the name of the blob. And then you can refer to it, but you can see it's prefixed by phanpy.social. So it's only valid for this thing. So if you actually paste that somewhere else, then it doesn't know what to do with it.
It has no way of getting the original data, but the original data is actually there as a TIFF. When's the last time you saw a TIFF, James?
James Munns: I could not tell you. Intentionally, probably a decade.
Amos Wenger: And yet here it is as a TIFF. It's a 2.8 megabyte file for 900 pixels, uh, by 800. It is screenshot I took, for sure, just in a weird format. Again, something I'm assuming Apple committed to a long time ago, and now that's just, that's just what's in the clipboard, I guess.
But it doesn't stop there. You also have it as a web archive. What's a web archive? I'm so glad you asked. It's pretty much what you have if you try to save a web page. Like if you hit file save in your browser and it says well I could save only the HTML but then you wouldn't have images, you wouldn't have CSS, you wouldn't have Javascript.
So how about I save everything for you? It's kind of a zip but not really. Because actually it's a binary property list in that case, which is a very Apple thing. And is here shown, I think as XML
James Munns: In all the joy of XML yeah, I was gonna say
Amos Wenger: It's even Base64 encoded. I don't think it actually is Base64 encoded in the... I think this is just the Pasteboard Viewer app showing it to us in a way that is structured.
Moving on, we have another format called com.apple.flat-rtfd. RTF does mean Rich Text Format. Your instinct is correct. And then I don't know what the D stands for, but flat means that it's an archive of rich text plus its attachments. So this is actually a rich text format document that contains the image, but this time as a PNG.
James Munns: I didn't realize that systemd got into RTF documents too.
This is my RTF daemon.
Amos Wenger: RTF is, is worth its own episode.
I think you can see a little bit of the markup in the slide right now. It's very interesting.
Code Highlighter copy-paste
Amos Wenger: Because I like the Pasteboard Viewer tool that much, I did a bunch more slides with what actually ends up in the pasteboard when you copy from different places. And I was curious about this specific node that is zoomed in but is part of the UI for a slides code highlighter, a tool by Roman Nurik, which lets you paste your own code as plain text and then colorizes it and then you can copy it back.
And it says if you're pasting into Keynote the Apple presentation tool you should copy from Safari but if you're pasting into Google Slides then you should copy from Chrome so I was curious what the differences actually are what actually ends up in the in the pasteboard if you copy from Safari or from Chrome and so that's what I did. This is the result of copying from Safari and there's an RTF thing in there. It's taking the HTML, it's taking the colors, it's taking the fonts, it's taking the size, and it's again, like converting it to rich text format.
And this, I didn't know yet, when I started researching this, but this is what Keynote uses if you paste into it. It doesn't use the HTML version at all, even though it also has the font, font face and size and color information. It uses the rich text format version, which is here rendered poorly with the wrong font.
James Munns: This is like... if it's asking you which you're copying and pasting from and to... that gets into I guess like just the practicalities of the 'Who's encoding what?'
It's like when my printer refuses to print a pdf and sometimes I open it in Chrome or whatever and then print to pdf and it re encodes the entire pdf and then sometimes my printer figures out what that format is, but... it's interesting that it's just... I guess this is your best encoding- like sender and receiver pairs known to work well.
Amos Wenger: Exactly. But I was curious. So I looked at the HTML version of that pasteboard item. Cause again, it's one pasteboard item. We're copying one thing, but it's available in different formats. When copying from Safari, it's available as RTF, as I just mentioned, but also as UTF-8 plain text, a bunch of WebKit internal things, WebArchive again, for some reason, and then also just HTML.
The difference between what Safari writes to the pasteboard and what Chrome writes to the pasteboard in terms of HTML is actually not that big. If you're looking at the slides, this is the Safari version.
This is the Chrome version.
Chrome vs. Safari pasteboard
Amos Wenger: I even did a diff so you can see the differences are mostly in like... the surrounding HTML element that it puts around the content that you're copying.
And there's a few different defaults. orphans: 2 becomes orphans: auto. I didn't even know there was an orphans CSS property. Widows also surprised me.
James Munns: Not windows just widows.
Amos Wenger: Widows, important. Text decoration, thickness, style, and color all get collapsed from initial into text decoration none, which is the shorthand. So it's just like the default stuff it puts around the HTML version of whatever you're copying.
And the reason that tool exists and the reason it's useful is because it styles every span of code individually with inline CSS and that ends up in the pasteboard as well. Whereas if you had like CSS classes and then rules for those CSS classes, it would not necessarily do that. It would not pick up on the styles that are applied from the style sheets, but that are not in line on the nodes.
James Munns: I know on Linux, there's a command line tool that does syntax highlighting. I forget what it is, but it's...
Amos Wenger: There's Pygments...
James Munns: Pygments is exactly the one that I'm thinking of. And it does the exact same thing where it's just the HTML output that it produces is super verbose because it's does inline CSS for every element.
So when you have some fairly benign looking code, there might be 12 CSS chunks in one line because it's going to highlight your variable and the equal sign and the numbers all different. And that's pretty- I have looked at that one before and it looks a lot like that. And it's probably why that exists like that.
But wait, there's more!
Amos Wenger: I am aware of such tools. And that's actually the first thing I address in my articles, like: I know there's existing solutions. But I did code highlighting the way I like it on my website. And so that's why I wanted that pipeline of like, I'm writing the article, I'm pasting the code one place already, which is inside the markdown.
And then it's highlighted exactly the way I want it. And I want that to appear as-is in the presentation. So I'm not using Pygments. I'm not using- I'm sure Pandoc or something can do something like that. There's existing tools, but... Yeah, I want to automate things much as possible and keep them close to the way that I like them.
And to achieve that, it means that I have to be able to copy code from my website, not the production version of my website, because when I draft an article, it's not actually available on fasterthanli.me. It's only available locally. I'm running the same software locally. It's serving it on port 1111.
But I have the same interface and every code block has a little copy button that only I can see and that actually writes to the pasteboard and that means I need to use the browser APIs to do that.
Luckily, the browser APIs for clipboard actually mirror reality more or less. I don't think they expose the fact that there's different clipboards or pasteboard or selection buffers. I think it just uses the clipboard selection buffer on X. It just uses the general pasteboard on macOS and don't ask me about other platforms, I'm tired.
But it has the idea that you can write multiple items to it. So you can like download multiple files probably. It has the idea that every item can be available in different formats. So you can have, for example, an item available as HTML and text so that it could be formatted if you paste it into a capable application or just text, if you paste it into something else.
And they don't have to strip the HTML tags themselves or just be incompatible.
Compatibility matrix in browsers
Amos Wenger: They even thought about use cases like... you know, maybe people have diagrams on their websites, you know, people like me and they want to copy those and paste them into presentation software that actually supports them. Now, unfortunately, for years now, you have not been able to paste those into Google Slides or even upload them.
It just goes, "Oh, your EMF file is invalid." I just uploaded an SVG. First of all, it's not even what? But then also there's a lot of workarounds or like: yeah, if you upload it to like a free online suite to convert it to a different format, and then you run a script on Google Workspaces to change the MIME type, because there's a bug since of two years ago.
Or like if you import it as a PowerPoint document into Google Drive and then copy from one tab to the other, then it keeps the shapes, but actually no, some people say it rasterizes it as PNG.
The point is you cannot do vector graphics in Google Slides. Don't even try.
James Munns: User experience is my passion.
Amos Wenger: But you can paste it in software like Keynote, which... Mostly I think by accident. I think like Apple invested in RTF, Apple invested in SVG, also internally for the OS. I'm pretty sure I didn't- I didn't fact check that. And as a result, you can just paste text with formatting as RTF and also diagrams as SVG, and it'll just do the right thing.
It has, it's all integrated in the renderer. It's very, they didn't have to add support for it. It's just something they've been doing for years.
So because SVG is supported by Safari and because web browser clipboard APIs have the ability to have items be available in different formats and tell which MIME type it is. And because SVG is a text format, so you don't have like weird things about binary or like, it's actually impossible to generate from JavaScript because binary is weird.
You'd think you would just be able to write some code that writes some SVG into the clipboard, and then you can paste it into Keynote, and then everyone's happy. But, actually, if you look at the compatibility matrix out of caniuse. What's name of that domain again?
James Munns: I just know 'caniuse', but I know exactly what you're talking about.
Amos Wenger: Right, it's just caniuse.com. I was thinking of the caniuse.rs. There's one for Rust.
Unfortunately, Safari is one of the browsers that does not support writing items of type image slash SVG plus XML.
Status quo: what is supported in which platform
Amos Wenger: If we summarize the status quo of support between browsers and presentation software, and all I really want to do is to just copy code as rich text, I just want colors and fonts and font sizes. It's not crazy. And copying diagrams as SVG.
I have this little slide here that kind of summarizes everything. Safari doesn't let you copy SVG into the clipboard. Chrome doesn't copy colored text as RTF. Keynote doesn't read HTML, CSS and apply the styles from the clipboard and Google Slides doesn't read RTF from the clipboard and also doesn't let you paste SVG.
So, as of now, there is no combination of tools that I can actually use to do my work properly.
James Munns: So at what point do you write a utility that just listens to the clipboard and whenever it sees clipboard data just re exports it in all three of these formats automatically? So whenever you copy it just replaces your copy buffer with like the maximally implemented version, number of version formats.
Amos Wenger: Well.
James Munns: Oh no.
My way out? Write to clipboard from the local server
Amos Wenger: If you search for Rust crates that let you write to the clipboard, you will see an open pull request for myself adding the... what did I actually add there? I think I was only concerned with HTML at that point, but yeah, adding the ability to specify different content types on macOS, but the problem is that it supports all platforms.
So they would only accept the pull request that adds it for all platforms at once. And again, I don't know what the frick's going on with Wayland and X11 these days. But yeah, because I'm running a local server anyway, I can just, when you click the, the copy button, I can just send a message to the server, which is then going to write to the system clipboard with all the might of native code and not just the browser API going like, "No, sorry, this MIME type string is something we will take another couple of years to support because..."
Because why? So actually pretty much this whole episode is a cry for help. If you work on the Safari team, if you know someone who works on the Safari team, please tell them to just add support for image slash SVG plus XML. Please. It's for me. It's for Amos. You know me. You've seen me in videos. Just add support. It's- I'm sure it's a two line change on your end!
I'll run technology preview if I have to. Just please add it. Please?
James Munns: I think the podcast is the perfect place for calling in favors from people who happen to work at tech companies and can just make those PRs. It's like that person who... uh, is probably apocryphal... the person who joined a company just so they could fix a bug and then quit immediately after fixing the bug.
Amos Wenger: I think that actually happened...
James Munns: it wouldn't surprise me.
Amos Wenger: More times than, than, than we have on the record. Probably.
James Munns: It's too good of a story to be true, but it's too good of a story to pass up.
Episode Sponsor
CodeCrafters is a service for learning programming skills by doing.
CodeCrafters offers a curated list of exercises for learning programming languages like Rust or learning skills like building an interpreter. Instead of just following a tutorial, you can instead clone a repo that contains all of the boilerplate already, and make progress by running tests and pushing commits that are checked by the server, allowing you to move on to the next step.
If you enjoy learning by doing, sign up today, or use the link in the show notes to start your free trial. If you decide to upgrade, you'll get a discount and a portion of the sale will support this podcast.