WEBVTT

NOTE
This file was generated by Descript <www.descript.com>

00:00:13.375 --> 00:00:15.145
<v Amanda Majorowicz>This
is Self-Directed Research.

00:00:15.235 --> 00:00:19.105
Our hosts, James and Amos, get hyped about
different topics and take turns each week

00:00:19.135 --> 00:00:20.665
presenting their ideas to each other.

00:00:20.995 --> 00:00:23.545
You can check out the website,
YouTube or Spotify to watch this

00:00:23.545 --> 00:00:28.525
episode's presentation and visit
sdr-podcast.com/episodes for

00:00:28.525 --> 00:00:32.305
previous episodes with presentations,
videos, show notes and transcripts.

00:00:32.605 --> 00:00:34.675
New episodes are
published every Wednesday.

00:00:34.995 --> 00:00:36.885
This episode is brought
to you by CodeCrafters.

00:00:37.095 --> 00:00:39.315
Check out the link in our show
notes or listen at the end for

00:00:39.315 --> 00:00:43.755
more information after James
elaborates on "Compile Time Crimes."

00:00:43.905 --> 00:00:44.775
Hope you enjoy!

00:00:50.239 --> 00:00:51.349
<v Amos Wenger>Both of you say something?

00:00:51.464 --> 00:00:51.984
<v Amanda Majorowicz>Something?

00:00:52.094 --> 00:00:53.734
<v James Munns>Hello, I am James.

00:00:54.199 --> 00:00:55.029
<v Amos Wenger>Yeah, we have you.

00:00:55.144 --> 00:00:57.532
I can see the levels in Cleanfeed,
which makes me feel safe.

00:00:57.629 --> 00:01:00.067
I can even like hit the download
button as many times as I want.

00:01:00.077 --> 00:01:00.957
Like, anxiety?

00:01:01.007 --> 00:01:01.277
Boom.

00:01:01.327 --> 00:01:01.687
Download.

00:01:01.717 --> 00:01:03.090
Now you have all the files on your disk.

00:01:03.293 --> 00:01:05.098
<v Amanda Majorowicz>That
sounds like too many...

00:01:05.208 --> 00:01:06.038
I don't like it.

00:01:06.108 --> 00:01:07.358
I get anxiety by that.

00:01:07.508 --> 00:01:08.108
<v Amos Wenger>It's fine.

00:01:08.148 --> 00:01:09.188
I'm the one doing it.

00:01:09.188 --> 00:01:09.778
<v Amanda Majorowicz>I'm like: wait.

00:01:09.848 --> 00:01:10.498
No, no, no.

00:01:10.568 --> 00:01:11.028
No, no, no.

00:01:11.028 --> 00:01:11.738
That's too many files.

00:01:12.008 --> 00:01:13.198
<v Amos Wenger>Well, okay.

00:01:13.258 --> 00:01:14.068
I understand that.

00:01:14.138 --> 00:01:16.928
Yeah, but losing the audio
is the higher anxiety.

00:01:16.938 --> 00:01:17.208
Surely.

00:01:17.603 --> 00:01:17.953
No?

00:01:18.118 --> 00:01:19.328
<v Amanda Majorowicz>Maybe, I don't know.

00:01:19.358 --> 00:01:22.033
It's intermittent- it's
anxiety either way, I guess.

00:01:23.218 --> 00:01:24.678
<v Amos Wenger>Ok ok ok,
we'll fight about it later.

00:01:24.988 --> 00:01:27.558
<v James Munns>We can have
our own special anxieties.

00:01:27.618 --> 00:01:28.608
<v Amos Wenger>I understand.

00:01:28.643 --> 00:01:30.253
<v Amanda Majorowicz>Do you want
the little bits of anxiety or just

00:01:30.253 --> 00:01:31.993
the really big potential anxiety?

00:01:32.003 --> 00:01:32.621
One of the two.

00:01:32.671 --> 00:01:33.041
<v Amos Wenger>All right.

00:01:33.171 --> 00:01:35.181
Well, James, what do
you have for us today?

00:01:35.181 --> 00:01:35.521
<v James Munns>Okay.

00:01:35.961 --> 00:01:41.401
I've been working on Postcard RPC because
I got nerd sniped into how to fix some

00:01:41.411 --> 00:01:47.781
things that I didn't think needed to be
fixed, but that required some trickery.

00:01:47.941 --> 00:01:53.550
So I've been working on doing
tricky things at compile time.

00:01:53.643 --> 00:01:57.763
This is me documenting some of the
silly tricks that I've done because I

00:01:57.773 --> 00:01:59.513
think you might appreciate them Amos.

00:02:00.083 --> 00:02:00.633
<v Amos Wenger>I hope so.

00:02:00.633 --> 00:02:02.793
<v James Munns>One of the things
that I've been working on is

00:02:02.863 --> 00:02:04.383
preparing a list of schemas.

00:02:04.533 --> 00:02:08.153
So if I'm making a device and I want to
say: these are all the types that the

00:02:08.153 --> 00:02:13.513
device can send or receive, sort of like
an OpenAPI spec if you're familiar with

00:02:13.513 --> 00:02:16.813
those, but just a list of, "Hey, when you
talk to me, these are all the different

00:02:16.813 --> 00:02:19.273
kinds of data you can send and receive."

00:02:20.433 --> 00:02:23.063
And in the past I hadn't
been making them unique.

00:02:23.103 --> 00:02:28.783
I would just kind of do a macro expansion,
and if you had 10 endpoints that all sent

00:02:28.783 --> 00:02:33.363
the same data, then you would just get 10
copies of that schema, but schemas aren't

00:02:33.363 --> 00:02:35.523
so big, so I wasn't too bothered by it.

00:02:36.023 --> 00:02:38.823
It wasn't great, but it also
wasn't practically a problem.

00:02:39.093 --> 00:02:39.653
<v Amos Wenger>Right...

00:02:40.063 --> 00:02:41.643
do you have a unique identifier for them?

00:02:41.662 --> 00:02:42.848
Can you hash the schemas?

00:02:42.868 --> 00:02:43.748
How does that actually work?

00:02:43.748 --> 00:02:46.358
Because like, doing anything at
compile time is complicated enough.

00:02:46.508 --> 00:02:48.498
But now I'm just thinking, even
if you didn't have the compile

00:02:48.498 --> 00:02:49.948
time constraint, how do you...

00:02:49.958 --> 00:02:50.848
<v James Munns>The answer is yes.

00:02:51.008 --> 00:02:54.898
Yeah, so as a reminder, Postcard RPC- we
did an episode on it, and I can't remember

00:02:54.898 --> 00:03:00.508
how deep I went into this part of it, but-
I do have a const function that hashes the

00:03:00.508 --> 00:03:06.968
schema and the URI together, and produces
an 8 byte key at const time for that.

00:03:07.198 --> 00:03:11.598
So you will have a const schema for
a type, so there's a trait Schema

00:03:11.598 --> 00:03:14.670
that gives you a const SCHEMA, which
is a type that is like a recursive

00:03:14.670 --> 00:03:17.479
struct that describes the data.

00:03:18.019 --> 00:03:21.039
And then I have a function that
can take that type and produce

00:03:21.069 --> 00:03:23.679
an eight byte hash for that.

00:03:24.399 --> 00:03:28.449
And then that's usually what I use as
my message identifiers on the wire.

00:03:28.647 --> 00:03:30.657
And that means you also don't
need to calculate it at runtime.

00:03:30.657 --> 00:03:33.417
It just becomes this bag of
eight bytes that you append

00:03:33.737 --> 00:03:35.177
to the front of every message.

00:03:35.987 --> 00:03:39.785
So one of the things I would like
to do is deduplicating both that set

00:03:39.835 --> 00:03:45.153
of key hashes, as well as the set of
schemas, because if I am allowing you

00:03:45.153 --> 00:03:49.983
to download types, like reflection
style, of what types are used for this,

00:03:50.213 --> 00:03:52.273
I need to deduplicate both the IDs.

00:03:52.283 --> 00:03:56.223
So these are all of my endpoints by
ID, as well as these are all of the

00:03:56.233 --> 00:04:00.503
types that are used by all of those
endpoints, and ideally deduplicating

00:04:00.513 --> 00:04:04.883
them, so that if I have five endpoints
that all return the same struct, just

00:04:04.883 --> 00:04:09.692
in a different context, then I don't
repeat that definition five times.

00:04:10.082 --> 00:04:13.187
<v Amos Wenger>Question, is it a
concern    of like, your binaries

00:04:13.207 --> 00:04:15.747
need to be as tiny as possible
because you're doing this on embedded?

00:04:15.997 --> 00:04:18.507
Is it a constraint of, like
it's a bandwidth problem?

00:04:18.507 --> 00:04:20.427
You want the response to
be as small as possible?

00:04:20.517 --> 00:04:22.847
Why are you even deduplicating
them in the first place?

00:04:23.017 --> 00:04:25.547
Or just someone said, "Oh,
that's messy, you should fix it."

00:04:25.700 --> 00:04:28.690
<v James Munns>There's actually
multiple parts to this here.

00:04:28.800 --> 00:04:32.010
The main answer is for bandwidth.

00:04:32.340 --> 00:04:35.630
So right now I use USB, and even
though it's sort of the full speed,

00:04:35.630 --> 00:04:38.777
which is the lower USB 1.1 speeds.

00:04:38.877 --> 00:04:39.437
<v Amos Wenger>Right.

00:04:39.447 --> 00:04:40.987
<v James Munns>It's still fast
enough, and the schemas aren't

00:04:41.007 --> 00:04:42.317
large enough, it isn't a problem.

00:04:42.337 --> 00:04:45.107
But I've talked to some people that
want to use this over much lower

00:04:45.117 --> 00:04:49.942
bandwidth links- so either radios,
or very old serial ports- saying: Oh,

00:04:49.942 --> 00:04:54.712
well, you're just gonna transfer five
kilobytes of schemas is more problematic

00:04:54.742 --> 00:04:58.512
than crunching that all down to maybe
a couple hundred bytes, ideally,

00:04:58.782 --> 00:05:03.142
particularly if we have very repetitive
or very recursively nested types.

00:05:03.212 --> 00:05:07.592
So on one hand, I didn't know how to do
it before and I managed to figure out how.

00:05:07.669 --> 00:05:11.469
So it's not like there's a specific
pressure on this, but in this

00:05:11.469 --> 00:05:15.333
case, reducing the amount of
packets that I need to send when

00:05:15.333 --> 00:05:17.143
you say, "Tell me about yourself!"

00:05:17.563 --> 00:05:22.706
I want to describe myself as succinctly
as possible usually for bandwidth reasons.

00:05:22.926 --> 00:05:25.776
For some implementation reasons,
a lot of these end up being

00:05:25.926 --> 00:05:28.626
statics or static references.

00:05:28.806 --> 00:05:31.796
So they actually do get kind of
deduplicated by the linker already.

00:05:32.016 --> 00:05:35.786
So this is more about when I've been
asked to describe myself, how many

00:05:35.796 --> 00:05:39.646
packets I need to send to describe
myself and how large are those packets.

00:05:39.841 --> 00:05:41.591
<v Amos Wenger>For the past two
minutes, I've been picturing...

00:05:41.631 --> 00:05:46.933
there's a short by I want to say Pixar,
about a letter that goes through the air.

00:05:46.993 --> 00:05:49.723
Cause I've been thinking about
Postcard over radio frequency.

00:05:50.426 --> 00:05:51.543
I have that short in mind.

00:05:51.543 --> 00:05:53.003
I think you should make
that part of your branding.

00:05:53.933 --> 00:05:55.213
I was totally listening
to your explanation.

00:05:55.213 --> 00:05:56.283
Please, please continue James.

00:05:57.683 --> 00:06:01.353
<v James Munns>I've kind of crunched out
a very basic example of this so it's a

00:06:01.353 --> 00:06:04.943
little easier to follow if you don't know
how all the guts of Postcard RPC work.

00:06:04.983 --> 00:06:11.100
But let's say I have a list of
u32s, so just integers unsigned.

00:06:11.100 --> 00:06:14.170
The list is 1, 5, 10, 5, 20, 1.

00:06:14.580 --> 00:06:18.740
So we notice that there are some
duplicates on that list, and if I

00:06:18.740 --> 00:06:20.680
wanted to get rid of those duplicates,

00:06:21.085 --> 00:06:25.145
I want to end up with an
array that is 1, 5, 10, or 20.

00:06:25.145 --> 00:06:28.315
I don't particularly care about
the order, but I don't want there

00:06:28.315 --> 00:06:30.695
to be any duplicates in this list.

00:06:30.705 --> 00:06:35.209
So I've removed one of the
fives and one of the ones.

00:06:35.309 --> 00:06:38.059
<v Amos Wenger>Luckily,
HashSet is const, right?

00:06:38.209 --> 00:06:39.639
Please tell me HashSet is const.

00:06:39.699 --> 00:06:41.799
Cause that's what I would do at runtime.

00:06:42.359 --> 00:06:46.619
<v James Munns>Well, there is an issue
with that and we will get to that.

00:06:47.339 --> 00:06:48.129
<v Amos Wenger>Okay, okay.

00:06:48.284 --> 00:06:51.004
<v James Munns>So if I wanted to
do this with a procedural or a

00:06:51.004 --> 00:06:54.809
declarative macro, the issue is
that this only works with tokens.

00:06:54.849 --> 00:06:58.919
So procedural and declarative macros can
do some really cool things and if you

00:06:58.919 --> 00:07:03.109
throw syn at the problem, you can write
a proc macro that probably could do all

00:07:03.109 --> 00:07:06.439
the things you just described and take
all those tokens in, convert them into

00:07:06.449 --> 00:07:12.679
integers, do the hash set of them, and
produce a new token tree or whatever of

00:07:12.939 --> 00:07:14.779
the unique integers, and that would work.

00:07:15.319 --> 00:07:20.309
And it would work if I gave you the
input of this array 1, 5, 10, 5, 20, 1.

00:07:20.554 --> 00:07:20.944
<v Amos Wenger>Yes.

00:07:21.093 --> 00:07:23.995
<v James Munns>But if I give you another
const or a static or something like

00:07:23.995 --> 00:07:26.175
that, these macros only work with tokens.

00:07:26.185 --> 00:07:30.715
They can't see the value behind them,
which means that it won't work if I just

00:07:30.715 --> 00:07:35.665
say I have a `const LIST_WITH_DUPES`
because they can't understand that.

00:07:35.770 --> 00:07:37.760
<v Amos Wenger>Yeah, all they
would see is the identifier.

00:07:37.760 --> 00:07:39.597
They wouldn't actually see
the value that's behind that.

00:07:39.630 --> 00:07:42.600
And I don't think you can even
fake your way around that.

00:07:42.660 --> 00:07:47.387
Cause syn, S Y N is a parser for Rust.

00:07:47.387 --> 00:07:50.707
It's an entirely separate
parser from rustc's parser which

00:07:50.732 --> 00:07:52.077
have different functionality.

00:07:52.117 --> 00:07:53.467
Can we say bitched on the podcast?

00:07:53.497 --> 00:07:54.837
I bitched about it on X.

00:07:54.837 --> 00:07:56.007
You can bleep that if you want.

00:07:56.007 --> 00:07:56.517
I don't care.

00:07:56.517 --> 00:07:57.517
Anyway, uh,

00:07:59.747 --> 00:08:03.847
I've bitched about that before: that like,
oh, rustc has its own parser and then syn

00:08:03.867 --> 00:08:06.797
is also a parser, and it's messy my head.

00:08:06.797 --> 00:08:09.938
In my perfect view of the world, there's
only one parser that can do it all.

00:08:10.078 --> 00:08:14.128
But I was explained that actually, the
rustc parser is better at recovering from

00:08:14.128 --> 00:08:18.120
parse errors and emitting very detailed
diagnostics, error messages, yeah.

00:08:18.769 --> 00:08:24.289
Whereas syn is better at keeping the
exact structure of the AST exactly the

00:08:24.289 --> 00:08:28.319
way written, even when it wouldn't matter
for rustc, like new lines and the exact

00:08:28.319 --> 00:08:30.139
formatting of comments and everything.

00:08:30.339 --> 00:08:33.929
So it actually has a function, but then
even if you have that instead of having

00:08:33.969 --> 00:08:37.979
a stream of tokens, like you said, like
identifiers, numbers, then you still

00:08:37.989 --> 00:08:41.829
have just this tree of AST and it doesn't
do any of the name lookup for you.

00:08:41.909 --> 00:08:44.042
<v James Munns>Yeah, and that's the
problem, is that that is a token.

00:08:44.081 --> 00:08:48.327
That is an identifier, whatever you want
to qualify it as, and in proc macro or

00:08:48.367 --> 00:08:52.362
declarative macro world, they work with
tokens, they do not work with source code.

00:08:52.362 --> 00:08:56.367
They don't understand that it
is an array, or that it has some

00:08:56.367 --> 00:08:57.487
length or anything like that.

00:08:57.707 --> 00:09:00.787
It is just an identifier that
it has parsed as an identifier.

00:09:00.807 --> 00:09:04.338
And like I said, it can't do
anything with this because macro

00:09:04.338 --> 00:09:07.438
world doesn't think about Rust code.

00:09:07.448 --> 00:09:10.258
It thinks about Rust
source, if that makes sense.

00:09:10.308 --> 00:09:11.688
It's not tied to the semantics.

00:09:11.968 --> 00:09:15.558
The other tool that we have here- and
like I mentioned with that is `const

00:09:15.558 --> 00:09:20.575
fn`, which is Rust code, does think
about Rust code, does have access to Rust

00:09:20.575 --> 00:09:23.195
code, but it has some restrictions to it.

00:09:23.205 --> 00:09:25.325
So you mentioned: oh, I would
just throw it in a HashSet.

00:09:25.535 --> 00:09:30.846
Well, at least currently, you can't
use heap allocations in const.

00:09:30.896 --> 00:09:35.063
And even if you could, there's no
way to then turn it into a static

00:09:35.073 --> 00:09:39.096
that would work on my no standard
embedded system or anything like that.

00:09:39.116 --> 00:09:42.926
And I think there is some interest,
and I've talked to the `const fn` folks

00:09:42.926 --> 00:09:46.556
like Oli, and there is interest in the
future of being able to make like a

00:09:46.556 --> 00:09:52.696
vec in a `const fn` and then turn it
into a slice or a fixed size array, and

00:09:52.696 --> 00:09:54.086
then use that as just a plain const.

00:09:54.086 --> 00:09:57.156
I think, some other languages, like
maybe Zig, have the ability to do that.

00:09:57.346 --> 00:09:58.356
<v Amos Wenger>I was gonna
make that parallel.

00:09:58.356 --> 00:10:01.586
Zig has comptime- evaluation,
compile time- and I think you

00:10:01.586 --> 00:10:06.196
can run any arbitrary Zig code
in there because they don't care.

00:10:06.516 --> 00:10:10.066
But in Rust, const is
actually run through Miri?

00:10:10.601 --> 00:10:11.761
Is that, yeah?

00:10:12.041 --> 00:10:14.968
And so, you have very
limited capabilities.

00:10:14.968 --> 00:10:20.228
And I think like over time we've unlocked
new abilities in const piecemeal.

00:10:20.248 --> 00:10:22.168
It's been very slow and very
frustrating for people who are

00:10:22.168 --> 00:10:25.358
used to Zig where you can just like
fucking run whatever code you want.

00:10:26.243 --> 00:10:26.495
<v James Munns>Yeah.

00:10:26.545 --> 00:10:28.538
I think it is something that
will get better, but I'm

00:10:28.538 --> 00:10:30.088
working with stable Rust today.

00:10:30.148 --> 00:10:34.628
I'm sort of limited because we
either start with a fixed size

00:10:34.628 --> 00:10:40.538
array, so like a [u32; 6] or a
slice of u32 of variable size.

00:10:40.548 --> 00:10:41.478
So we start with that.

00:10:42.213 --> 00:10:44.873
And based on my answer, I said
we want to end up with either a

00:10:44.873 --> 00:10:50.463
[u32; 4], so shrunk to just the
uniques, or another slice of u32.

00:10:51.133 --> 00:10:53.883
And there's this problem is that
there's no alloc and const, which

00:10:53.883 --> 00:10:57.463
means I can't just put it in a vec,
collect them all, and then dump it out.

00:10:57.723 --> 00:11:02.154
I need to figure out what type they are,
because when you declare a const fn,

00:11:02.474 --> 00:11:05.694
like any other function in Rust, you
have to say the signature of the inputs

00:11:05.934 --> 00:11:09.114
and the signature of the outputs when
you define the function, which means

00:11:09.205 --> 00:11:15.035
different instances of this sort of need
to take and return different pieces, but

00:11:15.035 --> 00:11:17.905
I found sort of a way to navigate this.

00:11:18.435 --> 00:11:19.445
<v Amos Wenger>I'm very curious about that.

00:11:19.495 --> 00:11:21.275
I have no idea how you
got out of this one.

00:11:21.285 --> 00:11:24.515
Usually it's just like, "Well, const
can't do it yet, so fuck it, too bad."

00:11:24.565 --> 00:11:26.045
<v James Munns>Yeah, yeah.

00:11:26.585 --> 00:11:29.285
So, step one: we take our [u32; 6].

00:11:29.285 --> 00:11:33.535
It is a const value that we've, let's
say, expanded in a macro, where I

00:11:33.535 --> 00:11:36.625
say: okay, you gave me the six things
that you care about, I've done some

00:11:36.625 --> 00:11:40.995
other things in const world, and I
end up with this list of six integers.

00:11:40.995 --> 00:11:43.535
And I know that it is an
array of six integers.

00:11:44.055 --> 00:11:49.505
So my first step is: we can get the
length of a constant array in const.

00:11:50.493 --> 00:11:54.513
.len() is a const fn, and so if I
take a const value and call .len()

00:11:54.513 --> 00:11:56.193
on it, I can get the size of it.

00:11:56.703 --> 00:12:00.893
My first step is going to be
to convert this [u32; 6] into

00:12:00.893 --> 00:12:04.160
an option [u32; 6] and a usize.

00:12:04.975 --> 00:12:07.245
I have this const fn, which
I'm pretty sure works.

00:12:07.245 --> 00:12:08.915
I kind of wrote it in pseudocode.

00:12:08.925 --> 00:12:12.085
So if it doesn't perfectly compile,
I'll point you towards the real thing.

00:12:12.193 --> 00:12:16.759
We can say: we have a function called
uniques, which takes a generic argument

00:12:16.839 --> 00:12:20.219
N, which is a usize, and a slice of u32s.

00:12:20.914 --> 00:12:21.174
<v Amos Wenger>Okay.

00:12:21.174 --> 00:12:21.434
Okay.

00:12:21.564 --> 00:12:22.434
I have to stop you, James.

00:12:22.474 --> 00:12:23.274
You know I have to.

00:12:23.354 --> 00:12:25.494
First of all, are you using a bitmap font?

00:12:25.955 --> 00:12:26.486
<v James Munns>Yes!

00:12:26.491 --> 00:12:28.032
Departure Mono, it's beautiful.

00:12:28.081 --> 00:12:30.232
<v Amos Wenger>Okay, I'm
glad we clarified that.

00:12:30.462 --> 00:12:33.182
Second all:  I think we need to
tell the people, in Rust, they

00:12:33.182 --> 00:12:36.342
can make functions generic, so
they can accept type parameters.

00:12:36.422 --> 00:12:41.072
A compare function can take two
things that implement Ord, then

00:12:41.122 --> 00:12:43.552
you can compare them, they can
be ints or strings or whatever.

00:12:43.692 --> 00:12:48.842
So you have this idea of type parameters,
and like in Java, like in other languages

00:12:48.842 --> 00:12:53.754
with that kind of influence, it takes
type parameters between chevrons?

00:12:53.781 --> 00:12:54.971
Do you have another name for them?

00:12:54.998 --> 00:12:56.316
<v James Munns>I'd say
angle brackets, but...

00:12:56.359 --> 00:12:57.839
<v Amos Wenger>Angle brackets, exactly.

00:12:57.903 --> 00:13:02.152
But then in const code, you can also
accept not just types, but also values.

00:13:02.212 --> 00:13:05.852
And so, if you haven't written const code
before, and you're looking at the same

00:13:05.852 --> 00:13:10.532
slide I am, it will at first look a little
weird, because between the angle brackets,

00:13:10.532 --> 00:13:13.402
there's something `const N: usize`.

00:13:13.827 --> 00:13:20.383
like, the function can be parameterized
by any of the 4- no, that's 32 bits...

00:13:20.518 --> 00:13:20.698
<v James Munns>Yeah.

00:13:20.748 --> 00:13:21.878
These are const generics.

00:13:21.904 --> 00:13:23.588
This is just another flavor of generics.

00:13:23.608 --> 00:13:26.538
You can use them on `const fn`s,
you can use them on non `const fn`s.

00:13:26.718 --> 00:13:30.666
You can have types, like a struct
that is generic over const n usize.

00:13:30.906 --> 00:13:34.636
For example, if you wanted a struct that
was generic over the size of a buffer,

00:13:34.896 --> 00:13:41.836
you could have that buffer be u8; N, where
N is the const N usize thing like that.

00:13:41.836 --> 00:13:45.666
We use this a lot in embedded because we
have to right size our buffers because

00:13:45.666 --> 00:13:47.316
we don't have an allocator very often.

00:13:47.316 --> 00:13:49.526
So yeah, this is const generics.

00:13:49.576 --> 00:13:53.646
We're parameterizing over
a value rather than a type.

00:13:53.696 --> 00:13:55.106
We'll just take that as a given right now.

00:13:55.106 --> 00:13:57.691
So we have this N, which is a length.

00:13:57.911 --> 00:14:00.651
It's actually the length, that
number six that we had before.

00:14:00.951 --> 00:14:05.999
And we take in a slice of u32s
and we return an array of option

00:14:05.999 --> 00:14:08.819
u32 of length N and a usize.

00:14:09.409 --> 00:14:12.769
So the first thing we check is
that the generic parameter N

00:14:12.849 --> 00:14:14.549
matches the length of our slice.

00:14:14.789 --> 00:14:18.579
That's just a little robustness of making
sure that we handed the right thing in.

00:14:19.249 --> 00:14:24.937
And then we create a new output array
that is just an array of None; N.

00:14:25.188 --> 00:14:31.088
And because iterator is a trait and we
can't use trait methods in const fns,

00:14:31.088 --> 00:14:35.668
we get to hand roll our own while loops
for iteration, which means I start my

00:14:36.263 --> 00:14:40.263
output index, which is going to track
how many things we've placed in that

00:14:40.283 --> 00:14:44.963
out buffer, and i, which is where
I'm kind of scanning my input from.

00:14:45.013 --> 00:14:49.673
It's back to the lovely C style- actually
not even C style because we don't have for

00:14:49.673 --> 00:14:54.158
loops, we just have while loops at this
point of manually managing our iteration.

00:14:55.108 --> 00:14:59.231
While we're iterating over the
input slice, we're going to then

00:14:59.241 --> 00:15:03.551
iterate over anything in that
out that has turned into a Some.

00:15:03.640 --> 00:15:05.330
So basically I'm just
walking through the list.

00:15:05.330 --> 00:15:08.870
And if I find a new unique value,
I kind of push it into this out.

00:15:08.910 --> 00:15:13.550
But instead of it being like a vec, it's
an array of options and I just turn the

00:15:13.550 --> 00:15:17.580
nones into some as my push primitive here.

00:15:17.780 --> 00:15:20.330
<v Amos Wenger>I have so many
notes on this, but I'm going

00:15:20.330 --> 00:15:21.440
to- I'm going to let you finish.

00:15:21.770 --> 00:15:25.150
<v James Munns>Yeah, it's horrifically
O(N^2), but again, this is the compiler's

00:15:25.180 --> 00:15:28.720
problem and not my runtime code's problem,
so I'm not going to worry about it.

00:15:28.750 --> 00:15:32.320
I walk through the input list, pushing
items that I find that are unique.

00:15:32.360 --> 00:15:35.750
If I find that they are not unique,
I don't push them into the list

00:15:35.780 --> 00:15:36.700
because they're already there.

00:15:37.110 --> 00:15:42.310
And then at the end of this list, I'm
left with my full array of six items.

00:15:42.380 --> 00:15:47.304
I will have pushed four items into
it, and because of my outindex

00:15:47.314 --> 00:15:53.814
variable is 4, I return that whole
option list and the number 4.

00:15:54.134 --> 00:15:56.764
<v Amos Wenger>Believe it or not,
the fact that you're doing a linear

00:15:56.794 --> 00:15:59.924
lookup every time instead of hashing
was not even one of my notes.

00:16:00.284 --> 00:16:02.304
This is not even the level
I'm operating on right now.

00:16:02.344 --> 00:16:06.971
I'm looking at option u32 and I'm
thinking there's no niche for u32.

00:16:07.021 --> 00:16:09.581
So you're essentially, you're
allocating too much, but again,

00:16:09.581 --> 00:16:10.831
it's all in const, so who cares?

00:16:11.200 --> 00:16:11.910
<v James Munns>We'll get there.

00:16:12.020 --> 00:16:12.420
<v Amos Wenger>Okay.

00:16:12.510 --> 00:16:16.010
I like the trick that you don't know
the size of the output, so you just fuck

00:16:16.010 --> 00:16:17.370
it, like every element is an option.

00:16:17.421 --> 00:16:20.601
You know the maximum number of them, so
you just allocate a little too much and

00:16:20.601 --> 00:16:22.461
then you return, what, the actual size?

00:16:22.801 --> 00:16:26.391
No, I'm looking at this and going this
is maybe uninit- sorry, maybe uninit?

00:16:26.941 --> 00:16:29.421
<v James Munns>Yeah, there's some
tricks and I tried using this.

00:16:30.033 --> 00:16:36.191
The answer is that uninit arrays are
hard to do with stable const today

00:16:36.191 --> 00:16:40.351
because at the end you would want
to then turn it into an init array.

00:16:40.916 --> 00:16:44.556
So I did try that, but the answer
is: not quite enough things are

00:16:44.556 --> 00:16:49.196
stable in const with maybe uninit,
specifically, maybe uninit arrays.

00:16:49.386 --> 00:16:49.516
<v Amos Wenger>Right.

00:16:49.583 --> 00:16:50.493
<v James Munns>There
might be a way to do it.

00:16:50.493 --> 00:16:51.713
I just didn't figure it

00:16:51.713 --> 00:16:51.953
<v Amos Wenger>out.

00:16:51.953 --> 00:16:52.677
Those are weird.

00:16:52.687 --> 00:16:56.447
They don't fit in my need do an episode
about them just so that I can refer back

00:16:56.447 --> 00:16:58.357
to it and then understand them properly.

00:16:58.394 --> 00:16:58.676
<v James Munns>Cool.

00:16:58.696 --> 00:16:59.866
So this is a neat function.

00:16:59.896 --> 00:17:01.328
It takes some values.

00:17:01.388 --> 00:17:02.628
Let's look at what it might look like.

00:17:02.638 --> 00:17:05.238
So I have this array
`const LIST_WITH_DUPES`.

00:17:05.258 --> 00:17:06.538
It is a slice.

00:17:06.608 --> 00:17:07.858
It's got six items.

00:17:08.007 --> 00:17:12.577
I can make another const just called
len that is a usize that just takes

00:17:12.577 --> 00:17:16.637
that first constant and calls .len()
on it because len is a const fn.

00:17:16.637 --> 00:17:19.429
You can take the input of
another const, call a const

00:17:19.439 --> 00:17:20.789
function on it and get a value.

00:17:20.799 --> 00:17:22.629
So we can then get len.

00:17:23.289 --> 00:17:27.141
And then I can make this third const
called intermediate, which is that

00:17:27.251 --> 00:17:33.571
tuple of an array of options of len
with a usize, and then I can call my

00:17:33.571 --> 00:17:38.919
const fn, `uniques::<LEN>`, and then
pass it the slice, list with dupes.

00:17:39.229 --> 00:17:43.659
So then I end up with this array and
the length of unique items in it.

00:17:43.704 --> 00:17:46.886
<v Amos Wenger>Why can't you just do
all of that inside of a `const fn`?

00:17:46.906 --> 00:17:47.936
Is what I'm not understanding.

00:17:47.946 --> 00:17:52.366
If you pass it a u32 slice, can't you
just then call len inside of there?

00:17:52.435 --> 00:17:53.765
<v James Munns>Sure, and
what's your return type?

00:17:54.315 --> 00:17:54.655
<v Amos Wenger>That's...

00:17:54.725 --> 00:17:55.015
okay.

00:17:57.284 --> 00:17:59.694
<v James Munns>So ideally you would have
a slice, but then you have to return

00:17:59.694 --> 00:18:03.664
something, and you can't allocate the
array inside of the function and then

00:18:03.664 --> 00:18:07.479
return a slice of it, because then
you haven't bound that to anything.

00:18:07.814 --> 00:18:08.244
<v Amos Wenger>Sure.

00:18:08.274 --> 00:18:08.664
Okay.

00:18:09.184 --> 00:18:11.881
<v James Munns>So that's the answer
is: because we need to figure

00:18:11.881 --> 00:18:14.389
out the types   in each step.

00:18:14.886 --> 00:18:15.406
<v Amos Wenger>Makes sense.

00:18:15.526 --> 00:18:15.776
Gotcha.

00:18:16.226 --> 00:18:17.566
<v James Munns>That's
not quite what we want.

00:18:17.646 --> 00:18:21.096
We now have all of our uniques, but that's
not how we would want to use them where we

00:18:21.096 --> 00:18:22.556
have to like iterate through this array.

00:18:22.556 --> 00:18:26.356
And like you said, options are probably
going to double the size of a u32

00:18:26.406 --> 00:18:30.469
array because there's no niche and
with alignment, it needs to work out.

00:18:30.519 --> 00:18:33.349
We're essentially doubling our size,
actually more than that: we want to have

00:18:33.349 --> 00:18:37.876
four items in the end, and we now have
like, 12 words worth of stuff because we

00:18:37.876 --> 00:18:40.196
have an option u32 or something like that.

00:18:40.281 --> 00:18:43.426
<v Amos Wenger>Yeah, we have identified the
uniques, but we haven't shrunken anything.

00:18:43.426 --> 00:18:44.406
We're actually using more memory.

00:18:44.526 --> 00:18:46.256
Maybe we should explain what a niche is?

00:18:46.336 --> 00:18:51.506
It's like for some types, like if you
have non zero u32, for example, you know

00:18:51.506 --> 00:18:53.861
that zero is not a legal value for that.

00:18:53.891 --> 00:18:55.651
So you can use that to signify none.

00:18:55.651 --> 00:18:59.101
If you have an option non zero
u32, then it's just like a u32,

00:18:59.101 --> 00:19:01.931
but zero means none and all the
other values are some, something.

00:19:01.981 --> 00:19:02.861
Rust does that a lot.

00:19:02.871 --> 00:19:04.581
You can do that with
like non null pointers.

00:19:04.581 --> 00:19:06.331
You can do that with enums.

00:19:06.541 --> 00:19:09.228
An option of an enum is not
necessarily larger than the enum.

00:19:09.281 --> 00:19:11.671
<v James Munns>The real code for
this definitely does that because

00:19:11.691 --> 00:19:15.851
I have an option reference to the
schema when I'm deduplicating it.

00:19:16.111 --> 00:19:18.061
And because references can never be zero,

00:19:18.171 --> 00:19:19.418
<v Amos Wenger>Cannot be null  yes.

00:19:19.538 --> 00:19:20.604
<v James Munns>So they also get niche'd.

00:19:20.604 --> 00:19:24.114
So even my option reference is
still just the size of a reference.

00:19:24.248 --> 00:19:28.178
I do take advantage of that there, but not
here because this is a simplified example.

00:19:28.786 --> 00:19:29.736
That's not quite what we want.

00:19:29.744 --> 00:19:35.874
Let's write that function that extracts
the uniques from that option array.

00:19:36.224 --> 00:19:40.114
Now I have a const fn
that is `const M: usize`.

00:19:40.134 --> 00:19:42.544
So it's generic over const M usize.

00:19:43.124 --> 00:19:48.564
And we take a slice in, which is
our slice of option u32, and we

00:19:48.564 --> 00:19:52.964
return an array of u32 of length M.

00:19:53.275 --> 00:19:57.365
We start by creating our output,
which is that array of u32 M.

00:19:57.452 --> 00:20:00.912
We're going to start iterating where
we know M is the size of our output.

00:20:00.932 --> 00:20:03.562
That's how many unique
items we expect to have.

00:20:03.802 --> 00:20:08.052
So we start iterating from zero
until we hit that number, and

00:20:08.052 --> 00:20:09.562
we unwrap each of the items.

00:20:09.592 --> 00:20:12.367
And this is another one of those
cool things that const is if

00:20:12.367 --> 00:20:16.335
you unwrap and panic in a const
fn, the compile just fails.

00:20:16.385 --> 00:20:19.685
It is a compile time error
rather than a run time error.

00:20:20.005 --> 00:20:24.945
So we go through and we fill in every
value of the output array with Some.

00:20:25.405 --> 00:20:30.515
And then just as a reasonable check, a
robustness check, we then keep iterating

00:20:30.535 --> 00:20:34.075
to the end of the input array and
make sure that all of those are none.

00:20:34.325 --> 00:20:38.605
So we're kind of verifying that: yes,
we got in a slice of six items, you

00:20:38.605 --> 00:20:41.795
told me there were going to be four
unique items in there, I was able to

00:20:41.795 --> 00:20:46.865
extract all four unique items, and then
that additional space was just nothing.

00:20:48.165 --> 00:20:51.795
When we put that all together, now
we have a whole list of consts.

00:20:51.815 --> 00:20:55.115
The first one is our input array,
the second one is the length of

00:20:55.115 --> 00:20:58.535
that input array, then we do that
intermediate step where we turn it into

00:20:58.535 --> 00:21:01.065
option, array, and the unique size.

00:21:01.185 --> 00:21:07.015
We then get the deduplicated length, which
is just that usize from the intermediate.

00:21:07.033 --> 00:21:11.013
And then we can call our extract
function with the deduplicated length

00:21:11.363 --> 00:21:13.693
and the input array of options.

00:21:14.373 --> 00:21:17.073
And then finally, we can turn
that fixed sized array of four

00:21:17.073 --> 00:21:20.663
items into a slice of u32s.

00:21:20.673 --> 00:21:26.013
Because, again, in const, you can take
a reference to a const value as a const,

00:21:26.033 --> 00:21:28.663
and the compiler will sort it out for you.

00:21:29.163 --> 00:21:32.219
<v Amos Wenger>James, you know how
some people have  prank shows where

00:21:32.219 --> 00:21:33.829
they take turns pranking each other?

00:21:33.859 --> 00:21:35.469
Is that- am I on one of
those shows right now?

00:21:35.569 --> 00:21:38.019
Because I'm looking at the thing and I
don't know whether to be impressed or

00:21:38.019 --> 00:21:42.469
disgusted which if I recall correctly
is how you felt about my, "Oh we can

00:21:42.498 --> 00:21:46.373
cast that lifetime to static because
it's just for the duration of a call

00:21:46.373 --> 00:21:49.593
and it's blocking so within that
thread it stays valid," and you were

00:21:49.593 --> 00:21:51.423
like, "I don't know if that's true..."

00:21:51.604 --> 00:21:52.754
<v James Munns>Yeah, I mean,
that's what I go for.

00:21:52.804 --> 00:21:56.484
I love solutions that are just: how
could you, but I couldn't think of

00:21:56.484 --> 00:21:58.254
a different, better way to do that.

00:21:58.539 --> 00:22:00.959
<v Amos Wenger>Please tell me that
you actually have a declarative

00:22:00.974 --> 00:22:04.933
macro that actually puts all that in
module and hides it away from you.

00:22:04.983 --> 00:22:05.763
Please tell me that.

00:22:05.850 --> 00:22:07.110
<v James Munns>I'm so glad you asked.

00:22:07.460 --> 00:22:10.290
So here is that, uh, declarative
macro, because you can take the

00:22:10.300 --> 00:22:14.390
input as an ident, and this is
a really interesting technique.

00:22:14.390 --> 00:22:20.341
It makes sense, but you can kind of
smash together macros with const fns,

00:22:21.006 --> 00:22:25.706
and actually sort of bounce between
the two, because you can use macros

00:22:25.706 --> 00:22:28.966
inside of `const fn`s, and you can
use `const fn`s inside of macros.

00:22:29.236 --> 00:22:32.516
And even though it seems unreasonable,
because you think, "Oh well, what

00:22:32.516 --> 00:22:33.956
are the stages of compilation?"

00:22:34.216 --> 00:22:38.016
You can actually sort of bounce back
and forth between the two of them and

00:22:38.026 --> 00:22:42.941
have all of these intermediate consts,
which very often then get used as the

00:22:42.951 --> 00:22:46.581
input to the next stage of something
that a macro generates, or have a macro

00:22:46.581 --> 00:22:52.331
generate multiple const fn's or multiple
constant values and sort of bounce between

00:22:52.331 --> 00:22:54.001
the two until you end up with this.

00:22:54.011 --> 00:22:58.057
So this is really only sort of one
layer where I'm using a declarative

00:22:58.077 --> 00:23:02.807
macro just because you have to do this
multi step process and the user doesn't

00:23:02.817 --> 00:23:05.187
want any of those intermediate values.

00:23:05.437 --> 00:23:07.377
They just want the final value out.

00:23:07.587 --> 00:23:12.037
Now I have my `const LIST_WITH_DUPES`,
which is an array of u32s, and then

00:23:12.037 --> 00:23:16.067
I have my `const LIST_DEDUPED`, which
is also a slice of u32s, and I just

00:23:16.067 --> 00:23:21.087
call that dedupe macro, which expands
inside of a const block, which means

00:23:21.497 --> 00:23:24.377
none of those intermediate values
are going to end up in my final

00:23:24.377 --> 00:23:27.632
program, only the calculated subset.

00:23:27.872 --> 00:23:31.112
We might not even end up with the
list with dupes in our program if

00:23:31.112 --> 00:23:32.612
we don't reference it anywhere.

00:23:32.612 --> 00:23:36.292
Only the consts that we reference
are going to be used in our program.

00:23:36.562 --> 00:23:40.622
Which means we might just have all of
this intermediate garbage at compile time.

00:23:41.067 --> 00:23:47.107
But, we end up with the perfectly sized
array of unique integers at runtime

00:23:47.107 --> 00:23:51.537
that our microcontroller or whatever
can use with no allocations or anything.

00:23:52.097 --> 00:23:54.207
<v Amos Wenger>You will not remove
from my head the thought...

00:23:54.267 --> 00:23:55.617
and I know I did rubicon.

00:23:55.647 --> 00:23:57.757
I presented rubicon on this very podcast!

00:23:57.757 --> 00:23:59.137
So who am I to say anything?

00:23:59.147 --> 00:24:02.421
But: if people looked at that,
they would be like, "You know what?

00:24:02.891 --> 00:24:06.481
There's something wrong with Rust and the
Rust people and I don't want to touch it."

00:24:06.831 --> 00:24:07.801
I would understand.

00:24:07.971 --> 00:24:09.211
It would make sense to me.

00:24:09.381 --> 00:24:11.701
I couldn't blame them,
because it shouldn't exist!

00:24:11.901 --> 00:24:14.541
<v James Munns>I think there's room
for improvement, don't get me wrong.

00:24:14.678 --> 00:24:18.118
I talked to Oli about this, and
Oli's first response was, "Oh wow."

00:24:18.408 --> 00:24:21.178
Which made me feel good, because
Oli's the one that wrote a lot of the

00:24:21.178 --> 00:24:24.168
const pieces, or one of the original
architect of a lot of the const stuff.

00:24:24.408 --> 00:24:25.628
But also said: hey, you know.

00:24:25.684 --> 00:24:27.781
I'm going to have some
time in the near future.

00:24:27.781 --> 00:24:30.611
And these are exactly the pain
points we want to get rid of.

00:24:30.611 --> 00:24:32.891
Not having to have these
intermediate steps...

00:24:32.941 --> 00:24:37.561
no roadmap on this, but like I said,
being able to do allocations in

00:24:38.021 --> 00:24:41.239
const functions and then returning
the array   because that would

00:24:41.239 --> 00:24:42.769
make all of these tricks pointless.

00:24:42.769 --> 00:24:44.099
You would just have one `const fn`

00:24:44.119 --> 00:24:44.569
-
<v Amos Wenger>Of course.

00:24:44.569 --> 00:24:48.029
- <v James Munns>that has the signature we
want here, which is just slice to slice.

00:24:48.059 --> 00:24:50.384
<v Amos Wenger>Just like rubicon should
not exist because it's working around

00:24:50.384 --> 00:24:53.014
limitations of the Rust compiler
and you could, you should just

00:24:53.024 --> 00:24:55.484
really go into the Rust compiler
and add the flags that you need.

00:24:55.674 --> 00:25:00.180
It's so interesting because if you
didn't know why it is that way- really,

00:25:00.214 --> 00:25:02.808
I'm trying to approach this from the
perspective of someone who's not using

00:25:02.828 --> 00:25:06.678
Rust on a daily basis and looking all
this and going like, "Dear God, why?"

00:25:06.961 --> 00:25:11.092
But it's because Rust is doing all
this in like super hardcore difficulty

00:25:11.342 --> 00:25:14.072
because it's not just compiling
something to a binary and running it.

00:25:14.072 --> 00:25:14.902
That's proc macros.

00:25:14.932 --> 00:25:16.722
Rust does do that in some context.

00:25:16.932 --> 00:25:18.482
But for const, it's none of that.

00:25:18.502 --> 00:25:22.077
It's interpreting things and it's not
interpreting things the easy way either.

00:25:22.077 --> 00:25:26.127
It is like doing tracking memory
allocations and tagging pointers and

00:25:26.127 --> 00:25:30.597
doing this weird CHERI like thing because
Miri is actually supposed to help you

00:25:30.597 --> 00:25:32.364
detect undefined behavior and unsafe code.

00:25:32.584 --> 00:25:34.384
That's where the limitations come from.

00:25:34.709 --> 00:25:36.604
I'm actually not exactly sure...

00:25:36.654 --> 00:25:40.954
why do we need all the might of Miri
to just do `const fn`s, but I dunno.

00:25:41.054 --> 00:25:41.421
Do you?

00:25:41.421 --> 00:25:45.695
<v James Munns>Well, I mean,you're going
to need to run the code at compile time.

00:25:45.695 --> 00:25:50.345
And especially if you're cross
compiling, you need to work essentially

00:25:50.345 --> 00:25:51.932
within the Rust abstract machine.

00:25:51.962 --> 00:25:56.105
So the neat thing about Miri is that it
is  essentially a Rust abstract machine

00:25:56.115 --> 00:25:59.905
interpreter, which is why we use it as
essentially a sanitizer a lot of the

00:25:59.905 --> 00:26:03.795
time, because the VM can introspect
the world that's going on there.

00:26:04.145 --> 00:26:07.855
I don't know exactly the decisions that
got made there, but it was: well, if

00:26:07.855 --> 00:26:12.105
you have to abstractly interpret code at
compile time, because it might be for a

00:26:12.105 --> 00:26:18.335
different architecture anyway, then why
not have one really good one that can also

00:26:18.365 --> 00:26:18.615
-
<v Amos Wenger>Yeah.

00:26:18.725 --> 00:26:22.555
<v James Munns>-enforce all the invariants
of the abstract model at the same time?

00:26:23.095 --> 00:26:25.922
<v Amos Wenger>I think it's a case of like:
can do more, so it can also do less.

00:26:25.942 --> 00:26:29.285
In that context, we don't need all
the- although I guess it could detect

00:26:29.285 --> 00:26:30.945
memory unsafety in your `const fn` code.

00:26:30.965 --> 00:26:32.835
Like can you have unsafe `const fn` code?

00:26:32.975 --> 00:26:33.360
<v James Munns>It does.

00:26:33.450 --> 00:26:33.720
You can.

00:26:33.720 --> 00:26:34.730
<v Amos Wenger>That's terrible.

00:26:34.745 --> 00:26:36.815
<v James Munns>And if you ever do
something that would be undefined

00:26:36.815 --> 00:26:38.555
behavior, your code won't compile.

00:26:38.965 --> 00:26:39.655
<v Amos Wenger>I'm revolted.

00:26:39.655 --> 00:26:40.535
I'm leaving Rust.

00:26:40.695 --> 00:26:43.685
<v James Munns>So `const fn`
can have pointers and can have

00:26:43.785 --> 00:26:47.545
unsafe code and that's why maybe
uninit could be possible here.

00:26:47.565 --> 00:26:50.385
It's just that not all the
methods have been made const yet.

00:26:50.485 --> 00:26:50.935
<v Amos Wenger>Sure.

00:26:51.825 --> 00:26:52.585
<v James Munns>So...

00:26:53.215 --> 00:26:55.205
can we take this further?

00:26:55.275 --> 00:26:55.805
<v Amos Wenger>No!

00:26:55.820 --> 00:26:56.405
Enough!

00:26:56.625 --> 00:27:00.005
<v James Munns>The answer is yes,
because what if we had a list of lists?

00:27:00.305 --> 00:27:05.655
So we have a slice of slices, and
all of my slices can be different

00:27:05.655 --> 00:27:12.415
sizes, and what I'd like to end up
is with one slice that contains all

00:27:12.415 --> 00:27:14.015
of the items that are used here.

00:27:14.475 --> 00:27:19.415
So where this maps back to reality
is when you have these endpoints-

00:27:19.585 --> 00:27:20.105
<v Amos Wenger>I'm sorry.

00:27:20.375 --> 00:27:20.955
Time out.

00:27:21.165 --> 00:27:22.905
Where this maps back to reality?

00:27:23.105 --> 00:27:24.995
30 minutes into this episode?

00:27:25.005 --> 00:27:26.055
It was just a great line.

00:27:26.055 --> 00:27:30.295
<v James Munns>So where this maps back
to reality is because if you have an

00:27:30.305 --> 00:27:35.345
endpoint that contains multiple types,
so it might have the input type and

00:27:35.345 --> 00:27:38.455
the output type and the input type
might be a composite type of a- like

00:27:38.455 --> 00:27:43.498
a struct or things like that, it might
have its own list of types that are

00:27:43.498 --> 00:27:45.228
involved in the request and response.

00:27:45.468 --> 00:27:48.431
If I have this array of four
endpoints, I'm going to end up with

00:27:48.431 --> 00:27:50.589
this list of each of their types.

00:27:50.899 --> 00:27:54.667
And like I said, if I'm trying to figure
out the list of unique types in the

00:27:54.667 --> 00:27:59.997
entire system, then I want one array
of all the inputs and all the outputs,

00:27:59.997 --> 00:28:02.777
because there might be duplicates
between my different endpoints.

00:28:02.847 --> 00:28:04.697
<v Amos Wenger>So you're
also doing flattening...

00:28:05.472 --> 00:28:07.532
<v James Munns>I don't have the code
written up for this because I wrote

00:28:07.532 --> 00:28:09.762
these slides about five minutes
before we gave the talk, but I'll

00:28:09.762 --> 00:28:11.032
walk you through the process.

00:28:11.102 --> 00:28:11.582
<v Amos Wenger>James...

00:28:11.726 --> 00:28:15.846
<v James Munns>So first, we write a
`const fn` that gets the total size.

00:28:16.139 --> 00:28:20.349
It just walks through the arrays and
counts the length of each subarray.

00:28:20.619 --> 00:28:24.109
So we would say the first one's
five, the second one's one, the third

00:28:24.109 --> 00:28:25.519
one's whatever the number it was.

00:28:25.729 --> 00:28:30.609
And we just get a sum of what
is the upper bound of potential

00:28:30.609 --> 00:28:32.652
unique values that we could have.

00:28:32.652 --> 00:28:38.052
And then we turn around and use that
as our option u32 allocator basically

00:28:38.322 --> 00:28:44.292
because we go: if there was absolutely
no duplication, this list could be

00:28:44.322 --> 00:28:46.912
at most this many elements large.

00:28:47.982 --> 00:28:51.552
And then we repeat the same process
that we did for a single one.

00:28:51.762 --> 00:28:57.802
We fill all of the sublists into our
allocated chunk, and then we get the total

00:28:57.822 --> 00:29:00.192
number out of how many uniques there were.

00:29:00.772 --> 00:29:02.132
Then we're back to the same tricks.

00:29:02.132 --> 00:29:05.912
We then crunch it back down, we end
up with one slice, and we can end

00:29:05.922 --> 00:29:08.438
up with this same sort of behavior.

00:29:08.488 --> 00:29:11.636
And again, we can write a little
procedural macro that hides

00:29:11.646 --> 00:29:15.116
all of these intermediate steps
that are horrific and necessary.

00:29:15.486 --> 00:29:19.646
But we can end up from an array
of array of u32s to a single

00:29:19.666 --> 00:29:22.426
array of deduplicated u32s.

00:29:23.056 --> 00:29:25.956
And like I said, even though it
is very wasteful, I'll tell you

00:29:25.956 --> 00:29:30.986
the algorithm in Postcard RPC also
does what's called perfect hashing.

00:29:31.236 --> 00:29:34.546
When I have all of those keys, I say,
"Well, they're all eight byte keys.

00:29:34.699 --> 00:29:37.219
Could I smash them all
down to one byte keys?

00:29:37.259 --> 00:29:38.869
And would there be any collisions?

00:29:39.414 --> 00:29:43.914
If yes, could I smash them down to 2
byte keys and not have any collisions?"

00:29:44.134 --> 00:29:49.564
and if so, then I go, "Ah, I don't need
8 byte keys, I need 2 byte keys," which,

00:29:49.624 --> 00:29:53.844
when I was talking about what I send over
the wire, now I've just shrunk the header

00:29:53.844 --> 00:30:00.054
of every message from containing an 8 byte
key to a 2 byte key, and I can calculate

00:30:00.074 --> 00:30:03.679
that with no misses at compile time.

00:30:03.939 --> 00:30:06.859
Which means that you can never
screw it up because it's using all

00:30:06.859 --> 00:30:08.249
the types you're actually using.

00:30:08.669 --> 00:30:10.809
And it can always get
exactly the right number.

00:30:10.819 --> 00:30:14.919
So this technique is generally called
perfect hashing, but this is also

00:30:14.919 --> 00:30:16.419
something we can do at const time.

00:30:17.089 --> 00:30:19.109
<v Amos Wenger>So, James, I've already
been very annoying to you this episode,I

00:30:19.109 --> 00:30:22.697
want to thank  you for your patience,
but I have another nerd snipe, for

00:30:22.697 --> 00:30:25.890
the next episode, of course,  which
is you should do Huffman, because-

00:30:25.914 --> 00:30:26.764
<v James Munns>Ooh...

00:30:26.884 --> 00:30:27.134
<v Amos Wenger>Right?

00:30:27.134 --> 00:30:32.382
Maybe you have more than 256 different
message types or schemas to deduplicate.

00:30:32.752 --> 00:30:34.462
But some of them come
up more often than not.

00:30:34.483 --> 00:30:34.924
<v James Munns>That's fair.

00:30:34.967 --> 00:30:36.402
<v Amos Wenger>I think
the best explanation...

00:30:36.402 --> 00:30:39.062
Let's put it in terms
of UTF-8, they thought.

00:30:39.362 --> 00:30:42.242
Because that's what people
know, instead of Huffman.

00:30:42.332 --> 00:30:47.806
Okay in UTF-8  uh, if you send "a, b, c"
a, b, and c all each take only one byte.

00:30:48.026 --> 00:30:49.753
But then you have this seventh bit?

00:30:49.783 --> 00:30:50.263
Eighth bit?

00:30:50.378 --> 00:30:51.488
What, what am I thinking of?

00:30:51.570 --> 00:30:52.060
<v James Munns>Eighth bit.

00:30:52.320 --> 00:30:53.590
That's the extension bit.

00:30:53.660 --> 00:30:54.110
<v Amos Wenger>Eighth bit.

00:30:54.272 --> 00:30:55.162
Yeah, it's the extension bit.

00:30:55.182 --> 00:30:58.952
So you can have characters, or
whatever you want to call them.

00:30:58.982 --> 00:31:00.872
Okay, this is not the episode about UTF-8.

00:31:01.032 --> 00:31:02.492
You can have multi byte sequence.

00:31:03.152 --> 00:31:07.242
So you can have more than 256 or
more than 128 different letters.

00:31:07.272 --> 00:31:08.512
And you could do the
same with the schemas.

00:31:08.542 --> 00:31:13.172
Like maybe some schemas are used
very rarely, and you can afford

00:31:13.172 --> 00:31:16.572
to have like a two, three, four
bytes prefix for those messages.

00:31:16.872 --> 00:31:20.012
But then the ones you use the most often,
and you can actually do that because you

00:31:20.022 --> 00:31:22.042
know how often they show up in schemas.

00:31:22.084 --> 00:31:24.534
So let's see you that
in const code, James!

00:31:24.624 --> 00:31:26.993
<v James Munns>You know how often
they show up on schemas, but not how

00:31:26.993 --> 00:31:28.583
often they are sent over the wire.

00:31:28.613 --> 00:31:29.183
<v Amos Wenger>That's true.

00:31:29.213 --> 00:31:31.113
<v James Munns>You could have a
very common type that is very,

00:31:31.113 --> 00:31:32.723
very rarely actually sent.

00:31:32.723 --> 00:31:33.283
<v Amos Wenger>Exactly.

00:31:33.346 --> 00:31:35.453
So, you'd have a likely annotation?

00:31:35.453 --> 00:31:38.903
You do profile guided
optimization of schema prefixes?

00:31:39.026 --> 00:31:43.618
<v James Munns>This is more about making
sure that it can never be misunderstood.

00:31:43.798 --> 00:31:46.338
Cause that's one of the big things
about Postcard RPC is you could do

00:31:46.348 --> 00:31:50.848
all of these tricks, manually assign
message IDs, manually bump them to go,

00:31:50.888 --> 00:31:54.418
"Oh, we just rolled over 255 items.

00:31:54.418 --> 00:31:56.018
Now we need 256 items.

00:31:56.028 --> 00:31:58.488
So now all of my keys need
to bump up to two bytes."

00:31:58.958 --> 00:32:02.188
You could do that, but I've seen
those bugs in production before where

00:32:02.188 --> 00:32:05.668
you have one system that thinks it's
talking one byte keys and another

00:32:05.668 --> 00:32:07.208
system that's talking two byte keys.

00:32:07.508 --> 00:32:12.836
And this is also a reason why I chose
varints for Postcard RPC you can

00:32:12.836 --> 00:32:16.396
never misunderstand the length of an
integer because they're all varints

00:32:16.406 --> 00:32:19.516
all the time, which means there's
never any room for confusion even if

00:32:19.516 --> 00:32:21.076
you're compiling on different days.

00:32:21.366 --> 00:32:24.906
And this is the same thing is I have
a deterministic way of hashing and a

00:32:24.906 --> 00:32:30.066
deterministic way of crunching from 8 to
4 to 2 to 1, which means that also if you

00:32:30.066 --> 00:32:35.046
have a system that natively is thinking in
1 byte IDs but someone sends it an 8 byte

00:32:35.046 --> 00:32:39.281
ID, It can crunch that 8 down to 1 and go,
"Oh, I know what you're talking about."

00:32:39.561 --> 00:32:43.645
And because you know that the set
of messages that it can accept is

00:32:43.645 --> 00:32:48.715
representable in one byte, that's
always acceptable to do that crunching.

00:32:49.378 --> 00:32:53.576
And if I'm natively thinking in two bytes
and you send me one byte, I go: No, I

00:32:53.576 --> 00:32:57.974
won't crunch that down because I already
proved that I need at least two bytes

00:32:58.014 --> 00:32:58.634
<v Amos Wenger>Right, yeah.

00:32:58.694 --> 00:33:00.894
<v James Munns>I don't know if
this could collide or not.

00:33:00.894 --> 00:33:05.544
So I'll refuse to expand it or
crunch my numbers down to match it.

00:33:05.874 --> 00:33:08.364
<v Amos Wenger>Well, I'm just glad
I got to name drop Huffman, to

00:33:08.374 --> 00:33:09.594
be completely honest with you.

00:33:10.283 --> 00:33:11.233
<v James Munns>It's very cool.

00:33:11.283 --> 00:33:15.078
And I think if I wasn't on embedded,
I would just do compression because

00:33:15.138 --> 00:33:19.878
Flate or LZ4 use Huffman encoding
as part of the encoding process.

00:33:20.128 --> 00:33:24.329
And so it is absolutely the
right tool to use here, if you're

00:33:24.329 --> 00:33:25.569
willing to pay for compression.

00:33:25.864 --> 00:33:27.614
<v Amos Wenger>You know, I'm
glad that we have embedded.

00:33:27.614 --> 00:33:30.200
You have the internet of things,
and people complain about that,

00:33:30.336 --> 00:33:32.486
call it the internet of something
else, that I'm not gonna say on

00:33:32.486 --> 00:33:33.666
the podcast, even though I could...

00:33:33.796 --> 00:33:34.366
<v Amanda Majorowicz>But what?

00:33:34.587 --> 00:33:35.557
<v Amos Wenger>It's the internet of shit.

00:33:36.474 --> 00:33:39.174
<v James Munns>You'll drop "fuck"
about 10 times and not shit?

00:33:39.213 --> 00:33:40.549
<v Amos Wenger>You made me say it!

00:33:41.150 --> 00:33:41.890
<v Amanda Majorowicz>I'm just curious.

00:33:41.940 --> 00:33:45.125
<v James Munns>We're going to have to check
the explicit check box on this episode.

00:33:45.741 --> 00:33:46.321
<v Amos Wenger>Yeah.

00:33:46.720 --> 00:33:47.930
You have chips in everything.

00:33:47.940 --> 00:33:49.980
I have chips in my light bulbs at home.

00:33:50.280 --> 00:33:53.020
But I'm glad, because we have to
think about these tricks again.

00:33:53.020 --> 00:33:57.100
We have to think in terms of constrained
resources because there was a gap, like

00:33:57.110 --> 00:34:01.100
after computers became ridiculously
big and powerful and before we started

00:34:01.100 --> 00:34:04.450
putting chips in light bulbs, where we
suddenly didn't have to care anymore,

00:34:04.450 --> 00:34:05.820
we could be wasteful with everything.

00:34:05.980 --> 00:34:08.609
And now, sure, we have the Macbook M4...

00:34:08.659 --> 00:34:11.943
that's like infinitely powerful,
and nobody knows what to do with

00:34:11.953 --> 00:34:15.274
that much compute- yes they do,
it's running inference on LLMs-

00:34:15.579 --> 00:34:17.489
and then we have those tiny chips.

00:34:17.814 --> 00:34:21.131
<v James Munns>This is one of my favorite
things because we're getting to the point

00:34:21.181 --> 00:34:25.061
where microcontrollers are as powerful
as the computers from the late 80s

00:34:25.061 --> 00:34:27.201
and early 90s like desktop computers.

00:34:27.251 --> 00:34:29.781
There's still this mindset
and a lot of embedded is that

00:34:30.151 --> 00:34:31.721
microcontrollers are very tiny.

00:34:31.741 --> 00:34:33.741
You can't do anything fancy with them.

00:34:33.921 --> 00:34:34.831
They need to be bare bones.

00:34:34.831 --> 00:34:37.261
And this is where a lot of that
internet of shit stuff has come from

00:34:37.501 --> 00:34:41.211
in that people were like, "Oh, I can't
afford encryption or authentication

00:34:41.221 --> 00:34:42.561
because I'm just a little guy."

00:34:42.871 --> 00:34:45.133
And that's not great.

00:34:45.203 --> 00:34:50.423
A lot of what I've been doing with my
network protocol stuff is looking at

00:34:50.423 --> 00:34:54.263
things like AppleTalk which was done
on computers in the early 90s and

00:34:54.493 --> 00:34:59.964
going: well, if the Mac from 1990 could
afford to do it, our microcontrollers

00:34:59.994 --> 00:35:01.324
can probably afford to do that.

00:35:01.514 --> 00:35:04.164
And they had networking and
printer sharing and file

00:35:04.164 --> 00:35:05.144
systems and things like that.

00:35:05.214 --> 00:35:07.734
Sometimes you don't want all those
things because you're saving power

00:35:07.734 --> 00:35:12.129
or whatever, but if you're thinking
of "I have to be miserable because

00:35:12.129 --> 00:35:16.709
I have to shave every byte off of
everything," that's not necessarily true.

00:35:16.709 --> 00:35:21.359
And particularly when we have better
compilers like Rust and we can cheat

00:35:21.499 --> 00:35:25.309
and do a lot of those things at compile
time to leverage our ridiculously

00:35:25.319 --> 00:35:29.829
powerful MacBooks and do all the work
ahead of time so that our tiny little

00:35:29.839 --> 00:35:33.449
16 kilobyte firmware device doesn't
have to worry its pretty little head

00:35:33.449 --> 00:35:37.364
about it then: yeah, that's exactly
the kind of tricks I like doing.

00:35:37.764 --> 00:35:39.144
<v Amos Wenger>Yeah, well, you got my vote.

00:35:45.008 --> 00:35:47.288
<v James Munns>This episode is
sponsored by CodeCrafters.

00:35:47.476 --> 00:35:50.746
CodeCrafters is a service for
learning programming skills by doing.

00:35:51.416 --> 00:35:54.706
CodeCrafters offers a curated list
of exercises for learning programming

00:35:54.706 --> 00:35:57.716
languages like Rust or learning
skills like building an interpreter.

00:35:58.191 --> 00:36:01.731
Instead of just following a tutorial, you
can instead clone a repo that contains

00:36:01.741 --> 00:36:05.471
all of the boilerplate already, and make
progress by running tests and pushing

00:36:05.471 --> 00:36:08.811
commits that are checked by the server,
allowing you to move on to the next step.

00:36:09.451 --> 00:36:12.421
If you enjoy learning by doing,
sign up today using the link at

00:36:12.421 --> 00:36:16.541
sdr-podcast.com/codecrafters,
or use the link in the show

00:36:16.541 --> 00:36:17.771
notes to start your free trial.

00:36:18.191 --> 00:36:20.711
If you decide to upgrade, you'll
get a discount and a portion of

00:36:20.711 --> 00:36:22.251
the sale will support this podcast.

00:36:22.696 --> 00:36:26.386
That's sdr-podcast.com/codecrafters.

00:36:27.086 --> 00:36:29.432
Thanks to CodeCrafters for
sponsoring this episode.