1//! This is a densely packed error representation which is used on targets with
2//! 64-bit pointers.
3//!
4//! (Note that `bitpacked` vs `unpacked` here has no relationship to
5//! `#[repr(packed)]`, it just refers to attempting to use any available bits in
6//! a more clever manner than `rustc`'s default layout algorithm would).
7//!
8//! Conceptually, it stores the same data as the "unpacked" equivalent we use on
9//! other targets. Specifically, you can imagine it as an optimized version of
10//! the following enum (which is roughly equivalent to what's stored by
11//! `repr_unpacked::Repr`, e.g. `super::ErrorData<Box<Custom>>`):
12//!
13//! ```ignore (exposition-only)
14//! enum ErrorData {
15//! Os(i32),
16//! Simple(ErrorKind),
17//! SimpleMessage(&'static SimpleMessage),
18//! Custom(Box<Custom>),
19//! }
20//! ```
21//!
22//! However, it packs this data into a 64bit non-zero value.
23//!
24//! This optimization not only allows `io::Error` to occupy a single pointer,
25//! but improves `io::Result` as well, especially for situations like
26//! `io::Result<()>` (which is now 64 bits) or `io::Result<u64>` (which is now
27//! 128 bits), which are quite common.
28//!
29//! # Layout
30//! Tagged values are 64 bits, with the 2 least significant bits used for the
31//! tag. This means there are there are 4 "variants":
32//!
33//! - **Tag 0b00**: The first variant is equivalent to
34//! `ErrorData::SimpleMessage`, and holds a `&'static SimpleMessage` directly.
35//!
36//! `SimpleMessage` has an alignment >= 4 (which is requested with
37//! `#[repr(align)]` and checked statically at the bottom of this file), which
38//! means every `&'static SimpleMessage` should have the both tag bits as 0,
39//! meaning its tagged and untagged representation are equivalent.
40//!
41//! This means we can skip tagging it, which is necessary as this variant can
42//! be constructed from a `const fn`, which probably cannot tag pointers (or
43//! at least it would be difficult).
44//!
45//! - **Tag 0b01**: The other pointer variant holds the data for
46//! `ErrorData::Custom` and the remaining 62 bits are used to store a
47//! `Box<Custom>`. `Custom` also has alignment >= 4, so the bottom two bits
48//! are free to use for the tag.
49//!
50//! The only important thing to note is that `ptr::wrapping_add` and
51//! `ptr::wrapping_sub` are used to tag the pointer, rather than bitwise
52//! operations. This should preserve the pointer's provenance, which would
53//! otherwise be lost.
54//!
55//! - **Tag 0b10**: Holds the data for `ErrorData::Os(i32)`. We store the `i32`
56//! in the pointer's most significant 32 bits, and don't use the bits `2..32`
57//! for anything. Using the top 32 bits is just to let us easily recover the
58//! `i32` code with the correct sign.
59//!
60//! - **Tag 0b11**: Holds the data for `ErrorData::Simple(ErrorKind)`. This
61//! stores the `ErrorKind` in the top 32 bits as well, although it doesn't
62//! occupy nearly that many. Most of the bits are unused here, but it's not
63//! like we need them for anything else yet.
64//!
65//! # Use of `NonNull<()>`
66//!
67//! Everything is stored in a `NonNull<()>`, which is odd, but actually serves a
68//! purpose.
69//!
70//! Conceptually you might think of this more like:
71//!
72//! ```ignore (exposition-only)
73//! union Repr {
74//! // holds integer (Simple/Os) variants, and
75//! // provides access to the tag bits.
76//! bits: NonZero<u64>,
77//! // Tag is 0, so this is stored untagged.
78//! msg: &'static SimpleMessage,
79//! // Tagged (offset) `Box<Custom>` pointer.
80//! tagged_custom: NonNull<()>,
81//! }
82//! ```
83//!
84//! But there are a few problems with this:
85//!
86//! 1. Union access is equivalent to a transmute, so this representation would
87//! require we transmute between integers and pointers in at least one
88//! direction, which may be UB (and even if not, it is likely harder for a
89//! compiler to reason about than explicit ptr->int operations).
90//!
91//! 2. Even if all fields of a union have a niche, the union itself doesn't,
92//! although this may change in the future. This would make things like
93//! `io::Result<()>` and `io::Result<usize>` larger, which defeats part of
94//! the motivation of this bitpacking.
95//!
96//! Storing everything in a `NonZero<usize>` (or some other integer) would be a
97//! bit more traditional for pointer tagging, but it would lose provenance
98//! information, couldn't be constructed from a `const fn`, and would probably
99//! run into other issues as well.
100//!
101//! The `NonNull<()>` seems like the only alternative, even if it's fairly odd
102//! to use a pointer type to store something that may hold an integer, some of
103//! the time.
104
105use super::{Custom, ErrorData, ErrorKind, RawOsError, SimpleMessage};
106use core::marker::PhantomData;
107use core::mem::{align_of, size_of};
108use core::ptr::{self, NonNull};
109
110// The 2 least-significant bits are used as tag.
111const TAG_MASK: usize = 0b11;
112const TAG_SIMPLE_MESSAGE: usize = 0b00;
113const TAG_CUSTOM: usize = 0b01;
114const TAG_OS: usize = 0b10;
115const TAG_SIMPLE: usize = 0b11;
116
117/// The internal representation.
118///
119/// See the module docs for more, this is just a way to hack in a check that we
120/// indeed are not unwind-safe.
121///
122/// ```compile_fail,E0277
123/// fn is_unwind_safe<T: core::panic::UnwindSafe>() {}
124/// is_unwind_safe::<std::io::Error>();
125/// ```
126#[repr(transparent)]
127pub(super) struct Repr(NonNull<()>, PhantomData<ErrorData<Box<Custom>>>);
128
129// All the types `Repr` stores internally are Send + Sync, and so is it.
130unsafe impl Send for Repr {}
131unsafe impl Sync for Repr {}
132
133impl Repr {
134 pub(super) fn new(dat: ErrorData<Box<Custom>>) -> Self {
135 match dat {
136 ErrorData::Os(code) => Self::new_os(code),
137 ErrorData::Simple(kind) => Self::new_simple(kind),
138 ErrorData::SimpleMessage(simple_message) => Self::new_simple_message(simple_message),
139 ErrorData::Custom(b) => Self::new_custom(b),
140 }
141 }
142
143 pub(super) fn new_custom(b: Box<Custom>) -> Self {
144 let p = Box::into_raw(b).cast::<u8>();
145 // Should only be possible if an allocator handed out a pointer with
146 // wrong alignment.
147 debug_assert_eq!(p.addr() & TAG_MASK, 0);
148 // Note: We know `TAG_CUSTOM <= size_of::<Custom>()` (static_assert at
149 // end of file), and both the start and end of the expression must be
150 // valid without address space wraparound due to `Box`'s semantics.
151 //
152 // This means it would be correct to implement this using `ptr::add`
153 // (rather than `ptr::wrapping_add`), but it's unclear this would give
154 // any benefit, so we just use `wrapping_add` instead.
155 let tagged = p.wrapping_add(TAG_CUSTOM).cast::<()>();
156 // Safety: `TAG_CUSTOM + p` is the same as `TAG_CUSTOM | p`,
157 // because `p`'s alignment means it isn't allowed to have any of the
158 // `TAG_BITS` set (you can verify that addition and bitwise-or are the
159 // same when the operands have no bits in common using a truth table).
160 //
161 // Then, `TAG_CUSTOM | p` is not zero, as that would require
162 // `TAG_CUSTOM` and `p` both be zero, and neither is (as `p` came from a
163 // box, and `TAG_CUSTOM` just... isn't zero -- it's `0b01`). Therefore,
164 // `TAG_CUSTOM + p` isn't zero and so `tagged` can't be, and the
165 // `new_unchecked` is safe.
166 let res = Self(unsafe { NonNull::new_unchecked(tagged) }, PhantomData);
167 // quickly smoke-check we encoded the right thing (This generally will
168 // only run in std's tests, unless the user uses -Zbuild-std)
169 debug_assert!(matches!(res.data(), ErrorData::Custom(_)), "repr(custom) encoding failed");
170 res
171 }
172
173 #[inline]
174 pub(super) fn new_os(code: RawOsError) -> Self {
175 let utagged = ((code as usize) << 32) | TAG_OS;
176 // Safety: `TAG_OS` is not zero, so the result of the `|` is not 0.
177 let res = Self(
178 unsafe { NonNull::new_unchecked(ptr::without_provenance_mut(utagged)) },
179 PhantomData,
180 );
181 // quickly smoke-check we encoded the right thing (This generally will
182 // only run in std's tests, unless the user uses -Zbuild-std)
183 debug_assert!(
184 matches!(res.data(), ErrorData::Os(c) if c == code),
185 "repr(os) encoding failed for {code}"
186 );
187 res
188 }
189
190 #[inline]
191 pub(super) fn new_simple(kind: ErrorKind) -> Self {
192 let utagged = ((kind as usize) << 32) | TAG_SIMPLE;
193 // Safety: `TAG_SIMPLE` is not zero, so the result of the `|` is not 0.
194 let res = Self(
195 unsafe { NonNull::new_unchecked(ptr::without_provenance_mut(utagged)) },
196 PhantomData,
197 );
198 // quickly smoke-check we encoded the right thing (This generally will
199 // only run in std's tests, unless the user uses -Zbuild-std)
200 debug_assert!(
201 matches!(res.data(), ErrorData::Simple(k) if k == kind),
202 "repr(simple) encoding failed {:?}",
203 kind,
204 );
205 res
206 }
207
208 #[inline]
209 pub(super) const fn new_simple_message(m: &'static SimpleMessage) -> Self {
210 // Safety: References are never null.
211 Self(unsafe { NonNull::new_unchecked(m as *const _ as *mut ()) }, PhantomData)
212 }
213
214 #[inline]
215 pub(super) fn data(&self) -> ErrorData<&Custom> {
216 // Safety: We're a Repr, decode_repr is fine.
217 unsafe { decode_repr(self.0, |c| &*c) }
218 }
219
220 #[inline]
221 pub(super) fn data_mut(&mut self) -> ErrorData<&mut Custom> {
222 // Safety: We're a Repr, decode_repr is fine.
223 unsafe { decode_repr(self.0, |c| &mut *c) }
224 }
225
226 #[inline]
227 pub(super) fn into_data(self) -> ErrorData<Box<Custom>> {
228 let this = core::mem::ManuallyDrop::new(self);
229 // Safety: We're a Repr, decode_repr is fine. The `Box::from_raw` is
230 // safe because we prevent double-drop using `ManuallyDrop`.
231 unsafe { decode_repr(this.0, |p| Box::from_raw(p)) }
232 }
233}
234
235impl Drop for Repr {
236 #[inline]
237 fn drop(&mut self) {
238 // Safety: We're a Repr, decode_repr is fine. The `Box::from_raw` is
239 // safe because we're being dropped.
240 unsafe {
241 let _ = decode_repr(self.0, |p: *mut Custom| Box::<Custom>::from_raw(p));
242 }
243 }
244}
245
246// Shared helper to decode a `Repr`'s internal pointer into an ErrorData.
247//
248// Safety: `ptr`'s bits should be encoded as described in the document at the
249// top (it should `some_repr.0`)
250#[inline]
251unsafe fn decode_repr<C, F>(ptr: NonNull<()>, make_custom: F) -> ErrorData<C>
252where
253 F: FnOnce(*mut Custom) -> C,
254{
255 let bits = ptr.as_ptr().addr();
256 match bits & TAG_MASK {
257 TAG_OS => {
258 let code = ((bits as i64) >> 32) as RawOsError;
259 ErrorData::Os(code)
260 }
261 TAG_SIMPLE => {
262 let kind_bits = (bits >> 32) as u32;
263 let kind = kind_from_prim(kind_bits).unwrap_or_else(|| {
264 debug_assert!(false, "Invalid io::error::Repr bits: `Repr({:#018x})`", bits);
265 // This means the `ptr` passed in was not valid, which violates
266 // the unsafe contract of `decode_repr`.
267 //
268 // Using this rather than unwrap meaningfully improves the code
269 // for callers which only care about one variant (usually
270 // `Custom`)
271 core::hint::unreachable_unchecked();
272 });
273 ErrorData::Simple(kind)
274 }
275 TAG_SIMPLE_MESSAGE => ErrorData::SimpleMessage(&*ptr.cast::<SimpleMessage>().as_ptr()),
276 TAG_CUSTOM => {
277 // It would be correct for us to use `ptr::byte_sub` here (see the
278 // comment above the `wrapping_add` call in `new_custom` for why),
279 // but it isn't clear that it makes a difference, so we don't.
280 let custom = ptr.as_ptr().wrapping_byte_sub(TAG_CUSTOM).cast::<Custom>();
281 ErrorData::Custom(make_custom(custom))
282 }
283 _ => {
284 // Can't happen, and compiler can tell
285 unreachable!();
286 }
287 }
288}
289
290// This compiles to the same code as the check+transmute, but doesn't require
291// unsafe, or to hard-code max ErrorKind or its size in a way the compiler
292// couldn't verify.
293#[inline]
294fn kind_from_prim(ek: u32) -> Option<ErrorKind> {
295 macro_rules! from_prim {
296 ($prim:expr => $Enum:ident { $($Variant:ident),* $(,)? }) => {{
297 // Force a compile error if the list gets out of date.
298 const _: fn(e: $Enum) = |e: $Enum| match e {
299 $($Enum::$Variant => ()),*
300 };
301 match $prim {
302 $(v if v == ($Enum::$Variant as _) => Some($Enum::$Variant),)*
303 _ => None,
304 }
305 }}
306 }
307 from_prim!(ek => ErrorKind {
308 NotFound,
309 PermissionDenied,
310 ConnectionRefused,
311 ConnectionReset,
312 HostUnreachable,
313 NetworkUnreachable,
314 ConnectionAborted,
315 NotConnected,
316 AddrInUse,
317 AddrNotAvailable,
318 NetworkDown,
319 BrokenPipe,
320 AlreadyExists,
321 WouldBlock,
322 NotADirectory,
323 IsADirectory,
324 DirectoryNotEmpty,
325 ReadOnlyFilesystem,
326 FilesystemLoop,
327 StaleNetworkFileHandle,
328 InvalidInput,
329 InvalidData,
330 TimedOut,
331 WriteZero,
332 StorageFull,
333 NotSeekable,
334 FilesystemQuotaExceeded,
335 FileTooLarge,
336 ResourceBusy,
337 ExecutableFileBusy,
338 Deadlock,
339 CrossesDevices,
340 TooManyLinks,
341 InvalidFilename,
342 ArgumentListTooLong,
343 Interrupted,
344 Other,
345 UnexpectedEof,
346 Unsupported,
347 OutOfMemory,
348 Uncategorized,
349 })
350}
351
352// Some static checking to alert us if a change breaks any of the assumptions
353// that our encoding relies on for correctness and soundness. (Some of these are
354// a bit overly thorough/cautious, admittedly)
355//
356// If any of these are hit on a platform that std supports, we should likely
357// just use `repr_unpacked.rs` there instead (unless the fix is easy).
358macro_rules! static_assert {
359 ($condition:expr) => {
360 const _: () = assert!($condition);
361 };
362 (@usize_eq: $lhs:expr, $rhs:expr) => {
363 const _: [(); $lhs] = [(); $rhs];
364 };
365}
366
367// The bitpacking we use requires pointers be exactly 64 bits.
368static_assert!(@usize_eq: size_of::<NonNull<()>>(), 8);
369
370// We also require pointers and usize be the same size.
371static_assert!(@usize_eq: size_of::<NonNull<()>>(), size_of::<usize>());
372
373// `Custom` and `SimpleMessage` need to be thin pointers.
374static_assert!(@usize_eq: size_of::<&'static SimpleMessage>(), 8);
375static_assert!(@usize_eq: size_of::<Box<Custom>>(), 8);
376
377static_assert!((TAG_MASK + 1).is_power_of_two());
378// And they must have sufficient alignment.
379static_assert!(align_of::<SimpleMessage>() >= TAG_MASK + 1);
380static_assert!(align_of::<Custom>() >= TAG_MASK + 1);
381
382static_assert!(@usize_eq: TAG_MASK & TAG_SIMPLE_MESSAGE, TAG_SIMPLE_MESSAGE);
383static_assert!(@usize_eq: TAG_MASK & TAG_CUSTOM, TAG_CUSTOM);
384static_assert!(@usize_eq: TAG_MASK & TAG_OS, TAG_OS);
385static_assert!(@usize_eq: TAG_MASK & TAG_SIMPLE, TAG_SIMPLE);
386
387// This is obviously true (`TAG_CUSTOM` is `0b01`), but in `Repr::new_custom` we
388// offset a pointer by this value, and expect it to both be within the same
389// object, and to not wrap around the address space. See the comment in that
390// function for further details.
391//
392// Actually, at the moment we use `ptr::wrapping_add`, not `ptr::add`, so this
393// check isn't needed for that one, although the assertion that we don't
394// actually wrap around in that wrapping_add does simplify the safety reasoning
395// elsewhere considerably.
396static_assert!(size_of::<Custom>() >= TAG_CUSTOM);
397
398// These two store a payload which is allowed to be zero, so they must be
399// non-zero to preserve the `NonNull`'s range invariant.
400static_assert!(TAG_OS != 0);
401static_assert!(TAG_SIMPLE != 0);
402// We can't tag `SimpleMessage`s, the tag must be 0.
403static_assert!(@usize_eq: TAG_SIMPLE_MESSAGE, 0);
404
405// Check that the point of all of this still holds.
406//
407// We'd check against `io::Error`, but *technically* it's allowed to vary,
408// as it's not `#[repr(transparent)]`/`#[repr(C)]`. We could add that, but
409// the `#[repr()]` would show up in rustdoc, which might be seen as a stable
410// commitment.
411static_assert!(@usize_eq: size_of::<Repr>(), 8);
412static_assert!(@usize_eq: size_of::<Option<Repr>>(), 8);
413static_assert!(@usize_eq: size_of::<Result<(), Repr>>(), 8);
414static_assert!(@usize_eq: size_of::<Result<usize, Repr>>(), 16);
415