mod.rs source code [crates/alloc/src/ffi/mod.rs]

1	//! Utilities related to FFI bindings.
2	//!
3	//! This module provides utilities to handle data across non-Rust
4	//! interfaces, like other programming languages and the underlying
5	//! operating system. It is mainly of use for FFI (Foreign Function
6	//! Interface) bindings and code that needs to exchange C-like strings
7	//! with other languages.
8	//!
9	//! # Overview
10	//!
11	//! Rust represents owned strings with the [`String`] type, and
12	//! borrowed slices of strings with the [`str`] primitive. Both are
13	//! always in UTF-8 encoding, and may contain nul bytes in the middle,
14	//! i.e., if you look at the bytes that make up the string, there may
15	//! be a `\0` among them. Both `String` and `str` store their length
16	//! explicitly; there are no nul terminators at the end of strings
17	//! like in C.
18	//!
19	//! C strings are different from Rust strings:
20	//!
21	//! Encodings - Rust strings are UTF-8, but C strings may use*
22	//! other encodings. If you are using a string from C, you should
23	//! check its encoding explicitly, rather than just assuming that it
24	//! is UTF-8 like you can do in Rust.
25	//!
26	//! Character size - C strings may use `char` or `wchar_t`-sized*
27	//! characters; please note* that C's `char` is different from Rust's.*
28	//! The C standard leaves the actual sizes of those types open to
29	//! interpretation, but defines different APIs for strings made up of
30	//! each character type. Rust strings are always UTF-8, so different
31	//! Unicode characters will be encoded in a variable number of bytes
32	//! each. The Rust type [`char`] represents a '[Unicode scalar
33	//! value]', which is similar to, but not the same as, a '[Unicode
34	//! code point]'.
35	//!
36	//! Nul terminators and implicit string lengths - Often, C*
37	//! strings are nul-terminated, i.e., they have a `\0` character at the
38	//! end. The length of a string buffer is not stored, but has to be
39	//! calculated; to compute the length of a string, C code must
40	//! manually call a function like `strlen()` for `char`-based strings,
41	//! or `wcslen()` for `wchar_t`-based ones. Those functions return
42	//! the number of characters in the string excluding the nul
43	//! terminator, so the buffer length is really `len+1` characters.
44	//! Rust strings don't have a nul terminator; their length is always
45	//! stored and does not need to be calculated. While in Rust
46	//! accessing a string's length is an O(1) operation (because the
47	//! length is stored); in C it is an O(n) operation because the
48	//! length needs to be computed by scanning the string for the nul
49	//! terminator.
50	//!
51	//! Internal nul characters - When C strings have a nul*
52	//! terminator character, this usually means that they cannot have nul
53	//! characters in the middle — a nul character would essentially
54	//! truncate the string. Rust strings can* have nul characters in*
55	//! the middle, because nul does not have to mark the end of the
56	//! string in Rust.
57	//!
58	//! # Representations of non-Rust strings
59	//!
60	//! [`CString`] and [`CStr`] are useful when you need to transfer
61	//! UTF-8 strings to and from languages with a C ABI, like Python.
62	//!
63	//! * From Rust to C: [`CString`] represents an owned, C-friendly
64	//! string: it is nul-terminated, and has no internal nul characters.
65	//! Rust code can create a [`CString`] out of a normal string (provided
66	//! that the string doesn't have nul characters in the middle), and
67	//! then use a variety of methods to obtain a raw <code>\mut* [u8]</code> that can
68	//! then be passed as an argument to functions which use the C
69	//! conventions for strings.
70	//!
71	//! * From C to Rust: [`CStr`] represents a borrowed C string; it
72	//! is what you would use to wrap a raw <code>\const* [u8]</code> that you got from
73	//! a C function. A [`CStr`] is guaranteed to be a nul-terminated array
74	//! of bytes. Once you have a [`CStr`], you can convert it to a Rust
75	//! <code>&[str]</code> if it's valid UTF-8, or lossily convert it by adding
76	//! replacement characters.
77	//!
78	//! [`String`]: crate::string::String
79	//! [`CStr`]: core::ffi::CStr
80
81	#![stable(feature = "alloc_ffi", since = "1.64.0")]
82
83	#[doc(no_inline)]
84	#[stable(feature = "alloc_c_string", since = "1.64.0")]
85	pub use self::c_str::{FromVecWithNulError, IntoStringError, NulError};
86
87	#[doc(inline)]
88	#[stable(feature = "alloc_c_string", since = "1.64.0")]
89	pub use self::c_str::CString;
90
91	#[unstable(feature = "c_str_module", issue = "112134")]
92	pub mod c_str;
93