1/*!
2This library provides heavily optimized routines for string search primitives.
3
4# Overview
5
6This section gives a brief high level overview of what this crate offers.
7
8* The top-level module provides routines for searching for 1, 2 or 3 bytes
9 in the forward or reverse direction. When searching for more than one byte,
10 positions are considered a match if the byte at that position matches any
11 of the bytes.
12* The [`memmem`] sub-module provides forward and reverse substring search
13 routines.
14
15In all such cases, routines operate on `&[u8]` without regard to encoding. This
16is exactly what you want when searching either UTF-8 or arbitrary bytes.
17
18# Example: using `memchr`
19
20This example shows how to use `memchr` to find the first occurrence of `z` in
21a haystack:
22
23```
24use memchr::memchr;
25
26let haystack = b"foo bar baz quuz";
27assert_eq!(Some(10), memchr(b'z', haystack));
28```
29
30# Example: matching one of three possible bytes
31
32This examples shows how to use `memrchr3` to find occurrences of `a`, `b` or
33`c`, starting at the end of the haystack.
34
35```
36use memchr::memchr3_iter;
37
38let haystack = b"xyzaxyzbxyzc";
39
40let mut it = memchr3_iter(b'a', b'b', b'c', haystack).rev();
41assert_eq!(Some(11), it.next());
42assert_eq!(Some(7), it.next());
43assert_eq!(Some(3), it.next());
44assert_eq!(None, it.next());
45```
46
47# Example: iterating over substring matches
48
49This example shows how to use the [`memmem`] sub-module to find occurrences of
50a substring in a haystack.
51
52```
53use memchr::memmem;
54
55let haystack = b"foo bar foo baz foo";
56
57let mut it = memmem::find_iter(haystack, "foo");
58assert_eq!(Some(0), it.next());
59assert_eq!(Some(8), it.next());
60assert_eq!(Some(16), it.next());
61assert_eq!(None, it.next());
62```
63
64# Example: repeating a search for the same needle
65
66It may be possible for the overhead of constructing a substring searcher to be
67measurable in some workloads. In cases where the same needle is used to search
68many haystacks, it is possible to do construction once and thus to avoid it for
69subsequent searches. This can be done with a [`memmem::Finder`]:
70
71```
72use memchr::memmem;
73
74let finder = memmem::Finder::new("foo");
75
76assert_eq!(Some(4), finder.find(b"baz foo quux"));
77assert_eq!(None, finder.find(b"quux baz bar"));
78```
79
80# Why use this crate?
81
82At first glance, the APIs provided by this crate might seem weird. Why provide
83a dedicated routine like `memchr` for something that could be implemented
84clearly and trivially in one line:
85
86```
87fn memchr(needle: u8, haystack: &[u8]) -> Option<usize> {
88 haystack.iter().position(|&b| b == needle)
89}
90```
91
92Or similarly, why does this crate provide substring search routines when Rust's
93core library already provides them?
94
95```
96fn search(haystack: &str, needle: &str) -> Option<usize> {
97 haystack.find(needle)
98}
99```
100
101The primary reason for both of them to exist is performance. When it comes to
102performance, at a high level at least, there are two primary ways to look at
103it:
104
105* **Throughput**: For this, think about it as, "given some very large haystack
106 and a byte that never occurs in that haystack, how long does it take to
107 search through it and determine that it, in fact, does not occur?"
108* **Latency**: For this, think about it as, "given a tiny haystack---just a
109 few bytes---how long does it take to determine if a byte is in it?"
110
111The `memchr` routine in this crate has _slightly_ worse latency than the
112solution presented above, however, its throughput can easily be over an
113order of magnitude faster. This is a good general purpose trade off to make.
114You rarely lose, but often gain big.
115
116**NOTE:** The name `memchr` comes from the corresponding routine in libc. A key
117advantage of using this library is that its performance is not tied to its
118quality of implementation in the libc you happen to be using, which can vary
119greatly from platform to platform.
120
121But what about substring search? This one is a bit more complicated. The
122primary reason for its existence is still indeed performance, but it's also
123useful because Rust's core library doesn't actually expose any substring
124search routine on arbitrary bytes. The only substring search routine that
125exists works exclusively on valid UTF-8.
126
127So if you have valid UTF-8, is there a reason to use this over the standard
128library substring search routine? Yes. This routine is faster on almost every
129metric, including latency. The natural question then, is why isn't this
130implementation in the standard library, even if only for searching on UTF-8?
131The reason is that the implementation details for using SIMD in the standard
132library haven't quite been worked out yet.
133
134**NOTE:** Currently, only `x86_64` targets have highly accelerated
135implementations of substring search. For `memchr`, all targets have
136somewhat-accelerated implementations, while only `x86_64` targets have highly
137accelerated implementations. This limitation is expected to be lifted once the
138standard library exposes a platform independent SIMD API.
139
140# Crate features
141
142* **std** - When enabled (the default), this will permit this crate to use
143 features specific to the standard library. Currently, the only thing used
144 from the standard library is runtime SIMD CPU feature detection. This means
145 that this feature must be enabled to get AVX accelerated routines. When
146 `std` is not enabled, this crate will still attempt to use SSE2 accelerated
147 routines on `x86_64`.
148* **libc** - When enabled (**not** the default), this library will use your
149 platform's libc implementation of `memchr` (and `memrchr` on Linux). This
150 can be useful on non-`x86_64` targets where the fallback implementation in
151 this crate is not as good as the one found in your libc. All other routines
152 (e.g., `memchr[23]` and substring search) unconditionally use the
153 implementation in this crate.
154*/
155
156#![deny(missing_docs)]
157#![cfg_attr(not(feature = "std"), no_std)]
158// It's not worth trying to gate all code on just miri, so turn off relevant
159// dead code warnings.
160#![cfg_attr(miri, allow(dead_code, unused_macros))]
161
162// Supporting 8-bit (or others) would be fine. If you need it, please submit a
163// bug report at https://github.com/BurntSushi/memchr
164#[cfg(not(any(
165 target_pointer_width = "16",
166 target_pointer_width = "32",
167 target_pointer_width = "64"
168)))]
169compile_error!("memchr currently not supported on non-{16,32,64}");
170
171pub use crate::memchr::{
172 memchr, memchr2, memchr2_iter, memchr3, memchr3_iter, memchr_iter,
173 memrchr, memrchr2, memrchr2_iter, memrchr3, memrchr3_iter, memrchr_iter,
174 Memchr, Memchr2, Memchr3,
175};
176
177mod cow;
178mod memchr;
179pub mod memmem;
180#[cfg(test)]
181mod tests;
182