1/*!
2Provides packed multiple substring search, principally for a small number of
3patterns.
4
5This sub-module provides vectorized routines for quickly finding
6matches of a small number of patterns. In general, users of this crate
7shouldn't need to interface with this module directly, as the primary
8[`AhoCorasick`](crate::AhoCorasick) searcher will use these routines
9automatically as a prefilter when applicable. However, in some cases, callers
10may want to bypass the Aho-Corasick machinery entirely and use this vectorized
11searcher directly.
12
13# Overview
14
15The primary types in this sub-module are:
16
17* [`Searcher`] executes the actual search algorithm to report matches in a
18haystack.
19* [`Builder`] accumulates patterns incrementally and can construct a
20`Searcher`.
21* [`Config`] permits tuning the searcher, and itself will produce a `Builder`
22(which can then be used to build a `Searcher`). Currently, the only tuneable
23knob are the match semantics, but this may be expanded in the future.
24
25# Examples
26
27This example shows how to create a searcher from an iterator of patterns.
28By default, leftmost-first match semantics are used. (See the top-level
29[`MatchKind`] type for more details about match semantics, which apply
30similarly to packed substring search.)
31
32```
33use aho_corasick::{packed::{MatchKind, Searcher}, PatternID};
34
35# fn example() -> Option<()> {
36let searcher = Searcher::new(["foobar", "foo"].iter().cloned())?;
37let matches: Vec<PatternID> = searcher
38 .find_iter("foobar")
39 .map(|mat| mat.pattern())
40 .collect();
41assert_eq!(vec![PatternID::ZERO], matches);
42# Some(()) }
43# if cfg!(all(feature = "std", any(
44# target_arch = "x86_64", target_arch = "aarch64",
45# ))) {
46# example().unwrap()
47# } else {
48# assert!(example().is_none());
49# }
50```
51
52This example shows how to use [`Config`] to change the match semantics to
53leftmost-longest:
54
55```
56use aho_corasick::{packed::{Config, MatchKind}, PatternID};
57
58# fn example() -> Option<()> {
59let searcher = Config::new()
60 .match_kind(MatchKind::LeftmostLongest)
61 .builder()
62 .add("foo")
63 .add("foobar")
64 .build()?;
65let matches: Vec<PatternID> = searcher
66 .find_iter("foobar")
67 .map(|mat| mat.pattern())
68 .collect();
69assert_eq!(vec![PatternID::must(1)], matches);
70# Some(()) }
71# if cfg!(all(feature = "std", any(
72# target_arch = "x86_64", target_arch = "aarch64",
73# ))) {
74# example().unwrap()
75# } else {
76# assert!(example().is_none());
77# }
78```
79
80# Packed substring searching
81
82Packed substring searching refers to the use of SIMD (Single Instruction,
83Multiple Data) to accelerate the detection of matches in a haystack. Unlike
84conventional algorithms, such as Aho-Corasick, SIMD algorithms for substring
85search tend to do better with a small number of patterns, where as Aho-Corasick
86generally maintains reasonably consistent performance regardless of the number
87of patterns you give it. Because of this, the vectorized searcher in this
88sub-module cannot be used as a general purpose searcher, since building the
89searcher may fail even when given a small number of patterns. However, in
90exchange, when searching for a small number of patterns, searching can be quite
91a bit faster than Aho-Corasick (sometimes by an order of magnitude).
92
93The key take away here is that constructing a searcher from a list of patterns
94is a fallible operation with no clear rules for when it will fail. While the
95precise conditions under which building a searcher can fail is specifically an
96implementation detail, here are some common reasons:
97
98* Too many patterns were given. Typically, the limit is on the order of 100 or
99 so, but this limit may fluctuate based on available CPU features.
100* The available packed algorithms require CPU features that aren't available.
101 For example, currently, this crate only provides packed algorithms for
102 `x86_64` and `aarch64`. Therefore, constructing a packed searcher on any
103 other target will always fail.
104* Zero patterns were given, or one of the patterns given was empty. Packed
105 searchers require at least one pattern and that all patterns are non-empty.
106* Something else about the nature of the patterns (typically based on
107 heuristics) suggests that a packed searcher would perform very poorly, so
108 no searcher is built.
109*/
110
111pub use crate::packed::api::{Builder, Config, FindIter, MatchKind, Searcher};
112
113mod api;
114mod ext;
115mod pattern;
116mod rabinkarp;
117mod teddy;
118#[cfg(all(feature = "std", test))]
119mod tests;
120mod vector;
121