1/*!
2Provides packed multiple substring search, principally for a small number of
3patterns.
4
5This sub-module provides vectorized routines for quickly finding
6matches of a small number of patterns. In general, users of this crate
7shouldn't need to interface with this module directly, as the primary
8[`AhoCorasick`](crate::AhoCorasick) searcher will use these routines
9automatically as a prefilter when applicable. However, in some cases, callers
10may want to bypass the Aho-Corasick machinery entirely and use this vectorized
11searcher directly.
12
13# Overview
14
15The primary types in this sub-module are:
16
17* [`Searcher`] executes the actual search algorithm to report matches in a
18haystack.
19* [`Builder`] accumulates patterns incrementally and can construct a
20`Searcher`.
21* [`Config`] permits tuning the searcher, and itself will produce a `Builder`
22(which can then be used to build a `Searcher`). Currently, the only tuneable
23knob are the match semantics, but this may be expanded in the future.
24
25# Examples
26
27This example shows how to create a searcher from an iterator of patterns.
28By default, leftmost-first match semantics are used. (See the top-level
29[`MatchKind`] type for more details about match semantics, which apply
30similarly to packed substring search.)
31
32```
33use aho_corasick::{packed::{MatchKind, Searcher}, PatternID};
34
35# fn example() -> Option<()> {
36let searcher = Searcher::new(["foobar", "foo"].iter().cloned())?;
37let matches: Vec<PatternID> = searcher
38 .find_iter("foobar")
39 .map(|mat| mat.pattern())
40 .collect();
41assert_eq!(vec![PatternID::ZERO], matches);
42# Some(()) }
43# if cfg!(all(feature = "std", target_arch = "x86_64")) {
44# example().unwrap()
45# } else {
46# assert!(example().is_none());
47# }
48```
49
50This example shows how to use [`Config`] to change the match semantics to
51leftmost-longest:
52
53```
54use aho_corasick::{packed::{Config, MatchKind}, PatternID};
55
56# fn example() -> Option<()> {
57let searcher = Config::new()
58 .match_kind(MatchKind::LeftmostLongest)
59 .builder()
60 .add("foo")
61 .add("foobar")
62 .build()?;
63let matches: Vec<PatternID> = searcher
64 .find_iter("foobar")
65 .map(|mat| mat.pattern())
66 .collect();
67assert_eq!(vec![PatternID::must(1)], matches);
68# Some(()) }
69# if cfg!(all(feature = "std", target_arch = "x86_64")) {
70# example().unwrap()
71# } else {
72# assert!(example().is_none());
73# }
74```
75
76# Packed substring searching
77
78Packed substring searching refers to the use of SIMD (Single Instruction,
79Multiple Data) to accelerate the detection of matches in a haystack. Unlike
80conventional algorithms, such as Aho-Corasick, SIMD algorithms for substring
81search tend to do better with a small number of patterns, where as Aho-Corasick
82generally maintains reasonably consistent performance regardless of the number
83of patterns you give it. Because of this, the vectorized searcher in this
84sub-module cannot be used as a general purpose searcher, since building the
85searcher may fail even when given a small number of patterns. However, in
86exchange, when searching for a small number of patterns, searching can be quite
87a bit faster than Aho-Corasick (sometimes by an order of magnitude).
88
89The key take away here is that constructing a searcher from a list of patterns
90is a fallible operation with no clear rules for when it will fail. While the
91precise conditions under which building a searcher can fail is specifically an
92implementation detail, here are some common reasons:
93
94* Too many patterns were given. Typically, the limit is on the order of 100 or
95 so, but this limit may fluctuate based on available CPU features.
96* The available packed algorithms require CPU features that aren't available.
97 For example, currently, this crate only provides packed algorithms for
98 `x86_64`. Therefore, constructing a packed searcher on any other target
99 (e.g., ARM) will always fail.
100* Zero patterns were given, or one of the patterns given was empty. Packed
101 searchers require at least one pattern and that all patterns are non-empty.
102* Something else about the nature of the patterns (typically based on
103 heuristics) suggests that a packed searcher would perform very poorly, so
104 no searcher is built.
105*/
106
107pub use crate::packed::api::{Builder, Config, FindIter, MatchKind, Searcher};
108
109mod api;
110mod pattern;
111mod rabinkarp;
112mod teddy;
113#[cfg(all(feature = "std", test))]
114mod tests;
115#[cfg(all(feature = "std", target_arch = "x86_64"))]
116mod vector;
117