primitives.rs source code [crates/regex-automata-0.4.6/src/util/primitives.rs]

1	/!*
2	Lower level primitive types that are useful in a variety of circumstances.
3
4	# Overview
5
6	This list represents the principle types in this module and briefly describes
7	when you might want to use them.
8
9	* [`PatternID`] - A type that represents the identifier of a regex pattern.
10	This is probably the most widely used type in this module (which is why it's
11	also re-exported in the crate root).
12	* [`StateID`] - A type the represents the identifier of a finite automaton
13	state. This is used for both NFAs and DFAs, with the notable exception of
14	the hybrid NFA/DFA. (The hybrid NFA/DFA uses a special purpose "lazy" state
15	identifier.)
16	* [`SmallIndex`] - The internal representation of both a `PatternID` and a
17	`StateID`. Its purpose is to serve as a type that can index memory without
18	being as big as a `usize` on 64-bit targets. The main idea behind this type
19	is that there are many things in regex engines that will, in practice, never
20	overflow a 32-bit integer. (For example, like the number of patterns in a regex
21	or the number of states in an NFA.) Thus, a `SmallIndex` can be used to index
22	memory without peppering `as` casts everywhere. Moreover, it forces callers
23	to handle errors in the case where, somehow, the value would otherwise overflow
24	either a 32-bit integer or a `usize` (e.g., on 16-bit targets).
25	* [`NonMaxUsize`] - Represents a `usize` that cannot be `usize::MAX`. As a
26	result, `Option<NonMaxUsize>` has the same size in memory as a `usize`. This
27	useful, for example, when representing the offsets of submatches since it
28	reduces memory usage by a factor of 2. It is a legal optimization since Rust
29	guarantees that slices never have a length that exceeds `isize::MAX`.
30	*/
31
32	use core::num::NonZeroUsize;
33
34	#[cfg(feature = "alloc")]
35	use alloc::vec::Vec;
36
37	use crate::util::int::{Usize, U16, U32, U64};
38
39	/// A `usize` that can never be `usize::MAX`.
40	///
41	/// This is similar to `core::num::NonZeroUsize`, but instead of not permitting
42	/// a zero value, this does not permit a max value.
43	///
44	/// This is useful in certain contexts where one wants to optimize the memory
45	/// usage of things that contain match offsets. Namely, since Rust slices
46	/// are guaranteed to never have a length exceeding `isize::MAX`, we can use
47	/// `usize::MAX` as a sentinel to indicate that no match was found. Indeed,
48	/// types like `Option<NonMaxUsize>` have exactly the same size in memory as a
49	/// `usize`.
50	///
51	/// This type is defined to be `repr(transparent)` for
52	/// `core::num::NonZeroUsize`, which is in turn defined to be
53	/// `repr(transparent)` for `usize`.
54	#[derive(Clone, Copy, Eq, Hash, PartialEq, PartialOrd, Ord)]
55	#[repr(transparent)]
56	pub struct NonMaxUsize(NonZeroUsize);
57
58	impl NonMaxUsize {
59	/// Create a new `NonMaxUsize` from the given value.
60	///
61	/// This returns `None` only when the given value is equal to `usize::MAX`.
62	#[inline]
63	pub fn new(value: usize) -> Option<NonMaxUsize> {
64	NonZeroUsize::new(value.wrapping_add(`1`)).map(NonMaxUsize)
65	}
66
67	/// Return the underlying `usize` value. The returned value is guaranteed
68	/// to not equal `usize::MAX`.
69	#[inline]
70	pub fn get(self) -> usize {
71	self.0.get().wrapping_sub(`1`)
72	}
73	}
74
75	// We provide our own Debug impl because seeing the internal repr can be quite
76	// surprising if you aren't expecting it. e.g., 'NonMaxUsize(5)' vs just '5'.
77	impl core::fmt::Debug for NonMaxUsize {
78	fn fmt(&self, f: &mut core::fmt::Formatter) -> core::fmt::Result {
79	write!(f, "{:?}", self.get())
80	}
81	}
82
83	/// A type that represents a "small" index.
84	///
85	/// The main idea of this type is to provide something that can index memory,
86	/// but uses less memory than `usize` on 64-bit systems. Specifically, its
87	/// representation is always a `u32` and has `repr(transparent)` enabled. (So
88	/// it is safe to transmute between a `u32` and a `SmallIndex`.)
89	///
90	/// A small index is typically useful in cases where there is no practical way
91	/// that the index will overflow a 32-bit integer. A good example of this is
92	/// an NFA state. If you could somehow build an NFA with `2^30` states, its
93	/// memory usage would be exorbitant and its runtime execution would be so
94	/// slow as to be completely worthless. Therefore, this crate generally deems
95	/// it acceptable to return an error if it would otherwise build an NFA that
96	/// requires a slice longer than what a 32-bit integer can index. In exchange,
97	/// we can use 32-bit indices instead of 64-bit indices in various places.
98	///
99	/// This type ensures this by providing a constructor that will return an error
100	/// if its argument cannot fit into the type. This makes it much easier to
101	/// handle these sorts of boundary cases that are otherwise extremely subtle.
102	///
103	/// On all targets, this type guarantees that its value will fit in a `u32`,
104	/// `i32`, `usize` and an `isize`. This means that on 16-bit targets, for
105	/// example, this type's maximum value will never overflow an `isize`,
106	/// which means it will never overflow a `i16` even though its internal
107	/// representation is still a `u32`.
108	///
109	/// The purpose for making the type fit into even signed integer types like
110	/// `isize` is to guarantee that the difference between any two small indices
111	/// is itself also a small index. This is useful in certain contexts, e.g.,
112	/// for delta encoding.
113	///
114	/// # Other types
115	///
116	/// The following types wrap `SmallIndex` to provide a more focused use case:
117	///
118	/// * [`PatternID`] is for representing the identifiers of patterns.
119	/// * [`StateID`] is for representing the identifiers of states in finite
120	/// automata. It is used for both NFAs and DFAs.
121	///
122	/// # Representation
123	///
124	/// This type is always represented internally by a `u32` and is marked as
125	/// `repr(transparent)`. Thus, this type always has the same representation as
126	/// a `u32`. It is thus safe to transmute between a `u32` and a `SmallIndex`.
127	///
128	/// # Indexing
129	///
130	/// For convenience, callers may use a `SmallIndex` to index slices.
131	///
132	/// # Safety
133	///
134	/// While a `SmallIndex` is meant to guarantee that its value fits into `usize`
135	/// without using as much space as a `usize` on all targets, callers must
136	/// not rely on this property for safety. Callers may choose to rely on this
137	/// property for correctness however. For example, creating a `SmallIndex` with
138	/// an invalid value can be done in entirely safe code. This may in turn result
139	/// in panics or silent logical errors.
140	#[derive(
141	Clone, Copy, Debug, Default, Eq, Hash, PartialEq, PartialOrd, Ord,
142	)]
143	#[repr(transparent)]
144	pub struct SmallIndex(u32);
145
146	impl SmallIndex {
147	/// The maximum index value.
148	#[cfg(any(target_pointer_width = "32", target_pointer_width = "64"))]
149	pub const MAX: SmallIndex =
150	// FIXME: Use as_usize() once const functions in traits are stable.
151	SmallIndex::new_unchecked(core::i32::MAX as usize - `1`);
152
153	/// The maximum index value.
154	#[cfg(target_pointer_width = "16")]
155	pub const MAX: SmallIndex =
156	SmallIndex::new_unchecked(core::isize::MAX - `1`);
157
158	/// The total number of values that can be represented as a small index.
159	pub const LIMIT: usize = SmallIndex::MAX.as_usize() + `1`;
160
161	/// The zero index value.
162	pub const ZERO: SmallIndex = SmallIndex::new_unchecked(`0`);
163
164	/// The number of bytes that a single small index uses in memory.
165	pub const SIZE: usize = core::mem::size_of::<SmallIndex>();
166
167	/// Create a new small index.
168	///
169	/// If the given index exceeds [`SmallIndex::MAX`], then this returns
170	/// an error.
171	#[inline]
172	pub fn new(index: usize) -> Result<SmallIndex, SmallIndexError> {
173	SmallIndex::try_from(index)
174	}
175
176	/// Create a new small index without checking whether the given value
177	/// exceeds [`SmallIndex::MAX`].
178	///
179	/// Using this routine with an invalid index value will result in
180	/// unspecified behavior, but not* undefined behavior. In particular, an*
181	/// invalid index value is likely to cause panics or possibly even silent
182	/// logical errors.
183	///
184	/// Callers must never rely on a `SmallIndex` to be within a certain range
185	/// for memory safety.
186	#[inline]
187	pub const fn new_unchecked(index: usize) -> SmallIndex {
188	// FIXME: Use as_u32() once const functions in traits are stable.
189	SmallIndex(index as u32)
190	}
191
192	/// Like [`SmallIndex::new`], but panics if the given index is not valid.
193	#[inline]
194	pub fn must(index: usize) -> SmallIndex {
195	SmallIndex::new(index).expect("invalid small index")
196	}
197
198	/// Return this small index as a `usize`. This is guaranteed to never
199	/// overflow `usize`.
200	#[inline]
201	pub const fn as_usize(&self) -> usize {
202	// FIXME: Use as_usize() once const functions in traits are stable.
203	self.0 as usize
204	}
205
206	/// Return this small index as a `u64`. This is guaranteed to never
207	/// overflow.
208	#[inline]
209	pub const fn as_u64(&self) -> u64 {
210	// FIXME: Use u64::from() once const functions in traits are stable.
211	self.0 as u64
212	}
213
214	/// Return the internal `u32` of this small index. This is guaranteed to
215	/// never overflow `u32`.
216	#[inline]
217	pub const fn as_u32(&self) -> u32 {
218	self.0
219	}
220
221	/// Return the internal `u32` of this small index represented as an `i32`.
222	/// This is guaranteed to never overflow an `i32`.
223	#[inline]
224	pub const fn as_i32(&self) -> i32 {
225	// This is OK because we guarantee that our max value is <= i32::MAX.
226	self.0 as i32
227	}
228
229	/// Returns one more than this small index as a usize.
230	///
231	/// Since a small index has constraints on its maximum value, adding `1` to
232	/// it will always fit in a `usize`, `u32` and a `i32`.
233	#[inline]
234	pub fn one_more(&self) -> usize {
235	self.as_usize() + `1`
236	}
237
238	/// Decode this small index from the bytes given using the native endian
239	/// byte order for the current target.
240	///
241	/// If the decoded integer is not representable as a small index for the
242	/// current target, then this returns an error.
243	#[inline]
244	pub fn from_ne_bytes(
245	bytes: [u8; `4`],
246	) -> Result<SmallIndex, SmallIndexError> {
247	let id = u32::from_ne_bytes(bytes);
248	if id > SmallIndex::MAX.as_u32() {
249	return Err(SmallIndexError { attempted: u64::from(id) });
250	}
251	Ok(SmallIndex::new_unchecked(id.as_usize()))
252	}
253
254	/// Decode this small index from the bytes given using the native endian
255	/// byte order for the current target.
256	///
257	/// This is analogous to [`SmallIndex::new_unchecked`] in that is does not
258	/// check whether the decoded integer is representable as a small index.
259	#[inline]
260	pub fn from_ne_bytes_unchecked(bytes: [u8; `4`]) -> SmallIndex {
261	SmallIndex::new_unchecked(u32::from_ne_bytes(bytes).as_usize())
262	}
263
264	/// Return the underlying small index integer as raw bytes in native endian
265	/// format.
266	#[inline]
267	pub fn to_ne_bytes(&self) -> [u8; `4`] {
268	self.0.to_ne_bytes()
269	}
270	}
271
272	impl<T> core::ops::Index<SmallIndex> for [T] {
273	type Output = T;
274
275	#[inline]
276	fn index(&self, index: SmallIndex) -> &T {
277	&self[index.as_usize()]
278	}
279	}
280
281	impl<T> core::ops::IndexMut<SmallIndex> for [T] {
282	#[inline]
283	fn index_mut(&mut self, index: SmallIndex) -> &mut T {
284	&mut self[index.as_usize()]
285	}
286	}
287
288	#[cfg(feature = "alloc")]
289	impl<T> core::ops::Index<SmallIndex> for Vec<T> {
290	type Output = T;
291
292	#[inline]
293	fn index(&self, index: SmallIndex) -> &T {
294	&self[index.as_usize()]
295	}
296	}
297
298	#[cfg(feature = "alloc")]
299	impl<T> core::ops::IndexMut<SmallIndex> for Vec<T> {
300	#[inline]
301	fn index_mut(&mut self, index: SmallIndex) -> &mut T {
302	&mut self[index.as_usize()]
303	}
304	}
305
306	impl From<u8> for SmallIndex {
307	fn from(index: u8) -> SmallIndex {
308	SmallIndex::new_unchecked(index:usize::from(index))
309	}
310	}
311
312	impl TryFrom<u16> for SmallIndex {
313	type Error = SmallIndexError;
314
315	fn try_from(index: u16) -> Result<SmallIndex, SmallIndexError> {
316	if u32::from(index) > SmallIndex::MAX.as_u32() {
317	return Err(SmallIndexError { attempted: u64::from(index) });
318	}
319	Ok(SmallIndex::new_unchecked(index:index.as_usize()))
320	}
321	}
322
323	impl TryFrom<u32> for SmallIndex {
324	type Error = SmallIndexError;
325
326	fn try_from(index: u32) -> Result<SmallIndex, SmallIndexError> {
327	if index > SmallIndex::MAX.as_u32() {
328	return Err(SmallIndexError { attempted: u64::from(index) });
329	}
330	Ok(SmallIndex::new_unchecked(index:index.as_usize()))
331	}
332	}
333
334	impl TryFrom<u64> for SmallIndex {
335	type Error = SmallIndexError;
336
337	fn try_from(index: u64) -> Result<SmallIndex, SmallIndexError> {
338	if index > SmallIndex::MAX.as_u64() {
339	return Err(SmallIndexError { attempted: index });
340	}
341	Ok(SmallIndex::new_unchecked(index:index.as_usize()))
342	}
343	}
344
345	impl TryFrom<usize> for SmallIndex {
346	type Error = SmallIndexError;
347
348	fn try_from(index: usize) -> Result<SmallIndex, SmallIndexError> {
349	if index > SmallIndex::MAX.as_usize() {
350	return Err(SmallIndexError { attempted: index.as_u64() });
351	}
352	Ok(SmallIndex::new_unchecked(index))
353	}
354	}
355
356	#[cfg(test)]
357	impl quickcheck::Arbitrary for SmallIndex {
358	fn arbitrary(gen: &mut quickcheck::Gen) -> SmallIndex {
359	use core::cmp::max;
360
361	let id = max(i32::MIN + `1`, i32::arbitrary(gen)).abs();
362	if id > SmallIndex::MAX.as_i32() {
363	SmallIndex::MAX
364	} else {
365	SmallIndex::new(usize::try_from(id).unwrap()).unwrap()
366	}
367	}
368	}
369
370	/// This error occurs when a small index could not be constructed.
371	///
372	/// This occurs when given an integer exceeding the maximum small index value.
373	///
374	/// When the `std` feature is enabled, this implements the `Error` trait.
375	#[derive(Clone, Debug, Eq, PartialEq)]
376	pub struct SmallIndexError {
377	attempted: u64,
378	}
379
380	impl SmallIndexError {
381	/// Returns the value that could not be converted to a small index.
382	pub fn attempted(&self) -> u64 {
383	self.attempted
384	}
385	}
386
387	#[cfg(feature = "std")]
388	impl std::error::Error for SmallIndexError {}
389
390	impl core::fmt::Display for SmallIndexError {
391	fn fmt(&self, f: &mut core::fmt::Formatter) -> core::fmt::Result {
392	write!(
393	f,
394	"failed to create small index from {:?}, which exceeds {:?}",
395	self.attempted(),
396	SmallIndex::MAX,
397	)
398	}
399	}
400
401	#[derive(Clone, Debug)]
402	pub(crate) struct SmallIndexIter {
403	rng: core::ops::Range<usize>,
404	}
405
406	impl Iterator for SmallIndexIter {
407	type Item = SmallIndex;
408
409	fn next(&mut self) -> Option<SmallIndex> {
410	if self.rng.start >= self.rng.end {
411	return None;
412	}
413	let next_id: usize = self.rng.start + `1`;
414	let id: usize = core::mem::replace(&mut self.rng.start, src:next_id);
415	// new_unchecked is OK since we asserted that the number of
416	// elements in this iterator will fit in an ID at construction.
417	Some(SmallIndex::new_unchecked(index:id))
418	}
419	}
420
421	macro_rules! index_type_impls {
422	($name:ident, $err:ident, $iter:ident, $withiter:ident) => {
423	impl $name {
424	/// The maximum value.
425	pub const MAX: $name = $name(SmallIndex::MAX);
426
427	/// The total number of values that can be represented.
428	pub const LIMIT: usize = SmallIndex::LIMIT;
429
430	/// The zero value.
431	pub const ZERO: $name = $name(SmallIndex::ZERO);
432
433	/// The number of bytes that a single value uses in memory.
434	pub const SIZE: usize = SmallIndex::SIZE;
435
436	/// Create a new value that is represented by a "small index."
437	///
438	/// If the given index exceeds the maximum allowed value, then this
439	/// returns an error.
440	#[inline]
441	pub fn new(value: usize) -> Result<$name, $err> {
442	SmallIndex::new(value).map($name).map_err($err)
443	}
444
445	/// Create a new value without checking whether the given argument
446	/// exceeds the maximum.
447	///
448	/// Using this routine with an invalid value will result in
449	/// unspecified behavior, but not* undefined behavior. In*
450	/// particular, an invalid ID value is likely to cause panics or
451	/// possibly even silent logical errors.
452	///
453	/// Callers must never rely on this type to be within a certain
454	/// range for memory safety.
455	#[inline]
456	pub const fn new_unchecked(value: usize) -> $name {
457	$name(SmallIndex::new_unchecked(value))
458	}
459
460	/// Like `new`, but panics if the given value is not valid.
461	#[inline]
462	pub fn must(value: usize) -> $name {
463	$name::new(value).expect(concat!(
464	"invalid ",
465	stringify!($name),
466	" value"
467	))
468	}
469
470	/// Return the internal value as a `usize`. This is guaranteed to
471	/// never overflow `usize`.
472	#[inline]
473	pub const fn as_usize(&self) -> usize {
474	self.`0`.as_usize()
475	}
476
477	/// Return the internal value as a `u64`. This is guaranteed to
478	/// never overflow.
479	#[inline]
480	pub const fn as_u64(&self) -> u64 {
481	self.`0`.as_u64()
482	}
483
484	/// Return the internal value as a `u32`. This is guaranteed to
485	/// never overflow `u32`.
486	#[inline]
487	pub const fn as_u32(&self) -> u32 {
488	self.`0`.as_u32()
489	}
490
491	/// Return the internal value as a i32`. This is guaranteed to
492	/// never overflow an `i32`.
493	#[inline]
494	pub const fn as_i32(&self) -> i32 {
495	self.`0`.as_i32()
496	}
497
498	/// Returns one more than this value as a usize.
499	///
500	/// Since values represented by a "small index" have constraints
501	/// on their maximum value, adding `1` to it will always fit in a
502	/// `usize`, `u32` and a `i32`.
503	#[inline]
504	pub fn one_more(&self) -> usize {
505	self.`0`.one_more()
506	}
507
508	/// Decode this value from the bytes given using the native endian
509	/// byte order for the current target.
510	///
511	/// If the decoded integer is not representable as a small index
512	/// for the current target, then this returns an error.
513	#[inline]
514	pub fn from_ne_bytes(bytes: [u8; `4`]) -> Result<$name, $err> {
515	SmallIndex::from_ne_bytes(bytes).map($name).map_err($err)
516	}
517
518	/// Decode this value from the bytes given using the native endian
519	/// byte order for the current target.
520	///
521	/// This is analogous to `new_unchecked` in that is does not check
522	/// whether the decoded integer is representable as a small index.
523	#[inline]
524	pub fn from_ne_bytes_unchecked(bytes: [u8; `4`]) -> $name {
525	$name(SmallIndex::from_ne_bytes_unchecked(bytes))
526	}
527
528	/// Return the underlying integer as raw bytes in native endian
529	/// format.
530	#[inline]
531	pub fn to_ne_bytes(&self) -> [u8; `4`] {
532	self.`0`.to_ne_bytes()
533	}
534
535	/// Returns an iterator over all values from 0 up to and not
536	/// including the given length.
537	///
538	/// If the given length exceeds this type's limit, then this
539	/// panics.
540	pub(crate) fn iter(len: usize) -> $iter {
541	$iter::new(len)
542	}
543	}
544
545	// We write our own Debug impl so that we get things like PatternID(5)
546	// instead of PatternID(SmallIndex(5)).
547	impl core::fmt::Debug for $name {
548	fn fmt(&self, f: &mut core::fmt::Formatter) -> core::fmt::Result {
549	f.debug_tuple(stringify!($name)).field(&self.as_u32()).finish()
550	}
551	}
552
553	impl<T> core::ops::Index<$name> for [T] {
554	type Output = T;
555
556	#[inline]
557	fn index(&self, index: $name) -> &T {
558	&self[index.as_usize()]
559	}
560	}
561
562	impl<T> core::ops::IndexMut<$name> for [T] {
563	#[inline]
564	fn index_mut(&mut self, index: $name) -> &mut T {
565	&mut self[index.as_usize()]
566	}
567	}
568
569	#[cfg(feature = "alloc")]
570	impl<T> core::ops::Index<$name> for Vec<T> {
571	type Output = T;
572
573	#[inline]
574	fn index(&self, index: $name) -> &T {
575	&self[index.as_usize()]
576	}
577	}
578
579	#[cfg(feature = "alloc")]
580	impl<T> core::ops::IndexMut<$name> for Vec<T> {
581	#[inline]
582	fn index_mut(&mut self, index: $name) -> &mut T {
583	&mut self[index.as_usize()]
584	}
585	}
586
587	impl From<u8> for $name {
588	fn from(value: u8) -> $name {
589	$name(SmallIndex::from(value))
590	}
591	}
592
593	impl TryFrom<u16> for $name {
594	type Error = $err;
595
596	fn try_from(value: u16) -> Result<$name, $err> {
597	SmallIndex::try_from(value).map($name).map_err($err)
598	}
599	}
600
601	impl TryFrom<u32> for $name {
602	type Error = $err;
603
604	fn try_from(value: u32) -> Result<$name, $err> {
605	SmallIndex::try_from(value).map($name).map_err($err)
606	}
607	}
608
609	impl TryFrom<u64> for $name {
610	type Error = $err;
611
612	fn try_from(value: u64) -> Result<$name, $err> {
613	SmallIndex::try_from(value).map($name).map_err($err)
614	}
615	}
616
617	impl TryFrom<usize> for $name {
618	type Error = $err;
619
620	fn try_from(value: usize) -> Result<$name, $err> {
621	SmallIndex::try_from(value).map($name).map_err($err)
622	}
623	}
624
625	#[cfg(test)]
626	impl quickcheck::Arbitrary for $name {
627	fn arbitrary(gen: &mut quickcheck::Gen) -> $name {
628	$name(SmallIndex::arbitrary(gen))
629	}
630	}
631
632	/// This error occurs when a value could not be constructed.
633	///
634	/// This occurs when given an integer exceeding the maximum allowed
635	/// value.
636	///
637	/// When the `std` feature is enabled, this implements the `Error`
638	/// trait.
639	#[derive(Clone, Debug, Eq, PartialEq)]
640	pub struct $err(SmallIndexError);
641
642	impl $err {
643	/// Returns the value that could not be converted to an ID.
644	pub fn attempted(&self) -> u64 {
645	self.`0`.attempted()
646	}
647	}
648
649	#[cfg(feature = "std")]
650	impl std::error::Error for $err {}
651
652	impl core::fmt::Display for $err {
653	fn fmt(&self, f: &mut core::fmt::Formatter) -> core::fmt::Result {
654	write!(
655	f,
656	"failed to create {} from {:?}, which exceeds {:?}",
657	stringify!($name),
658	self.attempted(),
659	$name::MAX,
660	)
661	}
662	}
663
664	#[derive(Clone, Debug)]
665	pub(crate) struct $iter(SmallIndexIter);
666
667	impl $iter {
668	fn new(len: usize) -> $iter {
669	assert!(
670	len <= $name::LIMIT,
671	"cannot create iterator for {} when number of \
672	elements exceed {:?}",
673	stringify!($name),
674	$name::LIMIT,
675	);
676	$iter(SmallIndexIter { rng: `0`..len })
677	}
678	}
679
680	impl Iterator for $iter {
681	type Item = $name;
682
683	fn next(&mut self) -> Option<$name> {
684	self.`0`.next().map($name)
685	}
686	}
687
688	/// An iterator adapter that is like std::iter::Enumerate, but attaches
689	/// small index values instead. It requires `ExactSizeIterator`. At
690	/// construction, it ensures that the index of each element in the
691	/// iterator is representable in the corresponding small index type.
692	#[derive(Clone, Debug)]
693	pub(crate) struct $withiter<I> {
694	it: I,
695	ids: $iter,
696	}
697
698	impl<I: Iterator + ExactSizeIterator> $withiter<I> {
699	fn new(it: I) -> $withiter<I> {
700	let ids = $name::iter(it.len());
701	$withiter { it, ids }
702	}
703	}
704
705	impl<I: Iterator + ExactSizeIterator> Iterator for $withiter<I> {
706	type Item = ($name, I::Item);
707
708	fn next(&mut self) -> Option<($name, I::Item)> {
709	let item = self.it.next()?;
710	// Number of elements in this iterator must match, according
711	// to contract of ExactSizeIterator.
712	let id = self.ids.next().unwrap();
713	Some((id, item))
714	}
715	}
716	};
717	}
718
719	/// The identifier of a regex pattern, represented by a [`SmallIndex`].
720	///
721	/// The identifier for a pattern corresponds to its relative position among
722	/// other patterns in a single finite state machine. Namely, when building
723	/// a multi-pattern regex engine, one must supply a sequence of patterns to
724	/// match. The position (starting at 0) of each pattern in that sequence
725	/// represents its identifier. This identifier is in turn used to identify and
726	/// report matches of that pattern in various APIs.
727	///
728	/// See the [`SmallIndex`] type for more information about what it means for
729	/// a pattern ID to be a "small index."
730	///
731	/// Note that this type is defined in the
732	/// [`util::primitives`](crate::util::primitives) module, but it is also
733	/// re-exported at the crate root due to how common it is.
734	#[derive(Clone, Copy, Default, Eq, Hash, PartialEq, PartialOrd, Ord)]
735	#[repr(transparent)]
736	pub struct PatternID(SmallIndex);
737
738	/// The identifier of a finite automaton state, represented by a
739	/// [`SmallIndex`].
740	///
741	/// Most regex engines in this crate are built on top of finite automata. Each
742	/// state in a finite automaton defines transitions from its state to another.
743	/// Those transitions point to other states via their identifiers, i.e., a
744	/// `StateID`. Since finite automata tend to contain many transitions, it is
745	/// much more memory efficient to define state IDs as small indices.
746	///
747	/// See the [`SmallIndex`] type for more information about what it means for
748	/// a state ID to be a "small index."
749	#[derive(Clone, Copy, Default, Eq, Hash, PartialEq, PartialOrd, Ord)]
750	#[repr(transparent)]
751	pub struct StateID(SmallIndex);
752
753	index_type_impls!(PatternID, PatternIDError, PatternIDIter, WithPatternIDIter);
754	index_type_impls!(StateID, StateIDError, StateIDIter, WithStateIDIter);
755
756	/// A utility trait that defines a couple of adapters for making it convenient
757	/// to access indices as "small index" types. We require ExactSizeIterator so
758	/// that iterator construction can do a single check to make sure the index of
759	/// each element is representable by its small index type.
760	pub(crate) trait IteratorIndexExt: Iterator {
761	fn with_pattern_ids(self) -> WithPatternIDIter<Self>
762	where
763	Self: Sized + ExactSizeIterator,
764	{
765	WithPatternIDIter::new(self)
766	}
767
768	fn with_state_ids(self) -> WithStateIDIter<Self>
769	where
770	Self: Sized + ExactSizeIterator,
771	{
772	WithStateIDIter::new(self)
773	}
774	}
775
776	impl<I: Iterator> IteratorIndexExt for I {}
777

Provided by KDAB

Definitions