1 | /*! |
2 | Lower level primitive types that are useful in a variety of circumstances. |
3 | |
4 | # Overview |
5 | |
6 | This list represents the principle types in this module and briefly describes |
7 | when you might want to use them. |
8 | |
9 | * [`PatternID`] - A type that represents the identifier of a regex pattern. |
10 | This is probably the most widely used type in this module (which is why it's |
11 | also re-exported in the crate root). |
12 | * [`StateID`] - A type the represents the identifier of a finite automaton |
13 | state. This is used for both NFAs and DFAs, with the notable exception of |
14 | the hybrid NFA/DFA. (The hybrid NFA/DFA uses a special purpose "lazy" state |
15 | identifier.) |
16 | * [`SmallIndex`] - The internal representation of both a `PatternID` and a |
17 | `StateID`. Its purpose is to serve as a type that can index memory without |
18 | being as big as a `usize` on 64-bit targets. The main idea behind this type |
19 | is that there are many things in regex engines that will, in practice, never |
20 | overflow a 32-bit integer. (For example, like the number of patterns in a regex |
21 | or the number of states in an NFA.) Thus, a `SmallIndex` can be used to index |
22 | memory without peppering `as` casts everywhere. Moreover, it forces callers |
23 | to handle errors in the case where, somehow, the value would otherwise overflow |
24 | either a 32-bit integer or a `usize` (e.g., on 16-bit targets). |
25 | */ |
26 | |
27 | // The macro we use to define some types below adds methods that we don't |
28 | // use on some of the types. There isn't much, so we just squash the warning. |
29 | #![allow (dead_code)] |
30 | |
31 | use alloc::vec::Vec; |
32 | |
33 | use crate::util::int::{Usize, U16, U32, U64}; |
34 | |
35 | /// A type that represents a "small" index. |
36 | /// |
37 | /// The main idea of this type is to provide something that can index memory, |
38 | /// but uses less memory than `usize` on 64-bit systems. Specifically, its |
39 | /// representation is always a `u32` and has `repr(transparent)` enabled. (So |
40 | /// it is safe to transmute between a `u32` and a `SmallIndex`.) |
41 | /// |
42 | /// A small index is typically useful in cases where there is no practical way |
43 | /// that the index will overflow a 32-bit integer. A good example of this is |
44 | /// an NFA state. If you could somehow build an NFA with `2^30` states, its |
45 | /// memory usage would be exorbitant and its runtime execution would be so |
46 | /// slow as to be completely worthless. Therefore, this crate generally deems |
47 | /// it acceptable to return an error if it would otherwise build an NFA that |
48 | /// requires a slice longer than what a 32-bit integer can index. In exchange, |
49 | /// we can use 32-bit indices instead of 64-bit indices in various places. |
50 | /// |
51 | /// This type ensures this by providing a constructor that will return an error |
52 | /// if its argument cannot fit into the type. This makes it much easier to |
53 | /// handle these sorts of boundary cases that are otherwise extremely subtle. |
54 | /// |
55 | /// On all targets, this type guarantees that its value will fit in a `u32`, |
56 | /// `i32`, `usize` and an `isize`. This means that on 16-bit targets, for |
57 | /// example, this type's maximum value will never overflow an `isize`, |
58 | /// which means it will never overflow a `i16` even though its internal |
59 | /// representation is still a `u32`. |
60 | /// |
61 | /// The purpose for making the type fit into even signed integer types like |
62 | /// `isize` is to guarantee that the difference between any two small indices |
63 | /// is itself also a small index. This is useful in certain contexts, e.g., |
64 | /// for delta encoding. |
65 | /// |
66 | /// # Other types |
67 | /// |
68 | /// The following types wrap `SmallIndex` to provide a more focused use case: |
69 | /// |
70 | /// * [`PatternID`] is for representing the identifiers of patterns. |
71 | /// * [`StateID`] is for representing the identifiers of states in finite |
72 | /// automata. It is used for both NFAs and DFAs. |
73 | /// |
74 | /// # Representation |
75 | /// |
76 | /// This type is always represented internally by a `u32` and is marked as |
77 | /// `repr(transparent)`. Thus, this type always has the same representation as |
78 | /// a `u32`. It is thus safe to transmute between a `u32` and a `SmallIndex`. |
79 | /// |
80 | /// # Indexing |
81 | /// |
82 | /// For convenience, callers may use a `SmallIndex` to index slices. |
83 | /// |
84 | /// # Safety |
85 | /// |
86 | /// While a `SmallIndex` is meant to guarantee that its value fits into `usize` |
87 | /// without using as much space as a `usize` on all targets, callers must |
88 | /// not rely on this property for safety. Callers may choose to rely on this |
89 | /// property for correctness however. For example, creating a `SmallIndex` with |
90 | /// an invalid value can be done in entirely safe code. This may in turn result |
91 | /// in panics or silent logical errors. |
92 | #[derive ( |
93 | Clone, Copy, Debug, Default, Eq, Hash, PartialEq, PartialOrd, Ord, |
94 | )] |
95 | #[repr (transparent)] |
96 | pub(crate) struct SmallIndex(u32); |
97 | |
98 | impl SmallIndex { |
99 | /// The maximum index value. |
100 | #[cfg (any(target_pointer_width = "32" , target_pointer_width = "64" ))] |
101 | pub const MAX: SmallIndex = |
102 | // FIXME: Use as_usize() once const functions in traits are stable. |
103 | SmallIndex::new_unchecked(core::i32::MAX as usize - 1); |
104 | |
105 | /// The maximum index value. |
106 | #[cfg (target_pointer_width = "16" )] |
107 | pub const MAX: SmallIndex = |
108 | SmallIndex::new_unchecked(core::isize::MAX - 1); |
109 | |
110 | /// The total number of values that can be represented as a small index. |
111 | pub const LIMIT: usize = SmallIndex::MAX.as_usize() + 1; |
112 | |
113 | /// The zero index value. |
114 | pub const ZERO: SmallIndex = SmallIndex::new_unchecked(0); |
115 | |
116 | /// The number of bytes that a single small index uses in memory. |
117 | pub const SIZE: usize = core::mem::size_of::<SmallIndex>(); |
118 | |
119 | /// Create a new small index. |
120 | /// |
121 | /// If the given index exceeds [`SmallIndex::MAX`], then this returns |
122 | /// an error. |
123 | #[inline ] |
124 | pub fn new(index: usize) -> Result<SmallIndex, SmallIndexError> { |
125 | SmallIndex::try_from(index) |
126 | } |
127 | |
128 | /// Create a new small index without checking whether the given value |
129 | /// exceeds [`SmallIndex::MAX`]. |
130 | /// |
131 | /// Using this routine with an invalid index value will result in |
132 | /// unspecified behavior, but *not* undefined behavior. In particular, an |
133 | /// invalid index value is likely to cause panics or possibly even silent |
134 | /// logical errors. |
135 | /// |
136 | /// Callers must never rely on a `SmallIndex` to be within a certain range |
137 | /// for memory safety. |
138 | #[inline ] |
139 | pub const fn new_unchecked(index: usize) -> SmallIndex { |
140 | // FIXME: Use as_u32() once const functions in traits are stable. |
141 | SmallIndex::from_u32_unchecked(index as u32) |
142 | } |
143 | |
144 | /// Create a new small index from a `u32` without checking whether the |
145 | /// given value exceeds [`SmallIndex::MAX`]. |
146 | /// |
147 | /// Using this routine with an invalid index value will result in |
148 | /// unspecified behavior, but *not* undefined behavior. In particular, an |
149 | /// invalid index value is likely to cause panics or possibly even silent |
150 | /// logical errors. |
151 | /// |
152 | /// Callers must never rely on a `SmallIndex` to be within a certain range |
153 | /// for memory safety. |
154 | #[inline ] |
155 | pub const fn from_u32_unchecked(index: u32) -> SmallIndex { |
156 | SmallIndex(index) |
157 | } |
158 | |
159 | /// Like [`SmallIndex::new`], but panics if the given index is not valid. |
160 | #[inline ] |
161 | pub fn must(index: usize) -> SmallIndex { |
162 | SmallIndex::new(index).expect("invalid small index" ) |
163 | } |
164 | |
165 | /// Return this small index as a `usize`. This is guaranteed to never |
166 | /// overflow `usize`. |
167 | #[inline ] |
168 | pub const fn as_usize(&self) -> usize { |
169 | // FIXME: Use as_usize() once const functions in traits are stable. |
170 | self.0 as usize |
171 | } |
172 | |
173 | /// Return this small index as a `u64`. This is guaranteed to never |
174 | /// overflow. |
175 | #[inline ] |
176 | pub const fn as_u64(&self) -> u64 { |
177 | // FIXME: Use u64::from() once const functions in traits are stable. |
178 | self.0 as u64 |
179 | } |
180 | |
181 | /// Return the internal `u32` of this small index. This is guaranteed to |
182 | /// never overflow `u32`. |
183 | #[inline ] |
184 | pub const fn as_u32(&self) -> u32 { |
185 | self.0 |
186 | } |
187 | |
188 | /// Return the internal `u32` of this small index represented as an `i32`. |
189 | /// This is guaranteed to never overflow an `i32`. |
190 | #[inline ] |
191 | pub const fn as_i32(&self) -> i32 { |
192 | // This is OK because we guarantee that our max value is <= i32::MAX. |
193 | self.0 as i32 |
194 | } |
195 | |
196 | /// Returns one more than this small index as a usize. |
197 | /// |
198 | /// Since a small index has constraints on its maximum value, adding `1` to |
199 | /// it will always fit in a `usize`, `isize`, `u32` and a `i32`. |
200 | #[inline ] |
201 | pub fn one_more(&self) -> usize { |
202 | self.as_usize() + 1 |
203 | } |
204 | |
205 | /// Decode this small index from the bytes given using the native endian |
206 | /// byte order for the current target. |
207 | /// |
208 | /// If the decoded integer is not representable as a small index for the |
209 | /// current target, then this returns an error. |
210 | #[inline ] |
211 | pub fn from_ne_bytes( |
212 | bytes: [u8; 4], |
213 | ) -> Result<SmallIndex, SmallIndexError> { |
214 | let id = u32::from_ne_bytes(bytes); |
215 | if id > SmallIndex::MAX.as_u32() { |
216 | return Err(SmallIndexError { attempted: u64::from(id) }); |
217 | } |
218 | Ok(SmallIndex::new_unchecked(id.as_usize())) |
219 | } |
220 | |
221 | /// Decode this small index from the bytes given using the native endian |
222 | /// byte order for the current target. |
223 | /// |
224 | /// This is analogous to [`SmallIndex::new_unchecked`] in that is does not |
225 | /// check whether the decoded integer is representable as a small index. |
226 | #[inline ] |
227 | pub fn from_ne_bytes_unchecked(bytes: [u8; 4]) -> SmallIndex { |
228 | SmallIndex::new_unchecked(u32::from_ne_bytes(bytes).as_usize()) |
229 | } |
230 | |
231 | /// Return the underlying small index integer as raw bytes in native endian |
232 | /// format. |
233 | #[inline ] |
234 | pub fn to_ne_bytes(&self) -> [u8; 4] { |
235 | self.0.to_ne_bytes() |
236 | } |
237 | } |
238 | |
239 | impl<T> core::ops::Index<SmallIndex> for [T] { |
240 | type Output = T; |
241 | |
242 | #[inline ] |
243 | fn index(&self, index: SmallIndex) -> &T { |
244 | &self[index.as_usize()] |
245 | } |
246 | } |
247 | |
248 | impl<T> core::ops::IndexMut<SmallIndex> for [T] { |
249 | #[inline ] |
250 | fn index_mut(&mut self, index: SmallIndex) -> &mut T { |
251 | &mut self[index.as_usize()] |
252 | } |
253 | } |
254 | |
255 | impl<T> core::ops::Index<SmallIndex> for Vec<T> { |
256 | type Output = T; |
257 | |
258 | #[inline ] |
259 | fn index(&self, index: SmallIndex) -> &T { |
260 | &self[index.as_usize()] |
261 | } |
262 | } |
263 | |
264 | impl<T> core::ops::IndexMut<SmallIndex> for Vec<T> { |
265 | #[inline ] |
266 | fn index_mut(&mut self, index: SmallIndex) -> &mut T { |
267 | &mut self[index.as_usize()] |
268 | } |
269 | } |
270 | |
271 | impl From<StateID> for SmallIndex { |
272 | fn from(sid: StateID) -> SmallIndex { |
273 | sid.0 |
274 | } |
275 | } |
276 | |
277 | impl From<PatternID> for SmallIndex { |
278 | fn from(pid: PatternID) -> SmallIndex { |
279 | pid.0 |
280 | } |
281 | } |
282 | |
283 | impl From<u8> for SmallIndex { |
284 | fn from(index: u8) -> SmallIndex { |
285 | SmallIndex::new_unchecked(index:usize::from(index)) |
286 | } |
287 | } |
288 | |
289 | impl TryFrom<u16> for SmallIndex { |
290 | type Error = SmallIndexError; |
291 | |
292 | fn try_from(index: u16) -> Result<SmallIndex, SmallIndexError> { |
293 | if u32::from(index) > SmallIndex::MAX.as_u32() { |
294 | return Err(SmallIndexError { attempted: u64::from(index) }); |
295 | } |
296 | Ok(SmallIndex::new_unchecked(index:index.as_usize())) |
297 | } |
298 | } |
299 | |
300 | impl TryFrom<u32> for SmallIndex { |
301 | type Error = SmallIndexError; |
302 | |
303 | fn try_from(index: u32) -> Result<SmallIndex, SmallIndexError> { |
304 | if index > SmallIndex::MAX.as_u32() { |
305 | return Err(SmallIndexError { attempted: u64::from(index) }); |
306 | } |
307 | Ok(SmallIndex::new_unchecked(index:index.as_usize())) |
308 | } |
309 | } |
310 | |
311 | impl TryFrom<u64> for SmallIndex { |
312 | type Error = SmallIndexError; |
313 | |
314 | fn try_from(index: u64) -> Result<SmallIndex, SmallIndexError> { |
315 | if index > SmallIndex::MAX.as_u64() { |
316 | return Err(SmallIndexError { attempted: index }); |
317 | } |
318 | Ok(SmallIndex::new_unchecked(index:index.as_usize())) |
319 | } |
320 | } |
321 | |
322 | impl TryFrom<usize> for SmallIndex { |
323 | type Error = SmallIndexError; |
324 | |
325 | fn try_from(index: usize) -> Result<SmallIndex, SmallIndexError> { |
326 | if index > SmallIndex::MAX.as_usize() { |
327 | return Err(SmallIndexError { attempted: index.as_u64() }); |
328 | } |
329 | Ok(SmallIndex::new_unchecked(index)) |
330 | } |
331 | } |
332 | |
333 | /// This error occurs when a small index could not be constructed. |
334 | /// |
335 | /// This occurs when given an integer exceeding the maximum small index value. |
336 | /// |
337 | /// When the `std` feature is enabled, this implements the `Error` trait. |
338 | #[derive (Clone, Debug, Eq, PartialEq)] |
339 | pub struct SmallIndexError { |
340 | attempted: u64, |
341 | } |
342 | |
343 | impl SmallIndexError { |
344 | /// Returns the value that could not be converted to a small index. |
345 | pub fn attempted(&self) -> u64 { |
346 | self.attempted |
347 | } |
348 | } |
349 | |
350 | #[cfg (feature = "std" )] |
351 | impl std::error::Error for SmallIndexError {} |
352 | |
353 | impl core::fmt::Display for SmallIndexError { |
354 | fn fmt(&self, f: &mut core::fmt::Formatter) -> core::fmt::Result { |
355 | write!( |
356 | f, |
357 | "failed to create small index from {:?}, which exceeds {:?}" , |
358 | self.attempted(), |
359 | SmallIndex::MAX, |
360 | ) |
361 | } |
362 | } |
363 | |
364 | #[derive (Clone, Debug)] |
365 | pub(crate) struct SmallIndexIter { |
366 | rng: core::ops::Range<usize>, |
367 | } |
368 | |
369 | impl Iterator for SmallIndexIter { |
370 | type Item = SmallIndex; |
371 | |
372 | fn next(&mut self) -> Option<SmallIndex> { |
373 | if self.rng.start >= self.rng.end { |
374 | return None; |
375 | } |
376 | let next_id: usize = self.rng.start + 1; |
377 | let id: usize = core::mem::replace(&mut self.rng.start, src:next_id); |
378 | // new_unchecked is OK since we asserted that the number of |
379 | // elements in this iterator will fit in an ID at construction. |
380 | Some(SmallIndex::new_unchecked(index:id)) |
381 | } |
382 | } |
383 | |
384 | macro_rules! index_type_impls { |
385 | ($name:ident, $err:ident, $iter:ident, $withiter:ident) => { |
386 | impl $name { |
387 | /// The maximum value. |
388 | pub const MAX: $name = $name(SmallIndex::MAX); |
389 | |
390 | /// The total number of values that can be represented. |
391 | pub const LIMIT: usize = SmallIndex::LIMIT; |
392 | |
393 | /// The zero value. |
394 | pub const ZERO: $name = $name(SmallIndex::ZERO); |
395 | |
396 | /// The number of bytes that a single value uses in memory. |
397 | pub const SIZE: usize = SmallIndex::SIZE; |
398 | |
399 | /// Create a new value that is represented by a "small index." |
400 | /// |
401 | /// If the given index exceeds the maximum allowed value, then this |
402 | /// returns an error. |
403 | #[inline] |
404 | pub fn new(value: usize) -> Result<$name, $err> { |
405 | SmallIndex::new(value).map($name).map_err($err) |
406 | } |
407 | |
408 | /// Create a new value without checking whether the given argument |
409 | /// exceeds the maximum. |
410 | /// |
411 | /// Using this routine with an invalid value will result in |
412 | /// unspecified behavior, but *not* undefined behavior. In |
413 | /// particular, an invalid ID value is likely to cause panics or |
414 | /// possibly even silent logical errors. |
415 | /// |
416 | /// Callers must never rely on this type to be within a certain |
417 | /// range for memory safety. |
418 | #[inline] |
419 | pub const fn new_unchecked(value: usize) -> $name { |
420 | $name(SmallIndex::new_unchecked(value)) |
421 | } |
422 | |
423 | /// Create a new value from a `u32` without checking whether the |
424 | /// given value exceeds the maximum. |
425 | /// |
426 | /// Using this routine with an invalid value will result in |
427 | /// unspecified behavior, but *not* undefined behavior. In |
428 | /// particular, an invalid ID value is likely to cause panics or |
429 | /// possibly even silent logical errors. |
430 | /// |
431 | /// Callers must never rely on this type to be within a certain |
432 | /// range for memory safety. |
433 | #[inline] |
434 | pub const fn from_u32_unchecked(index: u32) -> $name { |
435 | $name(SmallIndex::from_u32_unchecked(index)) |
436 | } |
437 | |
438 | /// Like `new`, but panics if the given value is not valid. |
439 | #[inline] |
440 | pub fn must(value: usize) -> $name { |
441 | $name::new(value).expect(concat!( |
442 | "invalid " , |
443 | stringify!($name), |
444 | " value" |
445 | )) |
446 | } |
447 | |
448 | /// Return the internal value as a `usize`. This is guaranteed to |
449 | /// never overflow `usize`. |
450 | #[inline] |
451 | pub const fn as_usize(&self) -> usize { |
452 | self.0.as_usize() |
453 | } |
454 | |
455 | /// Return the internal value as a `u64`. This is guaranteed to |
456 | /// never overflow. |
457 | #[inline] |
458 | pub const fn as_u64(&self) -> u64 { |
459 | self.0.as_u64() |
460 | } |
461 | |
462 | /// Return the internal value as a `u32`. This is guaranteed to |
463 | /// never overflow `u32`. |
464 | #[inline] |
465 | pub const fn as_u32(&self) -> u32 { |
466 | self.0.as_u32() |
467 | } |
468 | |
469 | /// Return the internal value as a `i32`. This is guaranteed to |
470 | /// never overflow an `i32`. |
471 | #[inline] |
472 | pub const fn as_i32(&self) -> i32 { |
473 | self.0.as_i32() |
474 | } |
475 | |
476 | /// Returns one more than this value as a usize. |
477 | /// |
478 | /// Since values represented by a "small index" have constraints |
479 | /// on their maximum value, adding `1` to it will always fit in a |
480 | /// `usize`, `u32` and a `i32`. |
481 | #[inline] |
482 | pub fn one_more(&self) -> usize { |
483 | self.0.one_more() |
484 | } |
485 | |
486 | /// Decode this value from the bytes given using the native endian |
487 | /// byte order for the current target. |
488 | /// |
489 | /// If the decoded integer is not representable as a small index |
490 | /// for the current target, then this returns an error. |
491 | #[inline] |
492 | pub fn from_ne_bytes(bytes: [u8; 4]) -> Result<$name, $err> { |
493 | SmallIndex::from_ne_bytes(bytes).map($name).map_err($err) |
494 | } |
495 | |
496 | /// Decode this value from the bytes given using the native endian |
497 | /// byte order for the current target. |
498 | /// |
499 | /// This is analogous to `new_unchecked` in that is does not check |
500 | /// whether the decoded integer is representable as a small index. |
501 | #[inline] |
502 | pub fn from_ne_bytes_unchecked(bytes: [u8; 4]) -> $name { |
503 | $name(SmallIndex::from_ne_bytes_unchecked(bytes)) |
504 | } |
505 | |
506 | /// Return the underlying integer as raw bytes in native endian |
507 | /// format. |
508 | #[inline] |
509 | pub fn to_ne_bytes(&self) -> [u8; 4] { |
510 | self.0.to_ne_bytes() |
511 | } |
512 | |
513 | /// Returns an iterator over all values from 0 up to and not |
514 | /// including the given length. |
515 | /// |
516 | /// If the given length exceeds this type's limit, then this |
517 | /// panics. |
518 | pub(crate) fn iter(len: usize) -> $iter { |
519 | $iter::new(len) |
520 | } |
521 | } |
522 | |
523 | // We write our own Debug impl so that we get things like PatternID(5) |
524 | // instead of PatternID(SmallIndex(5)). |
525 | impl core::fmt::Debug for $name { |
526 | fn fmt(&self, f: &mut core::fmt::Formatter) -> core::fmt::Result { |
527 | f.debug_tuple(stringify!($name)).field(&self.as_u32()).finish() |
528 | } |
529 | } |
530 | |
531 | impl<T> core::ops::Index<$name> for [T] { |
532 | type Output = T; |
533 | |
534 | #[inline] |
535 | fn index(&self, index: $name) -> &T { |
536 | &self[index.as_usize()] |
537 | } |
538 | } |
539 | |
540 | impl<T> core::ops::IndexMut<$name> for [T] { |
541 | #[inline] |
542 | fn index_mut(&mut self, index: $name) -> &mut T { |
543 | &mut self[index.as_usize()] |
544 | } |
545 | } |
546 | |
547 | impl<T> core::ops::Index<$name> for Vec<T> { |
548 | type Output = T; |
549 | |
550 | #[inline] |
551 | fn index(&self, index: $name) -> &T { |
552 | &self[index.as_usize()] |
553 | } |
554 | } |
555 | |
556 | impl<T> core::ops::IndexMut<$name> for Vec<T> { |
557 | #[inline] |
558 | fn index_mut(&mut self, index: $name) -> &mut T { |
559 | &mut self[index.as_usize()] |
560 | } |
561 | } |
562 | |
563 | impl From<SmallIndex> for $name { |
564 | fn from(index: SmallIndex) -> $name { |
565 | $name(index) |
566 | } |
567 | } |
568 | |
569 | impl From<u8> for $name { |
570 | fn from(value: u8) -> $name { |
571 | $name(SmallIndex::from(value)) |
572 | } |
573 | } |
574 | |
575 | impl TryFrom<u16> for $name { |
576 | type Error = $err; |
577 | |
578 | fn try_from(value: u16) -> Result<$name, $err> { |
579 | SmallIndex::try_from(value).map($name).map_err($err) |
580 | } |
581 | } |
582 | |
583 | impl TryFrom<u32> for $name { |
584 | type Error = $err; |
585 | |
586 | fn try_from(value: u32) -> Result<$name, $err> { |
587 | SmallIndex::try_from(value).map($name).map_err($err) |
588 | } |
589 | } |
590 | |
591 | impl TryFrom<u64> for $name { |
592 | type Error = $err; |
593 | |
594 | fn try_from(value: u64) -> Result<$name, $err> { |
595 | SmallIndex::try_from(value).map($name).map_err($err) |
596 | } |
597 | } |
598 | |
599 | impl TryFrom<usize> for $name { |
600 | type Error = $err; |
601 | |
602 | fn try_from(value: usize) -> Result<$name, $err> { |
603 | SmallIndex::try_from(value).map($name).map_err($err) |
604 | } |
605 | } |
606 | |
607 | /// This error occurs when an ID could not be constructed. |
608 | /// |
609 | /// This occurs when given an integer exceeding the maximum allowed |
610 | /// value. |
611 | /// |
612 | /// When the `std` feature is enabled, this implements the `Error` |
613 | /// trait. |
614 | #[derive(Clone, Debug, Eq, PartialEq)] |
615 | pub struct $err(SmallIndexError); |
616 | |
617 | impl $err { |
618 | /// Returns the value that could not be converted to an ID. |
619 | pub fn attempted(&self) -> u64 { |
620 | self.0.attempted() |
621 | } |
622 | } |
623 | |
624 | #[cfg(feature = "std" )] |
625 | impl std::error::Error for $err {} |
626 | |
627 | impl core::fmt::Display for $err { |
628 | fn fmt(&self, f: &mut core::fmt::Formatter) -> core::fmt::Result { |
629 | write!( |
630 | f, |
631 | "failed to create {} from {:?}, which exceeds {:?}" , |
632 | stringify!($name), |
633 | self.attempted(), |
634 | $name::MAX, |
635 | ) |
636 | } |
637 | } |
638 | |
639 | #[derive(Clone, Debug)] |
640 | pub(crate) struct $iter(SmallIndexIter); |
641 | |
642 | impl $iter { |
643 | fn new(len: usize) -> $iter { |
644 | assert!( |
645 | len <= $name::LIMIT, |
646 | "cannot create iterator for {} when number of \ |
647 | elements exceed {:?}" , |
648 | stringify!($name), |
649 | $name::LIMIT, |
650 | ); |
651 | $iter(SmallIndexIter { rng: 0..len }) |
652 | } |
653 | } |
654 | |
655 | impl Iterator for $iter { |
656 | type Item = $name; |
657 | |
658 | fn next(&mut self) -> Option<$name> { |
659 | self.0.next().map($name) |
660 | } |
661 | } |
662 | |
663 | /// An iterator adapter that is like std::iter::Enumerate, but attaches |
664 | /// small index values instead. It requires `ExactSizeIterator`. At |
665 | /// construction, it ensures that the index of each element in the |
666 | /// iterator is representable in the corresponding small index type. |
667 | #[derive(Clone, Debug)] |
668 | pub(crate) struct $withiter<I> { |
669 | it: I, |
670 | ids: $iter, |
671 | } |
672 | |
673 | impl<I: Iterator + ExactSizeIterator> $withiter<I> { |
674 | fn new(it: I) -> $withiter<I> { |
675 | let ids = $name::iter(it.len()); |
676 | $withiter { it, ids } |
677 | } |
678 | } |
679 | |
680 | impl<I: Iterator + ExactSizeIterator> Iterator for $withiter<I> { |
681 | type Item = ($name, I::Item); |
682 | |
683 | fn next(&mut self) -> Option<($name, I::Item)> { |
684 | let item = self.it.next()?; |
685 | // Number of elements in this iterator must match, according |
686 | // to contract of ExactSizeIterator. |
687 | let id = self.ids.next().unwrap(); |
688 | Some((id, item)) |
689 | } |
690 | } |
691 | }; |
692 | } |
693 | |
694 | /// The identifier of a pattern in an Aho-Corasick automaton. |
695 | /// |
696 | /// It is represented by a `u32` even on 64-bit systems in order to conserve |
697 | /// space. Namely, on all targets, this type guarantees that its value will |
698 | /// fit in a `u32`, `i32`, `usize` and an `isize`. This means that on 16-bit |
699 | /// targets, for example, this type's maximum value will never overflow an |
700 | /// `isize`, which means it will never overflow a `i16` even though its |
701 | /// internal representation is still a `u32`. |
702 | /// |
703 | /// # Safety |
704 | /// |
705 | /// While a `PatternID` is meant to guarantee that its value fits into `usize` |
706 | /// without using as much space as a `usize` on all targets, callers must |
707 | /// not rely on this property for safety. Callers may choose to rely on this |
708 | /// property for correctness however. For example, creating a `StateID` with an |
709 | /// invalid value can be done in entirely safe code. This may in turn result in |
710 | /// panics or silent logical errors. |
711 | #[derive (Clone, Copy, Default, Eq, Hash, PartialEq, PartialOrd, Ord)] |
712 | #[repr (transparent)] |
713 | pub struct PatternID(SmallIndex); |
714 | |
715 | /// The identifier of a finite automaton state. |
716 | /// |
717 | /// It is represented by a `u32` even on 64-bit systems in order to conserve |
718 | /// space. Namely, on all targets, this type guarantees that its value will |
719 | /// fit in a `u32`, `i32`, `usize` and an `isize`. This means that on 16-bit |
720 | /// targets, for example, this type's maximum value will never overflow an |
721 | /// `isize`, which means it will never overflow a `i16` even though its |
722 | /// internal representation is still a `u32`. |
723 | /// |
724 | /// # Safety |
725 | /// |
726 | /// While a `StateID` is meant to guarantee that its value fits into `usize` |
727 | /// without using as much space as a `usize` on all targets, callers must |
728 | /// not rely on this property for safety. Callers may choose to rely on this |
729 | /// property for correctness however. For example, creating a `StateID` with an |
730 | /// invalid value can be done in entirely safe code. This may in turn result in |
731 | /// panics or silent logical errors. |
732 | #[derive (Clone, Copy, Default, Eq, Hash, PartialEq, PartialOrd, Ord)] |
733 | #[repr (transparent)] |
734 | pub struct StateID(SmallIndex); |
735 | |
736 | index_type_impls!(PatternID, PatternIDError, PatternIDIter, WithPatternIDIter); |
737 | index_type_impls!(StateID, StateIDError, StateIDIter, WithStateIDIter); |
738 | |
739 | /// A utility trait that defines a couple of adapters for making it convenient |
740 | /// to access indices as "small index" types. We require ExactSizeIterator so |
741 | /// that iterator construction can do a single check to make sure the index of |
742 | /// each element is representable by its small index type. |
743 | pub(crate) trait IteratorIndexExt: Iterator { |
744 | fn with_pattern_ids(self) -> WithPatternIDIter<Self> |
745 | where |
746 | Self: Sized + ExactSizeIterator, |
747 | { |
748 | WithPatternIDIter::new(self) |
749 | } |
750 | |
751 | fn with_state_ids(self) -> WithStateIDIter<Self> |
752 | where |
753 | Self: Sized + ExactSizeIterator, |
754 | { |
755 | WithStateIDIter::new(self) |
756 | } |
757 | } |
758 | |
759 | impl<I: Iterator> IteratorIndexExt for I {} |
760 | |