1 | */*!* |

2 | *Provides direct access to NFA implementations of Aho-Corasick.* |

3 | |

4 | *The principle characteristic of an NFA in this crate is that it may* |

5 | *transition through multiple states per byte of haystack. In Aho-Corasick* |

6 | *parlance, NFAs follow failure transitions during a search. In contrast,* |

7 | *a *[`DFA`](crate::dfa::DFA)* pre-computes all failure transitions during* |

8 | *compilation at the expense of a much bigger memory footprint.* |

9 | |

10 | *Currently, there are two NFA implementations provided: noncontiguous and* |

11 | *contiguous. The names reflect their internal representation, and consequently,* |

12 | *the trade offs associated with them:* |

13 | |

14 | ** A *[`noncontiguous::NFA`]* uses a separate allocation for every NFA state to* |

15 | *represent its transitions in a sparse format. This is ideal for building an* |

16 | *NFA, since it cheaply permits different states to have a different number of* |

17 | *transitions. A noncontiguous NFA is where the main Aho-Corasick construction* |

18 | *algorithm is implemented. All other Aho-Corasick implementations are built by* |

19 | *first constructing a noncontiguous NFA.* |

20 | ** A *[`contiguous::NFA`]* is uses a single allocation to represent all states,* |

21 | *while still encoding most states as sparse states but permitting states near* |

22 | *the starting state to have a dense representation. The dense representation* |

23 | *uses more memory, but permits computing transitions during a search more* |

24 | *quickly. By only making the most active states dense (the states near the* |

25 | *starting state), a contiguous NFA better balances memory usage with search* |

26 | *speed. The single contiguous allocation also uses less overhead per state and* |

27 | *enables compression tricks where most states only use 8 bytes of heap memory.* |

28 | |

29 | *When given the choice between these two, you almost always want to pick a* |

30 | *contiguous NFA. It takes only a little longer to build, but both its memory* |

31 | *usage and search speed are typically much better than a noncontiguous NFA. A* |

32 | *noncontiguous NFA is useful when prioritizing build times, or when there are* |

33 | *so many patterns that a contiguous NFA could not be built. (Currently, because* |

34 | *of both memory and search speed improvements, a contiguous NFA has a smaller* |

35 | *internal limit on the total number of NFA states it can represent. But you* |

36 | *would likely need to have hundreds of thousands or even millions of patterns* |

37 | *before you hit this limit.)* |

38 | **/* |

39 | **pub** **mod** contiguous; |

40 | **pub** **mod** noncontiguous; |

41 | |