1//! # nom, eating data byte by byte
2//!
3//! nom is a parser combinator library with a focus on safe parsing,
4//! streaming patterns, and as much as possible zero copy.
5//!
6//! ## Example
7//!
8//! ```rust
9//! use nom::{
10//! IResult,
11//! bytes::complete::{tag, take_while_m_n},
12//! combinator::map_res,
13//! sequence::tuple};
14//!
15//! #[derive(Debug,PartialEq)]
16//! pub struct Color {
17//! pub red: u8,
18//! pub green: u8,
19//! pub blue: u8,
20//! }
21//!
22//! fn from_hex(input: &str) -> Result<u8, std::num::ParseIntError> {
23//! u8::from_str_radix(input, 16)
24//! }
25//!
26//! fn is_hex_digit(c: char) -> bool {
27//! c.is_digit(16)
28//! }
29//!
30//! fn hex_primary(input: &str) -> IResult<&str, u8> {
31//! map_res(
32//! take_while_m_n(2, 2, is_hex_digit),
33//! from_hex
34//! )(input)
35//! }
36//!
37//! fn hex_color(input: &str) -> IResult<&str, Color> {
38//! let (input, _) = tag("#")(input)?;
39//! let (input, (red, green, blue)) = tuple((hex_primary, hex_primary, hex_primary))(input)?;
40//!
41//! Ok((input, Color { red, green, blue }))
42//! }
43//!
44//! fn main() {
45//! assert_eq!(hex_color("#2F14DF"), Ok(("", Color {
46//! red: 47,
47//! green: 20,
48//! blue: 223,
49//! })));
50//! }
51//! ```
52//!
53//! The code is available on [Github](https://github.com/Geal/nom)
54//!
55//! There are a few [guides](https://github.com/Geal/nom/tree/main/doc) with more details
56//! about [how to write parsers](https://github.com/Geal/nom/blob/main/doc/making_a_new_parser_from_scratch.md),
57//! or the [error management system](https://github.com/Geal/nom/blob/main/doc/error_management.md).
58//! You can also check out the [recipes] module that contains examples of common patterns.
59//!
60//! **Looking for a specific combinator? Read the
61//! ["choose a combinator" guide](https://github.com/Geal/nom/blob/main/doc/choosing_a_combinator.md)**
62//!
63//! If you are upgrading to nom 5.0, please read the
64//! [migration document](https://github.com/Geal/nom/blob/main/doc/upgrading_to_nom_5.md).
65//!
66//! ## Parser combinators
67//!
68//! Parser combinators are an approach to parsers that is very different from
69//! software like [lex](https://en.wikipedia.org/wiki/Lex_(software)) and
70//! [yacc](https://en.wikipedia.org/wiki/Yacc). Instead of writing the grammar
71//! in a separate syntax and generating the corresponding code, you use very small
72//! functions with very specific purposes, like "take 5 bytes", or "recognize the
73//! word 'HTTP'", and assemble them in meaningful patterns like "recognize
74//! 'HTTP', then a space, then a version".
75//! The resulting code is small, and looks like the grammar you would have
76//! written with other parser approaches.
77//!
78//! This gives us a few advantages:
79//!
80//! - The parsers are small and easy to write
81//! - The parsers components are easy to reuse (if they're general enough, please add them to nom!)
82//! - The parsers components are easy to test separately (unit tests and property-based tests)
83//! - The parser combination code looks close to the grammar you would have written
84//! - You can build partial parsers, specific to the data you need at the moment, and ignore the rest
85//!
86//! Here is an example of one such parser, to recognize text between parentheses:
87//!
88//! ```rust
89//! use nom::{
90//! IResult,
91//! sequence::delimited,
92//! // see the "streaming/complete" paragraph lower for an explanation of these submodules
93//! character::complete::char,
94//! bytes::complete::is_not
95//! };
96//!
97//! fn parens(input: &str) -> IResult<&str, &str> {
98//! delimited(char('('), is_not(")"), char(')'))(input)
99//! }
100//! ```
101//!
102//! It defines a function named `parens` which will recognize a sequence of the
103//! character `(`, the longest byte array not containing `)`, then the character
104//! `)`, and will return the byte array in the middle.
105//!
106//! Here is another parser, written without using nom's combinators this time:
107//!
108//! ```rust
109//! use nom::{IResult, Err, Needed};
110//!
111//! # fn main() {
112//! fn take4(i: &[u8]) -> IResult<&[u8], &[u8]>{
113//! if i.len() < 4 {
114//! Err(Err::Incomplete(Needed::new(4)))
115//! } else {
116//! Ok((&i[4..], &i[0..4]))
117//! }
118//! }
119//! # }
120//! ```
121//!
122//! This function takes a byte array as input, and tries to consume 4 bytes.
123//! Writing all the parsers manually, like this, is dangerous, despite Rust's
124//! safety features. There are still a lot of mistakes one can make. That's why
125//! nom provides a list of functions to help in developing parsers.
126//!
127//! With functions, you would write it like this:
128//!
129//! ```rust
130//! use nom::{IResult, bytes::streaming::take};
131//! fn take4(input: &str) -> IResult<&str, &str> {
132//! take(4u8)(input)
133//! }
134//! ```
135//!
136//! A parser in nom is a function which, for an input type `I`, an output type `O`
137//! and an optional error type `E`, will have the following signature:
138//!
139//! ```rust,compile_fail
140//! fn parser(input: I) -> IResult<I, O, E>;
141//! ```
142//!
143//! Or like this, if you don't want to specify a custom error type (it will be `(I, ErrorKind)` by default):
144//!
145//! ```rust,compile_fail
146//! fn parser(input: I) -> IResult<I, O>;
147//! ```
148//!
149//! `IResult` is an alias for the `Result` type:
150//!
151//! ```rust
152//! use nom::{Needed, error::Error};
153//!
154//! type IResult<I, O, E = Error<I>> = Result<(I, O), Err<E>>;
155//!
156//! enum Err<E> {
157//! Incomplete(Needed),
158//! Error(E),
159//! Failure(E),
160//! }
161//! ```
162//!
163//! It can have the following values:
164//!
165//! - A correct result `Ok((I,O))` with the first element being the remaining of the input (not parsed yet), and the second the output value;
166//! - An error `Err(Err::Error(c))` with `c` an error that can be built from the input position and a parser specific error
167//! - An error `Err(Err::Incomplete(Needed))` indicating that more input is necessary. `Needed` can indicate how much data is needed
168//! - An error `Err(Err::Failure(c))`. It works like the `Error` case, except it indicates an unrecoverable error: We cannot backtrack and test another parser
169//!
170//! Please refer to the ["choose a combinator" guide](https://github.com/Geal/nom/blob/main/doc/choosing_a_combinator.md) for an exhaustive list of parsers.
171//! See also the rest of the documentation [here](https://github.com/Geal/nom/blob/main/doc).
172//!
173//! ## Making new parsers with function combinators
174//!
175//! nom is based on functions that generate parsers, with a signature like
176//! this: `(arguments) -> impl Fn(Input) -> IResult<Input, Output, Error>`.
177//! The arguments of a combinator can be direct values (like `take` which uses
178//! a number of bytes or character as argument) or even other parsers (like
179//! `delimited` which takes as argument 3 parsers, and returns the result of
180//! the second one if all are successful).
181//!
182//! Here are some examples:
183//!
184//! ```rust
185//! use nom::IResult;
186//! use nom::bytes::complete::{tag, take};
187//! fn abcd_parser(i: &str) -> IResult<&str, &str> {
188//! tag("abcd")(i) // will consume bytes if the input begins with "abcd"
189//! }
190//!
191//! fn take_10(i: &[u8]) -> IResult<&[u8], &[u8]> {
192//! take(10u8)(i) // will consume and return 10 bytes of input
193//! }
194//! ```
195//!
196//! ## Combining parsers
197//!
198//! There are higher level patterns, like the **`alt`** combinator, which
199//! provides a choice between multiple parsers. If one branch fails, it tries
200//! the next, and returns the result of the first parser that succeeds:
201//!
202//! ```rust
203//! use nom::IResult;
204//! use nom::branch::alt;
205//! use nom::bytes::complete::tag;
206//!
207//! let mut alt_tags = alt((tag("abcd"), tag("efgh")));
208//!
209//! assert_eq!(alt_tags(&b"abcdxxx"[..]), Ok((&b"xxx"[..], &b"abcd"[..])));
210//! assert_eq!(alt_tags(&b"efghxxx"[..]), Ok((&b"xxx"[..], &b"efgh"[..])));
211//! assert_eq!(alt_tags(&b"ijklxxx"[..]), Err(nom::Err::Error((&b"ijklxxx"[..], nom::error::ErrorKind::Tag))));
212//! ```
213//!
214//! The **`opt`** combinator makes a parser optional. If the child parser returns
215//! an error, **`opt`** will still succeed and return None:
216//!
217//! ```rust
218//! use nom::{IResult, combinator::opt, bytes::complete::tag};
219//! fn abcd_opt(i: &[u8]) -> IResult<&[u8], Option<&[u8]>> {
220//! opt(tag("abcd"))(i)
221//! }
222//!
223//! assert_eq!(abcd_opt(&b"abcdxxx"[..]), Ok((&b"xxx"[..], Some(&b"abcd"[..]))));
224//! assert_eq!(abcd_opt(&b"efghxxx"[..]), Ok((&b"efghxxx"[..], None)));
225//! ```
226//!
227//! **`many0`** applies a parser 0 or more times, and returns a vector of the aggregated results:
228//!
229//! ```rust
230//! # #[cfg(feature = "alloc")]
231//! # fn main() {
232//! use nom::{IResult, multi::many0, bytes::complete::tag};
233//! use std::str;
234//!
235//! fn multi(i: &str) -> IResult<&str, Vec<&str>> {
236//! many0(tag("abcd"))(i)
237//! }
238//!
239//! let a = "abcdef";
240//! let b = "abcdabcdef";
241//! let c = "azerty";
242//! assert_eq!(multi(a), Ok(("ef", vec!["abcd"])));
243//! assert_eq!(multi(b), Ok(("ef", vec!["abcd", "abcd"])));
244//! assert_eq!(multi(c), Ok(("azerty", Vec::new())));
245//! # }
246//! # #[cfg(not(feature = "alloc"))]
247//! # fn main() {}
248//! ```
249//!
250//! Here are some basic combinators available:
251//!
252//! - **`opt`**: Will make the parser optional (if it returns the `O` type, the new parser returns `Option<O>`)
253//! - **`many0`**: Will apply the parser 0 or more times (if it returns the `O` type, the new parser returns `Vec<O>`)
254//! - **`many1`**: Will apply the parser 1 or more times
255//!
256//! There are more complex (and more useful) parsers like `tuple`, which is
257//! used to apply a series of parsers then assemble their results.
258//!
259//! Example with `tuple`:
260//!
261//! ```rust
262//! # fn main() {
263//! use nom::{error::ErrorKind, Needed,
264//! number::streaming::be_u16,
265//! bytes::streaming::{tag, take},
266//! sequence::tuple};
267//!
268//! let mut tpl = tuple((be_u16, take(3u8), tag("fg")));
269//!
270//! assert_eq!(
271//! tpl(&b"abcdefgh"[..]),
272//! Ok((
273//! &b"h"[..],
274//! (0x6162u16, &b"cde"[..], &b"fg"[..])
275//! ))
276//! );
277//! assert_eq!(tpl(&b"abcde"[..]), Err(nom::Err::Incomplete(Needed::new(2))));
278//! let input = &b"abcdejk"[..];
279//! assert_eq!(tpl(input), Err(nom::Err::Error((&input[5..], ErrorKind::Tag))));
280//! # }
281//! ```
282//!
283//! But you can also use a sequence of combinators written in imperative style,
284//! thanks to the `?` operator:
285//!
286//! ```rust
287//! # fn main() {
288//! use nom::{IResult, bytes::complete::tag};
289//!
290//! #[derive(Debug, PartialEq)]
291//! struct A {
292//! a: u8,
293//! b: u8
294//! }
295//!
296//! fn ret_int1(i:&[u8]) -> IResult<&[u8], u8> { Ok((i,1)) }
297//! fn ret_int2(i:&[u8]) -> IResult<&[u8], u8> { Ok((i,2)) }
298//!
299//! fn f(i: &[u8]) -> IResult<&[u8], A> {
300//! // if successful, the parser returns `Ok((remaining_input, output_value))` that we can destructure
301//! let (i, _) = tag("abcd")(i)?;
302//! let (i, a) = ret_int1(i)?;
303//! let (i, _) = tag("efgh")(i)?;
304//! let (i, b) = ret_int2(i)?;
305//!
306//! Ok((i, A { a, b }))
307//! }
308//!
309//! let r = f(b"abcdefghX");
310//! assert_eq!(r, Ok((&b"X"[..], A{a: 1, b: 2})));
311//! # }
312//! ```
313//!
314//! ## Streaming / Complete
315//!
316//! Some of nom's modules have `streaming` or `complete` submodules. They hold
317//! different variants of the same combinators.
318//!
319//! A streaming parser assumes that we might not have all of the input data.
320//! This can happen with some network protocol or large file parsers, where the
321//! input buffer can be full and need to be resized or refilled.
322//!
323//! A complete parser assumes that we already have all of the input data.
324//! This will be the common case with small files that can be read entirely to
325//! memory.
326//!
327//! Here is how it works in practice:
328//!
329//! ```rust
330//! use nom::{IResult, Err, Needed, error::{Error, ErrorKind}, bytes, character};
331//!
332//! fn take_streaming(i: &[u8]) -> IResult<&[u8], &[u8]> {
333//! bytes::streaming::take(4u8)(i)
334//! }
335//!
336//! fn take_complete(i: &[u8]) -> IResult<&[u8], &[u8]> {
337//! bytes::complete::take(4u8)(i)
338//! }
339//!
340//! // both parsers will take 4 bytes as expected
341//! assert_eq!(take_streaming(&b"abcde"[..]), Ok((&b"e"[..], &b"abcd"[..])));
342//! assert_eq!(take_complete(&b"abcde"[..]), Ok((&b"e"[..], &b"abcd"[..])));
343//!
344//! // if the input is smaller than 4 bytes, the streaming parser
345//! // will return `Incomplete` to indicate that we need more data
346//! assert_eq!(take_streaming(&b"abc"[..]), Err(Err::Incomplete(Needed::new(1))));
347//!
348//! // but the complete parser will return an error
349//! assert_eq!(take_complete(&b"abc"[..]), Err(Err::Error(Error::new(&b"abc"[..], ErrorKind::Eof))));
350//!
351//! // the alpha0 function recognizes 0 or more alphabetic characters
352//! fn alpha0_streaming(i: &str) -> IResult<&str, &str> {
353//! character::streaming::alpha0(i)
354//! }
355//!
356//! fn alpha0_complete(i: &str) -> IResult<&str, &str> {
357//! character::complete::alpha0(i)
358//! }
359//!
360//! // if there's a clear limit to the recognized characters, both parsers work the same way
361//! assert_eq!(alpha0_streaming("abcd;"), Ok((";", "abcd")));
362//! assert_eq!(alpha0_complete("abcd;"), Ok((";", "abcd")));
363//!
364//! // but when there's no limit, the streaming version returns `Incomplete`, because it cannot
365//! // know if more input data should be recognized. The whole input could be "abcd;", or
366//! // "abcde;"
367//! assert_eq!(alpha0_streaming("abcd"), Err(Err::Incomplete(Needed::new(1))));
368//!
369//! // while the complete version knows that all of the data is there
370//! assert_eq!(alpha0_complete("abcd"), Ok(("", "abcd")));
371//! ```
372//! **Going further:** Read the [guides](https://github.com/Geal/nom/tree/main/doc),
373//! check out the [recipes]!
374#![cfg_attr(not(feature = "std"), no_std)]
375#![cfg_attr(feature = "cargo-clippy", allow(clippy::doc_markdown))]
376#![cfg_attr(feature = "docsrs", feature(doc_cfg))]
377#![cfg_attr(feature = "docsrs", feature(extended_key_value_attributes))]
378#![deny(missing_docs)]
379#[cfg_attr(nightly, warn(rustdoc::missing_doc_code_examples))]
380#[cfg(feature = "alloc")]
381#[macro_use]
382extern crate alloc;
383#[cfg(doctest)]
384extern crate doc_comment;
385
386#[cfg(doctest)]
387doc_comment::doctest!("../README.md");
388
389/// Lib module to re-export everything needed from `std` or `core`/`alloc`. This is how `serde` does
390/// it, albeit there it is not public.
391#[cfg_attr(nightly, allow(rustdoc::missing_doc_code_examples))]
392pub mod lib {
393 /// `std` facade allowing `std`/`core` to be interchangeable. Reexports `alloc` crate optionally,
394 /// as well as `core` or `std`
395 #[cfg(not(feature = "std"))]
396 #[cfg_attr(nightly, allow(rustdoc::missing_doc_code_examples))]
397 /// internal std exports for no_std compatibility
398 pub mod std {
399 #[doc(hidden)]
400 #[cfg(not(feature = "alloc"))]
401 pub use core::borrow;
402
403 #[cfg(feature = "alloc")]
404 #[doc(hidden)]
405 pub use alloc::{borrow, boxed, string, vec};
406
407 #[doc(hidden)]
408 pub use core::{cmp, convert, fmt, iter, mem, ops, option, result, slice, str};
409
410 /// internal reproduction of std prelude
411 #[doc(hidden)]
412 pub mod prelude {
413 pub use core::prelude as v1;
414 }
415 }
416
417 #[cfg(feature = "std")]
418 #[cfg_attr(nightly, allow(rustdoc::missing_doc_code_examples))]
419 /// internal std exports for no_std compatibility
420 pub mod std {
421 #[doc(hidden)]
422 pub use std::{
423 alloc, borrow, boxed, cmp, collections, convert, fmt, hash, iter, mem, ops, option, result,
424 slice, str, string, vec,
425 };
426
427 /// internal reproduction of std prelude
428 #[doc(hidden)]
429 pub mod prelude {
430 pub use std::prelude as v1;
431 }
432 }
433}
434
435pub use self::bits::*;
436pub use self::internal::*;
437pub use self::traits::*;
438
439pub use self::str::*;
440
441#[macro_use]
442mod macros;
443#[macro_use]
444pub mod error;
445
446pub mod branch;
447pub mod combinator;
448mod internal;
449pub mod multi;
450pub mod sequence;
451mod traits;
452
453pub mod bits;
454pub mod bytes;
455
456pub mod character;
457
458mod str;
459
460pub mod number;
461
462#[cfg(feature = "docsrs")]
463#[cfg_attr(feature = "docsrs", cfg_attr(feature = "docsrs", doc = include_str!("../doc/nom_recipes.md")))]
464pub mod recipes {}
465