As we'll be using it throughout the various scanner/* modules, not just
on the scanner struct itself.
This commit also improves eat_whitespace to chomp any valid YAML
whitespace, not just newlines and spaces.
Split out tag scanning functions into their own module.
This commit includes two functions scan_tag_uri, and scan_tag_handle for
process prefix/suffix'es and handles respectively.
Note that these functions by themselves cannot properly parse either %TAG
directives or YAML node tags; but higher level functions can use these to
correctly scan both.
This makes the naming more consistent, we now have isBreak! variants for
line breaks, isBlank! variants for space and isWhiteSpace! variants for
both.
This commit also adds $error variants to isBlank! and isBreak! to remain
consistent with isWhiteSpace!.
as we can no longer uphold the Iterator contract on Scanner directly, as
next_token now requires it be given a scratch space handle; we move the
iterator impl onto a separate struct, ScanIter.
For the moment, this struct is private but this may change in the
future.
This commit takes the first steps towards the final API of Scanner,
wherein it returns Result<Ref>s over Result<Token>s. This change allows
the struct to access a scratch space which it can and will use when
borrowing from the underlying data is impossible, such as when
encountering escape sequences (which must be unescaped), line joining,
or data (type) transformation.
Regardless, these changes were required to correctly handle escape
sequences in tag directives.
Note that the test suite for lib/scanner is broken as of this commit, it
will be fixed in the next.
This function unescapes percent encoded tag URIs, in accordance with
Section 5.6 Miscellaneous Characters #ns-uri-char
- Also add unit test for function
Ref is a type that allows us to discriminate between different lifetimes
specifically, whether the underlying Token is borrowed from the data, or
borrowed from the scratch space.
This buys us the ability to attempt zero copy deserialization, but fall
back to copying if required.
- On suspicious else, this is simply part of how I format this repo,
so this style lint will be ignored
- On the manual range impl, inclusive ranges in Rust are slower than
writing them out by hand by a significant amount
- cow! is useful for non test
- check! is a macro for determining if a buffer has the given byte at
the given pos (default 0)
- also add an advance! variant that returns the removed slice