Commit graph

110 commits

Author SHA1 Message Date
Paul Stemmet 53a8c8eccb scanner/key: adjustments for the API changes 2021-07-25 12:41:57 +01:00
Paul Stemmet 8986f36f00 scanner/tag: remove ref, return owned Slice variants
This is a part of an API changes I"ll be making, which will allow
allocations in the scanner code. This change is being made for a few
reasons.

1. Allows me to make the Scanner API nicer, as callers will only need
   to pass in the underlying data being scanned, and will not be tied to
   a mutable lifetime which limits them to scanning tokens one at a
   time.
2. Makes the code simpler, as I no longer need to ensure the mutable
   'owned' lifetime is honored throughout the call stack.
3. I'll need to allocate anyway for the indentation stack, and thus not
   allocating in other places that are sensible is less important.
2021-07-25 12:41:57 +01:00
Paul Stemmet 8f84972cd5 scalar: tidy syntax / includes 2021-07-25 12:41:57 +01:00
Paul Stemmet ea64559444 lib/token: add marker
This commit adds a Marker enum which mirrors the variants of Token, but
is data-less.
2021-07-25 12:41:57 +01:00
Paul Stemmet 37278ab219 WIP 2021-07-25 12:41:57 +01:00
Paul Stemmet cd7859f3d4 lib/scanner: save the scanned scalar's stats 2021-07-25 12:41:57 +01:00
Paul Stemmet 696b71b083 lib/scanner: remove reset_stale_keys, dbgs 2021-07-25 12:41:57 +01:00
Paul Stemmet edc70ed81a WIP 2021-07-25 12:41:57 +01:00
Paul Stemmet 7857839d6c scanner/scalar: add tests to catch trailing ws bugs 2021-07-25 12:41:57 +01:00
Paul Stemmet e416e3d0ab scanner/scalar: bugfix always count whitespace 2021-07-25 12:41:57 +01:00
Paul Stemmet a59527944b lib/scanner: add value token scanner, track keys 2021-07-25 12:41:57 +01:00
Paul Stemmet a0e184431f scanner/key: add, structs for managing key tokens
This module contains the beginnings of the state required to track and
store tokens which may be "found" out of sequence, notably when we first
need to parse a scalar (Token #1) then check if a value sequence follows
it (Token #2), and if so, return a key token first (Token #3), where the
correct order of tokens is:

    Token #3 -> Token #1 -> Token #2

We also need to track in the scanner whether a key is even possible, e.g
if we just parsed a value token, the next scalar _is not_ a key
2021-07-25 12:41:57 +01:00
Paul Stemmet 81975e197f scanner/scalar: return ScalarRange over Ref
this allows callers to decide when to convert the range into a Token
ref, which will be important when we need to save a scalar because we
need to return a Key Token first
2021-07-25 12:41:57 +01:00
Paul Stemmet ffcdce5961 lib/token: derive clone on style 2021-07-25 12:41:57 +01:00
Paul Stemmet cebd1d6e7d lib/scanner: add test for implicit key
So I can have a case to test my implementation against
2021-07-25 12:41:57 +01:00
Paul Stemmet 17f09e4d30
scanner/token: fix primary branch in scan_node_tag
we were double consuming a character, add a unit test to catch this
issue in the future
2021-06-29 22:50:19 +00:00
Paul Stemmet 298b15cad7 lib/scanner: document MStats 2021-06-29 23:14:30 +01:00
Paul Stemmet 6490de8974 lib/scanner: add stats test to unit tests 2021-06-29 23:14:30 +01:00
Paul Stemmet 393ce6372b scalar/flow: fix unit tests 2021-06-29 23:14:30 +01:00
Paul Stemmet 5e025374cc lib/scanner, scalar/flow: track stats in flow_scalar
Also fix the incorrect document stream indicator check now that we have
a column to check against
2021-06-29 23:14:30 +01:00
Paul Stemmet 32ada94850 lib/scanner: track stats in anchor 2021-06-29 23:14:30 +01:00
Paul Stemmet de2960a325 lib/scanner, scanner/tag: track stats in node,directive tags 2021-06-29 23:14:30 +01:00
Paul Stemmet 46fcd61aec lib/scanner: track stats in version directive 2021-06-29 23:14:30 +01:00
Paul Stemmet 911f861320 lib/scanner: track stats in document_marker 2021-06-29 23:14:30 +01:00
Paul Stemmet f2399e0eb4 lib/scanner: track stats in eat_whitespace 2021-06-29 23:14:30 +01:00
Paul Stemmet 69574b3628 scanner/macros: allow advance! to optionally update :stats 2021-06-29 23:14:30 +01:00
Paul Stemmet cb6d64dfc7 lib/scanner: add MStats
A struct for doing book keeping about where we are in the buffer:

1. How much we've read
2. How many lines we've seen
3. The current column

I'll likely add variants to advance!, as its the primary method used to
traverse the buffer

This will likely be passed as an extra argument down the various scan
call stacks, and care will need to be taken to ensure we're handling
line breaks correctly (because I bet we're not currently)

Tests will need to be updated to test that we're getting the stats we
expect.
2021-06-29 23:14:30 +01:00
Paul Stemmet 8be1ca8329 lib/scanner: fix tokens! ScanIter lifetimes 2021-06-29 23:14:30 +01:00
Paul Stemmet af8b18c781 lib/scanner: clippy lints 2021-06-29 16:01:11 +01:00
Paul Stemmet 5eb1e739c0 lib/scanner: add unit tests for tag + flow scalar scanning
and fix two tests relating to Scanner.eat_whitespace
2021-06-29 16:01:11 +01:00
Paul Stemmet 13d7091939 lib/scanner: add flow scalar to next_token
both double and single quoted variants
2021-06-29 16:01:11 +01:00
Paul Stemmet 26cd16403b lib/scanner: add tag scan to next_token
An unfortunate glitch in the compiler requires that I use a match
statement over fall through if guards, as the borrow checker is to
restrictive, and will not allow the code to compile despite being
clearly correct.

This required me to lift the stream checks into next_token, which means
we now have a redundant check.  Hopefully in the future the borrow
checker will become smarter.

This commit also refactors stream checks to use a const identifier shared
between the call site in next_token and the function proper, and misc
changes to Scanner.eat_whitespace
2021-06-29 16:01:11 +01:00
Paul Stemmet e5dda1467d scalar/flow: make scan_flow_scalar public in lib/scanner 2021-06-29 16:01:11 +01:00
Paul Stemmet 2654799739 lib/scanner: refactor tag directive scan to use scan_tag_directive 2021-06-29 16:01:11 +01:00
Paul Stemmet 09fc128545 scanner/tag: add scan_tag_directive, scan_node_tag
For scanning %TAG directive handle/prefixes and node !!tag
handle/suffixes.
2021-06-29 16:01:11 +01:00
Paul Stemmet 1f2f9b507e scanner/error: add InvalidTagSuffix variant 2021-06-29 16:01:11 +01:00
Paul Stemmet 07fda2c8a2 lib/scanner: refactor eat_whitespace into a free function
As we'll be using it throughout the various scanner/* modules, not just
on the scanner struct itself.

This commit also improves eat_whitespace to chomp any valid YAML
whitespace, not just newlines and spaces.
2021-06-29 16:01:11 +01:00
Paul Stemmet 9db21c0e23 scanner/macros: add advance! @line variant
for chomping a YAML line break
2021-06-29 16:01:11 +01:00
Paul Stemmet 157040bff0 lib/scanner: refactor tag directive scanner to use scanner/tag functions
This change to is made to allow reuse of the underlying functions
between tag directive and node tag scanning.
2021-06-29 16:01:11 +01:00
Paul Stemmet f14a843549 scanner/tag: add scan_tag_uri, scan_tag_handle
Split out tag scanning functions into their own module.

This commit includes two functions scan_tag_uri, and scan_tag_handle for
process prefix/suffix'es and handles respectively.

Note that these functions by themselves cannot properly parse either %TAG
directives or YAML node tags; but higher level functions can use these to
correctly scan both.
2021-06-29 16:01:11 +01:00
Paul Stemmet 206ef90575 scalar/flow: update isBlankZ! -> isWhiteSpaceZ! 2021-06-29 16:01:11 +01:00
Paul Stemmet 6a4649c10f scanner/macros: rename isBlankZ! -> isWhiteSpaceZ!, add isWhiteSpace!
This makes the naming more consistent, we now have isBreak! variants for
line breaks, isBlank! variants for space and isWhiteSpace! variants for
both.

This commit also adds $error variants to isBlank! and isBreak! to remain
consistent with isWhiteSpace!.
2021-06-29 16:01:11 +01:00
Paul Stemmet 95af7eb5b0 lib/scanner: clippy fixes 2021-06-27 17:35:02 +01:00
Paul Stemmet 89b7480cde lib/scanner: add unit test for tag directive escapes 2021-06-27 17:35:02 +01:00
Paul Stemmet 9a29c29f59 lib/scanner: update tokens! to use ScanIter
as we can no longer uphold the Iterator contract on Scanner directly, as
next_token now requires it be given a scratch space handle; we move the
iterator impl onto a separate struct, ScanIter.

For the moment, this struct is private but this may change in the
future.
2021-06-27 17:35:02 +01:00
Paul Stemmet 2dd5042fd6 lib/scanner: rewrite tag directive scan, return Ref over Token
This commit takes the first steps towards the final API of Scanner,
wherein it returns Result<Ref>s over Result<Token>s. This change allows
the struct to access a scratch space which it can and will use when
borrowing from the underlying data is impossible, such as when
encountering escape sequences (which must be unescaped), line joining,
or data (type) transformation.

Regardless, these changes were required to correctly handle escape
sequences in tag directives.

Note that the test suite for lib/scanner is broken as of this commit, it
will be fixed in the next.
2021-06-27 17:35:02 +01:00
Paul Stemmet d6f0c71e71 scanner/scalar: make submodules public (to scanner) 2021-06-27 17:35:02 +01:00
Paul Stemmet cd3e7beb1e lib/token: add helper methods to Token + Ref
- Token.borrowed + Token.copied for wrapping into Ref::Borrowed/Copied
- Ref.PartialEq<Token> for comparisons
2021-06-27 17:35:02 +01:00
Paul Stemmet da9e4f14e8 scalar/escape: clippy lints 2021-06-26 09:24:08 +01:00
Paul Stemmet e1d79d4851 scalar/escape: add more unit tests for tag_uri_unescape 2021-06-26 09:24:08 +01:00