When checking for a terminating sequence in plain scalars, we either
need a flow indicator (in flow contexts only), or a ': ' byte sequence,
where the space can be any valid YAML whitespace.
The issue here is that the lazy variant was correctly identifying the
terminating sequence, _but not recording it_ for the Deferred's slice.
This commit fixes that, ensuring we always record the final 1 or 2 bytes
before exiting the main loop.
to remain consistent with scanner/tests, also derive the base TEST_FLAGS
from scanner/tests.TEST_FLAGS, minus options that do not make sense for
the test battery (O_EXTENDABLE)
This makes explicit what was happening under the hood with the
'cxt.indent() + 0' expression. It also clearly describes the
circumstances in which it is possible to use the function safely
Here we refactor the main functionality of the module into
scan_flow_scalar_eager, with scan_flow_scalar delegating to it
-- or scan_flow_scalar_lazy -- depending on whether O_LAZY is set.
In essence, this allows us to test the Scanner's ability to handle
chunked byte streams, hooking directly into the existing test suite.
It has three levels large, medium and small where large is probably the
smallest buffer size + increment that could be considered reasonable
(4k/64), with the smaller two testing absurd buffers (8/8 and 1/1).
this simply prevents state corruption in the Scanner by waiting to make
the changes until _after_ any errors would have been returned.
While this works, its not immediately obvious in the code why the
operations are ordered the way they are. I should document this
probably.
before there was a subtle error when eating whitespace wherein the
whitespace could be eaten twice, which corrupts the Scanner.stats.
Now we ensure that any movement is captured before returning the error
to the caller
cache! allows the Scanner to state that it requires 'N' more codepoints
before it can correctly process the byte stream.
Its primary purpose is its interaction with O_EXTENDABLE, which allows
the caller to hint to the Scanner that the buffer could grow, likewise
cache! returns an error that hints to the caller that they should extend
the byte stream before calling the Scanner again -- or pass opts without
O_EXTENDABLE.
This struct is a C style bitflag container, which controls various
aspects of Scanner functionality.
The initial flags available are O_ZEROED, O_EXTENDABLE and O_LAZY. Read
each's documentation for an explanation.
This split allows future maintainers (i.e: me) to quickly know whether a
function handles the conversion of bytes into tokens -- scan_* function
family -- or handles updating the Scanner's state -- the fetch_*
function family.
Typically one might thing of the call stack as:
1. a Scanner
2. fetches a token
3. by scanning the byte stream
and refactor out the scanning code into scan_directive which is called
from the relevant Scanner method. This makes directive scanning more
consistent with the other scanning functions