dolysis/yary - Stateless Git Forge

Author	SHA1	Message	Date
Paul Stemmet	048550a7b1	lib/scanner: add unit tests for plain scalar token sequences	2021-08-14 19:59:07 +01:00
Paul Stemmet	5d8f78be25	lib/scanner: add support for plain scalars This commit adds the 3rd of the 5 possible scalar types in YAML to the scanner. It is compliant with the YAML spec, _except_ for its handling of "JSON like" keys, which allow for the following value token (e.g ':') to _not_ have a whitespace following it. I frankly find this exception absurd, as the spec _clearly_ half assed this in so that they can declare that they are a "strict super set of JSON", nevermind that _a lot_ of the semantics of _every_ other context for keys rely on a key being followed by whitespace. I may eventually return to this add it; I've a pretty good idea how -- we just need to keep track of the "last" token produced, as only ?'"]} characters would modify the behavior, but I'd need to make sure I haven't missed any subtle side effects, as almost all other key handling implicitly relies on: Key token === ": ".	2021-08-14 19:59:07 +01:00
Paul Stemmet	88ac017647	scalar/plain: fix handling of non EOF trailing whitespace before the loop would incorrectly update scalar_stats _after_ reaching a ': ' terminus. This is now fixed, as I check for the cases before reentering the word loop.	2021-08-14 19:59:07 +01:00
Paul Stemmet	b3e86dea9c	scalar/plain: add unit tests for scan_plain_scalar	2021-08-14 19:59:07 +01:00
Paul Stemmet	cddc1dae09	scalar/plain: add scan_plain_scalar the primary driver for scanning plain YAML scalars. This implementation tries to fit as closely as possible to the YAML spec, particularly in its handling of (the lack of) spacing requirements inside flow contexts, comment detection and special casing of - ? : as first character in flow contexts. Two things that are notably missing: 1. Proper tab '\t' handling in block context indentation 2. A sane maximum whitespace limit && better handling of whitespace storage. Rather than storing every whitespace given, I could instead count the whitespace separated by line breaks, and then add it back later, such that the maximum described above would apply to total line breaks, with the intervening whitespace stored as a u64/usize	2021-08-14 19:59:07 +01:00
Paul Stemmet	2aae6760f5	scalar/flow: use isDocumentIndicator! over longhand	2021-08-14 19:59:07 +01:00
Paul Stemmet	0fcc614771	scanner/macros: add isDocumentIndicator! short hand for checking '--- ' or '... ' sequences	2021-08-14 19:59:07 +01:00
Paul Stemmet	7d600cd29e	scanner/error: add variant InvalidPlainScalar	2021-08-14 19:59:07 +01:00
Paul Stemmet	ce7acbb754	lib/scanner: clippy lints	2021-08-08 10:59:11 +01:00
Paul Stemmet	71266f1530	lib/scanner: add tests for explicit key cases	2021-08-08 10:59:11 +01:00
Paul Stemmet	8558dada84	lib/scanner: add explicit key support to Scanner	2021-08-08 10:59:11 +01:00
Paul Stemmet	4c61af7eb9	scanner/error: add variant InvalidKey for catching cases where a key was given but not valid, typically involving explicit keys ('?')	2021-08-08 10:59:11 +01:00
Paul Stemmet	76be7001bb	lib/scanner: add test for zero indented sequence decrement	2021-08-08 10:59:11 +01:00
Paul Stemmet	5d0572d02d	lib/scanner: further fixes to zero indented sequence handling While the previous commit did add support for _adding_ zero indented sequences to the token stream, it unfortunately relied on the indent stack flush that happens once reaching end of stream to push the stored BlockEnd tokens. This commit adds better support for removing zero indented sequences from the stack once finished. The heuristic used here is: A zero_indented BlockSequence starts when: - The top stored indent is for a BlockMapping - A BlockEntry occupies the same indentation level And terminates when: - The top indent stored is a BlockSequence & is tagged as zero indented - A BlockEntry _does not_ occupy the same indentation level	2021-08-08 10:59:11 +01:00
Paul Stemmet	18d6430cc2	lib/scanner: produce token for zero indentation block sequence This fixes the edge case YAML allows where a sequence may be zero indented, but still start a sequence. E.g using the following YAML: key: - "one" - "two" The following tokens would have been produced (before this commit): StreamStart BlockMappingStart Key Scalar('key') Value BlockEntry Scalar('one') BlockEntry Scalar('two') BlockEnd StreamEnd Note the the lack of any indication that the values are in a sequence. Post commit, the following is produced: StreamStart BlockMappingStart Key Scalar('key') Value BlockSequenceStart <-- BlockEntry Scalar('one') BlockEntry Scalar('two') BlockEnd <-- BlockEnd StreamEnd	2021-08-08 10:59:11 +01:00
Paul Stemmet	76acbeba93	scanner/key: clippy lints	2021-08-01 17:21:17 +01:00
Paul Stemmet	29afcda3ee	lib/scanner: add tests for catching expected errors in stale keys	2021-08-01 17:21:17 +01:00
Paul Stemmet	cbde7ccb91	lib/scanner: fix Scanner.value ignoring key state Before we only checked for the existence of a saved key, but didn't also check that it was still valid / possible. This lead to a subtle error wherein scalar that weren't valid keys (anymore) would still be picked up and used.	2021-08-01 17:21:17 +01:00
Paul Stemmet	72e38d2100	lib/scanner: add tests for block collections	2021-08-01 17:21:17 +01:00
Paul Stemmet	5d6a023077	lib/scanner: add support for BlockEntry tokens to the Scanner	2021-08-01 17:21:17 +01:00
Paul Stemmet	5ba216147d	lib/scanner: update tests to expect BlockEnd tokens	2021-08-01 17:21:17 +01:00
Paul Stemmet	8773625259	lib/scanner: update scanner code to decrement indent Also reorganized the main loop of scan_next_token, adding comment placeholders for the remaining missing token fetchers	2021-08-01 17:21:17 +01:00
Paul Stemmet	e5b1bb0dac	scanner/context: allow passing indents directly to indent_decrement As we need to be able to reset the indents to nil/starting in the scanner	2021-08-01 17:21:17 +01:00
Paul Stemmet	90d2ac8a06	lib/scanner: document Scanner.value, allow bare ':' in flow context as the flow context does not require a Value (':') token is followed by whitespace of some kind	2021-08-01 09:37:23 +01:00
Paul Stemmet	349e62be0a	lib/scanner: move saved key check into value function as before we were double checking for the existence of a Value, once after parsing a scalar, and again when actually adding the Value token to the queue. This way we simplify the flow for scalar tokens, and stop doing unnecessary work	2021-08-01 09:37:23 +01:00
Paul Stemmet	458f806055	lib/scanner: add expire_stale_saved_key check so we don't allow keys that have expired to interfere, and we don't lose a key that is required	2021-08-01 09:37:23 +01:00
Paul Stemmet	46b616ea6f	scanner/error: add InvalidValue	2021-08-01 09:37:23 +01:00
Paul Stemmet	ce4deb76fa	lib/scanner: remove duplicate check from Scanner.value we already check this before calling the function	2021-08-01 09:37:23 +01:00
Paul Stemmet	b5fa29ca09	lib/scanner: fix tests ensure Key is produced before any node decorators, and ensure that a BlockMappingStart is produced if we've found a key/value	2021-08-01 08:12:47 +01:00
Paul Stemmet	6825e3ebd4	lib/scanner: ensure Scanner does not exit before Key resolution just loop until we either: 1. Have produced >1 tokens AND 2. A key isn't possible OR 3. We've produced stream end	2021-08-01 08:12:47 +01:00
Paul Stemmet	ebda074d66	lib/scanner: remove dead code	2021-08-01 08:12:47 +01:00
Paul Stemmet	0a36fc9e6e	lib/scanner: update block_collection_entry roll_indent call	2021-08-01 08:12:47 +01:00
Paul Stemmet	f8f8536631	lib/scanner: update ScanIter for Queue based token stream	2021-08-01 08:12:47 +01:00
Paul Stemmet	5935ba0056	lib/scanner: fix flow_scalar Key check	2021-08-01 08:12:47 +01:00
Paul Stemmet	1ac7eb556b	lib/scanner: fix un/roll_indent function defs and enqueue! tokens they produce	2021-08-01 08:12:47 +01:00
Paul Stemmet	0ee871dad5	lib/scanner: save keys across Scanner and fix a few function / type defs	2021-08-01 08:12:47 +01:00
Paul Stemmet	01039b62e2	lib/scanner: add save_key, remove_saved_key These are functions for saving/resetting potential implicit key positions	2021-08-01 08:12:47 +01:00
Paul Stemmet	aa7214ee35	lib/scanner: enqueue! tokens rather than tokens.push them. This gives me the flexibility to later make tokens a Trait, and only need to fix the macro rather than every call site	2021-08-01 08:12:47 +01:00
Paul Stemmet	971e2c76d4	lib/scanner: use simple_key_allowed over self.key_*	2021-08-01 08:12:47 +01:00
Paul Stemmet	cda72c58fa	lib/scanner: switch Tokens->Queue<TokenEntry>, add Scanner.simple_key_allowed	2021-08-01 08:12:47 +01:00
Paul Stemmet	87640bd1f4	scanner/key: refactor This commit completely rewrites the key subsystem of the Scanner. Rather than merely tracking whether a key could be added, Key now manages the state tracking for potential implicit keys.	2021-08-01 08:12:47 +01:00
Paul Stemmet	a82aa3c35d	scanner/macros: add enqueue!	2021-08-01 08:12:47 +01:00
Paul Stemmet	840868f82e	scanner/entry: a custom Ord Token wrapper A TokenEntry is designed as a wrapper for Tokens returned from the Scanner, ensuring that they are returned from the Queue in an order that mirrors where in the buffer the token was read. This will allow me to push Tokens out of order particularly when handling Keys and still have them returned in the expected order	2021-08-01 08:12:47 +01:00
Paul Stemmet	7e567aa8a9	lib/queue: add Queue, a stable min binary heap The structure will be how tokens are returned via the Scanner, over the current Vec. This change is occurring because: The genesis of this structure is a need in the Scanner for fast pops, and fast inserts. A binary heap gives me both, namely O(1) inserts and O(log(n)) pops -- with allocations amortized. This is because of how YAML handles implicit keys... in that you don't know whether you have one until you hit a value (': '). The easiest solution is just to save these potential implicit keys and then insert them into the token list at the correct position, but this would require memcopy'ing everything >key.pos and potentially cause many more reallocations than required. Enter the Queue. I couldn't just use std::BinaryHeap for two reasons: 1. Its a max heap 2. Its not stable, the order of equal elements is unspecified The Queue fixes both of these problems, first by innately using std::Reverse, and second by guaranteeing that equal elements are returned in the order added. These two attributes allow me to use Scanner.stats.read (number of bytes consumed so far) and a bit of elbow grease to get my tokens out in the right order.	2021-08-01 08:12:47 +01:00
Paul Stemmet	5212077ae8	lib/scanner: add tests for flow contexts	2021-08-01 08:12:47 +01:00
Paul Stemmet	24a8f2b211	lib/scanner: add test for simple flow sequence	2021-08-01 08:12:47 +01:00
Paul Stemmet	9e295b8f72	scanner/context: fix flow de/increment I forgot when removing the macro that I need to actually assign the computation to self.flow	2021-08-01 08:12:47 +01:00
Paul Stemmet	37221ad020	lib/scanner: add flow/block entry scan functions still need to add tests for them	2021-08-01 08:12:47 +01:00
Paul Stemmet	6b8965268c	lib/scanner: add un/roll_indent functions for handling the indent increment / decrement of the Scanner	2021-08-01 08:12:47 +01:00
Paul Stemmet	5ec3d0ae2b	scanner/error: add InvalidBlockEntry	2021-08-01 08:12:47 +01:00

1 2 3 4 5

220 commits