dolysis/yary - Stateless Git Forge

Commit Graph

Author	SHA1	Message	Date
Paul Stemmet	0663bebd0c	scanner/error: add variant Extend This variant suggests to the caller that they should extend the byte stream before calling the Scanner again.	2021-09-09 20:29:29 +01:00
Paul Stemmet	0b023bd062	scanner/flag: add Flags for Scanner control This struct is a C style bitflag container, which controls various aspects of Scanner functionality. The initial flags available are O_ZEROED, O_EXTENDABLE and O_LAZY. Read each's documentation for an explanation.	2021-09-09 20:29:29 +01:00
Paul Stemmet	6147424c30	Cargo: add dependencies.bitflags = 1	2021-09-09 20:29:29 +01:00
Paul Stemmet	c1aeb0d3f0	Cargo: dependencies.anyhow -> dev-dependencies.anyhow anyhow is used for testing, not in library code.	2021-09-07 18:32:26 +01:00
Paul Stemmet	da15105e1d	lib: prune dead module reader The code here was from earlier experiments and is not relevant or useful anymore.	2021-09-07 18:32:26 +01:00
Paul Stemmet	82a6e70d8b	lib/scanner: prune dead documentation	2021-09-07 18:18:53 +01:00
Paul Stemmet	c24f9ef286	lib/scanner: move Scanner.eat_whitespace out of fetch_* methods put with the other various "helper" functions	2021-09-07 18:18:53 +01:00
Paul Stemmet	2afc2b2606	lib/scanner: rename Scanner token retrieval methods to fetch_* This split allows future maintainers (i.e: me) to quickly know whether a function handles the conversion of bytes into tokens -- scan_* function family -- or handles updating the Scanner's state -- the fetch_* function family. Typically one might thing of the call stack as: 1. a Scanner 2. fetches a token 3. by scanning the byte stream	2021-09-07 18:18:53 +01:00
Paul Stemmet	0f76f9bb08	lib/scanner: move test code into scanner/tests ScanIter was never supposed to be used outside of tests	2021-09-07 18:18:53 +01:00
Paul Stemmet	633e461f4a	scanner/anchor: refactor anchor scanning into its own module and add scan_anchor which the relevant Scanner method calls to scan the anchor, bringing it more inline with other scan_* functions	2021-09-07 18:18:53 +01:00
Paul Stemmet	829f5c0e81	lib/scanner: merge crate:: and self:: use statements Not sure why rustfmt decided to do imports that way, but I prefer a single block.	2021-09-07 18:18:53 +01:00
Paul Stemmet	4bc2eb5c9f	scanner/directive: move directive scanning to a separate module and refactor out the scanning code into scan_directive which is called from the relevant Scanner method. This makes directive scanning more consistent with the other scanning functions	2021-09-07 18:18:53 +01:00
Paul Stemmet	842ed7cacb	scanner/stats: move MStats into its own module and update the various use statements, plus better documentation on the fields / methods	2021-09-07 18:18:53 +01:00
Paul Stemmet	1ed9d45344	lib/scanner: use const indicators over byte literals	2021-09-07 18:18:53 +01:00
Paul Stemmet	0b343ffa72	lib/scanner: refactor tests Split out the _massive_ tests module into smaller focused modules, one per area, explained below: - anchor \| For anchor '&' and alias '*' node tags - collection \| For flow and block collections - complex \| For interactions between token types - directive \| For directives '%' - document \| For doc starts '---' and endings '...' - key \| For mapping keys explicit and implicit - tag \| For node type tags '!!', '!' - whitespace \| For whitespace chomping between tokens This vastly reduces the size of lib/scanner's file leading to notably better performance by rustfmt and rustanalyzer	2021-09-07 18:18:53 +01:00
Paul Stemmet	369f49a248	lib/scanner: unit tests for block scalar token streams	2021-09-04 17:34:57 +01:00
Paul Stemmet	1f144caf76	lib/scanner: add catch all error, documentation	2021-09-04 17:34:57 +01:00
Paul Stemmet	b26301cbd0	lib/scanner: add support for block scalars	2021-09-04 17:34:57 +01:00
Paul Stemmet	0f42818f4b	scanner/error: add variant UnknownDelimiter this is a catch all error for if we've exhausted the possible token delimiters.	2021-09-04 17:34:57 +01:00
Paul Stemmet	8f4cab4f5b	scalar/block: add unit test for header comments	2021-09-04 17:34:57 +01:00
Paul Stemmet	87af3bc96b	scalar/block: fix skip_blanks comment handling	2021-09-04 17:34:57 +01:00
Paul Stemmet	8505569d3e	scalar/block: clippy lints	2021-09-04 17:34:57 +01:00
Paul Stemmet	645007938f	scalar/block: code reorganization	2021-09-04 17:34:57 +01:00
Paul Stemmet	4ec311d684	scalar/block: documentation	2021-09-04 17:34:57 +01:00
Paul Stemmet	0ffccab6c3	scalar/block: add unit tests for scan_block_scalar	2021-09-04 17:34:57 +01:00
Paul Stemmet	bcf1e405ec	scalar/block: add scan_block_scalar	2021-09-04 17:34:57 +01:00
Paul Stemmet	dba9212224	scanner/macros: add widthOf! for determining the length of a UTF8 unicode point. Uses the bit distribution of UTF8 to determine the code point length	2021-09-04 17:34:57 +01:00
Paul Stemmet	3318f8762a	scanner/macros: add isBreakZ! wrapper around 'isBreak! \|\| check!(buffer => [])'	2021-09-04 17:34:57 +01:00
Paul Stemmet	1b517b518f	scanner/error: add InvalidBlockScalar, InvalidTab variants	2021-09-04 17:34:57 +01:00
Paul Stemmet	a0d71ef644	docs/plain-scalar-indent: commit notes	2021-08-15 09:03:22 +00:00
Paul Stemmet	4eced7d9f4	lib/scanner: add complex test for plain scalars for peace of mind	2021-08-14 19:59:07 +01:00
Paul Stemmet	4189d3db96	lib/scanner: add test for YAML indicators in plain scalar	2021-08-14 19:59:07 +01:00
Paul Stemmet	76d9ec5561	scalar/plain: fix indentation level to account for the 0'th level and add the trait implementations to Indent for <usize> <op> <Indent> comparisons	2021-08-14 19:59:07 +01:00
Paul Stemmet	338db9ce42	lib/scanner: fix is_plain_scalar to block unsafe plain chars before the guard would fail, and implicitly fall through to the catchall statement allowing illegal characters	2021-08-14 19:59:07 +01:00
Paul Stemmet	28a7ce9191	lib/scanner: clippy lints	2021-08-14 19:59:07 +01:00
Paul Stemmet	048550a7b1	lib/scanner: add unit tests for plain scalar token sequences	2021-08-14 19:59:07 +01:00
Paul Stemmet	5d8f78be25	lib/scanner: add support for plain scalars This commit adds the 3rd of the 5 possible scalar types in YAML to the scanner. It is compliant with the YAML spec, _except_ for its handling of "JSON like" keys, which allow for the following value token (e.g ':') to _not_ have a whitespace following it. I frankly find this exception absurd, as the spec _clearly_ half assed this in so that they can declare that they are a "strict super set of JSON", nevermind that _a lot_ of the semantics of _every_ other context for keys rely on a key being followed by whitespace. I may eventually return to this add it; I've a pretty good idea how -- we just need to keep track of the "last" token produced, as only ?'"]} characters would modify the behavior, but I'd need to make sure I haven't missed any subtle side effects, as almost all other key handling implicitly relies on: Key token === ": ".	2021-08-14 19:59:07 +01:00
Paul Stemmet	88ac017647	scalar/plain: fix handling of non EOF trailing whitespace before the loop would incorrectly update scalar_stats _after_ reaching a ': ' terminus. This is now fixed, as I check for the cases before reentering the word loop.	2021-08-14 19:59:07 +01:00
Paul Stemmet	b3e86dea9c	scalar/plain: add unit tests for scan_plain_scalar	2021-08-14 19:59:07 +01:00
Paul Stemmet	cddc1dae09	scalar/plain: add scan_plain_scalar the primary driver for scanning plain YAML scalars. This implementation tries to fit as closely as possible to the YAML spec, particularly in its handling of (the lack of) spacing requirements inside flow contexts, comment detection and special casing of - ? : as first character in flow contexts. Two things that are notably missing: 1. Proper tab '\t' handling in block context indentation 2. A sane maximum whitespace limit && better handling of whitespace storage. Rather than storing every whitespace given, I could instead count the whitespace separated by line breaks, and then add it back later, such that the maximum described above would apply to total line breaks, with the intervening whitespace stored as a u64/usize	2021-08-14 19:59:07 +01:00
Paul Stemmet	2aae6760f5	scalar/flow: use isDocumentIndicator! over longhand	2021-08-14 19:59:07 +01:00
Paul Stemmet	0fcc614771	scanner/macros: add isDocumentIndicator! short hand for checking '--- ' or '... ' sequences	2021-08-14 19:59:07 +01:00
Paul Stemmet	7d600cd29e	scanner/error: add variant InvalidPlainScalar	2021-08-14 19:59:07 +01:00
Paul Stemmet	ce7acbb754	lib/scanner: clippy lints	2021-08-08 10:59:11 +01:00
Paul Stemmet	71266f1530	lib/scanner: add tests for explicit key cases	2021-08-08 10:59:11 +01:00
Paul Stemmet	8558dada84	lib/scanner: add explicit key support to Scanner	2021-08-08 10:59:11 +01:00
Paul Stemmet	4c61af7eb9	scanner/error: add variant InvalidKey for catching cases where a key was given but not valid, typically involving explicit keys ('?')	2021-08-08 10:59:11 +01:00
Paul Stemmet	76be7001bb	lib/scanner: add test for zero indented sequence decrement	2021-08-08 10:59:11 +01:00
Paul Stemmet	5d0572d02d	lib/scanner: further fixes to zero indented sequence handling While the previous commit did add support for _adding_ zero indented sequences to the token stream, it unfortunately relied on the indent stack flush that happens once reaching end of stream to push the stored BlockEnd tokens. This commit adds better support for removing zero indented sequences from the stack once finished. The heuristic used here is: A zero_indented BlockSequence starts when: - The top stored indent is for a BlockMapping - A BlockEntry occupies the same indentation level And terminates when: - The top indent stored is a BlockSequence & is tagged as zero indented - A BlockEntry _does not_ occupy the same indentation level	2021-08-08 10:59:11 +01:00
Paul Stemmet	18d6430cc2	lib/scanner: produce token for zero indentation block sequence This fixes the edge case YAML allows where a sequence may be zero indented, but still start a sequence. E.g using the following YAML: key: - "one" - "two" The following tokens would have been produced (before this commit): StreamStart BlockMappingStart Key Scalar('key') Value BlockEntry Scalar('one') BlockEntry Scalar('two') BlockEnd StreamEnd Note the the lack of any indication that the values are in a sequence. Post commit, the following is produced: StreamStart BlockMappingStart Key Scalar('key') Value BlockSequenceStart <-- BlockEntry Scalar('one') BlockEntry Scalar('two') BlockEnd <-- BlockEnd StreamEnd	2021-08-08 10:59:11 +01:00

1 2 3 4 5 ...

305 Commits All Branches Search

305 Commits

All Branches