Feature/scanner/explict key #21

Merged

bazaah merged 7 commits from feature/scanner/explict-key into master

2021-08-08 09:59:11 +00:00

bazaah commented

2021-08-08 09:46:35 +00:00

(Migrated from github.com)

This PR adds support for explicit keys, and fixes an issue with tokenization of zero indented block sequences.

Components

Added support for explicit (or complex) keys to the Scanner
Properly return BlockSequenceStart & BlockSequenceEnd tokens for zero indented sequences
Unit tests for both features

Zero indentation

The following YAML documents are equivalent:

---
a sequence:
  - one
  - two
  - three
---
a sequence:
- one
- two
- three

and should produce the same tokens:

- StreamStart
- DocumentStart
- BlockMappingStart
- Key
- Scalar('a sequence')
- Value
- BlockSequenceStart # <-- note that we start a sequence as the value
- BlockEntry
- Scalar('one')
- BlockEntry
- Scalar('two')
- BlockEntry
- Scalar('three')
- BlockEnd           # <-- of the sequence
- BlockEnd
- StreamEnd

However, due to how the Scanner handles block scopes -- namely that it requires an indentation increase to enter a sub scope -- the two commented Tokens above would be omitted from the token stream for the second document.

Fixing this was tricky, as mostly we do not want to increase block scope without an indentation level, except in the following case:

The indentation stack's most recent entry was a for a mapping
The new (current) one would be for a sequence
The current indentation level is equal to the indent stack's most recent entry's

This was an interesting problem to solve.

This PR adds support for explicit keys, and fixes an issue with tokenization of zero indented block sequences. ## Components - Added support for explicit (or _complex_) keys to the `Scanner` - Properly return `BlockSequenceStart` & `BlockSequenceEnd` tokens for zero indented sequences - Unit tests for both features ## Zero indentation The following YAML documents are equivalent: ```yaml --- a sequence: - one - two - three --- a sequence: - one - two - three ``` and should produce the same tokens: ```yaml - StreamStart - DocumentStart - BlockMappingStart - Key - Scalar('a sequence') - Value - BlockSequenceStart # <-- note that we start a sequence as the value - BlockEntry - Scalar('one') - BlockEntry - Scalar('two') - BlockEntry - Scalar('three') - BlockEnd # <-- of the sequence - BlockEnd - StreamEnd ``` However, due to how the Scanner handles block scopes -- namely that it requires an indentation increase to enter a sub scope -- the two commented `Token`s above would be omitted from the token stream for the second document. Fixing this was tricky, as mostly we _do not_ want to increase block scope without an indentation level, except in the following case: 1. The indentation stack's most recent entry was a for a mapping 2. The new (current) one would be for a sequence 3. The current indentation level is equal to the indent stack's most recent entry's This was an interesting problem to solve.