dolysis/yary - Stateless Git Forge

Commit Graph

Author	SHA1	Message	Date
Paul Stemmet	790b9b55d5	docs: add README, explaining library purpose and status	2022-03-18 15:40:12 +00:00
Paul Stemmet	a8230f86f2	lib: stub library documentation	2022-03-18 15:26:45 +00:00
Paul Stemmet	ba41069dc6	event/error: document ParseError variants	2022-03-18 15:26:45 +00:00
Paul Stemmet	95eeec30f5	lib: expose reader and event modules	2022-03-18 15:26:45 +00:00
Paul Stemmet	183dbc3b1b	lib/event: expose public API for YAML event streams This commit surfaces a public API for streaming YAML events from a read source. It provides callers an Events{} type that can be generated from any reader::Read implementation -- so for the moment, OwnedReader(s) and BorrowReader(s) -- via the module functions from_reader() and from_reader_with(). This type implements IntoIterator, and thus can be integrated with any iterator based flows, and benefits from the entire, extensive ecosystem around them. That said, I expect this to be a relatively unused part of this library in the long term, being the lowest level public API exposed by this library.	2022-03-18 15:26:45 +00:00
Paul Stemmet	65f990872f	event/flag: add public Flags exposed to callers These define the configuration that library users are allowed to set when iterating over Events. It currently only has one meaningful option, O_LAZY which reflects the behavior exposed by lib/scanner. This will likely change in the future, if more customization is desired when working with Event streams.	2022-03-18 15:26:45 +00:00
Paul Stemmet	44759d458d	event/parser: use relative paths in test macros	2022-03-18 15:26:45 +00:00
Paul Stemmet	bdfafc057f	event/parser: module doc	2022-03-18 15:26:45 +00:00
Paul Stemmet	d2c25e2bd0	lib/event: add module doc A large portion of this was split out from event/parser's module doc to coerce git to rename files.	2022-03-18 15:26:45 +00:00
Paul Stemmet	76824e9db7	lib/event: move Parser to lib/event/parser	2022-03-18 15:26:45 +00:00
Paul Stemmet	2e77556dd1	reader: fix visibility of public readers	2022-03-18 15:26:45 +00:00
Paul Stemmet	f7d75b836f	lib: pin rust version to 1.53 This isn't a hard guarantee, if a new version of Rust offers something useful, this will be moved with no warnings.	2022-03-18 15:26:45 +00:00
Paul Stemmet	d3fd96ea31	ci/github: add MSRV == 1.52 This can be bumped as needed but I'd like to stop checking it myself	2022-01-09 21:45:59 +00:00
Paul Stemmet	f288b71f83	ci/github: improve test naming - Remove unnecessary words from task names - Add rust version to action name :: [$os/$rustv] $taskname	2022-01-09 21:45:59 +00:00
Paul Stemmet	c14ba7829c	ci/github: improve toolchain install task - set default rust version instead of a folder override, as we always expect to use the provided version globally per run. - explicitly declare extra rustup components rather than implicitly rely on the current defaults	2022-01-09 21:45:59 +00:00
Paul Stemmet	056b9e27be	lib/event: add module documentation	2021-12-29 15:07:26 +00:00
Paul Stemmet	fcfb870583	lib/event/tests: add Parser tests	2021-12-29 15:07:26 +00:00
Paul Stemmet	2239a884fd	event/tests/macros: add tokens!, events!, event!, node!, scalar! These macros make op the test harness used by module tests. They allow us to declare a set of tokens! which will be matched against the expected events! that the tokens should produce. The others simplify the process of declaring some of the more nested event structures quickly	2021-12-29 15:07:26 +00:00
Paul Stemmet	2024724d04	lib/event: add handler for YAML nodes - node Note that this function must never call any other handlers, so the Parser remains non-recursive.	2021-12-29 15:07:26 +00:00
Paul Stemmet	5dc521c278	lib/event: add handlers for flow_sequence->mappings - flow_sequence_entry_mapping_key - flow_sequence_entry_mapping_value - flow_sequence_entry_mapping_end These are special cased due to how some of the implied values can pop up, and because we need far fewer rules then in the transition from block_{sequence,mapping}->flow_mapping.	2021-12-29 15:07:26 +00:00
Paul Stemmet	34d893f4b3	lib/event: add handlers for sequences/mappings - block_sequence_entry - block_mapping_key - block_mapping_value - flow_sequence_entry - flow_mapping_key - flow_mapping_value These were mostly straightforward, only tricky bit is handling all the cases in which YAML allows a (scalar) node to be "implied".	2021-12-29 15:07:26 +00:00
Paul Stemmet	89c5b2df5e	lib/event: add handlers for YAML document state - document_start - document_end - explicit_document_content Note that we guarantee at least one (DocumentStart, DocumentEnd) event pair in the event stream, regardless of whether these tokens exist or not. We also guarantee that each DocumentStart _will_ have a DocumentEnd eventually, again regardless of whether such exists in the token stream. This isn't explicitly required by the YAML spec, but makes usage of the Parser more pleasant to callers, as all "indentation" events -- documents, sequences, mappings -- have a guaranteed start and end event, without the caller needing to infer this behavior from the stream itself. If the caller is interested, each DocumentStart and DocumentEnd event records whether it was implicit (missing from the byte stream), or not.	2021-12-29 15:07:26 +00:00
Paul Stemmet	d33c7daf4e	lib/event: add stream_start, stream_end, empty_scalar handlers	2021-12-29 15:07:26 +00:00
Paul Stemmet	36bf2fef52	lib/event: add state handler skeletons	2021-12-29 15:07:26 +00:00
Paul Stemmet	f086975a63	lib/event: add Parser, EventIter skeletons This commit defines the public API of this module: the Parser. Next steps are to finish out all of the todo! methods on the StateMachine branches.	2021-12-29 15:07:26 +00:00
Paul Stemmet	763400291e	event/macros: add peek!, pop!, state!, consume!, initEvent! These macros will be used in the module proper, when operating the event state machine.	2021-12-29 15:07:26 +00:00
Paul Stemmet	5ddbd93ce7	event/types: add Event, EventData and child structures The most notable of the types included in this commit is EventData. Its parent, Event, is a small wrapper with some additional stream information encoded -- the approximate start and end bytes covered. EventData has 10 variants: 1. StreamStart 2. StreamEnd 3. DocumentStart 4. DocumentEnd 5. Alias 6. Scalar 7. MappingStart 8. MappingEnd 9. SequenceStart 10. SequenceEnd Combined, they allow us to express a stream of YAML in an iterative event model, that should hopefully be easy (at least compared to YAML proper) to consume. Expressed in pseudo backus-naur, this is the expected form of any given event stream: === Event Stream === stream := StreamStart document+ StreamEnd document := DocumentStart content? DocumentEnd content := Scalar \| collection collection := sequence \| mapping sequence := SequenceStart node* SequenceEnd mapping := MappingStart (node node)* MappingEnd node := Alias \| content === Syntax === ? => 0 or 1 of prefix * => 0 or more of prefix + => 1 or more of prefix () => production grouping \| => production logical OR === End ===	2021-12-29 15:07:26 +00:00
Paul Stemmet	a21385e92c	event/error: add module Error, Result typedef Plus some From impls for Reader, and Scanner error types.	2021-12-29 15:07:26 +00:00
Paul Stemmet	0f6fb62cb7	event/state: add StateMachine, Flags This module describes the various states that we can reach in a YAML Token stream, and provides the machinery for manipulating it.	2021-12-29 15:07:26 +00:00
Paul Stemmet	19f294cb1c	lib/event: add module stub This module will house the first, lowest level public API of this library, eventually exposing a structure that allows callers to consume high level YAML 'Events', likely with an Iterator interface.	2021-12-29 15:07:26 +00:00
Paul Stemmet	49116317c1	reader/owned: add test_reader! tests	2021-12-29 15:07:26 +00:00
Paul Stemmet	e1fe33e202	reader/owned: add OwnedReader This is an implementation of "Stacked Borrows" wherein memory is allocated in chunks, and once a chunk is reached, a new chunk is allocated and the old one's stack state (cap,len,ptr) is moved into the tail.	2021-12-29 15:07:26 +00:00
Paul Stemmet	fb90078a3e	reader/borrow: add test_reader! tests	2021-12-29 15:07:26 +00:00
Paul Stemmet	e49c473604	reader/borrow: add BorrowReader Naive implementation of Read using an existing, borrowed UTF8 slice (&str).	2021-12-29 15:07:26 +00:00
Paul Stemmet	8fa290d374	reader/test_util: add test_reader! macro This macro generates a test suite with provided Read implementation, allowing me to quickly and uniformly test reader implementations	2021-12-29 15:07:26 +00:00
Paul Stemmet	43353dae19	lib/reader: add Reader, PeekReader structs These will be used by higher level APIs to drive the underlying Read implementation	2021-12-29 15:07:26 +00:00
Paul Stemmet	a08aab7992	lib/reader: add trait Read Any Read implementation must uphold the contract: (&'de self) -> Tokens<'de> That is, any borrows into the backing bytes given out must not be mutated in any way. For an existing borrow (e.g &str) this is trivially possible, however things get much more complicated when dealing with an owned source that might not be complete -- a `std::io::Read` object, for example. While we could simply read the entire thing first, and then borrow from the complete byte stream this is less than ideal, particularly for Serde implementations as an owned source will only provide a DeserializeOwned implementation, consequently copying data. It also makes stream processing YAML arbitrarily limited to the total size of the stream, rather than the actual data stored -- e.g: sum(SCALAR.len()) + count(SCALAR) -- which is a strong limitation, given YAML natural stream processing capabilities. To overcome this limitation, I've decided to introduce a "Stacked Borrow" pattern with the use of a little unsafe. ``` ; A rust vector is just a capacity, length and ptr to somewhere in the ; heap VEC := (cap,len,ptr) ; Each OwnedReader keeps two VECs, one for bytes (u8) and another for ; VECs of bytes OwnedReader := { head: (cap, len, ptr) tail: (cap, len, ptr) } ; Demonstration of the various memory segments stored on the program's heap ; and how the OwnedReader's ptrs connect HEAP := { head.ptr->[u8..] tail.ptr->[VEC..] tail[0].ptr->[u8..] tail[n].ptr->[u8..] } ``` The OwnedReader makes a promise to NEVER call realloc on an existing heap segment; therefore any references given out to heap segments are immutable, fulfilling the contract required by the Parser (and Scanner). Instead, if/when more of the byte stream is requested, it will allocate a new .head and swapping out the old .head onto the .tail stack thus keep the memory live. Notably, this process hasn't described how to determine if any .tail segments are no longer needed and unload them. Mostly because I haven't figured that part out completely yet. Probably keeping track of the lowest borrowed segment somehow and running reconciliation periodically. But it _is_ possible using this strategy.	2021-12-29 15:07:26 +00:00
Paul Stemmet	a8a2aee615	scanner/entry: add .marker() method This allows users of TokenEntry(s) to have a quick, cheap method of ascertaining what the underlying Token is, even if the entry itself is deferred.	2021-12-29 15:07:26 +00:00
Paul Stemmet	bdbf510a24	lib/scanner: clippy lints from 1.56	2021-12-29 15:07:26 +00:00
Paul Stemmet	ce8b59b646	lib/scanner: add offset controls	2021-12-29 15:07:26 +00:00
Paul Stemmet	0343c29021	lib/{token,scanner/entry}: derive Clone on more structs	2021-12-29 15:07:26 +00:00
Paul Stemmet	91545f4c70	lib: fix visibility on Queue, Scanner, TokenEntry Each of these will likely appear in the parts of the public API, even if they aren't directly used. Its likely these will be "public but unreachable" -- e.g a public type in a private module. This will likely be revisited on the way to a stable 1.0 library version, but works for now.	2021-12-29 15:07:26 +00:00
Paul Stemmet	ccc1bc16ab	license/mpl2 * LICENSE: MPL 2.0 * lib/*: add MPL 2.0 header to source code Cargo: license = "MPL-2.0"	2021-09-17 17:32:30 +01:00
Paul Stemmet	7d90804cc5	ci/github: add matrix targets for test_lazy	2021-09-17 17:03:13 +01:00
Paul Stemmet	c977815ddc	scalar/block: module documentation updates	2021-09-17 17:03:13 +01:00
Paul Stemmet	9ed1bcc00e	scalar/plain: fix subtle slice error in scan_plain_scalar_lazy When checking for a terminating sequence in plain scalars, we either need a flow indicator (in flow contexts only), or a ': ' byte sequence, where the space can be any valid YAML whitespace. The issue here is that the lazy variant was correctly identifying the terminating sequence, _but not recording it_ for the Deferred's slice. This commit fixes that, ensuring we always record the final 1 or 2 bytes before exiting the main loop.	2021-09-17 17:03:13 +01:00
Paul Stemmet	1cdad01126	scalar/flow: add unit test for escaped double quote	2021-09-17 17:03:13 +01:00
Paul Stemmet	0c38dda908	scalar/flow: fixes to scan_flow_scalar_lazy's chomping 1. Handle linebreaks separately from other characters (for stats) 2. Don't quit early on an escaped double quote (\")	2021-09-17 17:03:13 +01:00
Paul Stemmet	0a4a7930a5	scalar/*/tests: rename TEST_OPTS -> TEST_FLAGS to remain consistent with scanner/tests, also derive the base TEST_FLAGS from scanner/tests.TEST_FLAGS, minus options that do not make sense for the test battery (O_EXTENDABLE)	2021-09-17 17:03:13 +01:00
Paul Stemmet	8d01532b1f	Cargo: add feature.test_lazy For testing the Scanner with O_LAZY active	2021-09-17 17:03:13 +01:00

1 2 3 4 5 ...

285 Commits All Branches Search

285 Commits

All Branches