Bugfixes and documentation cleanup for v1.0 release.
All checks were successful
ci/woodpecker/tag/release Pipeline was successful

This commit is contained in:
2026-05-01 11:30:29 -05:00
parent c9e98e14b5
commit f20f06db12
17 changed files with 332 additions and 177 deletions

View File

@@ -173,6 +173,14 @@ The current output format is JSON, specified with:
--output-file merged.json
```
The current named JSON schemas are:
- `seriatim-minimal`
- `seriatim-intermediate`
- `seriatim-full`
The current runtime default selection is `seriatim-intermediate`, but default selection may change over time. Consumers that depend on a specific schema should request it explicitly.
Future output formats may include:
- Markdown.
@@ -216,7 +224,7 @@ seriatim merge \
--preprocessing-modules validate-raw,normalize-speakers,trim-text \
--postprocessing-modules detect-overlaps,resolve-overlaps,backchannel,filler,resolve-danglers,coalesce,detect-overlaps,autocorrect,assign-ids,validate-output \
--output-modules json \
--output-schema default \
--output-schema seriatim-intermediate \
--output-file merged.json \
--report-file report.json
```
@@ -358,7 +366,7 @@ Initial classifications may include:
- `backchannel`
- `crosstalk`
The `resolve-overlaps` module uses preserved word-level timing to replace detected overlap-group segments with smaller word-run segments when usable timing is available. Resolution expands each overlap window by the configured coalesce gap so nearby same-speaker context can be absorbed into the replacement runs. Groups without usable word timing remain unresolved for later passes or human review.
The `resolve-overlaps` module uses preserved word-level timing to replace detected overlap-group segments with smaller word-run segments when usable timing is available. Resolution expands each overlap window by the configured coalesce gap so nearby same-speaker context can be absorbed into the replacement runs. Once a segment is selected for replacement, all timed words from that segment participate in word-run construction so text is not clipped at the window boundary. Groups without usable word timing remain unresolved for later passes or human review.
Overlap resolution should be non-destructive. Original segment text, timing, and source metadata must remain recoverable.