Cleaned up documentation and development artifcats in advance of release

This commit is contained in:
2026-04-27 21:48:04 -05:00
parent 6cb739be55
commit 28c2eea340
15 changed files with 336 additions and 92 deletions

View File

@@ -4,7 +4,7 @@
The initial use case is merging independently transcribed speaker audio tracks from the same recorded session, such as a weekly tabletop RPG session. The architecture should also support meetings, podcasts, interviews, and other multi-speaker events.
`seriatim` will be implemented in Go.
`seriatim` is implemented in Go.
## Goals
@@ -21,19 +21,19 @@ The initial use case is merging independently transcribed speaker audio tracks f
9. Emit one or more output artifacts through output writers.
10. Produce report data for validation findings, corrections, and transformations.
## Non-goals for the MVP
## Non-goals
The MVP should not attempt to:
The 1.0 release does not attempt to:
- Perform transcription.
- Perform audio diarization.
- Use an LLM.
- Summarize transcript content.
- Infer speaker identity from audio or text.
- Fully resolve crosstalk.
- Fully resolve every crosstalk case.
- Load arbitrary third-party code as dynamic plugins.
The MVP should support runtime composition of built-in modules by canonical module name. Arbitrary external plugin loading can be considered later.
The application supports runtime composition of built-in modules by canonical module name. Arbitrary external plugin loading can be considered later.
## Core Assumption
@@ -78,7 +78,7 @@ The configuration stage produces an application config value that is passed thro
The input stage converts external inputs into raw transcript documents with source metadata.
The MVP input method is one or more JSON files passed with repeated `--input-file` flags:
The current input method is one or more JSON files passed with repeated `--input-file` flags:
```text
seriatim merge --input-file eric.json --input-file mike.json --output-file merged.json
@@ -107,7 +107,7 @@ Preprocessing starts with raw transcript documents from input readers and must e
Preprocessing modules are selected at runtime with a comma-separated list of canonical module names:
```text
--preprocessing-modules validate-raw,normalize-speakers,trim-text,autocorrect
--preprocessing-modules validate-raw,normalize-speakers,trim-text
```
Modules run in the exact order provided. Unknown module names are configuration errors.
@@ -120,7 +120,6 @@ Potential preprocessing modules include:
- Speaker name normalization based on input filename.
- Timing validation and deterministic correction.
- Text trimming.
- Word replacement from `autocorrect.yml`.
Preprocessing should not depend on global chronological ordering across speakers. Modules that need the globally merged transcript belong in postprocessing.
@@ -147,7 +146,7 @@ The postprocessing stage applies zero or more modules to the merged transcript.
Postprocessing modules are selected at runtime with a comma-separated list of canonical module names:
```text
--postprocessing-modules detect-overlaps,resolve-overlaps,coalesce,assign-ids,validate-output
--postprocessing-modules detect-overlaps,resolve-overlaps,backchannel,filler,coalesce,detect-overlaps,autocorrect,assign-ids,validate-output
```
Modules run in the exact order provided. Unknown module names are configuration errors.
@@ -168,7 +167,7 @@ Any module that can reorder, split, merge, drop, or create segments must run bef
The output stage emits one or more artifacts from the final transcript and report model.
The MVP output format is JSON, specified with:
The current output format is JSON, specified with:
```text
--output-file merged.json
@@ -202,7 +201,7 @@ This classification should guide Go interfaces and package boundaries. It should
## Runtime Module Composition
The MVP should support runtime composition of built-in modules.
The application supports runtime composition of built-in modules.
Module names are canonical strings registered at startup. CLI flags refer to those names. The configuration stage resolves names into module instances before the pipeline runs.
@@ -214,9 +213,9 @@ seriatim merge \
--input-file mike.json \
--speakers speakers.yml \
--autocorrect autocorrect.yml \
--preprocessing-modules validate-raw,normalize-speakers,trim-text,autocorrect \
--postprocessing-modules detect-overlaps,resolve-overlaps,coalesce,assign-ids,validate-output \
--output-module json \
--preprocessing-modules validate-raw,normalize-speakers,trim-text \
--postprocessing-modules detect-overlaps,resolve-overlaps,backchannel,filler,coalesce,detect-overlaps,autocorrect,assign-ids,validate-output \
--output-modules json \
--output-file merged.json \
--report-file report.json
```
@@ -358,7 +357,7 @@ Initial classifications may include:
- `backchannel`
- `crosstalk`
The MVP `resolve-overlaps` module may be a stub that marks groups as unresolved. This preserves the architecture for future word-level crosstalk serialization without complicating the initial implementation.
The `resolve-overlaps` module uses preserved word-level timing to replace detected overlap-group segments with smaller word-run segments when usable timing is available. Groups without usable word timing remain unresolved for later passes or human review.
Overlap resolution should be non-destructive. Original segment text, timing, and source metadata must remain recoverable.
@@ -403,31 +402,28 @@ To support this:
- Record application version in output metadata.
- Record enabled module names and module order in output metadata or report data.
## Suggested Go Package Layout
## Go Package Layout
```text
cmd/seriatim/ CLI entrypoint
internal/config/ CLI/env/config loading and validation
internal/pipeline/ Pipeline orchestration and module registry
internal/input/ Input readers
internal/raw/ Raw transcript structs
internal/schema/ Schema loading and validation helpers
internal/builtin/ Built-in pipeline modules
internal/artifact/ Conversion from internal model to public output schema
internal/buildinfo/ Build-time version metadata
internal/speaker/ Speaker map parsing and lookup
internal/model/ Canonical and merged transcript models
internal/preprocess/ Preprocessing modules
internal/merge/ Deterministic merge logic
internal/postprocess/ Postprocessing modules
internal/overlap/ Overlap detection and refinement helpers
internal/autocorrect/ Word replacement rules
internal/report/ Report model and event accumulation
internal/output/ Output writers
schema/ Public output contract and JSON Schema validation
```
Package boundaries should follow data ownership. Shared models belong in `internal/model`; stage-specific behavior belongs in the relevant stage package.
## MVP Defaults
## Default Modules
The MVP should define documented defaults equivalent to explicit module lists.
The default pipeline is equivalent to explicit module lists.
Recommended default preprocessing modules:
@@ -438,7 +434,11 @@ validate-raw,normalize-speakers,trim-text
Recommended default postprocessing modules:
```text
detect-overlaps,resolve-overlaps,assign-ids,validate-output
detect-overlaps,resolve-overlaps,backchannel,filler,coalesce,detect-overlaps,autocorrect,assign-ids,validate-output
```
Optional modules such as `autocorrect` and `coalesce` should be opt-in until their behavior is thoroughly specified and tested.
The default output module is:
```text
json
```