Harden trim integration

Document trim command
Add trim report output
2026-05-08 15:00:46 +00:00 · 2026-05-08 14:57:52 +00:00 · 2026-05-08 14:56:24 +00:00 · 2026-05-08 14:53:59 +00:00 · 2026-05-08 14:47:52 +00:00 · 2026-05-08 14:44:31 +00:00
13 changed files with 3075 additions and 3 deletions
--- a/README.md
+++ b/README.md
@@ -1,8 +1,8 @@
 # seriatim
-`seriatim` merges per-speaker WhisperX-style JSON transcripts into a single JSON transcript that preserves speaker identity and chronological order.
+`seriatim` merges per-speaker WhisperX-style JSON transcripts into a single JSON transcript that preserves speaker identity and chronological order. It also trims existing seriatim output artifacts by segment ID.
-The current implementation supports the `merge` command. It reads one or more input JSON files, optionally maps each input file to a canonical speaker using `speakers.yml`, sorts all segments by timestamp, detects and resolves overlaps when word-level timing is available, assigns consecutive numeric `id` values, and writes a merged JSON artifact.
+The current implementation supports the `merge` and `trim` commands. `merge` reads one or more input JSON files, optionally maps each input file to a canonical speaker using `speakers.yml`, sorts all segments by timestamp, detects and resolves overlaps when word-level timing is available, assigns consecutive numeric `id` values, and writes a merged JSON artifact. `trim` reads an existing seriatim output artifact and projects it to a retained segment subset.
 ## Usage
@@ -25,10 +25,20 @@ go run ./cmd/seriatim merge \
  --report-file report.json
 ```
 Trim an existing seriatim artifact:
 ```sh
 go run ./cmd/seriatim trim \
  --input-file merged.json \
  --output-file trimmed.json \
  --keep "1-10, 15, 20-25"
 ```
 ## CLI
 ```text
 seriatim merge [flags]
 seriatim trim [flags]
 ```
 Global flags:
@@ -54,6 +64,50 @@ Global flags:
 | `--postprocessing-modules` | No | `detect-overlaps,resolve-overlaps,backchannel,filler,resolve-danglers,coalesce,detect-overlaps,autocorrect,assign-ids,validate-output` | Comma-separated postprocessing modules, evaluated in order. |
 | `--coalesce-gap` | No | `3.0` | Maximum same-speaker gap in seconds for `coalesce`; also used as the `resolve-overlaps` context window. Must be a non-negative float. |
 `trim` flags:
 | Flag | Required | Default | Description |
 | --- | --- | --- | --- |
 | `--input-file` | Yes | none | Input seriatim output artifact JSON file. |
 | `--output-file` | Yes | none | Trimmed transcript JSON output path. |
 | `--keep` | Exactly one of `--keep` or `--remove` is required | none | Segment ID selector to retain. |
 | `--remove` | Exactly one of `--keep` or `--remove` is required | none | Segment ID selector to drop. |
 | `--output-schema` | No | preserve input artifact schema | Optional output schema override: `seriatim-minimal`, `seriatim-intermediate`, or `seriatim-full`. |
 | `--report-file` | No | none | Optional report JSON output path. |
 | `--allow-empty` | No | `false` | Allow trimming to zero retained segments. |
 `trim` selection rules:
 - `--keep` and `--remove` are mutually exclusive.
 - Exactly one of `--keep` or `--remove` is required.
 - Selection is by segment ID only.
 - Invalid selected segment IDs fail the command by default.
 `trim` selector syntax:
 - Segment IDs are positive 1-based integers.
 - Inclusive ranges are supported: `1-10`.
 - Comma-separated selectors are supported: `1-10,15,20-25`.
 - Whitespace around numbers, commas, and hyphens is allowed: `1 - 10, 15, 20 - 25`.
 - Duplicate and overlapping ranges are accepted and normalized as a union.
 - Descending ranges (for example `10-1`) are rejected.
 `trim` behavior:
 - `trim` consumes existing seriatim JSON output artifacts only.
 - `trim` does not accept raw WhisperX transcript JSON as input.
 - Retained output segment IDs are renumbered sequentially from `1` to `N`.
 - Transcript order is preserved from input transcript order; selector order does not reorder output.
 - When output schema is `seriatim-full`, overlap groups are recomputed from retained segments.
 - `--output-schema seriatim-full` is supported when trim has full-schema artifact data to emit; trim does not synthesize missing full-schema provenance from minimal/intermediate input artifacts.
 - `trim` does not run merge postprocessors such as `resolve-overlaps`, `coalesce`, or `autocorrect`.
 `trim` report output:
 - When `--report-file` is provided, the report includes standard trim/validation/output events.
 - The report includes a `trim-audit` event containing trim operation metadata, including selected IDs, retained/removed counts, removed IDs, and old-to-new segment ID mapping.
 - Old-to-new ID mapping is emitted as a deterministic ordered array of `{old_id, new_id}` pairs.
 Environment variables:
 | Environment Variable | Default | Description |
--- a/architecture.md
+++ b/architecture.md
@@ -1,6 +1,9 @@
 # seriatim Architecture
-`seriatim` is a deterministic transcript merge utility for combining multiple per-speaker transcript inputs into a single chronologically ordered diarized transcript.
+`seriatim` is a deterministic transcript utility for:
 - merging multiple per-speaker transcript inputs into a single chronologically ordered diarized transcript, and
 - projecting existing seriatim transcript artifacts through deterministic segment-ID trimming.
 The initial use case is merging independently transcribed speaker audio tracks from the same recorded session, such as a weekly tabletop RPG session. The architecture should also support meetings, podcasts, interviews, and other multi-speaker events.
@@ -20,6 +23,7 @@ The initial use case is merging independently transcribed speaker audio tracks f
 8. Detect and annotate overlapping speech regions.
 9. Emit one or more output artifacts through output writers.
 10. Produce report data for validation findings, corrections, and transformations.
 11. Support artifact-level transcript projection commands that operate on existing seriatim output.
 ## Non-goals
@@ -56,6 +60,8 @@ configuration check
 Each stage has an explicit data contract. Input and output stages perform I/O. Processing stages should be deterministic transformations over in-memory models and should record report events for validation findings, corrections, and transformations.
 `merge` runs this pipeline. `trim` is intentionally separate from this pipeline and operates at the artifact layer.
 ## Stage Contracts
 ### 1. Configuration Check
@@ -191,6 +197,23 @@ Future output formats may include:
 Output writers should be selected from an explicit registry and should consume the final transcript model read-only. Multiple output writers may run for a single invocation.
 ### 7. Artifact Projection Stage (`trim` command)
 `trim` is an artifact-level command that reads an existing seriatim output artifact and emits a projected artifact containing a segment-ID subset.
 Design constraints:
 - `trim` runs after `merge`, not as a merge postprocessor.
 - `trim` validates the input artifact against supported seriatim output schemas.
 - `trim` performs deterministic keep/remove selection by segment ID.
 - `trim` renumbers retained IDs to `1..N` in transcript order.
 - `trim` validates the final output against the selected output schema before writing.
 - `trim` records audit metadata in report output.
 `trim` is intentionally separate from merge postprocessing because it consumes already-emitted public artifacts. This separation keeps merge semantics stable and avoids rerunning merge-only transforms on projected artifacts.
 `trim` must not rerun merge postprocessors such as `resolve-overlaps`, `coalesce`, or `autocorrect`.
 ## Module Classification
 Modules should be classified by their contract and allowed effects.
@@ -397,6 +420,8 @@ A valid merged transcript should satisfy:
 - Every referenced segment exists.
 - Output validates against the selected output schema.
 For full-schema trim output, overlap groups are recomputed from retained segments so overlap annotations and group references remain internally consistent after projection.
 ## Determinism Requirements
 Given the same inputs, config, and application version, `seriatim` should produce byte-stable JSON output where practical.
@@ -411,6 +436,12 @@ To support this:
 - Record application version in output metadata.
 - Record enabled module names and module order in output metadata or report data.
 Trim-specific determinism requirements:
 - Selector normalization and retained IDs are deterministic.
 - Old-to-new ID mapping in trim reports is emitted in deterministic order.
 - Full-schema overlap recomputation is deterministic for the same input artifact and selector.
 ## Go Package Layout
 ```text
@@ -419,6 +450,7 @@ internal/config/         CLI/env/config loading and validation
 internal/pipeline/       Pipeline orchestration and module registry
 internal/builtin/        Built-in pipeline modules
 internal/artifact/       Conversion from internal model to public output schema
 internal/trim/           Artifact parsing, trim selection, schema conversion, overlap recomputation for full schema
 internal/buildinfo/      Build-time version metadata
 internal/speaker/        Speaker map parsing and lookup
 internal/model/          Canonical and merged transcript models
@@ -430,6 +462,12 @@ schema/                  Public output contract and JSON Schema validation
 Package boundaries should follow data ownership. Shared models belong in `internal/model`; stage-specific behavior belongs in the relevant stage package.
 For trim:
 - `internal/trim` contains pure transformation logic over artifact structs.
 - CLI command code handles only flag parsing, file I/O, and report emission.
 - Transform logic is deterministic and pure except for command-layer I/O.
 ## Default Modules
 The default pipeline is equivalent to explicit module lists.
--- a/internal/cli/root.go
+++ b/internal/cli/root.go
@@ -17,5 +17,6 @@ func NewRootCommand() *cobra.Command {
 	}
 	cmd.AddCommand(newMergeCommand())
 	cmd.AddCommand(newTrimCommand())
 	return cmd
 }
--- a/internal/cli/trim.go
+++ b/internal/cli/trim.go
@@ -0,0 +1,191 @@
 package cli
 import (
 	"encoding/json"
 	"fmt"
 	"os"
 	"sort"
 	"github.com/spf13/cobra"
 	"gitea.maximumdirect.net/eric/seriatim/internal/config"
 	"gitea.maximumdirect.net/eric/seriatim/internal/report"
 	triminternal "gitea.maximumdirect.net/eric/seriatim/internal/trim"
 )
 type trimAuditReport struct {
 	Operation               string          `json:"operation"`
 	InputFile               string          `json:"input_file"`
 	OutputFile              string          `json:"output_file"`
 	InputSchema             string          `json:"input_schema"`
 	OutputSchema            string          `json:"output_schema"`
 	Mode                    string          `json:"mode"`
 	Selector                string          `json:"selector"`
 	SelectedIDs             []int           `json:"selected_ids"`
 	AllowEmpty              bool            `json:"allow_empty"`
 	InputSegmentCount       int             `json:"input_segment_count"`
 	RetainedSegmentCount    int             `json:"retained_segment_count"`
 	RemovedSegmentCount     int             `json:"removed_segment_count"`
 	RemovedInputIDs         []int           `json:"removed_input_ids"`
 	OldToNewIDMapping       []trimIDMapping `json:"old_to_new_id_mapping"`
 	OverlapGroupsRecomputed bool            `json:"overlap_groups_recomputed"`
 }
 type trimIDMapping struct {
 	OldID int `json:"old_id"`
 	NewID int `json:"new_id"`
 }
 func newTrimCommand() *cobra.Command {
 	var opts config.TrimOptions
 	cmd := &cobra.Command{
 		Use:   "trim",
 		Short: "Trim an existing seriatim transcript artifact by segment ID",
 		RunE: func(cmd *cobra.Command, args []string) error {
 			trimOpts := opts
 			if !cmd.Flags().Changed("output-schema") {
 				trimOpts.OutputSchema = ""
 			}
 			cfg, err := config.NewTrimConfig(trimOpts)
 			if err != nil {
 				return err
 			}
 			selector, err := triminternal.ParseSelector(cfg.Selector)
 			if err != nil {
 				return fmt.Errorf("invalid selector %q: %w", cfg.Selector, err)
 			}
 			data, err := os.ReadFile(cfg.InputFile)
 			if err != nil {
 				return fmt.Errorf("read --input-file %q: %w", cfg.InputFile, err)
 			}
 			artifact, err := triminternal.ParseArtifactJSON(data)
 			if err != nil {
 				return fmt.Errorf("--input-file %q: %w", cfg.InputFile, err)
 			}
 			inputSegmentCount := artifact.SegmentCount()
 			inputSchema := artifact.Schema
 			mode := triminternal.ModeKeep
 			if cfg.Mode == "remove" {
 				mode = triminternal.ModeRemove
 			}
 			trimmed, err := triminternal.ApplyArtifact(artifact, triminternal.Options{
 				Mode:       mode,
 				Selector:   selector,
 				AllowEmpty: cfg.AllowEmpty,
 			})
 			if err != nil {
 				return err
 			}
 			outputSchema := artifact.Schema
 			if cfg.OutputSchema != "" {
 				outputSchema = cfg.OutputSchema
 			}
 			outputArtifact, err := triminternal.ConvertArtifact(trimmed.Artifact, outputSchema)
 			if err != nil {
 				return err
 			}
 			if err := triminternal.ValidateArtifact(outputArtifact); err != nil {
 				return fmt.Errorf("validate trimmed output: %w", err)
 			}
 			if err := writeOutputJSON(cfg.OutputFile, outputArtifact.Value()); err != nil {
 				return err
 			}
 			if cfg.ReportFile != "" {
 				audit := trimAuditReport{
 					Operation:               "trim",
 					InputFile:               cfg.InputFile,
 					OutputFile:              cfg.OutputFile,
 					InputSchema:             inputSchema,
 					OutputSchema:            outputArtifact.Schema,
 					Mode:                    cfg.Mode,
 					Selector:                cfg.Selector,
 					SelectedIDs:             selector.IDs(),
 					AllowEmpty:              cfg.AllowEmpty,
 					InputSegmentCount:       inputSegmentCount,
 					RetainedSegmentCount:    len(trimmed.OldToNewID),
 					RemovedSegmentCount:     len(trimmed.RemovedIDs),
 					RemovedInputIDs:         append([]int(nil), trimmed.RemovedIDs...),
 					OldToNewIDMapping:       orderedIDMapping(trimmed.OldToNewID),
 					OverlapGroupsRecomputed: trimmed.OverlapGroupsRecomputed,
 				}
 				auditJSON, err := json.Marshal(audit)
 				if err != nil {
 					return fmt.Errorf("marshal trim audit report: %w", err)
 				}
 				rpt := report.Report{
 					Metadata: report.Metadata{
 						Application:   outputArtifact.Application(),
 						Version:       outputArtifact.Version(),
 						InputReader:   "trim-artifact",
 						InputFiles:    []string{cfg.InputFile},
 						OutputModules: []string{"json"},
 					},
 					Events: []report.Event{
 						report.Info("trim", "trim", fmt.Sprintf("trimmed %d input segment(s) into %d output segment(s) with mode=%s", inputSegmentCount, outputArtifact.SegmentCount(), cfg.Mode)),
 						report.Info("trim", "trim-audit", string(auditJSON)),
 						report.Info("trim", "validate-output", fmt.Sprintf("validated %d output segment(s)", outputArtifact.SegmentCount())),
 						report.Info("output", "json", "wrote transcript JSON"),
 					},
 				}
 				if err := report.WriteJSON(cfg.ReportFile, rpt); err != nil {
 					return err
 				}
 			}
 			return nil
 		},
 	}
 	flags := cmd.Flags()
 	flags.StringVar(&opts.InputFile, "input-file", "", "input seriatim transcript artifact JSON file")
 	flags.StringVar(&opts.OutputFile, "output-file", "", "output transcript JSON file")
 	flags.StringVar(&opts.ReportFile, "report-file", "", "optional report JSON file")
 	flags.StringVar(&opts.Keep, "keep", "", "segment ID selector to keep (for example: 1-10,15)")
 	flags.StringVar(&opts.Remove, "remove", "", "segment ID selector to remove (for example: 1-10,15)")
 	flags.StringVar(&opts.OutputSchema, "output-schema", "", "optional output JSON schema override: seriatim-minimal, seriatim-intermediate, or seriatim-full")
 	flags.BoolVar(&opts.AllowEmpty, "allow-empty", false, "allow trimming to an empty transcript")
 	return cmd
 }
 func writeOutputJSON(path string, value any) error {
 	file, err := os.Create(path)
 	if err != nil {
 		return err
 	}
 	defer file.Close()
 	enc := json.NewEncoder(file)
 	enc.SetIndent("", "  ")
 	return enc.Encode(value)
 }
 func orderedIDMapping(mapping map[int]int) []trimIDMapping {
 	keys := make([]int, 0, len(mapping))
 	for oldID := range mapping {
 		keys = append(keys, oldID)
 	}
 	sort.Ints(keys)
 	pairs := make([]trimIDMapping, 0, len(keys))
 	for _, oldID := range keys {
 		pairs = append(pairs, trimIDMapping{
 			OldID: oldID,
 			NewID: mapping[oldID],
 		})
 	}
 	return pairs
 }
--- a/internal/cli/trim_test.go
+++ b/internal/cli/trim_test.go
@@ -0,0 +1,758 @@
 package cli
 import (
 	"encoding/json"
 	"os"
 	"path/filepath"
 	"strings"
 	"testing"
 	"gitea.maximumdirect.net/eric/seriatim/internal/config"
 	"gitea.maximumdirect.net/eric/seriatim/internal/report"
 	"gitea.maximumdirect.net/eric/seriatim/schema"
 )
 func TestTrimKeepModeEndToEnd(t *testing.T) {
 	dir := t.TempDir()
 	input := writeTrimFullFixture(t, dir, "input.json")
 	output := filepath.Join(dir, "trimmed.json")
 	err := executeTrim(
 		"--input-file", input,
 		"--output-file", output,
 		"--keep", "2,4",
 	)
 	if err != nil {
 		t.Fatalf("trim failed: %v", err)
 	}
 	var transcript schema.Transcript
 	readJSON(t, output, &transcript)
 	if len(transcript.Segments) != 2 {
 		t.Fatalf("segment count = %d, want 2", len(transcript.Segments))
 	}
 	if transcript.Segments[0].Text != "two" || transcript.Segments[1].Text != "four" {
 		t.Fatalf("unexpected kept text order: %#v", transcript.Segments)
 	}
 	assertSequentialIDs(t, []int{transcript.Segments[0].ID, transcript.Segments[1].ID})
 }
 func TestTrimRemoveModeEndToEnd(t *testing.T) {
 	dir := t.TempDir()
 	input := writeTrimFullFixture(t, dir, "input.json")
 	output := filepath.Join(dir, "trimmed.json")
 	err := executeTrim(
 		"--input-file", input,
 		"--output-file", output,
 		"--remove", "2,4",
 	)
 	if err != nil {
 		t.Fatalf("trim failed: %v", err)
 	}
 	var transcript schema.Transcript
 	readJSON(t, output, &transcript)
 	if len(transcript.Segments) != 2 {
 		t.Fatalf("segment count = %d, want 2", len(transcript.Segments))
 	}
 	if transcript.Segments[0].Text != "one" || transcript.Segments[1].Text != "three" {
 		t.Fatalf("unexpected remaining text order: %#v", transcript.Segments)
 	}
 	assertSequentialIDs(t, []int{transcript.Segments[0].ID, transcript.Segments[1].ID})
 }
 func TestTrimMutualExclusionFailure(t *testing.T) {
 	dir := t.TempDir()
 	input := writeTrimFullFixture(t, dir, "input.json")
 	output := filepath.Join(dir, "trimmed.json")
 	err := executeTrim(
 		"--input-file", input,
 		"--output-file", output,
 		"--keep", "1",
 		"--remove", "2",
 	)
 	if err == nil {
 		t.Fatal("expected mutual exclusion error")
 	}
 	if !strings.Contains(err.Error(), "mutually exclusive") {
 		t.Fatalf("unexpected error: %v", err)
 	}
 }
 func TestTrimMissingSelectionFailure(t *testing.T) {
 	dir := t.TempDir()
 	input := writeTrimFullFixture(t, dir, "input.json")
 	output := filepath.Join(dir, "trimmed.json")
 	err := executeTrim(
 		"--input-file", input,
 		"--output-file", output,
 	)
 	if err == nil {
 		t.Fatal("expected selection flag error")
 	}
 	if !strings.Contains(err.Error(), "exactly one of --keep or --remove is required") {
 		t.Fatalf("unexpected error: %v", err)
 	}
 }
 func TestTrimInvalidSelectedIDFailure(t *testing.T) {
 	dir := t.TempDir()
 	input := writeTrimFullFixture(t, dir, "input.json")
 	output := filepath.Join(dir, "trimmed.json")
 	err := executeTrim(
 		"--input-file", input,
 		"--output-file", output,
 		"--keep", "99",
 	)
 	if err == nil {
 		t.Fatal("expected missing selected ID error")
 	}
 	if !strings.Contains(err.Error(), "does not exist") {
 		t.Fatalf("unexpected error: %v", err)
 	}
 }
 func TestTrimOmittedOutputSchemaPreservesInputSchema(t *testing.T) {
 	dir := t.TempDir()
 	input := writeTrimMinimalFixture(t, dir, "input-minimal.json")
 	output := filepath.Join(dir, "trimmed.json")
 	err := executeTrim(
 		"--input-file", input,
 		"--output-file", output,
 		"--keep", "1",
 	)
 	if err != nil {
 		t.Fatalf("trim failed: %v", err)
 	}
 	var transcript schema.MinimalTranscript
 	readJSON(t, output, &transcript)
 	if transcript.Metadata.OutputSchema != config.OutputSchemaMinimal {
 		t.Fatalf("output_schema = %q, want %q", transcript.Metadata.OutputSchema, config.OutputSchemaMinimal)
 	}
 	if len(transcript.Segments) != 1 || transcript.Segments[0].ID != 1 {
 		t.Fatalf("unexpected minimal trim output: %#v", transcript.Segments)
 	}
 }
 func TestTrimExplicitOutputSchemaChangesOutputSchema(t *testing.T) {
 	dir := t.TempDir()
 	input := writeTrimFullFixture(t, dir, "input.json")
 	output := filepath.Join(dir, "trimmed.json")
 	err := executeTrim(
 		"--input-file", input,
 		"--output-file", output,
 		"--keep", "1,3",
 		"--output-schema", config.OutputSchemaMinimal,
 	)
 	if err != nil {
 		t.Fatalf("trim failed: %v", err)
 	}
 	var transcript schema.MinimalTranscript
 	readJSON(t, output, &transcript)
 	if transcript.Metadata.OutputSchema != config.OutputSchemaMinimal {
 		t.Fatalf("output_schema = %q, want %q", transcript.Metadata.OutputSchema, config.OutputSchemaMinimal)
 	}
 	if len(transcript.Segments) != 2 {
 		t.Fatalf("segment count = %d, want 2", len(transcript.Segments))
 	}
 	assertSequentialIDs(t, []int{transcript.Segments[0].ID, transcript.Segments[1].ID})
 }
 func TestTrimExplicitOutputSchemaConvertsMinimalToIntermediate(t *testing.T) {
 	dir := t.TempDir()
 	input := writeTrimMinimalFixture(t, dir, "input-minimal.json")
 	output := filepath.Join(dir, "trimmed.json")
 	err := executeTrim(
 		"--input-file", input,
 		"--output-file", output,
 		"--keep", "1-2",
 		"--output-schema", config.OutputSchemaIntermediate,
 	)
 	if err != nil {
 		t.Fatalf("trim failed: %v", err)
 	}
 	var transcript schema.IntermediateTranscript
 	readJSON(t, output, &transcript)
 	if transcript.Metadata.OutputSchema != config.OutputSchemaIntermediate {
 		t.Fatalf("output_schema = %q, want %q", transcript.Metadata.OutputSchema, config.OutputSchemaIntermediate)
 	}
 	if len(transcript.Segments) != 2 {
 		t.Fatalf("segment count = %d, want 2", len(transcript.Segments))
 	}
 	assertSequentialIDs(t, []int{transcript.Segments[0].ID, transcript.Segments[1].ID})
 }
 func TestTrimIntermediateInputPreservesIntermediateOutputAndCategories(t *testing.T) {
 	dir := t.TempDir()
 	input := writeTrimIntermediateFixture(t, dir, "input-intermediate.json")
 	output := filepath.Join(dir, "trimmed.json")
 	err := executeTrim(
 		"--input-file", input,
 		"--output-file", output,
 		"--keep", "2",
 	)
 	if err != nil {
 		t.Fatalf("trim failed: %v", err)
 	}
 	var transcript schema.IntermediateTranscript
 	readJSON(t, output, &transcript)
 	if transcript.Metadata.OutputSchema != config.OutputSchemaIntermediate {
 		t.Fatalf("output_schema = %q, want %q", transcript.Metadata.OutputSchema, config.OutputSchemaIntermediate)
 	}
 	if len(transcript.Segments) != 1 {
 		t.Fatalf("segment count = %d, want 1", len(transcript.Segments))
 	}
 	if transcript.Segments[0].ID != 1 {
 		t.Fatalf("segment ID = %d, want 1", transcript.Segments[0].ID)
 	}
 	assertIntSliceEqual(t, []int{len(transcript.Segments[0].Categories)}, []int{2})
 	if transcript.Segments[0].Categories[0] != "filler" || transcript.Segments[0].Categories[1] != "backchannel" {
 		t.Fatalf("categories = %v, want [filler backchannel]", transcript.Segments[0].Categories)
 	}
 }
 func TestTrimFullInputPreservesFullShapeAndRecomputesOverlapGroups(t *testing.T) {
 	dir := t.TempDir()
 	input := writeTrimFullOverlapFixture(t, dir, "input-full-overlap.json")
 	output := filepath.Join(dir, "trimmed.json")
 	err := executeTrim(
 		"--input-file", input,
 		"--output-file", output,
 		"--keep", "1,2",
 	)
 	if err != nil {
 		t.Fatalf("trim failed: %v", err)
 	}
 	var transcript schema.Transcript
 	readJSON(t, output, &transcript)
 	if len(transcript.Segments) != 2 {
 		t.Fatalf("segment count = %d, want 2", len(transcript.Segments))
 	}
 	assertSequentialIDs(t, []int{transcript.Segments[0].ID, transcript.Segments[1].ID})
 	if len(transcript.OverlapGroups) != 1 {
 		t.Fatalf("overlap group count = %d, want 1", len(transcript.OverlapGroups))
 	}
 	if transcript.OverlapGroups[0].ID != 1 {
 		t.Fatalf("overlap group id = %d, want 1", transcript.OverlapGroups[0].ID)
 	}
 	if transcript.Segments[0].OverlapGroupID != 1 || transcript.Segments[1].OverlapGroupID != 1 {
 		t.Fatalf("segment overlap IDs = %d,%d, want 1,1", transcript.Segments[0].OverlapGroupID, transcript.Segments[1].OverlapGroupID)
 	}
 }
 func TestTrimMalformedSelectorFailsWithClearError(t *testing.T) {
 	dir := t.TempDir()
 	input := writeTrimFullFixture(t, dir, "input.json")
 	output := filepath.Join(dir, "trimmed.json")
 	err := executeTrim(
 		"--input-file", input,
 		"--output-file", output,
 		"--keep", "1-",
 	)
 	if err == nil {
 		t.Fatal("expected malformed selector error")
 	}
 	if !strings.Contains(err.Error(), "invalid selector") || !strings.Contains(err.Error(), "malformed element") {
 		t.Fatalf("unexpected error: %v", err)
 	}
 }
 func TestTrimMalformedInputArtifactFailsClearly(t *testing.T) {
 	dir := t.TempDir()
 	input := writeJSONFile(t, dir, "broken.json", `{"metadata":`)
 	output := filepath.Join(dir, "trimmed.json")
 	err := executeTrim(
 		"--input-file", input,
 		"--output-file", output,
 		"--keep", "1",
 	)
 	if err == nil {
 		t.Fatal("expected malformed artifact error")
 	}
 	if !strings.Contains(err.Error(), "input JSON is malformed") {
 		t.Fatalf("unexpected error: %v", err)
 	}
 }
 func TestTrimDuplicateInputSegmentIDsFail(t *testing.T) {
 	dir := t.TempDir()
 	input := writeTrimMinimalWithIDsFixture(t, dir, "input-dup.json", []int{1, 1})
 	output := filepath.Join(dir, "trimmed.json")
 	err := executeTrim(
 		"--input-file", input,
 		"--output-file", output,
 		"--keep", "1",
 	)
 	if err == nil {
 		t.Fatal("expected duplicate segment ID failure")
 	}
 	if !strings.Contains(err.Error(), "not a valid seriatim output artifact") {
 		t.Fatalf("unexpected error: %v", err)
 	}
 }
 func TestTrimNonSequentialInputSegmentIDsFail(t *testing.T) {
 	dir := t.TempDir()
 	input := writeTrimMinimalWithIDsFixture(t, dir, "input-nonseq.json", []int{1, 3})
 	output := filepath.Join(dir, "trimmed.json")
 	err := executeTrim(
 		"--input-file", input,
 		"--output-file", output,
 		"--keep", "1",
 	)
 	if err == nil {
 		t.Fatal("expected non-sequential segment ID failure")
 	}
 	if !strings.Contains(err.Error(), "not a valid seriatim output artifact") {
 		t.Fatalf("unexpected error: %v", err)
 	}
 }
 func TestTrimKeepSelectorWithOverlappingRanges(t *testing.T) {
 	dir := t.TempDir()
 	input := writeTrimFullFixture(t, dir, "input.json")
 	output := filepath.Join(dir, "trimmed.json")
 	err := executeTrim(
 		"--input-file", input,
 		"--output-file", output,
 		"--keep", "1-3,2-4",
 	)
 	if err != nil {
 		t.Fatalf("trim failed: %v", err)
 	}
 	var transcript schema.Transcript
 	readJSON(t, output, &transcript)
 	if len(transcript.Segments) != 4 {
 		t.Fatalf("segment count = %d, want 4", len(transcript.Segments))
 	}
 	assertSequentialIDs(t, []int{
 		transcript.Segments[0].ID,
 		transcript.Segments[1].ID,
 		transcript.Segments[2].ID,
 		transcript.Segments[3].ID,
 	})
 }
 func TestTrimRemoveSelectorWithOverlappingRanges(t *testing.T) {
 	dir := t.TempDir()
 	input := writeTrimFullFixture(t, dir, "input.json")
 	output := filepath.Join(dir, "trimmed.json")
 	err := executeTrim(
 		"--input-file", input,
 		"--output-file", output,
 		"--remove", "2-3,3-4",
 	)
 	if err != nil {
 		t.Fatalf("trim failed: %v", err)
 	}
 	var transcript schema.Transcript
 	readJSON(t, output, &transcript)
 	if len(transcript.Segments) != 1 {
 		t.Fatalf("segment count = %d, want 1", len(transcript.Segments))
 	}
 	if transcript.Segments[0].Text != "one" {
 		t.Fatalf("remaining segment = %#v, want one", transcript.Segments[0])
 	}
 }
 func TestTrimSelectorOrderDoesNotAffectTranscriptOrder(t *testing.T) {
 	dir := t.TempDir()
 	input := writeTrimFullFixture(t, dir, "input.json")
 	output := filepath.Join(dir, "trimmed.json")
 	err := executeTrim(
 		"--input-file", input,
 		"--output-file", output,
 		"--keep", "4,1,3",
 	)
 	if err != nil {
 		t.Fatalf("trim failed: %v", err)
 	}
 	var transcript schema.Transcript
 	readJSON(t, output, &transcript)
 	if len(transcript.Segments) != 3 {
 		t.Fatalf("segment count = %d, want 3", len(transcript.Segments))
 	}
 	got := []string{
 		transcript.Segments[0].Text,
 		transcript.Segments[1].Text,
 		transcript.Segments[2].Text,
 	}
 	want := []string{"one", "three", "four"}
 	if got[0] != want[0] || got[1] != want[1] || got[2] != want[2] {
 		t.Fatalf("segment text order = %v, want %v", got, want)
 	}
 }
 func TestTrimAllowEmptyBehavior(t *testing.T) {
 	dir := t.TempDir()
 	input := writeTrimFullFixture(t, dir, "input.json")
 	output := filepath.Join(dir, "trimmed.json")
 	err := executeTrim(
 		"--input-file", input,
 		"--output-file", output,
 		"--remove", "1-4",
 	)
 	if err == nil {
 		t.Fatal("expected empty-output error")
 	}
 	if !strings.Contains(err.Error(), "empty transcript") {
 		t.Fatalf("unexpected error: %v", err)
 	}
 	err = executeTrim(
 		"--input-file", input,
 		"--output-file", output,
 		"--remove", "1-4",
 		"--allow-empty",
 	)
 	if err != nil {
 		t.Fatalf("trim with --allow-empty failed: %v", err)
 	}
 	var transcript schema.Transcript
 	readJSON(t, output, &transcript)
 	if len(transcript.Segments) != 0 {
 		t.Fatalf("segment count = %d, want 0", len(transcript.Segments))
 	}
 }
 func TestTrimRejectsNonSeriatimInputArtifacts(t *testing.T) {
 	dir := t.TempDir()
 	input := writeJSONFile(t, dir, "raw-whisperx.json", `{
 		"segments": [
 			{"start": 1, "end": 2, "text": "hello"}
 		]
 	}`)
 	output := filepath.Join(dir, "trimmed.json")
 	err := executeTrim(
 		"--input-file", input,
 		"--output-file", output,
 		"--keep", "1",
 	)
 	if err == nil {
 		t.Fatal("expected invalid artifact error")
 	}
 	if !strings.Contains(err.Error(), "not a valid seriatim output artifact") {
 		t.Fatalf("unexpected error: %v", err)
 	}
 }
 func TestTrimReportFileContainsAuditFields(t *testing.T) {
 	dir := t.TempDir()
 	input := writeTrimFullFixture(t, dir, "input.json")
 	output := filepath.Join(dir, "trimmed.json")
 	reportPath := filepath.Join(dir, "trim-report.json")
 	err := executeTrim(
 		"--input-file", input,
 		"--output-file", output,
 		"--report-file", reportPath,
 		"--remove", "4,2",
 	)
 	if err != nil {
 		t.Fatalf("trim failed: %v", err)
 	}
 	var rpt report.Report
 	readJSON(t, reportPath, &rpt)
 	if len(rpt.Events) == 0 {
 		t.Fatal("expected report events")
 	}
 	if !hasReportEvent(rpt, "trim", "trim", "trimmed 4 input segment(s) into 2 output segment(s) with mode=remove") {
 		t.Fatal("expected trim summary event")
 	}
 	if !hasReportEvent(rpt, "trim", "validate-output", "validated 2 output segment(s)") {
 		t.Fatal("expected validation event")
 	}
 	audit := extractTrimAuditEvent(t, rpt)
 	if audit.Operation != "trim" {
 		t.Fatalf("operation = %q, want trim", audit.Operation)
 	}
 	if audit.InputFile != input {
 		t.Fatalf("input_file = %q, want %q", audit.InputFile, input)
 	}
 	if audit.OutputFile != output {
 		t.Fatalf("output_file = %q, want %q", audit.OutputFile, output)
 	}
 	if audit.InputSchema != config.OutputSchemaFull || audit.OutputSchema != config.OutputSchemaFull {
 		t.Fatalf("schemas = %q -> %q, want full -> full", audit.InputSchema, audit.OutputSchema)
 	}
 	if audit.Mode != "remove" {
 		t.Fatalf("mode = %q, want remove", audit.Mode)
 	}
 	if audit.Selector != "4,2" {
 		t.Fatalf("selector = %q, want %q", audit.Selector, "4,2")
 	}
 	assertIntSliceEqual(t, audit.SelectedIDs, []int{2, 4})
 	if audit.AllowEmpty {
 		t.Fatal("allow_empty should be false")
 	}
 	if audit.InputSegmentCount != 4 || audit.RetainedSegmentCount != 2 || audit.RemovedSegmentCount != 2 {
 		t.Fatalf("counts = input:%d retained:%d removed:%d, want 4/2/2", audit.InputSegmentCount, audit.RetainedSegmentCount, audit.RemovedSegmentCount)
 	}
 	assertIntSliceEqual(t, audit.RemovedInputIDs, []int{2, 4})
 	if len(audit.OldToNewIDMapping) != 2 {
 		t.Fatalf("mapping length = %d, want 2", len(audit.OldToNewIDMapping))
 	}
 	if audit.OldToNewIDMapping[0].OldID != 1 || audit.OldToNewIDMapping[0].NewID != 1 {
 		t.Fatalf("mapping[0] = %#v, want old_id=1 new_id=1", audit.OldToNewIDMapping[0])
 	}
 	if audit.OldToNewIDMapping[1].OldID != 3 || audit.OldToNewIDMapping[1].NewID != 2 {
 		t.Fatalf("mapping[1] = %#v, want old_id=3 new_id=2", audit.OldToNewIDMapping[1])
 	}
 	if !audit.OverlapGroupsRecomputed {
 		t.Fatal("expected overlap_groups_recomputed=true for full schema trim")
 	}
 }
 func TestTrimReportOldToNewMappingIsDeterministicSorted(t *testing.T) {
 	dir := t.TempDir()
 	input := writeTrimFullFixture(t, dir, "input.json")
 	output := filepath.Join(dir, "trimmed.json")
 	reportPath := filepath.Join(dir, "trim-report.json")
 	err := executeTrim(
 		"--input-file", input,
 		"--output-file", output,
 		"--report-file", reportPath,
 		"--keep", "4,1,3",
 	)
 	if err != nil {
 		t.Fatalf("trim failed: %v", err)
 	}
 	var rpt report.Report
 	readJSON(t, reportPath, &rpt)
 	audit := extractTrimAuditEvent(t, rpt)
 	if len(audit.OldToNewIDMapping) != 3 {
 		t.Fatalf("mapping length = %d, want 3", len(audit.OldToNewIDMapping))
 	}
 	for index, expectedOld := range []int{1, 3, 4} {
 		if audit.OldToNewIDMapping[index].OldID != expectedOld {
 			t.Fatalf("mapping[%d].old_id = %d, want %d", index, audit.OldToNewIDMapping[index].OldID, expectedOld)
 		}
 	}
 }
 func TestTrimNoReportFileWhenOmitted(t *testing.T) {
 	dir := t.TempDir()
 	input := writeTrimFullFixture(t, dir, "input.json")
 	output := filepath.Join(dir, "trimmed.json")
 	reportPath := filepath.Join(dir, "trim-report.json")
 	err := executeTrim(
 		"--input-file", input,
 		"--output-file", output,
 		"--keep", "1",
 	)
 	if err != nil {
 		t.Fatalf("trim failed: %v", err)
 	}
 	_, statErr := os.Stat(reportPath)
 	if !os.IsNotExist(statErr) {
 		t.Fatalf("expected no report file at %q, got err=%v", reportPath, statErr)
 	}
 }
 func executeTrim(args ...string) error {
 	cmd := NewRootCommand()
 	cmd.SetArgs(append([]string{"trim"}, args...))
 	return cmd.Execute()
 }
 func writeTrimFullFixture(t *testing.T, dir string, name string) string {
 	t.Helper()
 	first := 10
 	second := 20
 	third := 30
 	fourth := 40
 	value := schema.Transcript{
 		Metadata: schema.Metadata{
 			Application:           "seriatim",
 			Version:               "v-test",
 			InputReader:           "json-files",
 			InputFiles:            []string{"a.json"},
 			PreprocessingModules:  []string{"validate-raw"},
 			PostprocessingModules: []string{"assign-ids"},
 			OutputModules:         []string{"json"},
 		},
 		Segments: []schema.Segment{
 			{ID: 1, Source: "a.json", SourceSegmentIndex: &first, SourceRef: "a.json#10", Speaker: "A", Start: 1, End: 2, Text: "one", OverlapGroupID: 9},
 			{ID: 2, Source: "a.json", SourceSegmentIndex: &second, SourceRef: "a.json#20", Speaker: "B", Start: 2, End: 3, Text: "two", OverlapGroupID: 9},
 			{ID: 3, Source: "a.json", SourceSegmentIndex: &third, SourceRef: "a.json#30", Speaker: "C", Start: 4, End: 5, Text: "three", OverlapGroupID: 10},
 			{ID: 4, Source: "a.json", SourceSegmentIndex: &fourth, SourceRef: "a.json#40", Speaker: "D", Start: 5, End: 6, Text: "four", OverlapGroupID: 10},
 		},
 		OverlapGroups: []schema.OverlapGroup{
 			{ID: 9, Start: 1, End: 3, Segments: []string{"a.json#10", "a.json#20"}, Speakers: []string{"A", "B"}, Class: "unknown", Resolution: "unresolved"},
 		},
 	}
 	return writeTrimArtifactFile(t, dir, name, value)
 }
 func writeTrimMinimalFixture(t *testing.T, dir string, name string) string {
 	t.Helper()
 	value := schema.MinimalTranscript{
 		Metadata: schema.MinimalMetadata{
 			Application:  "seriatim",
 			Version:      "v-test",
 			OutputSchema: config.OutputSchemaMinimal,
 		},
 		Segments: []schema.MinimalSegment{
 			{ID: 1, Start: 1, End: 2, Speaker: "A", Text: "one"},
 			{ID: 2, Start: 2, End: 3, Speaker: "B", Text: "two"},
 		},
 	}
 	return writeTrimArtifactFile(t, dir, name, value)
 }
 func writeTrimIntermediateFixture(t *testing.T, dir string, name string) string {
 	t.Helper()
 	value := schema.IntermediateTranscript{
 		Metadata: schema.IntermediateMetadata{
 			Application:  "seriatim",
 			Version:      "v-test",
 			OutputSchema: config.OutputSchemaIntermediate,
 		},
 		Segments: []schema.IntermediateSegment{
 			{ID: 1, Start: 1, End: 2, Speaker: "A", Text: "one", Categories: []string{"word-run"}},
 			{ID: 2, Start: 2, End: 3, Speaker: "B", Text: "two", Categories: []string{"filler", "backchannel"}},
 		},
 	}
 	return writeTrimArtifactFile(t, dir, name, value)
 }
 func writeTrimMinimalWithIDsFixture(t *testing.T, dir string, name string, ids []int) string {
 	t.Helper()
 	if len(ids) < 2 {
 		t.Fatalf("need at least two IDs, got %d", len(ids))
 	}
 	value := schema.MinimalTranscript{
 		Metadata: schema.MinimalMetadata{
 			Application:  "seriatim",
 			Version:      "v-test",
 			OutputSchema: config.OutputSchemaMinimal,
 		},
 		Segments: []schema.MinimalSegment{
 			{ID: ids[0], Start: 1, End: 2, Speaker: "A", Text: "one"},
 			{ID: ids[1], Start: 2, End: 3, Speaker: "B", Text: "two"},
 		},
 	}
 	return writeTrimArtifactFile(t, dir, name, value)
 }
 func writeTrimFullOverlapFixture(t *testing.T, dir string, name string) string {
 	t.Helper()
 	first := 10
 	second := 20
 	third := 30
 	value := schema.Transcript{
 		Metadata: schema.Metadata{
 			Application:           "seriatim",
 			Version:               "v-test",
 			InputReader:           "json-files",
 			InputFiles:            []string{"a.json"},
 			PreprocessingModules:  []string{"validate-raw"},
 			PostprocessingModules: []string{"detect-overlaps", "assign-ids"},
 			OutputModules:         []string{"json"},
 		},
 		Segments: []schema.Segment{
 			{ID: 1, Source: "a.json", SourceSegmentIndex: &first, SourceRef: "a.json#10", Speaker: "A", Start: 1, End: 3, Text: "one", OverlapGroupID: 5},
 			{ID: 2, Source: "a.json", SourceSegmentIndex: &second, SourceRef: "a.json#20", Speaker: "B", Start: 2, End: 4, Text: "two", OverlapGroupID: 5},
 			{ID: 3, Source: "a.json", SourceSegmentIndex: &third, SourceRef: "a.json#30", Speaker: "C", Start: 6, End: 7, Text: "three", OverlapGroupID: 6},
 		},
 		OverlapGroups: []schema.OverlapGroup{
 			{ID: 99, Start: 0, End: 100, Segments: []string{"stale"}, Speakers: []string{"stale"}, Class: "unknown", Resolution: "unresolved"},
 		},
 	}
 	return writeTrimArtifactFile(t, dir, name, value)
 }
 func writeTrimArtifactFile(t *testing.T, dir string, name string, value any) string {
 	t.Helper()
 	data, err := json.MarshalIndent(value, "", "  ")
 	if err != nil {
 		t.Fatalf("marshal fixture: %v", err)
 	}
 	path := filepath.Join(dir, name)
 	if err := os.WriteFile(path, append(data, '\n'), 0o600); err != nil {
 		t.Fatalf("write fixture: %v", err)
 	}
 	return path
 }
 func assertSequentialIDs(t *testing.T, ids []int) {
 	t.Helper()
 	for index, id := range ids {
 		want := index + 1
 		if id != want {
 			t.Fatalf("id at index %d = %d, want %d", index, id, want)
 		}
 	}
 }
 func extractTrimAuditEvent(t *testing.T, rpt report.Report) trimAuditReport {
 	t.Helper()
 	for _, event := range rpt.Events {
 		if event.Stage == "trim" && event.Module == "trim-audit" {
 			var audit trimAuditReport
 			if err := json.Unmarshal([]byte(event.Message), &audit); err != nil {
 				t.Fatalf("decode trim audit event: %v", err)
 			}
 			return audit
 		}
 	}
 	t.Fatal("missing trim-audit event")
 	return trimAuditReport{}
 }
 func assertIntSliceEqual(t *testing.T, got []int, want []int) {
 	t.Helper()
 	if len(got) != len(want) {
 		t.Fatalf("slice length = %d, want %d", len(got), len(want))
 	}
 	for index := range got {
 		if got[index] != want[index] {
 			t.Fatalf("slice[%d] = %d, want %d (full got=%v, want=%v)", index, got[index], want[index], got, want)
 		}
 	}
 }
--- a/internal/config/config.go
+++ b/internal/config/config.go
@@ -47,6 +47,17 @@ type MergeOptions struct {
 	CoalesceGap           string
 }
 // TrimOptions captures raw CLI option values before validation.
 type TrimOptions struct {
 	InputFile    string
 	OutputFile   string
 	ReportFile   string
 	Keep         string
 	Remove       string
 	OutputSchema string
 	AllowEmpty   bool
 }
 // Config is the validated runtime configuration for a merge invocation.
 type Config struct {
 	InputFiles             []string
@@ -66,6 +77,17 @@ type Config struct {
 	FillerMaxDuration      float64
 }
 // TrimConfig is the validated runtime configuration for a trim invocation.
 type TrimConfig struct {
 	InputFile    string
 	OutputFile   string
 	ReportFile   string
 	Mode         string
 	Selector     string
 	OutputSchema string
 	AllowEmpty   bool
 }
 // NewMergeConfig validates raw merge options and returns normalized config.
 func NewMergeConfig(opts MergeOptions) (Config, error) {
 	cfg := Config{
@@ -168,6 +190,63 @@ func NewMergeConfig(opts MergeOptions) (Config, error) {
 	return cfg, nil
 }
 // NewTrimConfig validates raw trim options and returns normalized config.
 func NewTrimConfig(opts TrimOptions) (TrimConfig, error) {
 	inputFile := filepath.Clean(strings.TrimSpace(opts.InputFile))
 	if strings.TrimSpace(opts.InputFile) == "" {
 		return TrimConfig{}, errors.New("--input-file is required")
 	}
 	if err := requireFile(inputFile, "--input-file"); err != nil {
 		return TrimConfig{}, err
 	}
 	outputFile, err := normalizeOutputPath(opts.OutputFile, "--output-file")
 	if err != nil {
 		return TrimConfig{}, err
 	}
 	reportFile := ""
 	if strings.TrimSpace(opts.ReportFile) != "" {
 		reportFile, err = normalizeOutputPath(opts.ReportFile, "--report-file")
 		if err != nil {
 			return TrimConfig{}, err
 		}
 	}
 	keep := strings.TrimSpace(opts.Keep)
 	remove := strings.TrimSpace(opts.Remove)
 	if keep == "" && remove == "" {
 		return TrimConfig{}, errors.New("exactly one of --keep or --remove is required")
 	}
 	if keep != "" && remove != "" {
 		return TrimConfig{}, errors.New("--keep and --remove are mutually exclusive")
 	}
 	mode := "keep"
 	selector := keep
 	if remove != "" {
 		mode = "remove"
 		selector = remove
 	}
 	outputSchema := strings.TrimSpace(opts.OutputSchema)
 	if outputSchema != "" {
 		if err := validateOutputSchema(outputSchema); err != nil {
 			return TrimConfig{}, err
 		}
 	}
 	return TrimConfig{
 		InputFile:    inputFile,
 		OutputFile:   outputFile,
 		ReportFile:   reportFile,
 		Mode:         mode,
 		Selector:     selector,
 		OutputSchema: outputSchema,
 		AllowEmpty:   opts.AllowEmpty,
 	}, nil
 }
 func parseModuleList(value string) ([]string, error) {
 	value = strings.TrimSpace(value)
 	if value == "" {
--- a/internal/config/config_test.go
+++ b/internal/config/config_test.go
@@ -612,6 +612,105 @@ func TestCoalesceGapRejectsInvalidOverride(t *testing.T) {
 	}
 }
 func TestNewTrimConfigRequiresInputAndOutput(t *testing.T) {
 	dir := t.TempDir()
 	input := writeTempFile(t, dir, "input.json")
 	output := filepath.Join(dir, "trimmed.json")
 	_, err := NewTrimConfig(TrimOptions{
 		OutputFile: output,
 		Keep:       "1",
 	})
 	if err == nil || !strings.Contains(err.Error(), "--input-file is required") {
 		t.Fatalf("expected input-file required error, got %v", err)
 	}
 	_, err = NewTrimConfig(TrimOptions{
 		InputFile: input,
 		Keep:      "1",
 	})
 	if err == nil || !strings.Contains(err.Error(), "--output-file is required") {
 		t.Fatalf("expected output-file required error, got %v", err)
 	}
 }
 func TestNewTrimConfigRequiresExactlyOneSelectorFlag(t *testing.T) {
 	dir := t.TempDir()
 	input := writeTempFile(t, dir, "input.json")
 	output := filepath.Join(dir, "trimmed.json")
 	_, err := NewTrimConfig(TrimOptions{
 		InputFile:  input,
 		OutputFile: output,
 	})
 	if err == nil || !strings.Contains(err.Error(), "exactly one of --keep or --remove is required") {
 		t.Fatalf("expected missing selector error, got %v", err)
 	}
 	_, err = NewTrimConfig(TrimOptions{
 		InputFile:  input,
 		OutputFile: output,
 		Keep:       "1",
 		Remove:     "2",
 	})
 	if err == nil || !strings.Contains(err.Error(), "mutually exclusive") {
 		t.Fatalf("expected mutually exclusive selector error, got %v", err)
 	}
 }
 func TestNewTrimConfigAcceptsOutputSchemaOverride(t *testing.T) {
 	dir := t.TempDir()
 	input := writeTempFile(t, dir, "input.json")
 	output := filepath.Join(dir, "trimmed.json")
 	reportPath := filepath.Join(dir, "report.json")
 	cfg, err := NewTrimConfig(TrimOptions{
 		InputFile:    input,
 		OutputFile:   output,
 		ReportFile:   reportPath,
 		Remove:       "3-5",
 		OutputSchema: OutputSchemaMinimal,
 		AllowEmpty:   true,
 	})
 	if err != nil {
 		t.Fatalf("config failed: %v", err)
 	}
 	if cfg.Mode != "remove" {
 		t.Fatalf("mode = %q, want remove", cfg.Mode)
 	}
 	if cfg.Selector != "3-5" {
 		t.Fatalf("selector = %q, want 3-5", cfg.Selector)
 	}
 	if cfg.OutputSchema != OutputSchemaMinimal {
 		t.Fatalf("output schema = %q, want %q", cfg.OutputSchema, OutputSchemaMinimal)
 	}
 	if !cfg.AllowEmpty {
 		t.Fatal("allow empty should be true")
 	}
 	if cfg.ReportFile != reportPath {
 		t.Fatalf("report file = %q, want %q", cfg.ReportFile, reportPath)
 	}
 }
 func TestNewTrimConfigRejectsInvalidOutputSchemaOverride(t *testing.T) {
 	dir := t.TempDir()
 	input := writeTempFile(t, dir, "input.json")
 	output := filepath.Join(dir, "trimmed.json")
 	_, err := NewTrimConfig(TrimOptions{
 		InputFile:    input,
 		OutputFile:   output,
 		Keep:         "1",
 		OutputSchema: "compact",
 	})
 	if err == nil {
 		t.Fatal("expected output schema validation error")
 	}
 	if !strings.Contains(err.Error(), "--output-schema must be one of") {
 		t.Fatalf("unexpected error: %v", err)
 	}
 }
 func assertPositiveFloatEnvValidation(t *testing.T, envName string) {
 	t.Helper()
--- a/internal/trim/apply.go
+++ b/internal/trim/apply.go
@@ -0,0 +1,367 @@
 package trim
 import (
 	"fmt"
 	"gitea.maximumdirect.net/eric/seriatim/internal/model"
 	"gitea.maximumdirect.net/eric/seriatim/internal/overlap"
 	"gitea.maximumdirect.net/eric/seriatim/schema"
 )
 // Mode controls how selector IDs are applied.
 type Mode string
 const (
 	ModeKeep   Mode = "keep"
 	ModeRemove Mode = "remove"
 )
 // Options configures transcript trimming.
 type Options struct {
 	Mode       Mode
 	Selector   Selector
 	AllowEmpty bool
 }
 // Result contains trimming output and ID mapping metadata.
 type Result struct {
 	Transcript schema.Transcript
 	OldToNewID map[int]int
 	RemovedIDs []int
 }
 // IntermediateResult contains trimming output for intermediate schema artifacts.
 type IntermediateResult struct {
 	Transcript schema.IntermediateTranscript
 	OldToNewID map[int]int
 	RemovedIDs []int
 }
 // MinimalResult contains trimming output for minimal schema artifacts.
 type MinimalResult struct {
 	Transcript schema.MinimalTranscript
 	OldToNewID map[int]int
 	RemovedIDs []int
 }
 // Apply trims a full seriatim output transcript by segment ID.
 func Apply(input schema.Transcript, opts Options) (Result, error) {
 	if err := validateMode(opts.Mode); err != nil {
 		return Result{}, err
 	}
 	selected := opts.Selector.IDs()
 	if len(selected) == 0 {
 		return Result{}, fmt.Errorf("selector cannot be empty")
 	}
 	inputIDs := make([]int, len(input.Segments))
 	for index, segment := range input.Segments {
 		inputIDs[index] = segment.ID
 	}
 	idIndex, err := validateInputIDs(inputIDs)
 	if err != nil {
 		return Result{}, err
 	}
 	if err := validateSelectedIDsExist(selected, idIndex); err != nil {
 		return Result{}, err
 	}
 	kept := make([]schema.Segment, 0, len(input.Segments))
 	removed := make([]int, 0, len(input.Segments))
 	oldToNew := make(map[int]int, len(input.Segments))
 	for _, segment := range input.Segments {
 		keep := opts.Mode == ModeKeep && opts.Selector.Contains(segment.ID)
 		if opts.Mode == ModeRemove {
 			keep = !opts.Selector.Contains(segment.ID)
 		}
 		if !keep {
 			removed = append(removed, segment.ID)
 			continue
 		}
 		rewritten := copySegment(segment)
 		rewritten.ID = len(kept) + 1
 		rewritten.OverlapGroupID = 0
 		kept = append(kept, rewritten)
 		oldToNew[segment.ID] = rewritten.ID
 	}
 	if len(kept) == 0 && !opts.AllowEmpty {
 		return Result{}, fmt.Errorf("trim operation produced an empty transcript; set AllowEmpty to true to permit this")
 	}
 	kept, groups := recomputeOverlapGroups(kept)
 	if groups == nil {
 		groups = make([]schema.OverlapGroup, 0)
 	}
 	out := copyTranscript(input)
 	out.Segments = kept
 	out.OverlapGroups = groups
 	return Result{
 		Transcript: out,
 		OldToNewID: oldToNew,
 		RemovedIDs: removed,
 	}, nil
 }
 // ApplyIntermediate trims an intermediate seriatim output transcript by
 // segment ID.
 func ApplyIntermediate(input schema.IntermediateTranscript, opts Options) (IntermediateResult, error) {
 	if err := validateMode(opts.Mode); err != nil {
 		return IntermediateResult{}, err
 	}
 	selected := opts.Selector.IDs()
 	if len(selected) == 0 {
 		return IntermediateResult{}, fmt.Errorf("selector cannot be empty")
 	}
 	inputIDs := make([]int, len(input.Segments))
 	for index, segment := range input.Segments {
 		inputIDs[index] = segment.ID
 	}
 	idIndex, err := validateInputIDs(inputIDs)
 	if err != nil {
 		return IntermediateResult{}, err
 	}
 	if err := validateSelectedIDsExist(selected, idIndex); err != nil {
 		return IntermediateResult{}, err
 	}
 	kept := make([]schema.IntermediateSegment, 0, len(input.Segments))
 	removed := make([]int, 0, len(input.Segments))
 	oldToNew := make(map[int]int, len(input.Segments))
 	for _, segment := range input.Segments {
 		keep := opts.Mode == ModeKeep && opts.Selector.Contains(segment.ID)
 		if opts.Mode == ModeRemove {
 			keep = !opts.Selector.Contains(segment.ID)
 		}
 		if !keep {
 			removed = append(removed, segment.ID)
 			continue
 		}
 		rewritten := schema.IntermediateSegment{
 			ID:         len(kept) + 1,
 			Start:      segment.Start,
 			End:        segment.End,
 			Speaker:    segment.Speaker,
 			Text:       segment.Text,
 			Categories: append([]string(nil), segment.Categories...),
 		}
 		kept = append(kept, rewritten)
 		oldToNew[segment.ID] = rewritten.ID
 	}
 	if len(kept) == 0 && !opts.AllowEmpty {
 		return IntermediateResult{}, fmt.Errorf("trim operation produced an empty transcript; set AllowEmpty to true to permit this")
 	}
 	return IntermediateResult{
 		Transcript: schema.IntermediateTranscript{
 			Metadata: schema.IntermediateMetadata{
 				Application:  input.Metadata.Application,
 				Version:      input.Metadata.Version,
 				OutputSchema: input.Metadata.OutputSchema,
 			},
 			Segments: kept,
 		},
 		OldToNewID: oldToNew,
 		RemovedIDs: removed,
 	}, nil
 }
 // ApplyMinimal trims a minimal seriatim output transcript by segment ID.
 func ApplyMinimal(input schema.MinimalTranscript, opts Options) (MinimalResult, error) {
 	if err := validateMode(opts.Mode); err != nil {
 		return MinimalResult{}, err
 	}
 	selected := opts.Selector.IDs()
 	if len(selected) == 0 {
 		return MinimalResult{}, fmt.Errorf("selector cannot be empty")
 	}
 	inputIDs := make([]int, len(input.Segments))
 	for index, segment := range input.Segments {
 		inputIDs[index] = segment.ID
 	}
 	idIndex, err := validateInputIDs(inputIDs)
 	if err != nil {
 		return MinimalResult{}, err
 	}
 	if err := validateSelectedIDsExist(selected, idIndex); err != nil {
 		return MinimalResult{}, err
 	}
 	kept := make([]schema.MinimalSegment, 0, len(input.Segments))
 	removed := make([]int, 0, len(input.Segments))
 	oldToNew := make(map[int]int, len(input.Segments))
 	for _, segment := range input.Segments {
 		keep := opts.Mode == ModeKeep && opts.Selector.Contains(segment.ID)
 		if opts.Mode == ModeRemove {
 			keep = !opts.Selector.Contains(segment.ID)
 		}
 		if !keep {
 			removed = append(removed, segment.ID)
 			continue
 		}
 		rewritten := schema.MinimalSegment{
 			ID:      len(kept) + 1,
 			Start:   segment.Start,
 			End:     segment.End,
 			Speaker: segment.Speaker,
 			Text:    segment.Text,
 		}
 		kept = append(kept, rewritten)
 		oldToNew[segment.ID] = rewritten.ID
 	}
 	if len(kept) == 0 && !opts.AllowEmpty {
 		return MinimalResult{}, fmt.Errorf("trim operation produced an empty transcript; set AllowEmpty to true to permit this")
 	}
 	return MinimalResult{
 		Transcript: schema.MinimalTranscript{
 			Metadata: schema.MinimalMetadata{
 				Application:  input.Metadata.Application,
 				Version:      input.Metadata.Version,
 				OutputSchema: input.Metadata.OutputSchema,
 			},
 			Segments: kept,
 		},
 		OldToNewID: oldToNew,
 		RemovedIDs: removed,
 	}, nil
 }
 func validateMode(mode Mode) error {
 	switch mode {
 	case ModeKeep, ModeRemove:
 		return nil
 	default:
 		return fmt.Errorf("invalid trim mode %q", mode)
 	}
 }
 func validateInputIDs(ids []int) (map[int]int, error) {
 	seen := make(map[int]int, len(ids))
 	for index, id := range ids {
 		if id <= 0 {
 			return nil, fmt.Errorf("input transcript has non-positive segment ID %d at index %d", id, index)
 		}
 		if firstIndex, exists := seen[id]; exists {
 			return nil, fmt.Errorf("input transcript has duplicate segment ID %d at indexes %d and %d", id, firstIndex, index)
 		}
 		seen[id] = index
 	}
 	for id := 1; id <= len(ids); id++ {
 		if _, exists := seen[id]; !exists {
 			return nil, fmt.Errorf("input transcript segment IDs must be sequential 1..%d; missing ID %d", len(ids), id)
 		}
 	}
 	return seen, nil
 }
 func validateSelectedIDsExist(selected []int, idIndex map[int]int) error {
 	for _, id := range selected {
 		if _, exists := idIndex[id]; !exists {
 			return fmt.Errorf("selected segment ID %d does not exist in input transcript", id)
 		}
 	}
 	return nil
 }
 func recomputeOverlapGroups(segments []schema.Segment) ([]schema.Segment, []schema.OverlapGroup) {
 	if len(segments) == 0 {
 		return segments, make([]schema.OverlapGroup, 0)
 	}
 	modelSegments := make([]model.Segment, len(segments))
 	for index, segment := range segments {
 		modelSegments[index] = model.Segment{
 			ID:                 segment.ID,
 			Source:             segment.Source,
 			SourceSegmentIndex: copyIntPtr(segment.SourceSegmentIndex),
 			SourceRef:          segment.SourceRef,
 			DerivedFrom:        append([]string(nil), segment.DerivedFrom...),
 			Speaker:            segment.Speaker,
 			Start:              segment.Start,
 			End:                segment.End,
 			Text:               segment.Text,
 			Categories:         append([]string(nil), segment.Categories...),
 			OverlapGroupID:     segment.OverlapGroupID,
 		}
 	}
 	detected := overlap.Detect(model.MergedTranscript{
 		Segments: modelSegments,
 	})
 	rewrittenSegments := make([]schema.Segment, len(segments))
 	for index, segment := range segments {
 		rewritten := copySegment(segment)
 		rewritten.OverlapGroupID = detected.Segments[index].OverlapGroupID
 		rewrittenSegments[index] = rewritten
 	}
 	groups := make([]schema.OverlapGroup, len(detected.OverlapGroups))
 	for index, group := range detected.OverlapGroups {
 		groups[index] = schema.OverlapGroup{
 			ID:         group.ID,
 			Start:      group.Start,
 			End:        group.End,
 			Segments:   append([]string(nil), group.Segments...),
 			Speakers:   append([]string(nil), group.Speakers...),
 			Class:      group.Class,
 			Resolution: group.Resolution,
 		}
 	}
 	return rewrittenSegments, groups
 }
 func copyTranscript(input schema.Transcript) schema.Transcript {
 	return schema.Transcript{
 		Metadata: schema.Metadata{
 			Application:           input.Metadata.Application,
 			Version:               input.Metadata.Version,
 			InputReader:           input.Metadata.InputReader,
 			InputFiles:            append([]string(nil), input.Metadata.InputFiles...),
 			PreprocessingModules:  append([]string(nil), input.Metadata.PreprocessingModules...),
 			PostprocessingModules: append([]string(nil), input.Metadata.PostprocessingModules...),
 			OutputModules:         append([]string(nil), input.Metadata.OutputModules...),
 		},
 		Segments:      append([]schema.Segment(nil), input.Segments...),
 		OverlapGroups: append([]schema.OverlapGroup(nil), input.OverlapGroups...),
 	}
 }
 func copySegment(input schema.Segment) schema.Segment {
 	return schema.Segment{
 		ID:                 input.ID,
 		Source:             input.Source,
 		SourceSegmentIndex: copyIntPtr(input.SourceSegmentIndex),
 		SourceRef:          input.SourceRef,
 		DerivedFrom:        append([]string(nil), input.DerivedFrom...),
 		Speaker:            input.Speaker,
 		Start:              input.Start,
 		End:                input.End,
 		Text:               input.Text,
 		Categories:         append([]string(nil), input.Categories...),
 		OverlapGroupID:     input.OverlapGroupID,
 	}
 }
 func copyIntPtr(value *int) *int {
 	if value == nil {
 		return nil
 	}
 	copied := *value
 	return &copied
 }
--- a/internal/trim/apply_test.go
+++ b/internal/trim/apply_test.go
@@ -0,0 +1,668 @@
 package trim
 import (
 	"strings"
 	"testing"
 	"gitea.maximumdirect.net/eric/seriatim/schema"
 )
 func TestApplyKeepModeRenumbersFromOne(t *testing.T) {
 	input := fullTranscriptFixture()
 	selector := mustParseSelector(t, "2,4")
 	result, err := Apply(input, Options{
 		Mode:     ModeKeep,
 		Selector: selector,
 	})
 	if err != nil {
 		t.Fatalf("apply failed: %v", err)
 	}
 	if len(result.Transcript.Segments) != 2 {
 		t.Fatalf("segment count = %d, want 2", len(result.Transcript.Segments))
 	}
 	assertSegmentIDs(t, result.Transcript.Segments, []int{1, 2})
 	assertSegmentTexts(t, result.Transcript.Segments, []string{"beta", "delta"})
 	assertIntMap(t, result.OldToNewID, map[int]int{2: 1, 4: 2})
 	assertIntSlice(t, result.RemovedIDs, []int{1, 3})
 }
 func TestApplyRemoveModeRenumbersFromOne(t *testing.T) {
 	input := fullTranscriptFixture()
 	selector := mustParseSelector(t, "2,4")
 	result, err := Apply(input, Options{
 		Mode:     ModeRemove,
 		Selector: selector,
 	})
 	if err != nil {
 		t.Fatalf("apply failed: %v", err)
 	}
 	assertSegmentIDs(t, result.Transcript.Segments, []int{1, 2})
 	assertSegmentTexts(t, result.Transcript.Segments, []string{"alpha", "gamma"})
 	assertIntMap(t, result.OldToNewID, map[int]int{1: 1, 3: 2})
 	assertIntSlice(t, result.RemovedIDs, []int{2, 4})
 }
 func TestApplySelectorOrderDoesNotChangeTranscriptOrder(t *testing.T) {
 	input := fullTranscriptFixture()
 	selector := mustParseSelector(t, "4,1,3")
 	result, err := Apply(input, Options{
 		Mode:     ModeKeep,
 		Selector: selector,
 	})
 	if err != nil {
 		t.Fatalf("apply failed: %v", err)
 	}
 	assertSegmentIDs(t, result.Transcript.Segments, []int{1, 2, 3})
 	assertSegmentTexts(t, result.Transcript.Segments, []string{"alpha", "gamma", "delta"})
 }
 func TestApplyFailsWhenSelectedIDDoesNotExist(t *testing.T) {
 	input := fullTranscriptFixture()
 	selector := mustParseSelector(t, "2,99")
 	_, err := Apply(input, Options{
 		Mode:     ModeKeep,
 		Selector: selector,
 	})
 	if err == nil {
 		t.Fatal("expected missing selected ID error")
 	}
 	if !strings.Contains(err.Error(), "does not exist") {
 		t.Fatalf("unexpected error: %v", err)
 	}
 }
 func TestApplyFailsOnDuplicateInputIDs(t *testing.T) {
 	input := fullTranscriptFixture()
 	input.Segments[2].ID = 2
 	selector := mustParseSelector(t, "2")
 	_, err := Apply(input, Options{
 		Mode:     ModeKeep,
 		Selector: selector,
 	})
 	if err == nil {
 		t.Fatal("expected duplicate input ID error")
 	}
 	if !strings.Contains(err.Error(), "duplicate segment ID") {
 		t.Fatalf("unexpected error: %v", err)
 	}
 }
 func TestApplyFailsOnMissingOrNonSequentialInputIDs(t *testing.T) {
 	input := fullTranscriptFixture()
 	input.Segments[1].ID = 5
 	selector := mustParseSelector(t, "1")
 	_, err := Apply(input, Options{
 		Mode:     ModeKeep,
 		Selector: selector,
 	})
 	if err == nil {
 		t.Fatal("expected non-sequential input ID error")
 	}
 	if !strings.Contains(err.Error(), "must be sequential") {
 		t.Fatalf("unexpected error: %v", err)
 	}
 }
 func TestApplyFailsOnNonPositiveInputIDs(t *testing.T) {
 	input := fullTranscriptFixture()
 	input.Segments[0].ID = 0
 	selector := mustParseSelector(t, "1")
 	_, err := Apply(input, Options{
 		Mode:     ModeKeep,
 		Selector: selector,
 	})
 	if err == nil {
 		t.Fatal("expected non-positive input ID error")
 	}
 	if !strings.Contains(err.Error(), "non-positive") {
 		t.Fatalf("unexpected error: %v", err)
 	}
 }
 func TestApplyEmptyOutputFailsUnlessAllowEmpty(t *testing.T) {
 	input := fullTranscriptFixture()
 	selector := mustParseSelector(t, "1-4")
 	_, err := Apply(input, Options{
 		Mode:     ModeRemove,
 		Selector: selector,
 	})
 	if err == nil {
 		t.Fatal("expected empty-output error")
 	}
 	if !strings.Contains(err.Error(), "empty transcript") {
 		t.Fatalf("unexpected error: %v", err)
 	}
 	allowed, err := Apply(input, Options{
 		Mode:       ModeRemove,
 		Selector:   selector,
 		AllowEmpty: true,
 	})
 	if err != nil {
 		t.Fatalf("apply with AllowEmpty failed: %v", err)
 	}
 	if len(allowed.Transcript.Segments) != 0 {
 		t.Fatalf("segment count = %d, want 0", len(allowed.Transcript.Segments))
 	}
 	assertIntMap(t, allowed.OldToNewID, map[int]int{})
 	assertIntSlice(t, allowed.RemovedIDs, []int{1, 2, 3, 4})
 }
 func TestApplyPreservesRetainedSegmentFieldsAndClearsOverlapIDs(t *testing.T) {
 	input := fullTranscriptFixture()
 	selector := mustParseSelector(t, "2")
 	result, err := Apply(input, Options{
 		Mode:     ModeKeep,
 		Selector: selector,
 	})
 	if err != nil {
 		t.Fatalf("apply failed: %v", err)
 	}
 	if len(result.Transcript.Segments) != 1 {
 		t.Fatalf("segment count = %d, want 1", len(result.Transcript.Segments))
 	}
 	segment := result.Transcript.Segments[0]
 	if segment.ID != 1 {
 		t.Fatalf("segment ID = %d, want 1", segment.ID)
 	}
 	if segment.Source != "b.json" {
 		t.Fatalf("source = %q, want %q", segment.Source, "b.json")
 	}
 	if segment.SourceSegmentIndex == nil || *segment.SourceSegmentIndex != 20 {
 		t.Fatalf("source_segment_index = %v, want 20", segment.SourceSegmentIndex)
 	}
 	if segment.SourceRef != "b.json#20" {
 		t.Fatalf("source_ref = %q, want %q", segment.SourceRef, "b.json#20")
 	}
 	if !equalStringSlices(segment.DerivedFrom, []string{"b.json#19", "b.json#20"}) {
 		t.Fatalf("derived_from = %v, want %v", segment.DerivedFrom, []string{"b.json#19", "b.json#20"})
 	}
 	if !equalStringSlices(segment.Categories, []string{"filler", "backchannel"}) {
 		t.Fatalf("categories = %v, want %v", segment.Categories, []string{"filler", "backchannel"})
 	}
 	if segment.Speaker != "Bob" {
 		t.Fatalf("speaker = %q, want Bob", segment.Speaker)
 	}
 	if segment.Start != 2 || segment.End != 3 {
 		t.Fatalf("times = %.3f-%.3f, want 2.000-3.000", segment.Start, segment.End)
 	}
 	if segment.Text != "beta" {
 		t.Fatalf("text = %q, want beta", segment.Text)
 	}
 	if segment.OverlapGroupID != 0 {
 		t.Fatalf("overlap_group_id = %d, want 0", segment.OverlapGroupID)
 	}
 	if len(result.Transcript.OverlapGroups) != 0 {
 		t.Fatalf("overlap_groups count = %d, want 0", len(result.Transcript.OverlapGroups))
 	}
 }
 func TestApplyFullSchemaRemovesStaleOverlapGroups(t *testing.T) {
 	input := overlapTranscriptFixture()
 	selector := mustParseSelector(t, "1,3")
 	result, err := Apply(input, Options{
 		Mode:     ModeKeep,
 		Selector: selector,
 	})
 	if err != nil {
 		t.Fatalf("apply failed: %v", err)
 	}
 	if len(result.Transcript.OverlapGroups) != 0 {
 		t.Fatalf("overlap_groups count = %d, want 0", len(result.Transcript.OverlapGroups))
 	}
 	for index, segment := range result.Transcript.Segments {
 		if segment.OverlapGroupID != 0 {
 			t.Fatalf("segment %d overlap_group_id = %d, want 0", index, segment.OverlapGroupID)
 		}
 	}
 }
 func TestApplyFullSchemaRecomputesOverlapGroup(t *testing.T) {
 	input := overlapTranscriptFixture()
 	selector := mustParseSelector(t, "1,2")
 	result, err := Apply(input, Options{
 		Mode:     ModeKeep,
 		Selector: selector,
 	})
 	if err != nil {
 		t.Fatalf("apply failed: %v", err)
 	}
 	assertSegmentIDs(t, result.Transcript.Segments, []int{1, 2})
 	assertIntSlice(t, []int{
 		result.Transcript.Segments[0].OverlapGroupID,
 		result.Transcript.Segments[1].OverlapGroupID,
 	}, []int{1, 1})
 	if len(result.Transcript.OverlapGroups) != 1 {
 		t.Fatalf("overlap_groups count = %d, want 1", len(result.Transcript.OverlapGroups))
 	}
 	group := result.Transcript.OverlapGroups[0]
 	if group.ID != 1 {
 		t.Fatalf("group ID = %d, want 1", group.ID)
 	}
 	if group.Start != 1 || group.End != 4 {
 		t.Fatalf("group times = %.3f-%.3f, want 1.000-4.000", group.Start, group.End)
 	}
 	if !equalStringSlices(group.Segments, []string{"a.json#10", "b.json#20"}) {
 		t.Fatalf("group segments = %v, want %v", group.Segments, []string{"a.json#10", "b.json#20"})
 	}
 	if !equalStringSlices(group.Speakers, []string{"Alice", "Bob"}) {
 		t.Fatalf("group speakers = %v, want %v", group.Speakers, []string{"Alice", "Bob"})
 	}
 }
 func TestApplyFullSchemaDropsGroupWhenFewerThanTwoSpeakersRemain(t *testing.T) {
 	input := overlapTranscriptFixture()
 	selector := mustParseSelector(t, "1")
 	result, err := Apply(input, Options{
 		Mode:     ModeKeep,
 		Selector: selector,
 	})
 	if err != nil {
 		t.Fatalf("apply failed: %v", err)
 	}
 	if len(result.Transcript.OverlapGroups) != 0 {
 		t.Fatalf("overlap_groups count = %d, want 0", len(result.Transcript.OverlapGroups))
 	}
 	if len(result.Transcript.Segments) != 1 {
 		t.Fatalf("segment count = %d, want 1", len(result.Transcript.Segments))
 	}
 	if result.Transcript.Segments[0].OverlapGroupID != 0 {
 		t.Fatalf("segment overlap_group_id = %d, want 0", result.Transcript.Segments[0].OverlapGroupID)
 	}
 }
 func TestApplyFullSchemaHandlesTransitiveOverlaps(t *testing.T) {
 	input := transitiveOverlapFixture()
 	selector := mustParseSelector(t, "1-3")
 	result, err := Apply(input, Options{
 		Mode:     ModeKeep,
 		Selector: selector,
 	})
 	if err != nil {
 		t.Fatalf("apply failed: %v", err)
 	}
 	if len(result.Transcript.OverlapGroups) != 1 {
 		t.Fatalf("overlap_groups count = %d, want 1", len(result.Transcript.OverlapGroups))
 	}
 	assertIntSlice(t, []int{
 		result.Transcript.Segments[0].OverlapGroupID,
 		result.Transcript.Segments[1].OverlapGroupID,
 		result.Transcript.Segments[2].OverlapGroupID,
 	}, []int{1, 1, 1})
 	group := result.Transcript.OverlapGroups[0]
 	if group.Start != 10 || group.End != 15 {
 		t.Fatalf("group times = %.3f-%.3f, want 10.000-15.000", group.Start, group.End)
 	}
 }
 func TestApplyFullSchemaBoundaryTouchingNotGrouped(t *testing.T) {
 	input := boundaryFixture()
 	selector := mustParseSelector(t, "1-2")
 	result, err := Apply(input, Options{
 		Mode:     ModeKeep,
 		Selector: selector,
 	})
 	if err != nil {
 		t.Fatalf("apply failed: %v", err)
 	}
 	if len(result.Transcript.OverlapGroups) != 0 {
 		t.Fatalf("overlap_groups count = %d, want 0", len(result.Transcript.OverlapGroups))
 	}
 	assertIntSlice(t, []int{
 		result.Transcript.Segments[0].OverlapGroupID,
 		result.Transcript.Segments[1].OverlapGroupID,
 	}, []int{0, 0})
 }
 func TestApplyIntermediateDoesNotIncludeOverlapGroups(t *testing.T) {
 	input := schema.IntermediateTranscript{
 		Metadata: schema.IntermediateMetadata{
 			Application:  "seriatim",
 			Version:      "v-test",
 			OutputSchema: "seriatim-intermediate",
 		},
 		Segments: []schema.IntermediateSegment{
 			{ID: 1, Start: 1, End: 3, Speaker: "Alice", Text: "alpha", Categories: []string{"word-run"}},
 			{ID: 2, Start: 2, End: 4, Speaker: "Bob", Text: "beta", Categories: []string{"filler"}},
 		},
 	}
 	selector := mustParseSelector(t, "1")
 	result, err := ApplyIntermediate(input, Options{
 		Mode:     ModeKeep,
 		Selector: selector,
 	})
 	if err != nil {
 		t.Fatalf("apply intermediate failed: %v", err)
 	}
 	if len(result.Transcript.Segments) != 1 {
 		t.Fatalf("segment count = %d, want 1", len(result.Transcript.Segments))
 	}
 	if result.Transcript.Segments[0].ID != 1 {
 		t.Fatalf("segment id = %d, want 1", result.Transcript.Segments[0].ID)
 	}
 	if err := schema.ValidateIntermediateTranscript(result.Transcript); err != nil {
 		t.Fatalf("intermediate output should remain valid: %v", err)
 	}
 }
 func TestApplyMinimalDoesNotIncludeOverlapGroups(t *testing.T) {
 	input := schema.MinimalTranscript{
 		Metadata: schema.MinimalMetadata{
 			Application:  "seriatim",
 			Version:      "v-test",
 			OutputSchema: "seriatim-minimal",
 		},
 		Segments: []schema.MinimalSegment{
 			{ID: 1, Start: 1, End: 3, Speaker: "Alice", Text: "alpha"},
 			{ID: 2, Start: 2, End: 4, Speaker: "Bob", Text: "beta"},
 		},
 	}
 	selector := mustParseSelector(t, "2")
 	result, err := ApplyMinimal(input, Options{
 		Mode:     ModeKeep,
 		Selector: selector,
 	})
 	if err != nil {
 		t.Fatalf("apply minimal failed: %v", err)
 	}
 	if len(result.Transcript.Segments) != 1 {
 		t.Fatalf("segment count = %d, want 1", len(result.Transcript.Segments))
 	}
 	if result.Transcript.Segments[0].ID != 1 {
 		t.Fatalf("segment id = %d, want 1", result.Transcript.Segments[0].ID)
 	}
 	if err := schema.ValidateMinimalTranscript(result.Transcript); err != nil {
 		t.Fatalf("minimal output should remain valid: %v", err)
 	}
 }
 func TestApplyOutputInvariantsValidAfterRenumberAndOverlapRecompute(t *testing.T) {
 	input := overlapTranscriptFixture()
 	selector := mustParseSelector(t, "2,1")
 	result, err := Apply(input, Options{
 		Mode:     ModeKeep,
 		Selector: selector,
 	})
 	if err != nil {
 		t.Fatalf("apply failed: %v", err)
 	}
 	if err := schema.ValidateTranscript(result.Transcript); err != nil {
 		t.Fatalf("trim output should remain valid: %v", err)
 	}
 }
 func mustParseSelector(t *testing.T, value string) Selector {
 	t.Helper()
 	selector, err := ParseSelector(value)
 	if err != nil {
 		t.Fatalf("selector parse failed for %q: %v", value, err)
 	}
 	return selector
 }
 func fullTranscriptFixture() schema.Transcript {
 	firstIndex := 10
 	secondIndex := 20
 	thirdIndex := 30
 	fourthIndex := 40
 	return schema.Transcript{
 		Metadata: schema.Metadata{
 			Application:           "seriatim",
 			Version:               "v-test",
 			InputReader:           "json-files",
 			InputFiles:            []string{"a.json", "b.json"},
 			PreprocessingModules:  []string{"validate-raw"},
 			PostprocessingModules: []string{"detect-overlaps"},
 			OutputModules:         []string{"json"},
 		},
 		Segments: []schema.Segment{
 			{
 				ID:                 1,
 				Source:             "a.json",
 				SourceSegmentIndex: &firstIndex,
 				SourceRef:          "a.json#10",
 				DerivedFrom:        []string{"a.json#10"},
 				Speaker:            "Alice",
 				Start:              1,
 				End:                2,
 				Text:               "alpha",
 				Categories:         []string{"word-run"},
 				OverlapGroupID:     7,
 			},
 			{
 				ID:                 2,
 				Source:             "b.json",
 				SourceSegmentIndex: &secondIndex,
 				SourceRef:          "b.json#20",
 				DerivedFrom:        []string{"b.json#19", "b.json#20"},
 				Speaker:            "Bob",
 				Start:              2,
 				End:                3,
 				Text:               "beta",
 				Categories:         []string{"filler", "backchannel"},
 				OverlapGroupID:     7,
 			},
 			{
 				ID:                 3,
 				Source:             "c.json",
 				SourceSegmentIndex: &thirdIndex,
 				SourceRef:          "c.json#30",
 				DerivedFrom:        []string{"c.json#30"},
 				Speaker:            "Carol",
 				Start:              3,
 				End:                4,
 				Text:               "gamma",
 				Categories:         []string{"normal"},
 				OverlapGroupID:     8,
 			},
 			{
 				ID:                 4,
 				Source:             "d.json",
 				SourceSegmentIndex: &fourthIndex,
 				SourceRef:          "d.json#40",
 				DerivedFrom:        []string{"d.json#40"},
 				Speaker:            "Dan",
 				Start:              4,
 				End:                5,
 				Text:               "delta",
 				Categories:         []string{"normal"},
 				OverlapGroupID:     9,
 			},
 		},
 		OverlapGroups: []schema.OverlapGroup{
 			{
 				ID:         7,
 				Start:      1.5,
 				End:        3.1,
 				Segments:   []string{"a.json#10", "b.json#20"},
 				Speakers:   []string{"Alice", "Bob"},
 				Class:      "unknown",
 				Resolution: "unresolved",
 			},
 		},
 	}
 }
 func overlapTranscriptFixture() schema.Transcript {
 	first := 10
 	second := 20
 	third := 30
 	return schema.Transcript{
 		Metadata: schema.Metadata{
 			Application:           "seriatim",
 			Version:               "v-test",
 			InputReader:           "json-files",
 			InputFiles:            []string{"a.json", "b.json", "c.json"},
 			PreprocessingModules:  []string{"validate-raw"},
 			PostprocessingModules: []string{"detect-overlaps"},
 			OutputModules:         []string{"json"},
 		},
 		Segments: []schema.Segment{
 			{
 				ID:                 1,
 				Source:             "a.json",
 				SourceSegmentIndex: &first,
 				SourceRef:          "a.json#10",
 				Speaker:            "Alice",
 				Start:              1,
 				End:                4,
 				Text:               "a",
 				OverlapGroupID:     99,
 			},
 			{
 				ID:                 2,
 				Source:             "b.json",
 				SourceSegmentIndex: &second,
 				SourceRef:          "b.json#20",
 				Speaker:            "Bob",
 				Start:              2,
 				End:                3,
 				Text:               "b",
 				OverlapGroupID:     99,
 			},
 			{
 				ID:                 3,
 				Source:             "c.json",
 				SourceSegmentIndex: &third,
 				SourceRef:          "c.json#30",
 				Speaker:            "Carol",
 				Start:              10,
 				End:                11,
 				Text:               "c",
 				OverlapGroupID:     100,
 			},
 		},
 		OverlapGroups: []schema.OverlapGroup{
 			{
 				ID:         99,
 				Start:      0,
 				End:        100,
 				Segments:   []string{"stale#1", "stale#2"},
 				Speakers:   []string{"stale"},
 				Class:      "unknown",
 				Resolution: "unresolved",
 			},
 		},
 	}
 }
 func transitiveOverlapFixture() schema.Transcript {
 	one := 1
 	two := 2
 	three := 3
 	return schema.Transcript{
 		Metadata: schema.Metadata{
 			Application: "seriatim",
 			Version:     "v-test",
 		},
 		Segments: []schema.Segment{
 			{ID: 1, Source: "a.json", SourceSegmentIndex: &one, Speaker: "Alice", Start: 10, End: 14, Text: "a"},
 			{ID: 2, Source: "b.json", SourceSegmentIndex: &two, Speaker: "Bob", Start: 12, End: 13, Text: "b"},
 			{ID: 3, Source: "c.json", SourceSegmentIndex: &three, Speaker: "Carol", Start: 13.5, End: 15, Text: "c"},
 		},
 		OverlapGroups: []schema.OverlapGroup{{ID: 77}},
 	}
 }
 func boundaryFixture() schema.Transcript {
 	one := 1
 	two := 2
 	return schema.Transcript{
 		Metadata: schema.Metadata{
 			Application: "seriatim",
 			Version:     "v-test",
 		},
 		Segments: []schema.Segment{
 			{ID: 1, Source: "a.json", SourceSegmentIndex: &one, Speaker: "Alice", Start: 1, End: 2, Text: "a", OverlapGroupID: 7},
 			{ID: 2, Source: "b.json", SourceSegmentIndex: &two, Speaker: "Bob", Start: 2, End: 3, Text: "b", OverlapGroupID: 7},
 		},
 		OverlapGroups: []schema.OverlapGroup{{ID: 7, Start: 1, End: 3}},
 	}
 }
 func assertSegmentIDs(t *testing.T, segments []schema.Segment, want []int) {
 	t.Helper()
 	got := make([]int, len(segments))
 	for index, segment := range segments {
 		got[index] = segment.ID
 	}
 	assertIntSlice(t, got, want)
 }
 func assertSegmentTexts(t *testing.T, segments []schema.Segment, want []string) {
 	t.Helper()
 	got := make([]string, len(segments))
 	for index, segment := range segments {
 		got[index] = segment.Text
 	}
 	if !equalStringSlices(got, want) {
 		t.Fatalf("segment texts = %v, want %v", got, want)
 	}
 }
 func assertIntSlice(t *testing.T, got []int, want []int) {
 	t.Helper()
 	if len(got) != len(want) {
 		t.Fatalf("slice length = %d, want %d", len(got), len(want))
 	}
 	for index := range got {
 		if got[index] != want[index] {
 			t.Fatalf("slice[%d] = %d, want %d (full got=%v, want=%v)", index, got[index], want[index], got, want)
 		}
 	}
 }
 func assertIntMap(t *testing.T, got map[int]int, want map[int]int) {
 	t.Helper()
 	if len(got) != len(want) {
 		t.Fatalf("map length = %d, want %d", len(got), len(want))
 	}
 	for key, wantValue := range want {
 		gotValue, exists := got[key]
 		if !exists {
 			t.Fatalf("missing map key %d", key)
 		}
 		if gotValue != wantValue {
 			t.Fatalf("map[%d] = %d, want %d", key, gotValue, wantValue)
 		}
 	}
 }
 func equalStringSlices(got []string, want []string) bool {
 	if len(got) != len(want) {
 		return false
 	}
 	for index := range got {
 		if got[index] != want[index] {
 			return false
 		}
 	}
 	return true
 }
--- a/internal/trim/artifact.go
+++ b/internal/trim/artifact.go
@@ -0,0 +1,396 @@
 package trim
 import (
 	"encoding/json"
 	"fmt"
 	"gitea.maximumdirect.net/eric/seriatim/schema"
 )
 const (
 	SchemaMinimal      = "seriatim-minimal"
 	SchemaIntermediate = "seriatim-intermediate"
 	SchemaFull         = "seriatim-full"
 )
 // Artifact stores a parsed seriatim output artifact of one supported schema.
 type Artifact struct {
 	Schema       string
 	Full         *schema.Transcript
 	Intermediate *schema.IntermediateTranscript
 	Minimal      *schema.MinimalTranscript
 }
 // ApplyArtifactResult contains trimmed artifact output and ID mapping metadata.
 type ApplyArtifactResult struct {
 	Artifact                Artifact
 	OldToNewID              map[int]int
 	RemovedIDs              []int
 	OverlapGroupsRecomputed bool
 }
 // ParseArtifactJSON parses and validates a serialized seriatim output artifact.
 func ParseArtifactJSON(data []byte) (Artifact, error) {
 	var decoded any
 	if err := json.Unmarshal(data, &decoded); err != nil {
 		return Artifact{}, fmt.Errorf("input JSON is malformed: %w", err)
 	}
 	var full schema.Transcript
 	if err := json.Unmarshal(data, &full); err == nil {
 		if err := schema.ValidateTranscript(full); err == nil {
 			return Artifact{
 				Schema: SchemaFull,
 				Full:   &full,
 			}, nil
 		}
 	}
 	var intermediate schema.IntermediateTranscript
 	if err := json.Unmarshal(data, &intermediate); err == nil {
 		if err := schema.ValidateIntermediateTranscript(intermediate); err == nil {
 			return Artifact{
 				Schema:       SchemaIntermediate,
 				Intermediate: &intermediate,
 			}, nil
 		}
 	}
 	var minimal schema.MinimalTranscript
 	if err := json.Unmarshal(data, &minimal); err == nil {
 		if err := schema.ValidateMinimalTranscript(minimal); err == nil {
 			return Artifact{
 				Schema:  SchemaMinimal,
 				Minimal: &minimal,
 			}, nil
 		}
 	}
 	return Artifact{}, fmt.Errorf("input JSON is not a valid seriatim output artifact")
 }
 // ValidateArtifact validates an artifact against its declared schema.
 func ValidateArtifact(artifact Artifact) error {
 	switch artifact.Schema {
 	case SchemaFull:
 		if artifact.Full == nil {
 			return fmt.Errorf("full artifact payload is missing")
 		}
 		return schema.ValidateTranscript(*artifact.Full)
 	case SchemaIntermediate:
 		if artifact.Intermediate == nil {
 			return fmt.Errorf("intermediate artifact payload is missing")
 		}
 		return schema.ValidateIntermediateTranscript(*artifact.Intermediate)
 	case SchemaMinimal:
 		if artifact.Minimal == nil {
 			return fmt.Errorf("minimal artifact payload is missing")
 		}
 		return schema.ValidateMinimalTranscript(*artifact.Minimal)
 	default:
 		return fmt.Errorf("unsupported artifact schema %q", artifact.Schema)
 	}
 }
 // Value returns the artifact value for JSON serialization.
 func (artifact Artifact) Value() any {
 	switch artifact.Schema {
 	case SchemaFull:
 		if artifact.Full == nil {
 			return schema.Transcript{}
 		}
 		return *artifact.Full
 	case SchemaIntermediate:
 		if artifact.Intermediate == nil {
 			return schema.IntermediateTranscript{}
 		}
 		return *artifact.Intermediate
 	case SchemaMinimal:
 		if artifact.Minimal == nil {
 			return schema.MinimalTranscript{}
 		}
 		return *artifact.Minimal
 	default:
 		return nil
 	}
 }
 // SegmentCount returns the number of segments in the artifact.
 func (artifact Artifact) SegmentCount() int {
 	switch artifact.Schema {
 	case SchemaFull:
 		if artifact.Full == nil {
 			return 0
 		}
 		return len(artifact.Full.Segments)
 	case SchemaIntermediate:
 		if artifact.Intermediate == nil {
 			return 0
 		}
 		return len(artifact.Intermediate.Segments)
 	case SchemaMinimal:
 		if artifact.Minimal == nil {
 			return 0
 		}
 		return len(artifact.Minimal.Segments)
 	default:
 		return 0
 	}
 }
 // Application returns artifact metadata application name.
 func (artifact Artifact) Application() string {
 	switch artifact.Schema {
 	case SchemaFull:
 		if artifact.Full == nil {
 			return ""
 		}
 		return artifact.Full.Metadata.Application
 	case SchemaIntermediate:
 		if artifact.Intermediate == nil {
 			return ""
 		}
 		return artifact.Intermediate.Metadata.Application
 	case SchemaMinimal:
 		if artifact.Minimal == nil {
 			return ""
 		}
 		return artifact.Minimal.Metadata.Application
 	default:
 		return ""
 	}
 }
 // Version returns artifact metadata version.
 func (artifact Artifact) Version() string {
 	switch artifact.Schema {
 	case SchemaFull:
 		if artifact.Full == nil {
 			return ""
 		}
 		return artifact.Full.Metadata.Version
 	case SchemaIntermediate:
 		if artifact.Intermediate == nil {
 			return ""
 		}
 		return artifact.Intermediate.Metadata.Version
 	case SchemaMinimal:
 		if artifact.Minimal == nil {
 			return ""
 		}
 		return artifact.Minimal.Metadata.Version
 	default:
 		return ""
 	}
 }
 // ApplyArtifact trims a parsed artifact while preserving its input schema.
 func ApplyArtifact(input Artifact, opts Options) (ApplyArtifactResult, error) {
 	switch input.Schema {
 	case SchemaFull:
 		if input.Full == nil {
 			return ApplyArtifactResult{}, fmt.Errorf("full artifact payload is missing")
 		}
 		result, err := Apply(*input.Full, opts)
 		if err != nil {
 			return ApplyArtifactResult{}, err
 		}
 		out := result.Transcript
 		return ApplyArtifactResult{
 			Artifact: Artifact{
 				Schema: SchemaFull,
 				Full:   &out,
 			},
 			OldToNewID:              result.OldToNewID,
 			RemovedIDs:              result.RemovedIDs,
 			OverlapGroupsRecomputed: true,
 		}, nil
 	case SchemaIntermediate:
 		if input.Intermediate == nil {
 			return ApplyArtifactResult{}, fmt.Errorf("intermediate artifact payload is missing")
 		}
 		result, err := ApplyIntermediate(*input.Intermediate, opts)
 		if err != nil {
 			return ApplyArtifactResult{}, err
 		}
 		out := result.Transcript
 		return ApplyArtifactResult{
 			Artifact: Artifact{
 				Schema:       SchemaIntermediate,
 				Intermediate: &out,
 			},
 			OldToNewID:              result.OldToNewID,
 			RemovedIDs:              result.RemovedIDs,
 			OverlapGroupsRecomputed: false,
 		}, nil
 	case SchemaMinimal:
 		if input.Minimal == nil {
 			return ApplyArtifactResult{}, fmt.Errorf("minimal artifact payload is missing")
 		}
 		result, err := ApplyMinimal(*input.Minimal, opts)
 		if err != nil {
 			return ApplyArtifactResult{}, err
 		}
 		out := result.Transcript
 		return ApplyArtifactResult{
 			Artifact: Artifact{
 				Schema:  SchemaMinimal,
 				Minimal: &out,
 			},
 			OldToNewID:              result.OldToNewID,
 			RemovedIDs:              result.RemovedIDs,
 			OverlapGroupsRecomputed: false,
 		}, nil
 	default:
 		return ApplyArtifactResult{}, fmt.Errorf("unsupported artifact schema %q", input.Schema)
 	}
 }
 // ConvertArtifact converts a parsed artifact to another supported output schema.
 func ConvertArtifact(input Artifact, outputSchema string) (Artifact, error) {
 	if outputSchema == "" || outputSchema == input.Schema {
 		return input, nil
 	}
 	switch input.Schema {
 	case SchemaFull:
 		if input.Full == nil {
 			return Artifact{}, fmt.Errorf("full artifact payload is missing")
 		}
 		switch outputSchema {
 		case SchemaIntermediate:
 			out := intermediateFromFull(*input.Full)
 			return Artifact{
 				Schema:       SchemaIntermediate,
 				Intermediate: &out,
 			}, nil
 		case SchemaMinimal:
 			out := minimalFromFull(*input.Full)
 			return Artifact{
 				Schema:  SchemaMinimal,
 				Minimal: &out,
 			}, nil
 		default:
 			return Artifact{}, fmt.Errorf("unsupported output schema %q", outputSchema)
 		}
 	case SchemaIntermediate:
 		if input.Intermediate == nil {
 			return Artifact{}, fmt.Errorf("intermediate artifact payload is missing")
 		}
 		switch outputSchema {
 		case SchemaMinimal:
 			out := minimalFromIntermediate(*input.Intermediate)
 			return Artifact{
 				Schema:  SchemaMinimal,
 				Minimal: &out,
 			}, nil
 		case SchemaFull:
 			return Artifact{}, fmt.Errorf("cannot emit %q from %q input artifact", SchemaFull, SchemaIntermediate)
 		default:
 			return Artifact{}, fmt.Errorf("unsupported output schema %q", outputSchema)
 		}
 	case SchemaMinimal:
 		if input.Minimal == nil {
 			return Artifact{}, fmt.Errorf("minimal artifact payload is missing")
 		}
 		switch outputSchema {
 		case SchemaIntermediate:
 			out := intermediateFromMinimal(*input.Minimal)
 			return Artifact{
 				Schema:       SchemaIntermediate,
 				Intermediate: &out,
 			}, nil
 		case SchemaFull:
 			return Artifact{}, fmt.Errorf("cannot emit %q from %q input artifact", SchemaFull, SchemaMinimal)
 		default:
 			return Artifact{}, fmt.Errorf("unsupported output schema %q", outputSchema)
 		}
 	default:
 		return Artifact{}, fmt.Errorf("unsupported input schema %q", input.Schema)
 	}
 }
 func intermediateFromFull(input schema.Transcript) schema.IntermediateTranscript {
 	segments := make([]schema.IntermediateSegment, len(input.Segments))
 	for index, segment := range input.Segments {
 		segments[index] = schema.IntermediateSegment{
 			ID:         segment.ID,
 			Start:      segment.Start,
 			End:        segment.End,
 			Speaker:    segment.Speaker,
 			Text:       segment.Text,
 			Categories: append([]string(nil), segment.Categories...),
 		}
 	}
 	return schema.IntermediateTranscript{
 		Metadata: schema.IntermediateMetadata{
 			Application:  input.Metadata.Application,
 			Version:      input.Metadata.Version,
 			OutputSchema: SchemaIntermediate,
 		},
 		Segments: segments,
 	}
 }
 func minimalFromFull(input schema.Transcript) schema.MinimalTranscript {
 	segments := make([]schema.MinimalSegment, len(input.Segments))
 	for index, segment := range input.Segments {
 		segments[index] = schema.MinimalSegment{
 			ID:      segment.ID,
 			Start:   segment.Start,
 			End:     segment.End,
 			Speaker: segment.Speaker,
 			Text:    segment.Text,
 		}
 	}
 	return schema.MinimalTranscript{
 		Metadata: schema.MinimalMetadata{
 			Application:  input.Metadata.Application,
 			Version:      input.Metadata.Version,
 			OutputSchema: SchemaMinimal,
 		},
 		Segments: segments,
 	}
 }
 func minimalFromIntermediate(input schema.IntermediateTranscript) schema.MinimalTranscript {
 	segments := make([]schema.MinimalSegment, len(input.Segments))
 	for index, segment := range input.Segments {
 		segments[index] = schema.MinimalSegment{
 			ID:      segment.ID,
 			Start:   segment.Start,
 			End:     segment.End,
 			Speaker: segment.Speaker,
 			Text:    segment.Text,
 		}
 	}
 	return schema.MinimalTranscript{
 		Metadata: schema.MinimalMetadata{
 			Application:  input.Metadata.Application,
 			Version:      input.Metadata.Version,
 			OutputSchema: SchemaMinimal,
 		},
 		Segments: segments,
 	}
 }
 func intermediateFromMinimal(input schema.MinimalTranscript) schema.IntermediateTranscript {
 	segments := make([]schema.IntermediateSegment, len(input.Segments))
 	for index, segment := range input.Segments {
 		segments[index] = schema.IntermediateSegment{
 			ID:      segment.ID,
 			Start:   segment.Start,
 			End:     segment.End,
 			Speaker: segment.Speaker,
 			Text:    segment.Text,
 		}
 	}
 	return schema.IntermediateTranscript{
 		Metadata: schema.IntermediateMetadata{
 			Application:  input.Metadata.Application,
 			Version:      input.Metadata.Version,
 			OutputSchema: SchemaIntermediate,
 		},
 		Segments: segments,
 	}
 }
--- a/internal/trim/artifact_test.go
+++ b/internal/trim/artifact_test.go
@@ -0,0 +1,138 @@
 package trim
 import (
 	"encoding/json"
 	"strings"
 	"testing"
 	"gitea.maximumdirect.net/eric/seriatim/schema"
 )
 func TestParseArtifactJSONRejectsMalformedJSON(t *testing.T) {
 	_, err := ParseArtifactJSON([]byte(`{"metadata":`))
 	if err == nil {
 		t.Fatal("expected malformed JSON error")
 	}
 	if !strings.Contains(err.Error(), "input JSON is malformed") {
 		t.Fatalf("unexpected error: %v", err)
 	}
 }
 func TestParseArtifactJSONRejectsDuplicateSegmentIDs(t *testing.T) {
 	first := 10
 	second := 20
 	value := schema.Transcript{
 		Metadata: schema.Metadata{
 			Application: "seriatim",
 			Version:     "v-test",
 		},
 		Segments: []schema.Segment{
 			{ID: 1, Source: "a.json", SourceSegmentIndex: &first, Speaker: "A", Start: 1, End: 2, Text: "one"},
 			{ID: 1, Source: "a.json", SourceSegmentIndex: &second, Speaker: "B", Start: 2, End: 3, Text: "two"},
 		},
 		OverlapGroups: []schema.OverlapGroup{},
 	}
 	data := mustMarshalJSON(t, value)
 	_, err := ParseArtifactJSON(data)
 	if err == nil {
 		t.Fatal("expected invalid artifact error")
 	}
 	if !strings.Contains(err.Error(), "not a valid seriatim output artifact") {
 		t.Fatalf("unexpected error: %v", err)
 	}
 }
 func TestParseArtifactJSONRejectsNonSequentialSegmentIDs(t *testing.T) {
 	first := 10
 	second := 20
 	value := schema.Transcript{
 		Metadata: schema.Metadata{
 			Application: "seriatim",
 			Version:     "v-test",
 		},
 		Segments: []schema.Segment{
 			{ID: 1, Source: "a.json", SourceSegmentIndex: &first, Speaker: "A", Start: 1, End: 2, Text: "one"},
 			{ID: 3, Source: "a.json", SourceSegmentIndex: &second, Speaker: "B", Start: 2, End: 3, Text: "two"},
 		},
 		OverlapGroups: []schema.OverlapGroup{},
 	}
 	data := mustMarshalJSON(t, value)
 	_, err := ParseArtifactJSON(data)
 	if err == nil {
 		t.Fatal("expected invalid artifact error")
 	}
 	if !strings.Contains(err.Error(), "not a valid seriatim output artifact") {
 		t.Fatalf("unexpected error: %v", err)
 	}
 }
 func TestConvertArtifactMinimalToIntermediate(t *testing.T) {
 	value := schema.MinimalTranscript{
 		Metadata: schema.MinimalMetadata{
 			Application:  "seriatim",
 			Version:      "v-test",
 			OutputSchema: SchemaMinimal,
 		},
 		Segments: []schema.MinimalSegment{
 			{ID: 1, Start: 1, End: 2, Speaker: "A", Text: "one"},
 			{ID: 2, Start: 2, End: 3, Speaker: "B", Text: "two"},
 		},
 	}
 	artifact := Artifact{
 		Schema:  SchemaMinimal,
 		Minimal: &value,
 	}
 	converted, err := ConvertArtifact(artifact, SchemaIntermediate)
 	if err != nil {
 		t.Fatalf("convert failed: %v", err)
 	}
 	if converted.Schema != SchemaIntermediate {
 		t.Fatalf("schema = %q, want %q", converted.Schema, SchemaIntermediate)
 	}
 	if converted.Intermediate == nil {
 		t.Fatal("expected intermediate artifact")
 	}
 	if len(converted.Intermediate.Segments) != 2 {
 		t.Fatalf("segment count = %d, want 2", len(converted.Intermediate.Segments))
 	}
 	if converted.Intermediate.Segments[0].ID != 1 || converted.Intermediate.Segments[1].ID != 2 {
 		t.Fatalf("unexpected IDs: %#v", converted.Intermediate.Segments)
 	}
 }
 func TestConvertArtifactMinimalToFullFails(t *testing.T) {
 	value := schema.MinimalTranscript{
 		Metadata: schema.MinimalMetadata{
 			Application:  "seriatim",
 			Version:      "v-test",
 			OutputSchema: SchemaMinimal,
 		},
 		Segments: []schema.MinimalSegment{
 			{ID: 1, Start: 1, End: 2, Speaker: "A", Text: "one"},
 		},
 	}
 	artifact := Artifact{
 		Schema:  SchemaMinimal,
 		Minimal: &value,
 	}
 	_, err := ConvertArtifact(artifact, SchemaFull)
 	if err == nil {
 		t.Fatal("expected conversion error")
 	}
 	if !strings.Contains(err.Error(), "cannot emit") {
 		t.Fatalf("unexpected error: %v", err)
 	}
 }
 func mustMarshalJSON(t *testing.T, value any) []byte {
 	t.Helper()
 	data, err := json.Marshal(value)
 	if err != nil {
 		t.Fatalf("marshal: %v", err)
 	}
 	return data
 }
--- a/internal/trim/selector.go
+++ b/internal/trim/selector.go
@@ -0,0 +1,156 @@
 package trim
 import (
 	"fmt"
 	"regexp"
 	"sort"
 	"strconv"
 	"strings"
 )
 var selectorElementPattern = regexp.MustCompile(`^([+-]?\d+)(?:\s*-\s*([+-]?\d+))?$`)
 // Selector represents a normalized union of segment IDs.
 type Selector struct {
 	ranges []idRange
 }
 type idRange struct {
 	start int
 	end   int
 }
 // ParseSelector parses an inline segment selector expression.
 func ParseSelector(input string) (Selector, error) {
 	if strings.TrimSpace(input) == "" {
 		return Selector{}, fmt.Errorf("selector cannot be empty")
 	}
 	parts := strings.Split(input, ",")
 	ranges := make([]idRange, 0, len(parts))
 	for index, raw := range parts {
 		element := strings.TrimSpace(raw)
 		if element == "" {
 			return Selector{}, fmt.Errorf("selector element %d cannot be empty", index+1)
 		}
 		rangeValue, err := parseElement(element)
 		if err != nil {
 			return Selector{}, fmt.Errorf("selector element %d %q: %w", index+1, element, err)
 		}
 		ranges = append(ranges, rangeValue)
 	}
 	normalized := normalizeRanges(ranges)
 	if len(normalized) == 0 {
 		return Selector{}, fmt.Errorf("selector cannot be empty")
 	}
 	return Selector{ranges: normalized}, nil
 }
 // Contains returns true when id is included by this selector.
 func (s Selector) Contains(id int) bool {
 	if id <= 0 {
 		return false
 	}
 	index := sort.Search(len(s.ranges), func(i int) bool {
 		return s.ranges[i].end >= id
 	})
 	if index == len(s.ranges) {
 		return false
 	}
 	rangeValue := s.ranges[index]
 	return id >= rangeValue.start && id <= rangeValue.end
 }
 // IDs returns a deterministic ascending list of unique segment IDs.
 func (s Selector) IDs() []int {
 	total := 0
 	for _, rangeValue := range s.ranges {
 		total += rangeValue.end - rangeValue.start + 1
 	}
 	ids := make([]int, 0, total)
 	for _, rangeValue := range s.ranges {
 		for id := rangeValue.start; id <= rangeValue.end; id++ {
 			ids = append(ids, id)
 		}
 	}
 	return ids
 }
 func parseElement(element string) (idRange, error) {
 	matches := selectorElementPattern.FindStringSubmatch(element)
 	if matches == nil {
 		return idRange{}, fmt.Errorf("malformed element")
 	}
 	start, err := parseID(matches[1])
 	if err != nil {
 		return idRange{}, err
 	}
 	if matches[2] == "" {
 		return idRange{start: start, end: start}, nil
 	}
 	end, err := parseID(matches[2])
 	if err != nil {
 		return idRange{}, fmt.Errorf("invalid range end: %w", err)
 	}
 	if start > end {
 		return idRange{}, fmt.Errorf("descending range %d-%d is invalid", start, end)
 	}
 	return idRange{start: start, end: end}, nil
 }
 func parseID(value string) (int, error) {
 	value = strings.TrimSpace(value)
 	if value == "" {
 		return 0, fmt.Errorf("missing segment ID")
 	}
 	id, err := strconv.Atoi(value)
 	if err != nil {
 		return 0, fmt.Errorf("segment ID must be an integer")
 	}
 	if id <= 0 {
 		return 0, fmt.Errorf("segment ID must be positive")
 	}
 	return id, nil
 }
 func normalizeRanges(in []idRange) []idRange {
 	if len(in) == 0 {
 		return nil
 	}
 	sorted := make([]idRange, len(in))
 	copy(sorted, in)
 	sort.Slice(sorted, func(i, j int) bool {
 		if sorted[i].start == sorted[j].start {
 			return sorted[i].end < sorted[j].end
 		}
 		return sorted[i].start < sorted[j].start
 	})
 	merged := make([]idRange, 0, len(sorted))
 	for _, next := range sorted {
 		if len(merged) == 0 {
 			merged = append(merged, next)
 			continue
 		}
 		last := &merged[len(merged)-1]
 		if next.start <= last.end+1 {
 			if next.end > last.end {
 				last.end = next.end
 			}
 			continue
 		}
 		merged = append(merged, next)
 	}
 	return merged
 }
--- a/internal/trim/selector_test.go
+++ b/internal/trim/selector_test.go
@@ -0,0 +1,127 @@
 package trim
 import (
 	"strings"
 	"testing"
 )
 func TestParseSelectorSingleID(t *testing.T) {
 	selector, err := ParseSelector("1")
 	if err != nil {
 		t.Fatalf("parse failed: %v", err)
 	}
 	assertIDs(t, selector, []int{1})
 	assertContains(t, selector, map[int]bool{1: true, 2: false, 0: false, -1: false})
 }
 func TestParseSelectorInclusiveRange(t *testing.T) {
 	selector, err := ParseSelector("1-3")
 	if err != nil {
 		t.Fatalf("parse failed: %v", err)
 	}
 	assertIDs(t, selector, []int{1, 2, 3})
 }
 func TestParseSelectorCommaSeparatedCombination(t *testing.T) {
 	selector, err := ParseSelector("1-3,8,10-12")
 	if err != nil {
 		t.Fatalf("parse failed: %v", err)
 	}
 	assertIDs(t, selector, []int{1, 2, 3, 8, 10, 11, 12})
 }
 func TestParseSelectorWhitespaceTolerance(t *testing.T) {
 	selector, err := ParseSelector(" 1 - 3 ,  8 , 10 - 12 ")
 	if err != nil {
 		t.Fatalf("parse failed: %v", err)
 	}
 	assertIDs(t, selector, []int{1, 2, 3, 8, 10, 11, 12})
 }
 func TestParseSelectorDuplicatesAndOverlapsNormalizeUnion(t *testing.T) {
 	selector, err := ParseSelector("1-4,2,4,3-6,6")
 	if err != nil {
 		t.Fatalf("parse failed: %v", err)
 	}
 	assertIDs(t, selector, []int{1, 2, 3, 4, 5, 6})
 	assertContains(t, selector, map[int]bool{1: true, 5: true, 6: true, 7: false})
 }
 func TestParseSelectorDeterministicNormalizedOutput(t *testing.T) {
 	left, err := ParseSelector("8,1-3,2,10-12")
 	if err != nil {
 		t.Fatalf("parse left failed: %v", err)
 	}
 	right, err := ParseSelector("10-12,3,2,1,8")
 	if err != nil {
 		t.Fatalf("parse right failed: %v", err)
 	}
 	leftIDs := left.IDs()
 	rightIDs := right.IDs()
 	if !equalInts(leftIDs, rightIDs) {
 		t.Fatalf("normalized IDs mismatch: %v vs %v", leftIDs, rightIDs)
 	}
 }
 func TestParseSelectorFailures(t *testing.T) {
 	tests := []struct {
 		name      string
 		selector  string
 		wantError string
 	}{
 		{name: "empty", selector: "", wantError: "cannot be empty"},
 		{name: "whitespace only", selector: "   ", wantError: "cannot be empty"},
 		{name: "zero", selector: "0", wantError: "must be positive"},
 		{name: "negative", selector: "-1", wantError: "must be positive"},
 		{name: "range includes zero", selector: "0-2", wantError: "must be positive"},
 		{name: "descending range", selector: "10-1", wantError: "descending range"},
 		{name: "empty element", selector: "1,,2", wantError: "cannot be empty"},
 		{name: "trailing comma", selector: "1,", wantError: "cannot be empty"},
 		{name: "malformed alpha", selector: "abc", wantError: "malformed element"},
 		{name: "malformed range", selector: "1-2-3", wantError: "malformed element"},
 		{name: "missing end", selector: "1-", wantError: "malformed element"},
 		{name: "missing start", selector: "-2", wantError: "must be positive"},
 	}
 	for _, test := range tests {
 		t.Run(test.name, func(t *testing.T) {
 			_, err := ParseSelector(test.selector)
 			if err == nil {
 				t.Fatalf("expected error for %q", test.selector)
 			}
 			if !strings.Contains(err.Error(), test.wantError) {
 				t.Fatalf("error = %q, want substring %q", err.Error(), test.wantError)
 			}
 		})
 	}
 }
 func assertIDs(t *testing.T, selector Selector, want []int) {
 	t.Helper()
 	got := selector.IDs()
 	if !equalInts(got, want) {
 		t.Fatalf("IDs = %v, want %v", got, want)
 	}
 }
 func assertContains(t *testing.T, selector Selector, checks map[int]bool) {
 	t.Helper()
 	for id, want := range checks {
 		if got := selector.Contains(id); got != want {
 			t.Fatalf("Contains(%d) = %t, want %t", id, got, want)
 		}
 	}
 }
 func equalInts(left []int, right []int) bool {
 	if len(left) != len(right) {
 		return false
 	}
 	for index := range left {
 		if left[index] != right[index] {
 			return false
 		}
 	}
 	return true
 }
Author	SHA1	Message	Date
Eric Rakestraw	e6d3b4a46e	Harden trim integration All checks were successful ci/woodpecker/tag/release Pipeline was successful Details	2026-05-08 15:00:46 +00:00
Eric Rakestraw	54f7717de8	Document trim command	2026-05-08 14:57:52 +00:00
Eric Rakestraw	c48b02d2ec	Add trim report output	2026-05-08 14:56:24 +00:00
Eric Rakestraw	ac3dcf2557	Add trim CLI command	2026-05-08 14:53:59 +00:00
Eric Rakestraw	1c0e4438ae	Recompute overlap groups during trim	2026-05-08 14:47:52 +00:00
Eric Rakestraw	52f7729100	Add artifact trim transformation	2026-05-08 14:44:31 +00:00
Eric Rakestraw	2c82f8bf5c	Add trim selector parsing	2026-05-08 14:41:47 +00:00