Harden trim integration

Document trim command
Add trim report output
2026-05-08 15:00:46 +00:00 · 2026-05-08 14:57:52 +00:00 · 2026-05-08 14:56:24 +00:00 · 2026-05-08 14:53:59 +00:00 · 2026-05-08 14:47:52 +00:00 · 2026-05-08 14:44:31 +00:00
14 changed files with 3079 additions and 3 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -1,3 +1,7 @@
+# ---> Codex
+.codex
+AGENTS.md
+
 # ---> Go
 # If you prefer the allow list template instead of the deny list, see community template:
 # https://github.com/github/gitignore/blob/main/community/Golang/Go.AllowList.gitignore
--- a/README.md
+++ b/README.md
@@ -1,8 +1,8 @@
 # seriatim

-`seriatim` merges per-speaker WhisperX-style JSON transcripts into a single JSON transcript that preserves speaker identity and chronological order.
+`seriatim` merges per-speaker WhisperX-style JSON transcripts into a single JSON transcript that preserves speaker identity and chronological order. It also trims existing seriatim output artifacts by segment ID.

-The current implementation supports the `merge` command. It reads one or more input JSON files, optionally maps each input file to a canonical speaker using `speakers.yml`, sorts all segments by timestamp, detects and resolves overlaps when word-level timing is available, assigns consecutive numeric `id` values, and writes a merged JSON artifact.
+The current implementation supports the `merge` and `trim` commands. `merge` reads one or more input JSON files, optionally maps each input file to a canonical speaker using `speakers.yml`, sorts all segments by timestamp, detects and resolves overlaps when word-level timing is available, assigns consecutive numeric `id` values, and writes a merged JSON artifact. `trim` reads an existing seriatim output artifact and projects it to a retained segment subset.

 ## Usage

@@ -25,10 +25,20 @@ go run ./cmd/seriatim merge \
  --report-file report.json
 ```

+Trim an existing seriatim artifact:
+
+```sh
+go run ./cmd/seriatim trim \
+  --input-file merged.json \
+  --output-file trimmed.json \
+  --keep "1-10, 15, 20-25"
+```
+
 ## CLI

 ```text
 seriatim merge [flags]
+seriatim trim [flags]
 ```

 Global flags:
@@ -54,6 +64,50 @@ Global flags:
 | `--postprocessing-modules` | No | `detect-overlaps,resolve-overlaps,backchannel,filler,resolve-danglers,coalesce,detect-overlaps,autocorrect,assign-ids,validate-output` | Comma-separated postprocessing modules, evaluated in order. |
 | `--coalesce-gap` | No | `3.0` | Maximum same-speaker gap in seconds for `coalesce`; also used as the `resolve-overlaps` context window. Must be a non-negative float. |

+`trim` flags:
+
+| Flag | Required | Default | Description |
+| --- | --- | --- | --- |
+| `--input-file` | Yes | none | Input seriatim output artifact JSON file. |
+| `--output-file` | Yes | none | Trimmed transcript JSON output path. |
+| `--keep` | Exactly one of `--keep` or `--remove` is required | none | Segment ID selector to retain. |
+| `--remove` | Exactly one of `--keep` or `--remove` is required | none | Segment ID selector to drop. |
+| `--output-schema` | No | preserve input artifact schema | Optional output schema override: `seriatim-minimal`, `seriatim-intermediate`, or `seriatim-full`. |
+| `--report-file` | No | none | Optional report JSON output path. |
+| `--allow-empty` | No | `false` | Allow trimming to zero retained segments. |
+
+`trim` selection rules:
+
+- `--keep` and `--remove` are mutually exclusive.
+- Exactly one of `--keep` or `--remove` is required.
+- Selection is by segment ID only.
+- Invalid selected segment IDs fail the command by default.
+
+`trim` selector syntax:
+
+- Segment IDs are positive 1-based integers.
+- Inclusive ranges are supported: `1-10`.
+- Comma-separated selectors are supported: `1-10,15,20-25`.
+- Whitespace around numbers, commas, and hyphens is allowed: `1 - 10, 15, 20 - 25`.
+- Duplicate and overlapping ranges are accepted and normalized as a union.
+- Descending ranges (for example `10-1`) are rejected.
+
+`trim` behavior:
+
+- `trim` consumes existing seriatim JSON output artifacts only.
+- `trim` does not accept raw WhisperX transcript JSON as input.
+- Retained output segment IDs are renumbered sequentially from `1` to `N`.
+- Transcript order is preserved from input transcript order; selector order does not reorder output.
+- When output schema is `seriatim-full`, overlap groups are recomputed from retained segments.
+- `--output-schema seriatim-full` is supported when trim has full-schema artifact data to emit; trim does not synthesize missing full-schema provenance from minimal/intermediate input artifacts.
+- `trim` does not run merge postprocessors such as `resolve-overlaps`, `coalesce`, or `autocorrect`.
+
+`trim` report output:
+
+- When `--report-file` is provided, the report includes standard trim/validation/output events.
+- The report includes a `trim-audit` event containing trim operation metadata, including selected IDs, retained/removed counts, removed IDs, and old-to-new segment ID mapping.
+- Old-to-new ID mapping is emitted as a deterministic ordered array of `{old_id, new_id}` pairs.
+
 Environment variables:

 | Environment Variable | Default | Description |
--- a/architecture.md
+++ b/architecture.md
@@ -1,6 +1,9 @@
 # seriatim Architecture

-`seriatim` is a deterministic transcript merge utility for combining multiple per-speaker transcript inputs into a single chronologically ordered diarized transcript.
+`seriatim` is a deterministic transcript utility for:
+
+- merging multiple per-speaker transcript inputs into a single chronologically ordered diarized transcript, and
+- projecting existing seriatim transcript artifacts through deterministic segment-ID trimming.

 The initial use case is merging independently transcribed speaker audio tracks from the same recorded session, such as a weekly tabletop RPG session. The architecture should also support meetings, podcasts, interviews, and other multi-speaker events.

@@ -20,6 +23,7 @@ The initial use case is merging independently transcribed speaker audio tracks f
 8. Detect and annotate overlapping speech regions.
 9. Emit one or more output artifacts through output writers.
 10. Produce report data for validation findings, corrections, and transformations.
+11. Support artifact-level transcript projection commands that operate on existing seriatim output.

 ## Non-goals

@@ -56,6 +60,8 @@ configuration check

 Each stage has an explicit data contract. Input and output stages perform I/O. Processing stages should be deterministic transformations over in-memory models and should record report events for validation findings, corrections, and transformations.

+`merge` runs this pipeline. `trim` is intentionally separate from this pipeline and operates at the artifact layer.
+
 ## Stage Contracts

 ### 1. Configuration Check
@@ -191,6 +197,23 @@ Future output formats may include:

 Output writers should be selected from an explicit registry and should consume the final transcript model read-only. Multiple output writers may run for a single invocation.

+### 7. Artifact Projection Stage (`trim` command)
+
+`trim` is an artifact-level command that reads an existing seriatim output artifact and emits a projected artifact containing a segment-ID subset.
+
+Design constraints:
+
+- `trim` runs after `merge`, not as a merge postprocessor.
+- `trim` validates the input artifact against supported seriatim output schemas.
+- `trim` performs deterministic keep/remove selection by segment ID.
+- `trim` renumbers retained IDs to `1..N` in transcript order.
+- `trim` validates the final output against the selected output schema before writing.
+- `trim` records audit metadata in report output.
+
+`trim` is intentionally separate from merge postprocessing because it consumes already-emitted public artifacts. This separation keeps merge semantics stable and avoids rerunning merge-only transforms on projected artifacts.
+
+`trim` must not rerun merge postprocessors such as `resolve-overlaps`, `coalesce`, or `autocorrect`.
+
 ## Module Classification

 Modules should be classified by their contract and allowed effects.
@@ -397,6 +420,8 @@ A valid merged transcript should satisfy:
 - Every referenced segment exists.
 - Output validates against the selected output schema.

+For full-schema trim output, overlap groups are recomputed from retained segments so overlap annotations and group references remain internally consistent after projection.
+
 ## Determinism Requirements

 Given the same inputs, config, and application version, `seriatim` should produce byte-stable JSON output where practical.
@@ -411,6 +436,12 @@ To support this:
 - Record application version in output metadata.
 - Record enabled module names and module order in output metadata or report data.

+Trim-specific determinism requirements:
+
+- Selector normalization and retained IDs are deterministic.
+- Old-to-new ID mapping in trim reports is emitted in deterministic order.
+- Full-schema overlap recomputation is deterministic for the same input artifact and selector.
+
 ## Go Package Layout

 ```text
@@ -419,6 +450,7 @@ internal/config/         CLI/env/config loading and validation
 internal/pipeline/       Pipeline orchestration and module registry
 internal/builtin/        Built-in pipeline modules
 internal/artifact/       Conversion from internal model to public output schema
+internal/trim/           Artifact parsing, trim selection, schema conversion, overlap recomputation for full schema
 internal/buildinfo/      Build-time version metadata
 internal/speaker/        Speaker map parsing and lookup
 internal/model/          Canonical and merged transcript models
@@ -430,6 +462,12 @@ schema/                  Public output contract and JSON Schema validation

 Package boundaries should follow data ownership. Shared models belong in `internal/model`; stage-specific behavior belongs in the relevant stage package.

+For trim:
+
+- `internal/trim` contains pure transformation logic over artifact structs.
+- CLI command code handles only flag parsing, file I/O, and report emission.
+- Transform logic is deterministic and pure except for command-layer I/O.
+
 ## Default Modules

 The default pipeline is equivalent to explicit module lists.
--- a/internal/cli/root.go
+++ b/internal/cli/root.go
@@ -17,5 +17,6 @@ func NewRootCommand() *cobra.Command {
 	}

 	cmd.AddCommand(newMergeCommand())
+	cmd.AddCommand(newTrimCommand())
 	return cmd
 }
--- a/internal/cli/trim.go
+++ b/internal/cli/trim.go
@@ -0,0 +1,191 @@
+package cli
+
+import (
+	"encoding/json"
+	"fmt"
+	"os"
+	"sort"
+
+	"github.com/spf13/cobra"
+
+	"gitea.maximumdirect.net/eric/seriatim/internal/config"
+	"gitea.maximumdirect.net/eric/seriatim/internal/report"
+	triminternal "gitea.maximumdirect.net/eric/seriatim/internal/trim"
+)
+
+type trimAuditReport struct {
+	Operation               string          `json:"operation"`
+	InputFile               string          `json:"input_file"`
+	OutputFile              string          `json:"output_file"`
+	InputSchema             string          `json:"input_schema"`
+	OutputSchema            string          `json:"output_schema"`
+	Mode                    string          `json:"mode"`
+	Selector                string          `json:"selector"`
+	SelectedIDs             []int           `json:"selected_ids"`
+	AllowEmpty              bool            `json:"allow_empty"`
+	InputSegmentCount       int             `json:"input_segment_count"`
+	RetainedSegmentCount    int             `json:"retained_segment_count"`
+	RemovedSegmentCount     int             `json:"removed_segment_count"`
+	RemovedInputIDs         []int           `json:"removed_input_ids"`
+	OldToNewIDMapping       []trimIDMapping `json:"old_to_new_id_mapping"`
+	OverlapGroupsRecomputed bool            `json:"overlap_groups_recomputed"`
+}
+
+type trimIDMapping struct {
+	OldID int `json:"old_id"`
+	NewID int `json:"new_id"`
+}
+
+func newTrimCommand() *cobra.Command {
+	var opts config.TrimOptions
+
+	cmd := &cobra.Command{
+		Use:   "trim",
+		Short: "Trim an existing seriatim transcript artifact by segment ID",
+		RunE: func(cmd *cobra.Command, args []string) error {
+			trimOpts := opts
+			if !cmd.Flags().Changed("output-schema") {
+				trimOpts.OutputSchema = ""
+			}
+
+			cfg, err := config.NewTrimConfig(trimOpts)
+			if err != nil {
+				return err
+			}
+
+			selector, err := triminternal.ParseSelector(cfg.Selector)
+			if err != nil {
+				return fmt.Errorf("invalid selector %q: %w", cfg.Selector, err)
+			}
+
+			data, err := os.ReadFile(cfg.InputFile)
+			if err != nil {
+				return fmt.Errorf("read --input-file %q: %w", cfg.InputFile, err)
+			}
+
+			artifact, err := triminternal.ParseArtifactJSON(data)
+			if err != nil {
+				return fmt.Errorf("--input-file %q: %w", cfg.InputFile, err)
+			}
+			inputSegmentCount := artifact.SegmentCount()
+			inputSchema := artifact.Schema
+
+			mode := triminternal.ModeKeep
+			if cfg.Mode == "remove" {
+				mode = triminternal.ModeRemove
+			}
+
+			trimmed, err := triminternal.ApplyArtifact(artifact, triminternal.Options{
+				Mode:       mode,
+				Selector:   selector,
+				AllowEmpty: cfg.AllowEmpty,
+			})
+			if err != nil {
+				return err
+			}
+
+			outputSchema := artifact.Schema
+			if cfg.OutputSchema != "" {
+				outputSchema = cfg.OutputSchema
+			}
+
+			outputArtifact, err := triminternal.ConvertArtifact(trimmed.Artifact, outputSchema)
+			if err != nil {
+				return err
+			}
+
+			if err := triminternal.ValidateArtifact(outputArtifact); err != nil {
+				return fmt.Errorf("validate trimmed output: %w", err)
+			}
+
+			if err := writeOutputJSON(cfg.OutputFile, outputArtifact.Value()); err != nil {
+				return err
+			}
+
+			if cfg.ReportFile != "" {
+				audit := trimAuditReport{
+					Operation:               "trim",
+					InputFile:               cfg.InputFile,
+					OutputFile:              cfg.OutputFile,
+					InputSchema:             inputSchema,
+					OutputSchema:            outputArtifact.Schema,
+					Mode:                    cfg.Mode,
+					Selector:                cfg.Selector,
+					SelectedIDs:             selector.IDs(),
+					AllowEmpty:              cfg.AllowEmpty,
+					InputSegmentCount:       inputSegmentCount,
+					RetainedSegmentCount:    len(trimmed.OldToNewID),
+					RemovedSegmentCount:     len(trimmed.RemovedIDs),
+					RemovedInputIDs:         append([]int(nil), trimmed.RemovedIDs...),
+					OldToNewIDMapping:       orderedIDMapping(trimmed.OldToNewID),
+					OverlapGroupsRecomputed: trimmed.OverlapGroupsRecomputed,
+				}
+				auditJSON, err := json.Marshal(audit)
+				if err != nil {
+					return fmt.Errorf("marshal trim audit report: %w", err)
+				}
+
+				rpt := report.Report{
+					Metadata: report.Metadata{
+						Application:   outputArtifact.Application(),
+						Version:       outputArtifact.Version(),
+						InputReader:   "trim-artifact",
+						InputFiles:    []string{cfg.InputFile},
+						OutputModules: []string{"json"},
+					},
+					Events: []report.Event{
+						report.Info("trim", "trim", fmt.Sprintf("trimmed %d input segment(s) into %d output segment(s) with mode=%s", inputSegmentCount, outputArtifact.SegmentCount(), cfg.Mode)),
+						report.Info("trim", "trim-audit", string(auditJSON)),
+						report.Info("trim", "validate-output", fmt.Sprintf("validated %d output segment(s)", outputArtifact.SegmentCount())),
+						report.Info("output", "json", "wrote transcript JSON"),
+					},
+				}
+				if err := report.WriteJSON(cfg.ReportFile, rpt); err != nil {
+					return err
+				}
+			}
+
+			return nil
+		},
+	}
+
+	flags := cmd.Flags()
+	flags.StringVar(&opts.InputFile, "input-file", "", "input seriatim transcript artifact JSON file")
+	flags.StringVar(&opts.OutputFile, "output-file", "", "output transcript JSON file")
+	flags.StringVar(&opts.ReportFile, "report-file", "", "optional report JSON file")
+	flags.StringVar(&opts.Keep, "keep", "", "segment ID selector to keep (for example: 1-10,15)")
+	flags.StringVar(&opts.Remove, "remove", "", "segment ID selector to remove (for example: 1-10,15)")
+	flags.StringVar(&opts.OutputSchema, "output-schema", "", "optional output JSON schema override: seriatim-minimal, seriatim-intermediate, or seriatim-full")
+	flags.BoolVar(&opts.AllowEmpty, "allow-empty", false, "allow trimming to an empty transcript")
+
+	return cmd
+}
+
+func writeOutputJSON(path string, value any) error {
+	file, err := os.Create(path)
+	if err != nil {
+		return err
+	}
+	defer file.Close()
+
+	enc := json.NewEncoder(file)
+	enc.SetIndent("", "  ")
+	return enc.Encode(value)
+}
+
+func orderedIDMapping(mapping map[int]int) []trimIDMapping {
+	keys := make([]int, 0, len(mapping))
+	for oldID := range mapping {
+		keys = append(keys, oldID)
+	}
+	sort.Ints(keys)
+
+	pairs := make([]trimIDMapping, 0, len(keys))
+	for _, oldID := range keys {
+		pairs = append(pairs, trimIDMapping{
+			OldID: oldID,
+			NewID: mapping[oldID],
+		})
+	}
+	return pairs
+}
--- a/internal/cli/trim_test.go
+++ b/internal/cli/trim_test.go
@@ -0,0 +1,758 @@
+package cli
+
+import (
+	"encoding/json"
+	"os"
+	"path/filepath"
+	"strings"
+	"testing"
+
+	"gitea.maximumdirect.net/eric/seriatim/internal/config"
+	"gitea.maximumdirect.net/eric/seriatim/internal/report"
+	"gitea.maximumdirect.net/eric/seriatim/schema"
+)
+
+func TestTrimKeepModeEndToEnd(t *testing.T) {
+	dir := t.TempDir()
+	input := writeTrimFullFixture(t, dir, "input.json")
+	output := filepath.Join(dir, "trimmed.json")
+
+	err := executeTrim(
+		"--input-file", input,
+		"--output-file", output,
+		"--keep", "2,4",
+	)
+	if err != nil {
+		t.Fatalf("trim failed: %v", err)
+	}
+
+	var transcript schema.Transcript
+	readJSON(t, output, &transcript)
+	if len(transcript.Segments) != 2 {
+		t.Fatalf("segment count = %d, want 2", len(transcript.Segments))
+	}
+	if transcript.Segments[0].Text != "two" || transcript.Segments[1].Text != "four" {
+		t.Fatalf("unexpected kept text order: %#v", transcript.Segments)
+	}
+	assertSequentialIDs(t, []int{transcript.Segments[0].ID, transcript.Segments[1].ID})
+}
+
+func TestTrimRemoveModeEndToEnd(t *testing.T) {
+	dir := t.TempDir()
+	input := writeTrimFullFixture(t, dir, "input.json")
+	output := filepath.Join(dir, "trimmed.json")
+
+	err := executeTrim(
+		"--input-file", input,
+		"--output-file", output,
+		"--remove", "2,4",
+	)
+	if err != nil {
+		t.Fatalf("trim failed: %v", err)
+	}
+
+	var transcript schema.Transcript
+	readJSON(t, output, &transcript)
+	if len(transcript.Segments) != 2 {
+		t.Fatalf("segment count = %d, want 2", len(transcript.Segments))
+	}
+	if transcript.Segments[0].Text != "one" || transcript.Segments[1].Text != "three" {
+		t.Fatalf("unexpected remaining text order: %#v", transcript.Segments)
+	}
+	assertSequentialIDs(t, []int{transcript.Segments[0].ID, transcript.Segments[1].ID})
+}
+
+func TestTrimMutualExclusionFailure(t *testing.T) {
+	dir := t.TempDir()
+	input := writeTrimFullFixture(t, dir, "input.json")
+	output := filepath.Join(dir, "trimmed.json")
+
+	err := executeTrim(
+		"--input-file", input,
+		"--output-file", output,
+		"--keep", "1",
+		"--remove", "2",
+	)
+	if err == nil {
+		t.Fatal("expected mutual exclusion error")
+	}
+	if !strings.Contains(err.Error(), "mutually exclusive") {
+		t.Fatalf("unexpected error: %v", err)
+	}
+}
+
+func TestTrimMissingSelectionFailure(t *testing.T) {
+	dir := t.TempDir()
+	input := writeTrimFullFixture(t, dir, "input.json")
+	output := filepath.Join(dir, "trimmed.json")
+
+	err := executeTrim(
+		"--input-file", input,
+		"--output-file", output,
+	)
+	if err == nil {
+		t.Fatal("expected selection flag error")
+	}
+	if !strings.Contains(err.Error(), "exactly one of --keep or --remove is required") {
+		t.Fatalf("unexpected error: %v", err)
+	}
+}
+
+func TestTrimInvalidSelectedIDFailure(t *testing.T) {
+	dir := t.TempDir()
+	input := writeTrimFullFixture(t, dir, "input.json")
+	output := filepath.Join(dir, "trimmed.json")
+
+	err := executeTrim(
+		"--input-file", input,
+		"--output-file", output,
+		"--keep", "99",
+	)
+	if err == nil {
+		t.Fatal("expected missing selected ID error")
+	}
+	if !strings.Contains(err.Error(), "does not exist") {
+		t.Fatalf("unexpected error: %v", err)
+	}
+}
+
+func TestTrimOmittedOutputSchemaPreservesInputSchema(t *testing.T) {
+	dir := t.TempDir()
+	input := writeTrimMinimalFixture(t, dir, "input-minimal.json")
+	output := filepath.Join(dir, "trimmed.json")
+
+	err := executeTrim(
+		"--input-file", input,
+		"--output-file", output,
+		"--keep", "1",
+	)
+	if err != nil {
+		t.Fatalf("trim failed: %v", err)
+	}
+
+	var transcript schema.MinimalTranscript
+	readJSON(t, output, &transcript)
+	if transcript.Metadata.OutputSchema != config.OutputSchemaMinimal {
+		t.Fatalf("output_schema = %q, want %q", transcript.Metadata.OutputSchema, config.OutputSchemaMinimal)
+	}
+	if len(transcript.Segments) != 1 || transcript.Segments[0].ID != 1 {
+		t.Fatalf("unexpected minimal trim output: %#v", transcript.Segments)
+	}
+}
+
+func TestTrimExplicitOutputSchemaChangesOutputSchema(t *testing.T) {
+	dir := t.TempDir()
+	input := writeTrimFullFixture(t, dir, "input.json")
+	output := filepath.Join(dir, "trimmed.json")
+
+	err := executeTrim(
+		"--input-file", input,
+		"--output-file", output,
+		"--keep", "1,3",
+		"--output-schema", config.OutputSchemaMinimal,
+	)
+	if err != nil {
+		t.Fatalf("trim failed: %v", err)
+	}
+
+	var transcript schema.MinimalTranscript
+	readJSON(t, output, &transcript)
+	if transcript.Metadata.OutputSchema != config.OutputSchemaMinimal {
+		t.Fatalf("output_schema = %q, want %q", transcript.Metadata.OutputSchema, config.OutputSchemaMinimal)
+	}
+	if len(transcript.Segments) != 2 {
+		t.Fatalf("segment count = %d, want 2", len(transcript.Segments))
+	}
+	assertSequentialIDs(t, []int{transcript.Segments[0].ID, transcript.Segments[1].ID})
+}
+
+func TestTrimExplicitOutputSchemaConvertsMinimalToIntermediate(t *testing.T) {
+	dir := t.TempDir()
+	input := writeTrimMinimalFixture(t, dir, "input-minimal.json")
+	output := filepath.Join(dir, "trimmed.json")
+
+	err := executeTrim(
+		"--input-file", input,
+		"--output-file", output,
+		"--keep", "1-2",
+		"--output-schema", config.OutputSchemaIntermediate,
+	)
+	if err != nil {
+		t.Fatalf("trim failed: %v", err)
+	}
+
+	var transcript schema.IntermediateTranscript
+	readJSON(t, output, &transcript)
+	if transcript.Metadata.OutputSchema != config.OutputSchemaIntermediate {
+		t.Fatalf("output_schema = %q, want %q", transcript.Metadata.OutputSchema, config.OutputSchemaIntermediate)
+	}
+	if len(transcript.Segments) != 2 {
+		t.Fatalf("segment count = %d, want 2", len(transcript.Segments))
+	}
+	assertSequentialIDs(t, []int{transcript.Segments[0].ID, transcript.Segments[1].ID})
+}
+
+func TestTrimIntermediateInputPreservesIntermediateOutputAndCategories(t *testing.T) {
+	dir := t.TempDir()
+	input := writeTrimIntermediateFixture(t, dir, "input-intermediate.json")
+	output := filepath.Join(dir, "trimmed.json")
+
+	err := executeTrim(
+		"--input-file", input,
+		"--output-file", output,
+		"--keep", "2",
+	)
+	if err != nil {
+		t.Fatalf("trim failed: %v", err)
+	}
+
+	var transcript schema.IntermediateTranscript
+	readJSON(t, output, &transcript)
+	if transcript.Metadata.OutputSchema != config.OutputSchemaIntermediate {
+		t.Fatalf("output_schema = %q, want %q", transcript.Metadata.OutputSchema, config.OutputSchemaIntermediate)
+	}
+	if len(transcript.Segments) != 1 {
+		t.Fatalf("segment count = %d, want 1", len(transcript.Segments))
+	}
+	if transcript.Segments[0].ID != 1 {
+		t.Fatalf("segment ID = %d, want 1", transcript.Segments[0].ID)
+	}
+	assertIntSliceEqual(t, []int{len(transcript.Segments[0].Categories)}, []int{2})
+	if transcript.Segments[0].Categories[0] != "filler" || transcript.Segments[0].Categories[1] != "backchannel" {
+		t.Fatalf("categories = %v, want [filler backchannel]", transcript.Segments[0].Categories)
+	}
+}
+
+func TestTrimFullInputPreservesFullShapeAndRecomputesOverlapGroups(t *testing.T) {
+	dir := t.TempDir()
+	input := writeTrimFullOverlapFixture(t, dir, "input-full-overlap.json")
+	output := filepath.Join(dir, "trimmed.json")
+
+	err := executeTrim(
+		"--input-file", input,
+		"--output-file", output,
+		"--keep", "1,2",
+	)
+	if err != nil {
+		t.Fatalf("trim failed: %v", err)
+	}
+
+	var transcript schema.Transcript
+	readJSON(t, output, &transcript)
+	if len(transcript.Segments) != 2 {
+		t.Fatalf("segment count = %d, want 2", len(transcript.Segments))
+	}
+	assertSequentialIDs(t, []int{transcript.Segments[0].ID, transcript.Segments[1].ID})
+	if len(transcript.OverlapGroups) != 1 {
+		t.Fatalf("overlap group count = %d, want 1", len(transcript.OverlapGroups))
+	}
+	if transcript.OverlapGroups[0].ID != 1 {
+		t.Fatalf("overlap group id = %d, want 1", transcript.OverlapGroups[0].ID)
+	}
+	if transcript.Segments[0].OverlapGroupID != 1 || transcript.Segments[1].OverlapGroupID != 1 {
+		t.Fatalf("segment overlap IDs = %d,%d, want 1,1", transcript.Segments[0].OverlapGroupID, transcript.Segments[1].OverlapGroupID)
+	}
+}
+
+func TestTrimMalformedSelectorFailsWithClearError(t *testing.T) {
+	dir := t.TempDir()
+	input := writeTrimFullFixture(t, dir, "input.json")
+	output := filepath.Join(dir, "trimmed.json")
+
+	err := executeTrim(
+		"--input-file", input,
+		"--output-file", output,
+		"--keep", "1-",
+	)
+	if err == nil {
+		t.Fatal("expected malformed selector error")
+	}
+	if !strings.Contains(err.Error(), "invalid selector") || !strings.Contains(err.Error(), "malformed element") {
+		t.Fatalf("unexpected error: %v", err)
+	}
+}
+
+func TestTrimMalformedInputArtifactFailsClearly(t *testing.T) {
+	dir := t.TempDir()
+	input := writeJSONFile(t, dir, "broken.json", `{"metadata":`)
+	output := filepath.Join(dir, "trimmed.json")
+
+	err := executeTrim(
+		"--input-file", input,
+		"--output-file", output,
+		"--keep", "1",
+	)
+	if err == nil {
+		t.Fatal("expected malformed artifact error")
+	}
+	if !strings.Contains(err.Error(), "input JSON is malformed") {
+		t.Fatalf("unexpected error: %v", err)
+	}
+}
+
+func TestTrimDuplicateInputSegmentIDsFail(t *testing.T) {
+	dir := t.TempDir()
+	input := writeTrimMinimalWithIDsFixture(t, dir, "input-dup.json", []int{1, 1})
+	output := filepath.Join(dir, "trimmed.json")
+
+	err := executeTrim(
+		"--input-file", input,
+		"--output-file", output,
+		"--keep", "1",
+	)
+	if err == nil {
+		t.Fatal("expected duplicate segment ID failure")
+	}
+	if !strings.Contains(err.Error(), "not a valid seriatim output artifact") {
+		t.Fatalf("unexpected error: %v", err)
+	}
+}
+
+func TestTrimNonSequentialInputSegmentIDsFail(t *testing.T) {
+	dir := t.TempDir()
+	input := writeTrimMinimalWithIDsFixture(t, dir, "input-nonseq.json", []int{1, 3})
+	output := filepath.Join(dir, "trimmed.json")
+
+	err := executeTrim(
+		"--input-file", input,
+		"--output-file", output,
+		"--keep", "1",
+	)
+	if err == nil {
+		t.Fatal("expected non-sequential segment ID failure")
+	}
+	if !strings.Contains(err.Error(), "not a valid seriatim output artifact") {
+		t.Fatalf("unexpected error: %v", err)
+	}
+}
+
+func TestTrimKeepSelectorWithOverlappingRanges(t *testing.T) {
+	dir := t.TempDir()
+	input := writeTrimFullFixture(t, dir, "input.json")
+	output := filepath.Join(dir, "trimmed.json")
+
+	err := executeTrim(
+		"--input-file", input,
+		"--output-file", output,
+		"--keep", "1-3,2-4",
+	)
+	if err != nil {
+		t.Fatalf("trim failed: %v", err)
+	}
+
+	var transcript schema.Transcript
+	readJSON(t, output, &transcript)
+	if len(transcript.Segments) != 4 {
+		t.Fatalf("segment count = %d, want 4", len(transcript.Segments))
+	}
+	assertSequentialIDs(t, []int{
+		transcript.Segments[0].ID,
+		transcript.Segments[1].ID,
+		transcript.Segments[2].ID,
+		transcript.Segments[3].ID,
+	})
+}
+
+func TestTrimRemoveSelectorWithOverlappingRanges(t *testing.T) {
+	dir := t.TempDir()
+	input := writeTrimFullFixture(t, dir, "input.json")
+	output := filepath.Join(dir, "trimmed.json")
+
+	err := executeTrim(
+		"--input-file", input,
+		"--output-file", output,
+		"--remove", "2-3,3-4",
+	)
+	if err != nil {
+		t.Fatalf("trim failed: %v", err)
+	}
+
+	var transcript schema.Transcript
+	readJSON(t, output, &transcript)
+	if len(transcript.Segments) != 1 {
+		t.Fatalf("segment count = %d, want 1", len(transcript.Segments))
+	}
+	if transcript.Segments[0].Text != "one" {
+		t.Fatalf("remaining segment = %#v, want one", transcript.Segments[0])
+	}
+}
+
+func TestTrimSelectorOrderDoesNotAffectTranscriptOrder(t *testing.T) {
+	dir := t.TempDir()
+	input := writeTrimFullFixture(t, dir, "input.json")
+	output := filepath.Join(dir, "trimmed.json")
+
+	err := executeTrim(
+		"--input-file", input,
+		"--output-file", output,
+		"--keep", "4,1,3",
+	)
+	if err != nil {
+		t.Fatalf("trim failed: %v", err)
+	}
+
+	var transcript schema.Transcript
+	readJSON(t, output, &transcript)
+	if len(transcript.Segments) != 3 {
+		t.Fatalf("segment count = %d, want 3", len(transcript.Segments))
+	}
+	got := []string{
+		transcript.Segments[0].Text,
+		transcript.Segments[1].Text,
+		transcript.Segments[2].Text,
+	}
+	want := []string{"one", "three", "four"}
+	if got[0] != want[0] || got[1] != want[1] || got[2] != want[2] {
+		t.Fatalf("segment text order = %v, want %v", got, want)
+	}
+}
+
+func TestTrimAllowEmptyBehavior(t *testing.T) {
+	dir := t.TempDir()
+	input := writeTrimFullFixture(t, dir, "input.json")
+	output := filepath.Join(dir, "trimmed.json")
+
+	err := executeTrim(
+		"--input-file", input,
+		"--output-file", output,
+		"--remove", "1-4",
+	)
+	if err == nil {
+		t.Fatal("expected empty-output error")
+	}
+	if !strings.Contains(err.Error(), "empty transcript") {
+		t.Fatalf("unexpected error: %v", err)
+	}
+
+	err = executeTrim(
+		"--input-file", input,
+		"--output-file", output,
+		"--remove", "1-4",
+		"--allow-empty",
+	)
+	if err != nil {
+		t.Fatalf("trim with --allow-empty failed: %v", err)
+	}
+
+	var transcript schema.Transcript
+	readJSON(t, output, &transcript)
+	if len(transcript.Segments) != 0 {
+		t.Fatalf("segment count = %d, want 0", len(transcript.Segments))
+	}
+}
+
+func TestTrimRejectsNonSeriatimInputArtifacts(t *testing.T) {
+	dir := t.TempDir()
+	input := writeJSONFile(t, dir, "raw-whisperx.json", `{
+		"segments": [
+			{"start": 1, "end": 2, "text": "hello"}
+		]
+	}`)
+	output := filepath.Join(dir, "trimmed.json")
+
+	err := executeTrim(
+		"--input-file", input,
+		"--output-file", output,
+		"--keep", "1",
+	)
+	if err == nil {
+		t.Fatal("expected invalid artifact error")
+	}
+	if !strings.Contains(err.Error(), "not a valid seriatim output artifact") {
+		t.Fatalf("unexpected error: %v", err)
+	}
+}
+
+func TestTrimReportFileContainsAuditFields(t *testing.T) {
+	dir := t.TempDir()
+	input := writeTrimFullFixture(t, dir, "input.json")
+	output := filepath.Join(dir, "trimmed.json")
+	reportPath := filepath.Join(dir, "trim-report.json")
+
+	err := executeTrim(
+		"--input-file", input,
+		"--output-file", output,
+		"--report-file", reportPath,
+		"--remove", "4,2",
+	)
+	if err != nil {
+		t.Fatalf("trim failed: %v", err)
+	}
+
+	var rpt report.Report
+	readJSON(t, reportPath, &rpt)
+	if len(rpt.Events) == 0 {
+		t.Fatal("expected report events")
+	}
+	if !hasReportEvent(rpt, "trim", "trim", "trimmed 4 input segment(s) into 2 output segment(s) with mode=remove") {
+		t.Fatal("expected trim summary event")
+	}
+	if !hasReportEvent(rpt, "trim", "validate-output", "validated 2 output segment(s)") {
+		t.Fatal("expected validation event")
+	}
+
+	audit := extractTrimAuditEvent(t, rpt)
+	if audit.Operation != "trim" {
+		t.Fatalf("operation = %q, want trim", audit.Operation)
+	}
+	if audit.InputFile != input {
+		t.Fatalf("input_file = %q, want %q", audit.InputFile, input)
+	}
+	if audit.OutputFile != output {
+		t.Fatalf("output_file = %q, want %q", audit.OutputFile, output)
+	}
+	if audit.InputSchema != config.OutputSchemaFull || audit.OutputSchema != config.OutputSchemaFull {
+		t.Fatalf("schemas = %q -> %q, want full -> full", audit.InputSchema, audit.OutputSchema)
+	}
+	if audit.Mode != "remove" {
+		t.Fatalf("mode = %q, want remove", audit.Mode)
+	}
+	if audit.Selector != "4,2" {
+		t.Fatalf("selector = %q, want %q", audit.Selector, "4,2")
+	}
+	assertIntSliceEqual(t, audit.SelectedIDs, []int{2, 4})
+	if audit.AllowEmpty {
+		t.Fatal("allow_empty should be false")
+	}
+	if audit.InputSegmentCount != 4 || audit.RetainedSegmentCount != 2 || audit.RemovedSegmentCount != 2 {
+		t.Fatalf("counts = input:%d retained:%d removed:%d, want 4/2/2", audit.InputSegmentCount, audit.RetainedSegmentCount, audit.RemovedSegmentCount)
+	}
+	assertIntSliceEqual(t, audit.RemovedInputIDs, []int{2, 4})
+	if len(audit.OldToNewIDMapping) != 2 {
+		t.Fatalf("mapping length = %d, want 2", len(audit.OldToNewIDMapping))
+	}
+	if audit.OldToNewIDMapping[0].OldID != 1 || audit.OldToNewIDMapping[0].NewID != 1 {
+		t.Fatalf("mapping[0] = %#v, want old_id=1 new_id=1", audit.OldToNewIDMapping[0])
+	}
+	if audit.OldToNewIDMapping[1].OldID != 3 || audit.OldToNewIDMapping[1].NewID != 2 {
+		t.Fatalf("mapping[1] = %#v, want old_id=3 new_id=2", audit.OldToNewIDMapping[1])
+	}
+	if !audit.OverlapGroupsRecomputed {
+		t.Fatal("expected overlap_groups_recomputed=true for full schema trim")
+	}
+}
+
+func TestTrimReportOldToNewMappingIsDeterministicSorted(t *testing.T) {
+	dir := t.TempDir()
+	input := writeTrimFullFixture(t, dir, "input.json")
+	output := filepath.Join(dir, "trimmed.json")
+	reportPath := filepath.Join(dir, "trim-report.json")
+
+	err := executeTrim(
+		"--input-file", input,
+		"--output-file", output,
+		"--report-file", reportPath,
+		"--keep", "4,1,3",
+	)
+	if err != nil {
+		t.Fatalf("trim failed: %v", err)
+	}
+
+	var rpt report.Report
+	readJSON(t, reportPath, &rpt)
+	audit := extractTrimAuditEvent(t, rpt)
+	if len(audit.OldToNewIDMapping) != 3 {
+		t.Fatalf("mapping length = %d, want 3", len(audit.OldToNewIDMapping))
+	}
+	for index, expectedOld := range []int{1, 3, 4} {
+		if audit.OldToNewIDMapping[index].OldID != expectedOld {
+			t.Fatalf("mapping[%d].old_id = %d, want %d", index, audit.OldToNewIDMapping[index].OldID, expectedOld)
+		}
+	}
+}
+
+func TestTrimNoReportFileWhenOmitted(t *testing.T) {
+	dir := t.TempDir()
+	input := writeTrimFullFixture(t, dir, "input.json")
+	output := filepath.Join(dir, "trimmed.json")
+	reportPath := filepath.Join(dir, "trim-report.json")
+
+	err := executeTrim(
+		"--input-file", input,
+		"--output-file", output,
+		"--keep", "1",
+	)
+	if err != nil {
+		t.Fatalf("trim failed: %v", err)
+	}
+
+	_, statErr := os.Stat(reportPath)
+	if !os.IsNotExist(statErr) {
+		t.Fatalf("expected no report file at %q, got err=%v", reportPath, statErr)
+	}
+}
+
+func executeTrim(args ...string) error {
+	cmd := NewRootCommand()
+	cmd.SetArgs(append([]string{"trim"}, args...))
+	return cmd.Execute()
+}
+
+func writeTrimFullFixture(t *testing.T, dir string, name string) string {
+	t.Helper()
+
+	first := 10
+	second := 20
+	third := 30
+	fourth := 40
+	value := schema.Transcript{
+		Metadata: schema.Metadata{
+			Application:           "seriatim",
+			Version:               "v-test",
+			InputReader:           "json-files",
+			InputFiles:            []string{"a.json"},
+			PreprocessingModules:  []string{"validate-raw"},
+			PostprocessingModules: []string{"assign-ids"},
+			OutputModules:         []string{"json"},
+		},
+		Segments: []schema.Segment{
+			{ID: 1, Source: "a.json", SourceSegmentIndex: &first, SourceRef: "a.json#10", Speaker: "A", Start: 1, End: 2, Text: "one", OverlapGroupID: 9},
+			{ID: 2, Source: "a.json", SourceSegmentIndex: &second, SourceRef: "a.json#20", Speaker: "B", Start: 2, End: 3, Text: "two", OverlapGroupID: 9},
+			{ID: 3, Source: "a.json", SourceSegmentIndex: &third, SourceRef: "a.json#30", Speaker: "C", Start: 4, End: 5, Text: "three", OverlapGroupID: 10},
+			{ID: 4, Source: "a.json", SourceSegmentIndex: &fourth, SourceRef: "a.json#40", Speaker: "D", Start: 5, End: 6, Text: "four", OverlapGroupID: 10},
+		},
+		OverlapGroups: []schema.OverlapGroup{
+			{ID: 9, Start: 1, End: 3, Segments: []string{"a.json#10", "a.json#20"}, Speakers: []string{"A", "B"}, Class: "unknown", Resolution: "unresolved"},
+		},
+	}
+
+	return writeTrimArtifactFile(t, dir, name, value)
+}
+
+func writeTrimMinimalFixture(t *testing.T, dir string, name string) string {
+	t.Helper()
+
+	value := schema.MinimalTranscript{
+		Metadata: schema.MinimalMetadata{
+			Application:  "seriatim",
+			Version:      "v-test",
+			OutputSchema: config.OutputSchemaMinimal,
+		},
+		Segments: []schema.MinimalSegment{
+			{ID: 1, Start: 1, End: 2, Speaker: "A", Text: "one"},
+			{ID: 2, Start: 2, End: 3, Speaker: "B", Text: "two"},
+		},
+	}
+
+	return writeTrimArtifactFile(t, dir, name, value)
+}
+
+func writeTrimIntermediateFixture(t *testing.T, dir string, name string) string {
+	t.Helper()
+
+	value := schema.IntermediateTranscript{
+		Metadata: schema.IntermediateMetadata{
+			Application:  "seriatim",
+			Version:      "v-test",
+			OutputSchema: config.OutputSchemaIntermediate,
+		},
+		Segments: []schema.IntermediateSegment{
+			{ID: 1, Start: 1, End: 2, Speaker: "A", Text: "one", Categories: []string{"word-run"}},
+			{ID: 2, Start: 2, End: 3, Speaker: "B", Text: "two", Categories: []string{"filler", "backchannel"}},
+		},
+	}
+
+	return writeTrimArtifactFile(t, dir, name, value)
+}
+
+func writeTrimMinimalWithIDsFixture(t *testing.T, dir string, name string, ids []int) string {
+	t.Helper()
+
+	if len(ids) < 2 {
+		t.Fatalf("need at least two IDs, got %d", len(ids))
+	}
+	value := schema.MinimalTranscript{
+		Metadata: schema.MinimalMetadata{
+			Application:  "seriatim",
+			Version:      "v-test",
+			OutputSchema: config.OutputSchemaMinimal,
+		},
+		Segments: []schema.MinimalSegment{
+			{ID: ids[0], Start: 1, End: 2, Speaker: "A", Text: "one"},
+			{ID: ids[1], Start: 2, End: 3, Speaker: "B", Text: "two"},
+		},
+	}
+
+	return writeTrimArtifactFile(t, dir, name, value)
+}
+
+func writeTrimFullOverlapFixture(t *testing.T, dir string, name string) string {
+	t.Helper()
+
+	first := 10
+	second := 20
+	third := 30
+	value := schema.Transcript{
+		Metadata: schema.Metadata{
+			Application:           "seriatim",
+			Version:               "v-test",
+			InputReader:           "json-files",
+			InputFiles:            []string{"a.json"},
+			PreprocessingModules:  []string{"validate-raw"},
+			PostprocessingModules: []string{"detect-overlaps", "assign-ids"},
+			OutputModules:         []string{"json"},
+		},
+		Segments: []schema.Segment{
+			{ID: 1, Source: "a.json", SourceSegmentIndex: &first, SourceRef: "a.json#10", Speaker: "A", Start: 1, End: 3, Text: "one", OverlapGroupID: 5},
+			{ID: 2, Source: "a.json", SourceSegmentIndex: &second, SourceRef: "a.json#20", Speaker: "B", Start: 2, End: 4, Text: "two", OverlapGroupID: 5},
+			{ID: 3, Source: "a.json", SourceSegmentIndex: &third, SourceRef: "a.json#30", Speaker: "C", Start: 6, End: 7, Text: "three", OverlapGroupID: 6},
+		},
+		OverlapGroups: []schema.OverlapGroup{
+			{ID: 99, Start: 0, End: 100, Segments: []string{"stale"}, Speakers: []string{"stale"}, Class: "unknown", Resolution: "unresolved"},
+		},
+	}
+
+	return writeTrimArtifactFile(t, dir, name, value)
+}
+
+func writeTrimArtifactFile(t *testing.T, dir string, name string, value any) string {
+	t.Helper()
+
+	data, err := json.MarshalIndent(value, "", "  ")
+	if err != nil {
+		t.Fatalf("marshal fixture: %v", err)
+	}
+	path := filepath.Join(dir, name)
+	if err := os.WriteFile(path, append(data, '\n'), 0o600); err != nil {
+		t.Fatalf("write fixture: %v", err)
+	}
+	return path
+}
+
+func assertSequentialIDs(t *testing.T, ids []int) {
+	t.Helper()
+	for index, id := range ids {
+		want := index + 1
+		if id != want {
+			t.Fatalf("id at index %d = %d, want %d", index, id, want)
+		}
+	}
+}
+
+func extractTrimAuditEvent(t *testing.T, rpt report.Report) trimAuditReport {
+	t.Helper()
+
+	for _, event := range rpt.Events {
+		if event.Stage == "trim" && event.Module == "trim-audit" {
+			var audit trimAuditReport
+			if err := json.Unmarshal([]byte(event.Message), &audit); err != nil {
+				t.Fatalf("decode trim audit event: %v", err)
+			}
+			return audit
+		}
+	}
+	t.Fatal("missing trim-audit event")
+	return trimAuditReport{}
+}
+
+func assertIntSliceEqual(t *testing.T, got []int, want []int) {
+	t.Helper()
+	if len(got) != len(want) {
+		t.Fatalf("slice length = %d, want %d", len(got), len(want))
+	}
+	for index := range got {
+		if got[index] != want[index] {
+			t.Fatalf("slice[%d] = %d, want %d (full got=%v, want=%v)", index, got[index], want[index], got, want)
+		}
+	}
+}
--- a/internal/config/config.go
+++ b/internal/config/config.go
@@ -47,6 +47,17 @@ type MergeOptions struct {
 	CoalesceGap           string
 }

+// TrimOptions captures raw CLI option values before validation.
+type TrimOptions struct {
+	InputFile    string
+	OutputFile   string
+	ReportFile   string
+	Keep         string
+	Remove       string
+	OutputSchema string
+	AllowEmpty   bool
+}
+
 // Config is the validated runtime configuration for a merge invocation.
 type Config struct {
 	InputFiles             []string
@@ -66,6 +77,17 @@ type Config struct {
 	FillerMaxDuration      float64
 }

+// TrimConfig is the validated runtime configuration for a trim invocation.
+type TrimConfig struct {
+	InputFile    string
+	OutputFile   string
+	ReportFile   string
+	Mode         string
+	Selector     string
+	OutputSchema string
+	AllowEmpty   bool
+}
+
 // NewMergeConfig validates raw merge options and returns normalized config.
 func NewMergeConfig(opts MergeOptions) (Config, error) {
 	cfg := Config{
@@ -168,6 +190,63 @@ func NewMergeConfig(opts MergeOptions) (Config, error) {
 	return cfg, nil
 }

+// NewTrimConfig validates raw trim options and returns normalized config.
+func NewTrimConfig(opts TrimOptions) (TrimConfig, error) {
+	inputFile := filepath.Clean(strings.TrimSpace(opts.InputFile))
+	if strings.TrimSpace(opts.InputFile) == "" {
+		return TrimConfig{}, errors.New("--input-file is required")
+	}
+	if err := requireFile(inputFile, "--input-file"); err != nil {
+		return TrimConfig{}, err
+	}
+
+	outputFile, err := normalizeOutputPath(opts.OutputFile, "--output-file")
+	if err != nil {
+		return TrimConfig{}, err
+	}
+
+	reportFile := ""
+	if strings.TrimSpace(opts.ReportFile) != "" {
+		reportFile, err = normalizeOutputPath(opts.ReportFile, "--report-file")
+		if err != nil {
+			return TrimConfig{}, err
+		}
+	}
+
+	keep := strings.TrimSpace(opts.Keep)
+	remove := strings.TrimSpace(opts.Remove)
+	if keep == "" && remove == "" {
+		return TrimConfig{}, errors.New("exactly one of --keep or --remove is required")
+	}
+	if keep != "" && remove != "" {
+		return TrimConfig{}, errors.New("--keep and --remove are mutually exclusive")
+	}
+
+	mode := "keep"
+	selector := keep
+	if remove != "" {
+		mode = "remove"
+		selector = remove
+	}
+
+	outputSchema := strings.TrimSpace(opts.OutputSchema)
+	if outputSchema != "" {
+		if err := validateOutputSchema(outputSchema); err != nil {
+			return TrimConfig{}, err
+		}
+	}
+
+	return TrimConfig{
+		InputFile:    inputFile,
+		OutputFile:   outputFile,
+		ReportFile:   reportFile,
+		Mode:         mode,
+		Selector:     selector,
+		OutputSchema: outputSchema,
+		AllowEmpty:   opts.AllowEmpty,
+	}, nil
+}
+
 func parseModuleList(value string) ([]string, error) {
 	value = strings.TrimSpace(value)
 	if value == "" {
--- a/internal/config/config_test.go
+++ b/internal/config/config_test.go
@@ -612,6 +612,105 @@ func TestCoalesceGapRejectsInvalidOverride(t *testing.T) {
 	}
 }

+func TestNewTrimConfigRequiresInputAndOutput(t *testing.T) {
+	dir := t.TempDir()
+	input := writeTempFile(t, dir, "input.json")
+	output := filepath.Join(dir, "trimmed.json")
+
+	_, err := NewTrimConfig(TrimOptions{
+		OutputFile: output,
+		Keep:       "1",
+	})
+	if err == nil || !strings.Contains(err.Error(), "--input-file is required") {
+		t.Fatalf("expected input-file required error, got %v", err)
+	}
+
+	_, err = NewTrimConfig(TrimOptions{
+		InputFile: input,
+		Keep:      "1",
+	})
+	if err == nil || !strings.Contains(err.Error(), "--output-file is required") {
+		t.Fatalf("expected output-file required error, got %v", err)
+	}
+}
+
+func TestNewTrimConfigRequiresExactlyOneSelectorFlag(t *testing.T) {
+	dir := t.TempDir()
+	input := writeTempFile(t, dir, "input.json")
+	output := filepath.Join(dir, "trimmed.json")
+
+	_, err := NewTrimConfig(TrimOptions{
+		InputFile:  input,
+		OutputFile: output,
+	})
+	if err == nil || !strings.Contains(err.Error(), "exactly one of --keep or --remove is required") {
+		t.Fatalf("expected missing selector error, got %v", err)
+	}
+
+	_, err = NewTrimConfig(TrimOptions{
+		InputFile:  input,
+		OutputFile: output,
+		Keep:       "1",
+		Remove:     "2",
+	})
+	if err == nil || !strings.Contains(err.Error(), "mutually exclusive") {
+		t.Fatalf("expected mutually exclusive selector error, got %v", err)
+	}
+}
+
+func TestNewTrimConfigAcceptsOutputSchemaOverride(t *testing.T) {
+	dir := t.TempDir()
+	input := writeTempFile(t, dir, "input.json")
+	output := filepath.Join(dir, "trimmed.json")
+	reportPath := filepath.Join(dir, "report.json")
+
+	cfg, err := NewTrimConfig(TrimOptions{
+		InputFile:    input,
+		OutputFile:   output,
+		ReportFile:   reportPath,
+		Remove:       "3-5",
+		OutputSchema: OutputSchemaMinimal,
+		AllowEmpty:   true,
+	})
+	if err != nil {
+		t.Fatalf("config failed: %v", err)
+	}
+	if cfg.Mode != "remove" {
+		t.Fatalf("mode = %q, want remove", cfg.Mode)
+	}
+	if cfg.Selector != "3-5" {
+		t.Fatalf("selector = %q, want 3-5", cfg.Selector)
+	}
+	if cfg.OutputSchema != OutputSchemaMinimal {
+		t.Fatalf("output schema = %q, want %q", cfg.OutputSchema, OutputSchemaMinimal)
+	}
+	if !cfg.AllowEmpty {
+		t.Fatal("allow empty should be true")
+	}
+	if cfg.ReportFile != reportPath {
+		t.Fatalf("report file = %q, want %q", cfg.ReportFile, reportPath)
+	}
+}
+
+func TestNewTrimConfigRejectsInvalidOutputSchemaOverride(t *testing.T) {
+	dir := t.TempDir()
+	input := writeTempFile(t, dir, "input.json")
+	output := filepath.Join(dir, "trimmed.json")
+
+	_, err := NewTrimConfig(TrimOptions{
+		InputFile:    input,
+		OutputFile:   output,
+		Keep:         "1",
+		OutputSchema: "compact",
+	})
+	if err == nil {
+		t.Fatal("expected output schema validation error")
+	}
+	if !strings.Contains(err.Error(), "--output-schema must be one of") {
+		t.Fatalf("unexpected error: %v", err)
+	}
+}
+
 func assertPositiveFloatEnvValidation(t *testing.T, envName string) {
 	t.Helper()

--- a/internal/trim/apply.go
+++ b/internal/trim/apply.go
@@ -0,0 +1,367 @@
+package trim
+
+import (
+	"fmt"
+
+	"gitea.maximumdirect.net/eric/seriatim/internal/model"
+	"gitea.maximumdirect.net/eric/seriatim/internal/overlap"
+	"gitea.maximumdirect.net/eric/seriatim/schema"
+)
+
+// Mode controls how selector IDs are applied.
+type Mode string
+
+const (
+	ModeKeep   Mode = "keep"
+	ModeRemove Mode = "remove"
+)
+
+// Options configures transcript trimming.
+type Options struct {
+	Mode       Mode
+	Selector   Selector
+	AllowEmpty bool
+}
+
+// Result contains trimming output and ID mapping metadata.
+type Result struct {
+	Transcript schema.Transcript
+	OldToNewID map[int]int
+	RemovedIDs []int
+}
+
+// IntermediateResult contains trimming output for intermediate schema artifacts.
+type IntermediateResult struct {
+	Transcript schema.IntermediateTranscript
+	OldToNewID map[int]int
+	RemovedIDs []int
+}
+
+// MinimalResult contains trimming output for minimal schema artifacts.
+type MinimalResult struct {
+	Transcript schema.MinimalTranscript
+	OldToNewID map[int]int
+	RemovedIDs []int
+}
+
+// Apply trims a full seriatim output transcript by segment ID.
+func Apply(input schema.Transcript, opts Options) (Result, error) {
+	if err := validateMode(opts.Mode); err != nil {
+		return Result{}, err
+	}
+
+	selected := opts.Selector.IDs()
+	if len(selected) == 0 {
+		return Result{}, fmt.Errorf("selector cannot be empty")
+	}
+
+	inputIDs := make([]int, len(input.Segments))
+	for index, segment := range input.Segments {
+		inputIDs[index] = segment.ID
+	}
+
+	idIndex, err := validateInputIDs(inputIDs)
+	if err != nil {
+		return Result{}, err
+	}
+
+	if err := validateSelectedIDsExist(selected, idIndex); err != nil {
+		return Result{}, err
+	}
+
+	kept := make([]schema.Segment, 0, len(input.Segments))
+	removed := make([]int, 0, len(input.Segments))
+	oldToNew := make(map[int]int, len(input.Segments))
+	for _, segment := range input.Segments {
+		keep := opts.Mode == ModeKeep && opts.Selector.Contains(segment.ID)
+		if opts.Mode == ModeRemove {
+			keep = !opts.Selector.Contains(segment.ID)
+		}
+
+		if !keep {
+			removed = append(removed, segment.ID)
+			continue
+		}
+
+		rewritten := copySegment(segment)
+		rewritten.ID = len(kept) + 1
+		rewritten.OverlapGroupID = 0
+		kept = append(kept, rewritten)
+		oldToNew[segment.ID] = rewritten.ID
+	}
+
+	if len(kept) == 0 && !opts.AllowEmpty {
+		return Result{}, fmt.Errorf("trim operation produced an empty transcript; set AllowEmpty to true to permit this")
+	}
+
+	kept, groups := recomputeOverlapGroups(kept)
+	if groups == nil {
+		groups = make([]schema.OverlapGroup, 0)
+	}
+
+	out := copyTranscript(input)
+	out.Segments = kept
+	out.OverlapGroups = groups
+	return Result{
+		Transcript: out,
+		OldToNewID: oldToNew,
+		RemovedIDs: removed,
+	}, nil
+}
+
+// ApplyIntermediate trims an intermediate seriatim output transcript by
+// segment ID.
+func ApplyIntermediate(input schema.IntermediateTranscript, opts Options) (IntermediateResult, error) {
+	if err := validateMode(opts.Mode); err != nil {
+		return IntermediateResult{}, err
+	}
+
+	selected := opts.Selector.IDs()
+	if len(selected) == 0 {
+		return IntermediateResult{}, fmt.Errorf("selector cannot be empty")
+	}
+
+	inputIDs := make([]int, len(input.Segments))
+	for index, segment := range input.Segments {
+		inputIDs[index] = segment.ID
+	}
+	idIndex, err := validateInputIDs(inputIDs)
+	if err != nil {
+		return IntermediateResult{}, err
+	}
+	if err := validateSelectedIDsExist(selected, idIndex); err != nil {
+		return IntermediateResult{}, err
+	}
+
+	kept := make([]schema.IntermediateSegment, 0, len(input.Segments))
+	removed := make([]int, 0, len(input.Segments))
+	oldToNew := make(map[int]int, len(input.Segments))
+	for _, segment := range input.Segments {
+		keep := opts.Mode == ModeKeep && opts.Selector.Contains(segment.ID)
+		if opts.Mode == ModeRemove {
+			keep = !opts.Selector.Contains(segment.ID)
+		}
+		if !keep {
+			removed = append(removed, segment.ID)
+			continue
+		}
+
+		rewritten := schema.IntermediateSegment{
+			ID:         len(kept) + 1,
+			Start:      segment.Start,
+			End:        segment.End,
+			Speaker:    segment.Speaker,
+			Text:       segment.Text,
+			Categories: append([]string(nil), segment.Categories...),
+		}
+		kept = append(kept, rewritten)
+		oldToNew[segment.ID] = rewritten.ID
+	}
+
+	if len(kept) == 0 && !opts.AllowEmpty {
+		return IntermediateResult{}, fmt.Errorf("trim operation produced an empty transcript; set AllowEmpty to true to permit this")
+	}
+
+	return IntermediateResult{
+		Transcript: schema.IntermediateTranscript{
+			Metadata: schema.IntermediateMetadata{
+				Application:  input.Metadata.Application,
+				Version:      input.Metadata.Version,
+				OutputSchema: input.Metadata.OutputSchema,
+			},
+			Segments: kept,
+		},
+		OldToNewID: oldToNew,
+		RemovedIDs: removed,
+	}, nil
+}
+
+// ApplyMinimal trims a minimal seriatim output transcript by segment ID.
+func ApplyMinimal(input schema.MinimalTranscript, opts Options) (MinimalResult, error) {
+	if err := validateMode(opts.Mode); err != nil {
+		return MinimalResult{}, err
+	}
+
+	selected := opts.Selector.IDs()
+	if len(selected) == 0 {
+		return MinimalResult{}, fmt.Errorf("selector cannot be empty")
+	}
+
+	inputIDs := make([]int, len(input.Segments))
+	for index, segment := range input.Segments {
+		inputIDs[index] = segment.ID
+	}
+	idIndex, err := validateInputIDs(inputIDs)
+	if err != nil {
+		return MinimalResult{}, err
+	}
+	if err := validateSelectedIDsExist(selected, idIndex); err != nil {
+		return MinimalResult{}, err
+	}
+
+	kept := make([]schema.MinimalSegment, 0, len(input.Segments))
+	removed := make([]int, 0, len(input.Segments))
+	oldToNew := make(map[int]int, len(input.Segments))
+	for _, segment := range input.Segments {
+		keep := opts.Mode == ModeKeep && opts.Selector.Contains(segment.ID)
+		if opts.Mode == ModeRemove {
+			keep = !opts.Selector.Contains(segment.ID)
+		}
+		if !keep {
+			removed = append(removed, segment.ID)
+			continue
+		}
+
+		rewritten := schema.MinimalSegment{
+			ID:      len(kept) + 1,
+			Start:   segment.Start,
+			End:     segment.End,
+			Speaker: segment.Speaker,
+			Text:    segment.Text,
+		}
+		kept = append(kept, rewritten)
+		oldToNew[segment.ID] = rewritten.ID
+	}
+
+	if len(kept) == 0 && !opts.AllowEmpty {
+		return MinimalResult{}, fmt.Errorf("trim operation produced an empty transcript; set AllowEmpty to true to permit this")
+	}
+
+	return MinimalResult{
+		Transcript: schema.MinimalTranscript{
+			Metadata: schema.MinimalMetadata{
+				Application:  input.Metadata.Application,
+				Version:      input.Metadata.Version,
+				OutputSchema: input.Metadata.OutputSchema,
+			},
+			Segments: kept,
+		},
+		OldToNewID: oldToNew,
+		RemovedIDs: removed,
+	}, nil
+}
+
+func validateMode(mode Mode) error {
+	switch mode {
+	case ModeKeep, ModeRemove:
+		return nil
+	default:
+		return fmt.Errorf("invalid trim mode %q", mode)
+	}
+}
+
+func validateInputIDs(ids []int) (map[int]int, error) {
+	seen := make(map[int]int, len(ids))
+	for index, id := range ids {
+		if id <= 0 {
+			return nil, fmt.Errorf("input transcript has non-positive segment ID %d at index %d", id, index)
+		}
+		if firstIndex, exists := seen[id]; exists {
+			return nil, fmt.Errorf("input transcript has duplicate segment ID %d at indexes %d and %d", id, firstIndex, index)
+		}
+		seen[id] = index
+	}
+
+	for id := 1; id <= len(ids); id++ {
+		if _, exists := seen[id]; !exists {
+			return nil, fmt.Errorf("input transcript segment IDs must be sequential 1..%d; missing ID %d", len(ids), id)
+		}
+	}
+	return seen, nil
+}
+
+func validateSelectedIDsExist(selected []int, idIndex map[int]int) error {
+	for _, id := range selected {
+		if _, exists := idIndex[id]; !exists {
+			return fmt.Errorf("selected segment ID %d does not exist in input transcript", id)
+		}
+	}
+	return nil
+}
+
+func recomputeOverlapGroups(segments []schema.Segment) ([]schema.Segment, []schema.OverlapGroup) {
+	if len(segments) == 0 {
+		return segments, make([]schema.OverlapGroup, 0)
+	}
+
+	modelSegments := make([]model.Segment, len(segments))
+	for index, segment := range segments {
+		modelSegments[index] = model.Segment{
+			ID:                 segment.ID,
+			Source:             segment.Source,
+			SourceSegmentIndex: copyIntPtr(segment.SourceSegmentIndex),
+			SourceRef:          segment.SourceRef,
+			DerivedFrom:        append([]string(nil), segment.DerivedFrom...),
+			Speaker:            segment.Speaker,
+			Start:              segment.Start,
+			End:                segment.End,
+			Text:               segment.Text,
+			Categories:         append([]string(nil), segment.Categories...),
+			OverlapGroupID:     segment.OverlapGroupID,
+		}
+	}
+
+	detected := overlap.Detect(model.MergedTranscript{
+		Segments: modelSegments,
+	})
+	rewrittenSegments := make([]schema.Segment, len(segments))
+	for index, segment := range segments {
+		rewritten := copySegment(segment)
+		rewritten.OverlapGroupID = detected.Segments[index].OverlapGroupID
+		rewrittenSegments[index] = rewritten
+	}
+
+	groups := make([]schema.OverlapGroup, len(detected.OverlapGroups))
+	for index, group := range detected.OverlapGroups {
+		groups[index] = schema.OverlapGroup{
+			ID:         group.ID,
+			Start:      group.Start,
+			End:        group.End,
+			Segments:   append([]string(nil), group.Segments...),
+			Speakers:   append([]string(nil), group.Speakers...),
+			Class:      group.Class,
+			Resolution: group.Resolution,
+		}
+	}
+	return rewrittenSegments, groups
+}
+
+func copyTranscript(input schema.Transcript) schema.Transcript {
+	return schema.Transcript{
+		Metadata: schema.Metadata{
+			Application:           input.Metadata.Application,
+			Version:               input.Metadata.Version,
+			InputReader:           input.Metadata.InputReader,
+			InputFiles:            append([]string(nil), input.Metadata.InputFiles...),
+			PreprocessingModules:  append([]string(nil), input.Metadata.PreprocessingModules...),
+			PostprocessingModules: append([]string(nil), input.Metadata.PostprocessingModules...),
+			OutputModules:         append([]string(nil), input.Metadata.OutputModules...),
+		},
+		Segments:      append([]schema.Segment(nil), input.Segments...),
+		OverlapGroups: append([]schema.OverlapGroup(nil), input.OverlapGroups...),
+	}
+}
+
+func copySegment(input schema.Segment) schema.Segment {
+	return schema.Segment{
+		ID:                 input.ID,
+		Source:             input.Source,
+		SourceSegmentIndex: copyIntPtr(input.SourceSegmentIndex),
+		SourceRef:          input.SourceRef,
+		DerivedFrom:        append([]string(nil), input.DerivedFrom...),
+		Speaker:            input.Speaker,
+		Start:              input.Start,
+		End:                input.End,
+		Text:               input.Text,
+		Categories:         append([]string(nil), input.Categories...),
+		OverlapGroupID:     input.OverlapGroupID,
+	}
+}
+
+func copyIntPtr(value *int) *int {
+	if value == nil {
+		return nil
+	}
+	copied := *value
+	return &copied
+}
--- a/internal/trim/apply_test.go
+++ b/internal/trim/apply_test.go
@@ -0,0 +1,668 @@
+package trim
+
+import (
+	"strings"
+	"testing"
+
+	"gitea.maximumdirect.net/eric/seriatim/schema"
+)
+
+func TestApplyKeepModeRenumbersFromOne(t *testing.T) {
+	input := fullTranscriptFixture()
+	selector := mustParseSelector(t, "2,4")
+
+	result, err := Apply(input, Options{
+		Mode:     ModeKeep,
+		Selector: selector,
+	})
+	if err != nil {
+		t.Fatalf("apply failed: %v", err)
+	}
+
+	if len(result.Transcript.Segments) != 2 {
+		t.Fatalf("segment count = %d, want 2", len(result.Transcript.Segments))
+	}
+	assertSegmentIDs(t, result.Transcript.Segments, []int{1, 2})
+	assertSegmentTexts(t, result.Transcript.Segments, []string{"beta", "delta"})
+	assertIntMap(t, result.OldToNewID, map[int]int{2: 1, 4: 2})
+	assertIntSlice(t, result.RemovedIDs, []int{1, 3})
+}
+
+func TestApplyRemoveModeRenumbersFromOne(t *testing.T) {
+	input := fullTranscriptFixture()
+	selector := mustParseSelector(t, "2,4")
+
+	result, err := Apply(input, Options{
+		Mode:     ModeRemove,
+		Selector: selector,
+	})
+	if err != nil {
+		t.Fatalf("apply failed: %v", err)
+	}
+
+	assertSegmentIDs(t, result.Transcript.Segments, []int{1, 2})
+	assertSegmentTexts(t, result.Transcript.Segments, []string{"alpha", "gamma"})
+	assertIntMap(t, result.OldToNewID, map[int]int{1: 1, 3: 2})
+	assertIntSlice(t, result.RemovedIDs, []int{2, 4})
+}
+
+func TestApplySelectorOrderDoesNotChangeTranscriptOrder(t *testing.T) {
+	input := fullTranscriptFixture()
+	selector := mustParseSelector(t, "4,1,3")
+
+	result, err := Apply(input, Options{
+		Mode:     ModeKeep,
+		Selector: selector,
+	})
+	if err != nil {
+		t.Fatalf("apply failed: %v", err)
+	}
+
+	assertSegmentIDs(t, result.Transcript.Segments, []int{1, 2, 3})
+	assertSegmentTexts(t, result.Transcript.Segments, []string{"alpha", "gamma", "delta"})
+}
+
+func TestApplyFailsWhenSelectedIDDoesNotExist(t *testing.T) {
+	input := fullTranscriptFixture()
+	selector := mustParseSelector(t, "2,99")
+
+	_, err := Apply(input, Options{
+		Mode:     ModeKeep,
+		Selector: selector,
+	})
+	if err == nil {
+		t.Fatal("expected missing selected ID error")
+	}
+	if !strings.Contains(err.Error(), "does not exist") {
+		t.Fatalf("unexpected error: %v", err)
+	}
+}
+
+func TestApplyFailsOnDuplicateInputIDs(t *testing.T) {
+	input := fullTranscriptFixture()
+	input.Segments[2].ID = 2
+	selector := mustParseSelector(t, "2")
+
+	_, err := Apply(input, Options{
+		Mode:     ModeKeep,
+		Selector: selector,
+	})
+	if err == nil {
+		t.Fatal("expected duplicate input ID error")
+	}
+	if !strings.Contains(err.Error(), "duplicate segment ID") {
+		t.Fatalf("unexpected error: %v", err)
+	}
+}
+
+func TestApplyFailsOnMissingOrNonSequentialInputIDs(t *testing.T) {
+	input := fullTranscriptFixture()
+	input.Segments[1].ID = 5
+	selector := mustParseSelector(t, "1")
+
+	_, err := Apply(input, Options{
+		Mode:     ModeKeep,
+		Selector: selector,
+	})
+	if err == nil {
+		t.Fatal("expected non-sequential input ID error")
+	}
+	if !strings.Contains(err.Error(), "must be sequential") {
+		t.Fatalf("unexpected error: %v", err)
+	}
+}
+
+func TestApplyFailsOnNonPositiveInputIDs(t *testing.T) {
+	input := fullTranscriptFixture()
+	input.Segments[0].ID = 0
+	selector := mustParseSelector(t, "1")
+
+	_, err := Apply(input, Options{
+		Mode:     ModeKeep,
+		Selector: selector,
+	})
+	if err == nil {
+		t.Fatal("expected non-positive input ID error")
+	}
+	if !strings.Contains(err.Error(), "non-positive") {
+		t.Fatalf("unexpected error: %v", err)
+	}
+}
+
+func TestApplyEmptyOutputFailsUnlessAllowEmpty(t *testing.T) {
+	input := fullTranscriptFixture()
+	selector := mustParseSelector(t, "1-4")
+
+	_, err := Apply(input, Options{
+		Mode:     ModeRemove,
+		Selector: selector,
+	})
+	if err == nil {
+		t.Fatal("expected empty-output error")
+	}
+	if !strings.Contains(err.Error(), "empty transcript") {
+		t.Fatalf("unexpected error: %v", err)
+	}
+
+	allowed, err := Apply(input, Options{
+		Mode:       ModeRemove,
+		Selector:   selector,
+		AllowEmpty: true,
+	})
+	if err != nil {
+		t.Fatalf("apply with AllowEmpty failed: %v", err)
+	}
+	if len(allowed.Transcript.Segments) != 0 {
+		t.Fatalf("segment count = %d, want 0", len(allowed.Transcript.Segments))
+	}
+	assertIntMap(t, allowed.OldToNewID, map[int]int{})
+	assertIntSlice(t, allowed.RemovedIDs, []int{1, 2, 3, 4})
+}
+
+func TestApplyPreservesRetainedSegmentFieldsAndClearsOverlapIDs(t *testing.T) {
+	input := fullTranscriptFixture()
+	selector := mustParseSelector(t, "2")
+
+	result, err := Apply(input, Options{
+		Mode:     ModeKeep,
+		Selector: selector,
+	})
+	if err != nil {
+		t.Fatalf("apply failed: %v", err)
+	}
+
+	if len(result.Transcript.Segments) != 1 {
+		t.Fatalf("segment count = %d, want 1", len(result.Transcript.Segments))
+	}
+	segment := result.Transcript.Segments[0]
+	if segment.ID != 1 {
+		t.Fatalf("segment ID = %d, want 1", segment.ID)
+	}
+	if segment.Source != "b.json" {
+		t.Fatalf("source = %q, want %q", segment.Source, "b.json")
+	}
+	if segment.SourceSegmentIndex == nil || *segment.SourceSegmentIndex != 20 {
+		t.Fatalf("source_segment_index = %v, want 20", segment.SourceSegmentIndex)
+	}
+	if segment.SourceRef != "b.json#20" {
+		t.Fatalf("source_ref = %q, want %q", segment.SourceRef, "b.json#20")
+	}
+	if !equalStringSlices(segment.DerivedFrom, []string{"b.json#19", "b.json#20"}) {
+		t.Fatalf("derived_from = %v, want %v", segment.DerivedFrom, []string{"b.json#19", "b.json#20"})
+	}
+	if !equalStringSlices(segment.Categories, []string{"filler", "backchannel"}) {
+		t.Fatalf("categories = %v, want %v", segment.Categories, []string{"filler", "backchannel"})
+	}
+	if segment.Speaker != "Bob" {
+		t.Fatalf("speaker = %q, want Bob", segment.Speaker)
+	}
+	if segment.Start != 2 || segment.End != 3 {
+		t.Fatalf("times = %.3f-%.3f, want 2.000-3.000", segment.Start, segment.End)
+	}
+	if segment.Text != "beta" {
+		t.Fatalf("text = %q, want beta", segment.Text)
+	}
+	if segment.OverlapGroupID != 0 {
+		t.Fatalf("overlap_group_id = %d, want 0", segment.OverlapGroupID)
+	}
+	if len(result.Transcript.OverlapGroups) != 0 {
+		t.Fatalf("overlap_groups count = %d, want 0", len(result.Transcript.OverlapGroups))
+	}
+}
+
+func TestApplyFullSchemaRemovesStaleOverlapGroups(t *testing.T) {
+	input := overlapTranscriptFixture()
+	selector := mustParseSelector(t, "1,3")
+
+	result, err := Apply(input, Options{
+		Mode:     ModeKeep,
+		Selector: selector,
+	})
+	if err != nil {
+		t.Fatalf("apply failed: %v", err)
+	}
+
+	if len(result.Transcript.OverlapGroups) != 0 {
+		t.Fatalf("overlap_groups count = %d, want 0", len(result.Transcript.OverlapGroups))
+	}
+	for index, segment := range result.Transcript.Segments {
+		if segment.OverlapGroupID != 0 {
+			t.Fatalf("segment %d overlap_group_id = %d, want 0", index, segment.OverlapGroupID)
+		}
+	}
+}
+
+func TestApplyFullSchemaRecomputesOverlapGroup(t *testing.T) {
+	input := overlapTranscriptFixture()
+	selector := mustParseSelector(t, "1,2")
+
+	result, err := Apply(input, Options{
+		Mode:     ModeKeep,
+		Selector: selector,
+	})
+	if err != nil {
+		t.Fatalf("apply failed: %v", err)
+	}
+
+	assertSegmentIDs(t, result.Transcript.Segments, []int{1, 2})
+	assertIntSlice(t, []int{
+		result.Transcript.Segments[0].OverlapGroupID,
+		result.Transcript.Segments[1].OverlapGroupID,
+	}, []int{1, 1})
+	if len(result.Transcript.OverlapGroups) != 1 {
+		t.Fatalf("overlap_groups count = %d, want 1", len(result.Transcript.OverlapGroups))
+	}
+	group := result.Transcript.OverlapGroups[0]
+	if group.ID != 1 {
+		t.Fatalf("group ID = %d, want 1", group.ID)
+	}
+	if group.Start != 1 || group.End != 4 {
+		t.Fatalf("group times = %.3f-%.3f, want 1.000-4.000", group.Start, group.End)
+	}
+	if !equalStringSlices(group.Segments, []string{"a.json#10", "b.json#20"}) {
+		t.Fatalf("group segments = %v, want %v", group.Segments, []string{"a.json#10", "b.json#20"})
+	}
+	if !equalStringSlices(group.Speakers, []string{"Alice", "Bob"}) {
+		t.Fatalf("group speakers = %v, want %v", group.Speakers, []string{"Alice", "Bob"})
+	}
+}
+
+func TestApplyFullSchemaDropsGroupWhenFewerThanTwoSpeakersRemain(t *testing.T) {
+	input := overlapTranscriptFixture()
+	selector := mustParseSelector(t, "1")
+
+	result, err := Apply(input, Options{
+		Mode:     ModeKeep,
+		Selector: selector,
+	})
+	if err != nil {
+		t.Fatalf("apply failed: %v", err)
+	}
+
+	if len(result.Transcript.OverlapGroups) != 0 {
+		t.Fatalf("overlap_groups count = %d, want 0", len(result.Transcript.OverlapGroups))
+	}
+	if len(result.Transcript.Segments) != 1 {
+		t.Fatalf("segment count = %d, want 1", len(result.Transcript.Segments))
+	}
+	if result.Transcript.Segments[0].OverlapGroupID != 0 {
+		t.Fatalf("segment overlap_group_id = %d, want 0", result.Transcript.Segments[0].OverlapGroupID)
+	}
+}
+
+func TestApplyFullSchemaHandlesTransitiveOverlaps(t *testing.T) {
+	input := transitiveOverlapFixture()
+	selector := mustParseSelector(t, "1-3")
+
+	result, err := Apply(input, Options{
+		Mode:     ModeKeep,
+		Selector: selector,
+	})
+	if err != nil {
+		t.Fatalf("apply failed: %v", err)
+	}
+
+	if len(result.Transcript.OverlapGroups) != 1 {
+		t.Fatalf("overlap_groups count = %d, want 1", len(result.Transcript.OverlapGroups))
+	}
+	assertIntSlice(t, []int{
+		result.Transcript.Segments[0].OverlapGroupID,
+		result.Transcript.Segments[1].OverlapGroupID,
+		result.Transcript.Segments[2].OverlapGroupID,
+	}, []int{1, 1, 1})
+	group := result.Transcript.OverlapGroups[0]
+	if group.Start != 10 || group.End != 15 {
+		t.Fatalf("group times = %.3f-%.3f, want 10.000-15.000", group.Start, group.End)
+	}
+}
+
+func TestApplyFullSchemaBoundaryTouchingNotGrouped(t *testing.T) {
+	input := boundaryFixture()
+	selector := mustParseSelector(t, "1-2")
+
+	result, err := Apply(input, Options{
+		Mode:     ModeKeep,
+		Selector: selector,
+	})
+	if err != nil {
+		t.Fatalf("apply failed: %v", err)
+	}
+
+	if len(result.Transcript.OverlapGroups) != 0 {
+		t.Fatalf("overlap_groups count = %d, want 0", len(result.Transcript.OverlapGroups))
+	}
+	assertIntSlice(t, []int{
+		result.Transcript.Segments[0].OverlapGroupID,
+		result.Transcript.Segments[1].OverlapGroupID,
+	}, []int{0, 0})
+}
+
+func TestApplyIntermediateDoesNotIncludeOverlapGroups(t *testing.T) {
+	input := schema.IntermediateTranscript{
+		Metadata: schema.IntermediateMetadata{
+			Application:  "seriatim",
+			Version:      "v-test",
+			OutputSchema: "seriatim-intermediate",
+		},
+		Segments: []schema.IntermediateSegment{
+			{ID: 1, Start: 1, End: 3, Speaker: "Alice", Text: "alpha", Categories: []string{"word-run"}},
+			{ID: 2, Start: 2, End: 4, Speaker: "Bob", Text: "beta", Categories: []string{"filler"}},
+		},
+	}
+	selector := mustParseSelector(t, "1")
+	result, err := ApplyIntermediate(input, Options{
+		Mode:     ModeKeep,
+		Selector: selector,
+	})
+	if err != nil {
+		t.Fatalf("apply intermediate failed: %v", err)
+	}
+	if len(result.Transcript.Segments) != 1 {
+		t.Fatalf("segment count = %d, want 1", len(result.Transcript.Segments))
+	}
+	if result.Transcript.Segments[0].ID != 1 {
+		t.Fatalf("segment id = %d, want 1", result.Transcript.Segments[0].ID)
+	}
+	if err := schema.ValidateIntermediateTranscript(result.Transcript); err != nil {
+		t.Fatalf("intermediate output should remain valid: %v", err)
+	}
+}
+
+func TestApplyMinimalDoesNotIncludeOverlapGroups(t *testing.T) {
+	input := schema.MinimalTranscript{
+		Metadata: schema.MinimalMetadata{
+			Application:  "seriatim",
+			Version:      "v-test",
+			OutputSchema: "seriatim-minimal",
+		},
+		Segments: []schema.MinimalSegment{
+			{ID: 1, Start: 1, End: 3, Speaker: "Alice", Text: "alpha"},
+			{ID: 2, Start: 2, End: 4, Speaker: "Bob", Text: "beta"},
+		},
+	}
+	selector := mustParseSelector(t, "2")
+	result, err := ApplyMinimal(input, Options{
+		Mode:     ModeKeep,
+		Selector: selector,
+	})
+	if err != nil {
+		t.Fatalf("apply minimal failed: %v", err)
+	}
+	if len(result.Transcript.Segments) != 1 {
+		t.Fatalf("segment count = %d, want 1", len(result.Transcript.Segments))
+	}
+	if result.Transcript.Segments[0].ID != 1 {
+		t.Fatalf("segment id = %d, want 1", result.Transcript.Segments[0].ID)
+	}
+	if err := schema.ValidateMinimalTranscript(result.Transcript); err != nil {
+		t.Fatalf("minimal output should remain valid: %v", err)
+	}
+}
+
+func TestApplyOutputInvariantsValidAfterRenumberAndOverlapRecompute(t *testing.T) {
+	input := overlapTranscriptFixture()
+	selector := mustParseSelector(t, "2,1")
+
+	result, err := Apply(input, Options{
+		Mode:     ModeKeep,
+		Selector: selector,
+	})
+	if err != nil {
+		t.Fatalf("apply failed: %v", err)
+	}
+
+	if err := schema.ValidateTranscript(result.Transcript); err != nil {
+		t.Fatalf("trim output should remain valid: %v", err)
+	}
+}
+
+func mustParseSelector(t *testing.T, value string) Selector {
+	t.Helper()
+	selector, err := ParseSelector(value)
+	if err != nil {
+		t.Fatalf("selector parse failed for %q: %v", value, err)
+	}
+	return selector
+}
+
+func fullTranscriptFixture() schema.Transcript {
+	firstIndex := 10
+	secondIndex := 20
+	thirdIndex := 30
+	fourthIndex := 40
+
+	return schema.Transcript{
+		Metadata: schema.Metadata{
+			Application:           "seriatim",
+			Version:               "v-test",
+			InputReader:           "json-files",
+			InputFiles:            []string{"a.json", "b.json"},
+			PreprocessingModules:  []string{"validate-raw"},
+			PostprocessingModules: []string{"detect-overlaps"},
+			OutputModules:         []string{"json"},
+		},
+		Segments: []schema.Segment{
+			{
+				ID:                 1,
+				Source:             "a.json",
+				SourceSegmentIndex: &firstIndex,
+				SourceRef:          "a.json#10",
+				DerivedFrom:        []string{"a.json#10"},
+				Speaker:            "Alice",
+				Start:              1,
+				End:                2,
+				Text:               "alpha",
+				Categories:         []string{"word-run"},
+				OverlapGroupID:     7,
+			},
+			{
+				ID:                 2,
+				Source:             "b.json",
+				SourceSegmentIndex: &secondIndex,
+				SourceRef:          "b.json#20",
+				DerivedFrom:        []string{"b.json#19", "b.json#20"},
+				Speaker:            "Bob",
+				Start:              2,
+				End:                3,
+				Text:               "beta",
+				Categories:         []string{"filler", "backchannel"},
+				OverlapGroupID:     7,
+			},
+			{
+				ID:                 3,
+				Source:             "c.json",
+				SourceSegmentIndex: &thirdIndex,
+				SourceRef:          "c.json#30",
+				DerivedFrom:        []string{"c.json#30"},
+				Speaker:            "Carol",
+				Start:              3,
+				End:                4,
+				Text:               "gamma",
+				Categories:         []string{"normal"},
+				OverlapGroupID:     8,
+			},
+			{
+				ID:                 4,
+				Source:             "d.json",
+				SourceSegmentIndex: &fourthIndex,
+				SourceRef:          "d.json#40",
+				DerivedFrom:        []string{"d.json#40"},
+				Speaker:            "Dan",
+				Start:              4,
+				End:                5,
+				Text:               "delta",
+				Categories:         []string{"normal"},
+				OverlapGroupID:     9,
+			},
+		},
+		OverlapGroups: []schema.OverlapGroup{
+			{
+				ID:         7,
+				Start:      1.5,
+				End:        3.1,
+				Segments:   []string{"a.json#10", "b.json#20"},
+				Speakers:   []string{"Alice", "Bob"},
+				Class:      "unknown",
+				Resolution: "unresolved",
+			},
+		},
+	}
+}
+
+func overlapTranscriptFixture() schema.Transcript {
+	first := 10
+	second := 20
+	third := 30
+
+	return schema.Transcript{
+		Metadata: schema.Metadata{
+			Application:           "seriatim",
+			Version:               "v-test",
+			InputReader:           "json-files",
+			InputFiles:            []string{"a.json", "b.json", "c.json"},
+			PreprocessingModules:  []string{"validate-raw"},
+			PostprocessingModules: []string{"detect-overlaps"},
+			OutputModules:         []string{"json"},
+		},
+		Segments: []schema.Segment{
+			{
+				ID:                 1,
+				Source:             "a.json",
+				SourceSegmentIndex: &first,
+				SourceRef:          "a.json#10",
+				Speaker:            "Alice",
+				Start:              1,
+				End:                4,
+				Text:               "a",
+				OverlapGroupID:     99,
+			},
+			{
+				ID:                 2,
+				Source:             "b.json",
+				SourceSegmentIndex: &second,
+				SourceRef:          "b.json#20",
+				Speaker:            "Bob",
+				Start:              2,
+				End:                3,
+				Text:               "b",
+				OverlapGroupID:     99,
+			},
+			{
+				ID:                 3,
+				Source:             "c.json",
+				SourceSegmentIndex: &third,
+				SourceRef:          "c.json#30",
+				Speaker:            "Carol",
+				Start:              10,
+				End:                11,
+				Text:               "c",
+				OverlapGroupID:     100,
+			},
+		},
+		OverlapGroups: []schema.OverlapGroup{
+			{
+				ID:         99,
+				Start:      0,
+				End:        100,
+				Segments:   []string{"stale#1", "stale#2"},
+				Speakers:   []string{"stale"},
+				Class:      "unknown",
+				Resolution: "unresolved",
+			},
+		},
+	}
+}
+
+func transitiveOverlapFixture() schema.Transcript {
+	one := 1
+	two := 2
+	three := 3
+	return schema.Transcript{
+		Metadata: schema.Metadata{
+			Application: "seriatim",
+			Version:     "v-test",
+		},
+		Segments: []schema.Segment{
+			{ID: 1, Source: "a.json", SourceSegmentIndex: &one, Speaker: "Alice", Start: 10, End: 14, Text: "a"},
+			{ID: 2, Source: "b.json", SourceSegmentIndex: &two, Speaker: "Bob", Start: 12, End: 13, Text: "b"},
+			{ID: 3, Source: "c.json", SourceSegmentIndex: &three, Speaker: "Carol", Start: 13.5, End: 15, Text: "c"},
+		},
+		OverlapGroups: []schema.OverlapGroup{{ID: 77}},
+	}
+}
+
+func boundaryFixture() schema.Transcript {
+	one := 1
+	two := 2
+	return schema.Transcript{
+		Metadata: schema.Metadata{
+			Application: "seriatim",
+			Version:     "v-test",
+		},
+		Segments: []schema.Segment{
+			{ID: 1, Source: "a.json", SourceSegmentIndex: &one, Speaker: "Alice", Start: 1, End: 2, Text: "a", OverlapGroupID: 7},
+			{ID: 2, Source: "b.json", SourceSegmentIndex: &two, Speaker: "Bob", Start: 2, End: 3, Text: "b", OverlapGroupID: 7},
+		},
+		OverlapGroups: []schema.OverlapGroup{{ID: 7, Start: 1, End: 3}},
+	}
+}
+
+func assertSegmentIDs(t *testing.T, segments []schema.Segment, want []int) {
+	t.Helper()
+	got := make([]int, len(segments))
+	for index, segment := range segments {
+		got[index] = segment.ID
+	}
+	assertIntSlice(t, got, want)
+}
+
+func assertSegmentTexts(t *testing.T, segments []schema.Segment, want []string) {
+	t.Helper()
+	got := make([]string, len(segments))
+	for index, segment := range segments {
+		got[index] = segment.Text
+	}
+	if !equalStringSlices(got, want) {
+		t.Fatalf("segment texts = %v, want %v", got, want)
+	}
+}
+
+func assertIntSlice(t *testing.T, got []int, want []int) {
+	t.Helper()
+	if len(got) != len(want) {
+		t.Fatalf("slice length = %d, want %d", len(got), len(want))
+	}
+	for index := range got {
+		if got[index] != want[index] {
+			t.Fatalf("slice[%d] = %d, want %d (full got=%v, want=%v)", index, got[index], want[index], got, want)
+		}
+	}
+}
+
+func assertIntMap(t *testing.T, got map[int]int, want map[int]int) {
+	t.Helper()
+	if len(got) != len(want) {
+		t.Fatalf("map length = %d, want %d", len(got), len(want))
+	}
+	for key, wantValue := range want {
+		gotValue, exists := got[key]
+		if !exists {
+			t.Fatalf("missing map key %d", key)
+		}
+		if gotValue != wantValue {
+			t.Fatalf("map[%d] = %d, want %d", key, gotValue, wantValue)
+		}
+	}
+}
+
+func equalStringSlices(got []string, want []string) bool {
+	if len(got) != len(want) {
+		return false
+	}
+	for index := range got {
+		if got[index] != want[index] {
+			return false
+		}
+	}
+	return true
+}
--- a/internal/trim/artifact.go
+++ b/internal/trim/artifact.go
@@ -0,0 +1,396 @@
+package trim
+
+import (
+	"encoding/json"
+	"fmt"
+
+	"gitea.maximumdirect.net/eric/seriatim/schema"
+)
+
+const (
+	SchemaMinimal      = "seriatim-minimal"
+	SchemaIntermediate = "seriatim-intermediate"
+	SchemaFull         = "seriatim-full"
+)
+
+// Artifact stores a parsed seriatim output artifact of one supported schema.
+type Artifact struct {
+	Schema       string
+	Full         *schema.Transcript
+	Intermediate *schema.IntermediateTranscript
+	Minimal      *schema.MinimalTranscript
+}
+
+// ApplyArtifactResult contains trimmed artifact output and ID mapping metadata.
+type ApplyArtifactResult struct {
+	Artifact                Artifact
+	OldToNewID              map[int]int
+	RemovedIDs              []int
+	OverlapGroupsRecomputed bool
+}
+
+// ParseArtifactJSON parses and validates a serialized seriatim output artifact.
+func ParseArtifactJSON(data []byte) (Artifact, error) {
+	var decoded any
+	if err := json.Unmarshal(data, &decoded); err != nil {
+		return Artifact{}, fmt.Errorf("input JSON is malformed: %w", err)
+	}
+
+	var full schema.Transcript
+	if err := json.Unmarshal(data, &full); err == nil {
+		if err := schema.ValidateTranscript(full); err == nil {
+			return Artifact{
+				Schema: SchemaFull,
+				Full:   &full,
+			}, nil
+		}
+	}
+
+	var intermediate schema.IntermediateTranscript
+	if err := json.Unmarshal(data, &intermediate); err == nil {
+		if err := schema.ValidateIntermediateTranscript(intermediate); err == nil {
+			return Artifact{
+				Schema:       SchemaIntermediate,
+				Intermediate: &intermediate,
+			}, nil
+		}
+	}
+
+	var minimal schema.MinimalTranscript
+	if err := json.Unmarshal(data, &minimal); err == nil {
+		if err := schema.ValidateMinimalTranscript(minimal); err == nil {
+			return Artifact{
+				Schema:  SchemaMinimal,
+				Minimal: &minimal,
+			}, nil
+		}
+	}
+
+	return Artifact{}, fmt.Errorf("input JSON is not a valid seriatim output artifact")
+}
+
+// ValidateArtifact validates an artifact against its declared schema.
+func ValidateArtifact(artifact Artifact) error {
+	switch artifact.Schema {
+	case SchemaFull:
+		if artifact.Full == nil {
+			return fmt.Errorf("full artifact payload is missing")
+		}
+		return schema.ValidateTranscript(*artifact.Full)
+	case SchemaIntermediate:
+		if artifact.Intermediate == nil {
+			return fmt.Errorf("intermediate artifact payload is missing")
+		}
+		return schema.ValidateIntermediateTranscript(*artifact.Intermediate)
+	case SchemaMinimal:
+		if artifact.Minimal == nil {
+			return fmt.Errorf("minimal artifact payload is missing")
+		}
+		return schema.ValidateMinimalTranscript(*artifact.Minimal)
+	default:
+		return fmt.Errorf("unsupported artifact schema %q", artifact.Schema)
+	}
+}
+
+// Value returns the artifact value for JSON serialization.
+func (artifact Artifact) Value() any {
+	switch artifact.Schema {
+	case SchemaFull:
+		if artifact.Full == nil {
+			return schema.Transcript{}
+		}
+		return *artifact.Full
+	case SchemaIntermediate:
+		if artifact.Intermediate == nil {
+			return schema.IntermediateTranscript{}
+		}
+		return *artifact.Intermediate
+	case SchemaMinimal:
+		if artifact.Minimal == nil {
+			return schema.MinimalTranscript{}
+		}
+		return *artifact.Minimal
+	default:
+		return nil
+	}
+}
+
+// SegmentCount returns the number of segments in the artifact.
+func (artifact Artifact) SegmentCount() int {
+	switch artifact.Schema {
+	case SchemaFull:
+		if artifact.Full == nil {
+			return 0
+		}
+		return len(artifact.Full.Segments)
+	case SchemaIntermediate:
+		if artifact.Intermediate == nil {
+			return 0
+		}
+		return len(artifact.Intermediate.Segments)
+	case SchemaMinimal:
+		if artifact.Minimal == nil {
+			return 0
+		}
+		return len(artifact.Minimal.Segments)
+	default:
+		return 0
+	}
+}
+
+// Application returns artifact metadata application name.
+func (artifact Artifact) Application() string {
+	switch artifact.Schema {
+	case SchemaFull:
+		if artifact.Full == nil {
+			return ""
+		}
+		return artifact.Full.Metadata.Application
+	case SchemaIntermediate:
+		if artifact.Intermediate == nil {
+			return ""
+		}
+		return artifact.Intermediate.Metadata.Application
+	case SchemaMinimal:
+		if artifact.Minimal == nil {
+			return ""
+		}
+		return artifact.Minimal.Metadata.Application
+	default:
+		return ""
+	}
+}
+
+// Version returns artifact metadata version.
+func (artifact Artifact) Version() string {
+	switch artifact.Schema {
+	case SchemaFull:
+		if artifact.Full == nil {
+			return ""
+		}
+		return artifact.Full.Metadata.Version
+	case SchemaIntermediate:
+		if artifact.Intermediate == nil {
+			return ""
+		}
+		return artifact.Intermediate.Metadata.Version
+	case SchemaMinimal:
+		if artifact.Minimal == nil {
+			return ""
+		}
+		return artifact.Minimal.Metadata.Version
+	default:
+		return ""
+	}
+}
+
+// ApplyArtifact trims a parsed artifact while preserving its input schema.
+func ApplyArtifact(input Artifact, opts Options) (ApplyArtifactResult, error) {
+	switch input.Schema {
+	case SchemaFull:
+		if input.Full == nil {
+			return ApplyArtifactResult{}, fmt.Errorf("full artifact payload is missing")
+		}
+		result, err := Apply(*input.Full, opts)
+		if err != nil {
+			return ApplyArtifactResult{}, err
+		}
+		out := result.Transcript
+		return ApplyArtifactResult{
+			Artifact: Artifact{
+				Schema: SchemaFull,
+				Full:   &out,
+			},
+			OldToNewID:              result.OldToNewID,
+			RemovedIDs:              result.RemovedIDs,
+			OverlapGroupsRecomputed: true,
+		}, nil
+	case SchemaIntermediate:
+		if input.Intermediate == nil {
+			return ApplyArtifactResult{}, fmt.Errorf("intermediate artifact payload is missing")
+		}
+		result, err := ApplyIntermediate(*input.Intermediate, opts)
+		if err != nil {
+			return ApplyArtifactResult{}, err
+		}
+		out := result.Transcript
+		return ApplyArtifactResult{
+			Artifact: Artifact{
+				Schema:       SchemaIntermediate,
+				Intermediate: &out,
+			},
+			OldToNewID:              result.OldToNewID,
+			RemovedIDs:              result.RemovedIDs,
+			OverlapGroupsRecomputed: false,
+		}, nil
+	case SchemaMinimal:
+		if input.Minimal == nil {
+			return ApplyArtifactResult{}, fmt.Errorf("minimal artifact payload is missing")
+		}
+		result, err := ApplyMinimal(*input.Minimal, opts)
+		if err != nil {
+			return ApplyArtifactResult{}, err
+		}
+		out := result.Transcript
+		return ApplyArtifactResult{
+			Artifact: Artifact{
+				Schema:  SchemaMinimal,
+				Minimal: &out,
+			},
+			OldToNewID:              result.OldToNewID,
+			RemovedIDs:              result.RemovedIDs,
+			OverlapGroupsRecomputed: false,
+		}, nil
+	default:
+		return ApplyArtifactResult{}, fmt.Errorf("unsupported artifact schema %q", input.Schema)
+	}
+}
+
+// ConvertArtifact converts a parsed artifact to another supported output schema.
+func ConvertArtifact(input Artifact, outputSchema string) (Artifact, error) {
+	if outputSchema == "" || outputSchema == input.Schema {
+		return input, nil
+	}
+
+	switch input.Schema {
+	case SchemaFull:
+		if input.Full == nil {
+			return Artifact{}, fmt.Errorf("full artifact payload is missing")
+		}
+		switch outputSchema {
+		case SchemaIntermediate:
+			out := intermediateFromFull(*input.Full)
+			return Artifact{
+				Schema:       SchemaIntermediate,
+				Intermediate: &out,
+			}, nil
+		case SchemaMinimal:
+			out := minimalFromFull(*input.Full)
+			return Artifact{
+				Schema:  SchemaMinimal,
+				Minimal: &out,
+			}, nil
+		default:
+			return Artifact{}, fmt.Errorf("unsupported output schema %q", outputSchema)
+		}
+	case SchemaIntermediate:
+		if input.Intermediate == nil {
+			return Artifact{}, fmt.Errorf("intermediate artifact payload is missing")
+		}
+		switch outputSchema {
+		case SchemaMinimal:
+			out := minimalFromIntermediate(*input.Intermediate)
+			return Artifact{
+				Schema:  SchemaMinimal,
+				Minimal: &out,
+			}, nil
+		case SchemaFull:
+			return Artifact{}, fmt.Errorf("cannot emit %q from %q input artifact", SchemaFull, SchemaIntermediate)
+		default:
+			return Artifact{}, fmt.Errorf("unsupported output schema %q", outputSchema)
+		}
+	case SchemaMinimal:
+		if input.Minimal == nil {
+			return Artifact{}, fmt.Errorf("minimal artifact payload is missing")
+		}
+		switch outputSchema {
+		case SchemaIntermediate:
+			out := intermediateFromMinimal(*input.Minimal)
+			return Artifact{
+				Schema:       SchemaIntermediate,
+				Intermediate: &out,
+			}, nil
+		case SchemaFull:
+			return Artifact{}, fmt.Errorf("cannot emit %q from %q input artifact", SchemaFull, SchemaMinimal)
+		default:
+			return Artifact{}, fmt.Errorf("unsupported output schema %q", outputSchema)
+		}
+	default:
+		return Artifact{}, fmt.Errorf("unsupported input schema %q", input.Schema)
+	}
+}
+
+func intermediateFromFull(input schema.Transcript) schema.IntermediateTranscript {
+	segments := make([]schema.IntermediateSegment, len(input.Segments))
+	for index, segment := range input.Segments {
+		segments[index] = schema.IntermediateSegment{
+			ID:         segment.ID,
+			Start:      segment.Start,
+			End:        segment.End,
+			Speaker:    segment.Speaker,
+			Text:       segment.Text,
+			Categories: append([]string(nil), segment.Categories...),
+		}
+	}
+	return schema.IntermediateTranscript{
+		Metadata: schema.IntermediateMetadata{
+			Application:  input.Metadata.Application,
+			Version:      input.Metadata.Version,
+			OutputSchema: SchemaIntermediate,
+		},
+		Segments: segments,
+	}
+}
+
+func minimalFromFull(input schema.Transcript) schema.MinimalTranscript {
+	segments := make([]schema.MinimalSegment, len(input.Segments))
+	for index, segment := range input.Segments {
+		segments[index] = schema.MinimalSegment{
+			ID:      segment.ID,
+			Start:   segment.Start,
+			End:     segment.End,
+			Speaker: segment.Speaker,
+			Text:    segment.Text,
+		}
+	}
+	return schema.MinimalTranscript{
+		Metadata: schema.MinimalMetadata{
+			Application:  input.Metadata.Application,
+			Version:      input.Metadata.Version,
+			OutputSchema: SchemaMinimal,
+		},
+		Segments: segments,
+	}
+}
+
+func minimalFromIntermediate(input schema.IntermediateTranscript) schema.MinimalTranscript {
+	segments := make([]schema.MinimalSegment, len(input.Segments))
+	for index, segment := range input.Segments {
+		segments[index] = schema.MinimalSegment{
+			ID:      segment.ID,
+			Start:   segment.Start,
+			End:     segment.End,
+			Speaker: segment.Speaker,
+			Text:    segment.Text,
+		}
+	}
+	return schema.MinimalTranscript{
+		Metadata: schema.MinimalMetadata{
+			Application:  input.Metadata.Application,
+			Version:      input.Metadata.Version,
+			OutputSchema: SchemaMinimal,
+		},
+		Segments: segments,
+	}
+}
+
+func intermediateFromMinimal(input schema.MinimalTranscript) schema.IntermediateTranscript {
+	segments := make([]schema.IntermediateSegment, len(input.Segments))
+	for index, segment := range input.Segments {
+		segments[index] = schema.IntermediateSegment{
+			ID:      segment.ID,
+			Start:   segment.Start,
+			End:     segment.End,
+			Speaker: segment.Speaker,
+			Text:    segment.Text,
+		}
+	}
+	return schema.IntermediateTranscript{
+		Metadata: schema.IntermediateMetadata{
+			Application:  input.Metadata.Application,
+			Version:      input.Metadata.Version,
+			OutputSchema: SchemaIntermediate,
+		},
+		Segments: segments,
+	}
+}
--- a/internal/trim/artifact_test.go
+++ b/internal/trim/artifact_test.go
@@ -0,0 +1,138 @@
+package trim
+
+import (
+	"encoding/json"
+	"strings"
+	"testing"
+
+	"gitea.maximumdirect.net/eric/seriatim/schema"
+)
+
+func TestParseArtifactJSONRejectsMalformedJSON(t *testing.T) {
+	_, err := ParseArtifactJSON([]byte(`{"metadata":`))
+	if err == nil {
+		t.Fatal("expected malformed JSON error")
+	}
+	if !strings.Contains(err.Error(), "input JSON is malformed") {
+		t.Fatalf("unexpected error: %v", err)
+	}
+}
+
+func TestParseArtifactJSONRejectsDuplicateSegmentIDs(t *testing.T) {
+	first := 10
+	second := 20
+	value := schema.Transcript{
+		Metadata: schema.Metadata{
+			Application: "seriatim",
+			Version:     "v-test",
+		},
+		Segments: []schema.Segment{
+			{ID: 1, Source: "a.json", SourceSegmentIndex: &first, Speaker: "A", Start: 1, End: 2, Text: "one"},
+			{ID: 1, Source: "a.json", SourceSegmentIndex: &second, Speaker: "B", Start: 2, End: 3, Text: "two"},
+		},
+		OverlapGroups: []schema.OverlapGroup{},
+	}
+	data := mustMarshalJSON(t, value)
+
+	_, err := ParseArtifactJSON(data)
+	if err == nil {
+		t.Fatal("expected invalid artifact error")
+	}
+	if !strings.Contains(err.Error(), "not a valid seriatim output artifact") {
+		t.Fatalf("unexpected error: %v", err)
+	}
+}
+
+func TestParseArtifactJSONRejectsNonSequentialSegmentIDs(t *testing.T) {
+	first := 10
+	second := 20
+	value := schema.Transcript{
+		Metadata: schema.Metadata{
+			Application: "seriatim",
+			Version:     "v-test",
+		},
+		Segments: []schema.Segment{
+			{ID: 1, Source: "a.json", SourceSegmentIndex: &first, Speaker: "A", Start: 1, End: 2, Text: "one"},
+			{ID: 3, Source: "a.json", SourceSegmentIndex: &second, Speaker: "B", Start: 2, End: 3, Text: "two"},
+		},
+		OverlapGroups: []schema.OverlapGroup{},
+	}
+	data := mustMarshalJSON(t, value)
+
+	_, err := ParseArtifactJSON(data)
+	if err == nil {
+		t.Fatal("expected invalid artifact error")
+	}
+	if !strings.Contains(err.Error(), "not a valid seriatim output artifact") {
+		t.Fatalf("unexpected error: %v", err)
+	}
+}
+
+func TestConvertArtifactMinimalToIntermediate(t *testing.T) {
+	value := schema.MinimalTranscript{
+		Metadata: schema.MinimalMetadata{
+			Application:  "seriatim",
+			Version:      "v-test",
+			OutputSchema: SchemaMinimal,
+		},
+		Segments: []schema.MinimalSegment{
+			{ID: 1, Start: 1, End: 2, Speaker: "A", Text: "one"},
+			{ID: 2, Start: 2, End: 3, Speaker: "B", Text: "two"},
+		},
+	}
+	artifact := Artifact{
+		Schema:  SchemaMinimal,
+		Minimal: &value,
+	}
+
+	converted, err := ConvertArtifact(artifact, SchemaIntermediate)
+	if err != nil {
+		t.Fatalf("convert failed: %v", err)
+	}
+	if converted.Schema != SchemaIntermediate {
+		t.Fatalf("schema = %q, want %q", converted.Schema, SchemaIntermediate)
+	}
+	if converted.Intermediate == nil {
+		t.Fatal("expected intermediate artifact")
+	}
+	if len(converted.Intermediate.Segments) != 2 {
+		t.Fatalf("segment count = %d, want 2", len(converted.Intermediate.Segments))
+	}
+	if converted.Intermediate.Segments[0].ID != 1 || converted.Intermediate.Segments[1].ID != 2 {
+		t.Fatalf("unexpected IDs: %#v", converted.Intermediate.Segments)
+	}
+}
+
+func TestConvertArtifactMinimalToFullFails(t *testing.T) {
+	value := schema.MinimalTranscript{
+		Metadata: schema.MinimalMetadata{
+			Application:  "seriatim",
+			Version:      "v-test",
+			OutputSchema: SchemaMinimal,
+		},
+		Segments: []schema.MinimalSegment{
+			{ID: 1, Start: 1, End: 2, Speaker: "A", Text: "one"},
+		},
+	}
+	artifact := Artifact{
+		Schema:  SchemaMinimal,
+		Minimal: &value,
+	}
+
+	_, err := ConvertArtifact(artifact, SchemaFull)
+	if err == nil {
+		t.Fatal("expected conversion error")
+	}
+	if !strings.Contains(err.Error(), "cannot emit") {
+		t.Fatalf("unexpected error: %v", err)
+	}
+}
+
+func mustMarshalJSON(t *testing.T, value any) []byte {
+	t.Helper()
+	data, err := json.Marshal(value)
+	if err != nil {
+		t.Fatalf("marshal: %v", err)
+	}
+	return data
+}
--- a/internal/trim/selector.go
+++ b/internal/trim/selector.go
@@ -0,0 +1,156 @@
+package trim
+
+import (
+	"fmt"
+	"regexp"
+	"sort"
+	"strconv"
+	"strings"
+)
+
+var selectorElementPattern = regexp.MustCompile(`^([+-]?\d+)(?:\s*-\s*([+-]?\d+))?$`)
+
+// Selector represents a normalized union of segment IDs.
+type Selector struct {
+	ranges []idRange
+}
+
+type idRange struct {
+	start int
+	end   int
+}
+
+// ParseSelector parses an inline segment selector expression.
+func ParseSelector(input string) (Selector, error) {
+	if strings.TrimSpace(input) == "" {
+		return Selector{}, fmt.Errorf("selector cannot be empty")
+	}
+
+	parts := strings.Split(input, ",")
+	ranges := make([]idRange, 0, len(parts))
+	for index, raw := range parts {
+		element := strings.TrimSpace(raw)
+		if element == "" {
+			return Selector{}, fmt.Errorf("selector element %d cannot be empty", index+1)
+		}
+
+		rangeValue, err := parseElement(element)
+		if err != nil {
+			return Selector{}, fmt.Errorf("selector element %d %q: %w", index+1, element, err)
+		}
+		ranges = append(ranges, rangeValue)
+	}
+
+	normalized := normalizeRanges(ranges)
+	if len(normalized) == 0 {
+		return Selector{}, fmt.Errorf("selector cannot be empty")
+	}
+	return Selector{ranges: normalized}, nil
+}
+
+// Contains returns true when id is included by this selector.
+func (s Selector) Contains(id int) bool {
+	if id <= 0 {
+		return false
+	}
+	index := sort.Search(len(s.ranges), func(i int) bool {
+		return s.ranges[i].end >= id
+	})
+	if index == len(s.ranges) {
+		return false
+	}
+	rangeValue := s.ranges[index]
+	return id >= rangeValue.start && id <= rangeValue.end
+}
+
+// IDs returns a deterministic ascending list of unique segment IDs.
+func (s Selector) IDs() []int {
+	total := 0
+	for _, rangeValue := range s.ranges {
+		total += rangeValue.end - rangeValue.start + 1
+	}
+
+	ids := make([]int, 0, total)
+	for _, rangeValue := range s.ranges {
+		for id := rangeValue.start; id <= rangeValue.end; id++ {
+			ids = append(ids, id)
+		}
+	}
+	return ids
+}
+
+func parseElement(element string) (idRange, error) {
+	matches := selectorElementPattern.FindStringSubmatch(element)
+	if matches == nil {
+		return idRange{}, fmt.Errorf("malformed element")
+	}
+
+	start, err := parseID(matches[1])
+	if err != nil {
+		return idRange{}, err
+	}
+
+	if matches[2] == "" {
+		return idRange{start: start, end: start}, nil
+	}
+
+	end, err := parseID(matches[2])
+	if err != nil {
+		return idRange{}, fmt.Errorf("invalid range end: %w", err)
+	}
+	if start > end {
+		return idRange{}, fmt.Errorf("descending range %d-%d is invalid", start, end)
+	}
+	return idRange{start: start, end: end}, nil
+}
+
+func parseID(value string) (int, error) {
+	value = strings.TrimSpace(value)
+	if value == "" {
+		return 0, fmt.Errorf("missing segment ID")
+	}
+
+	id, err := strconv.Atoi(value)
+	if err != nil {
+		return 0, fmt.Errorf("segment ID must be an integer")
+	}
+	if id <= 0 {
+		return 0, fmt.Errorf("segment ID must be positive")
+	}
+	return id, nil
+}
+
+func normalizeRanges(in []idRange) []idRange {
+	if len(in) == 0 {
+		return nil
+	}
+
+	sorted := make([]idRange, len(in))
+	copy(sorted, in)
+	sort.Slice(sorted, func(i, j int) bool {
+		if sorted[i].start == sorted[j].start {
+			return sorted[i].end < sorted[j].end
+		}
+		return sorted[i].start < sorted[j].start
+	})
+
+	merged := make([]idRange, 0, len(sorted))
+	for _, next := range sorted {
+		if len(merged) == 0 {
+			merged = append(merged, next)
+			continue
+		}
+
+		last := &merged[len(merged)-1]
+		if next.start <= last.end+1 {
+			if next.end > last.end {
+				last.end = next.end
+			}
+			continue
+		}
+
+		merged = append(merged, next)
+	}
+
+	return merged
+}
--- a/internal/trim/selector_test.go
+++ b/internal/trim/selector_test.go
@@ -0,0 +1,127 @@
+package trim
+
+import (
+	"strings"
+	"testing"
+)
+
+func TestParseSelectorSingleID(t *testing.T) {
+	selector, err := ParseSelector("1")
+	if err != nil {
+		t.Fatalf("parse failed: %v", err)
+	}
+	assertIDs(t, selector, []int{1})
+	assertContains(t, selector, map[int]bool{1: true, 2: false, 0: false, -1: false})
+}
+
+func TestParseSelectorInclusiveRange(t *testing.T) {
+	selector, err := ParseSelector("1-3")
+	if err != nil {
+		t.Fatalf("parse failed: %v", err)
+	}
+	assertIDs(t, selector, []int{1, 2, 3})
+}
+
+func TestParseSelectorCommaSeparatedCombination(t *testing.T) {
+	selector, err := ParseSelector("1-3,8,10-12")
+	if err != nil {
+		t.Fatalf("parse failed: %v", err)
+	}
+	assertIDs(t, selector, []int{1, 2, 3, 8, 10, 11, 12})
+}
+
+func TestParseSelectorWhitespaceTolerance(t *testing.T) {
+	selector, err := ParseSelector(" 1 - 3 ,  8 , 10 - 12 ")
+	if err != nil {
+		t.Fatalf("parse failed: %v", err)
+	}
+	assertIDs(t, selector, []int{1, 2, 3, 8, 10, 11, 12})
+}
+
+func TestParseSelectorDuplicatesAndOverlapsNormalizeUnion(t *testing.T) {
+	selector, err := ParseSelector("1-4,2,4,3-6,6")
+	if err != nil {
+		t.Fatalf("parse failed: %v", err)
+	}
+	assertIDs(t, selector, []int{1, 2, 3, 4, 5, 6})
+	assertContains(t, selector, map[int]bool{1: true, 5: true, 6: true, 7: false})
+}
+
+func TestParseSelectorDeterministicNormalizedOutput(t *testing.T) {
+	left, err := ParseSelector("8,1-3,2,10-12")
+	if err != nil {
+		t.Fatalf("parse left failed: %v", err)
+	}
+	right, err := ParseSelector("10-12,3,2,1,8")
+	if err != nil {
+		t.Fatalf("parse right failed: %v", err)
+	}
+
+	leftIDs := left.IDs()
+	rightIDs := right.IDs()
+	if !equalInts(leftIDs, rightIDs) {
+		t.Fatalf("normalized IDs mismatch: %v vs %v", leftIDs, rightIDs)
+	}
+}
+
+func TestParseSelectorFailures(t *testing.T) {
+	tests := []struct {
+		name      string
+		selector  string
+		wantError string
+	}{
+		{name: "empty", selector: "", wantError: "cannot be empty"},
+		{name: "whitespace only", selector: "   ", wantError: "cannot be empty"},
+		{name: "zero", selector: "0", wantError: "must be positive"},
+		{name: "negative", selector: "-1", wantError: "must be positive"},
+		{name: "range includes zero", selector: "0-2", wantError: "must be positive"},
+		{name: "descending range", selector: "10-1", wantError: "descending range"},
+		{name: "empty element", selector: "1,,2", wantError: "cannot be empty"},
+		{name: "trailing comma", selector: "1,", wantError: "cannot be empty"},
+		{name: "malformed alpha", selector: "abc", wantError: "malformed element"},
+		{name: "malformed range", selector: "1-2-3", wantError: "malformed element"},
+		{name: "missing end", selector: "1-", wantError: "malformed element"},
+		{name: "missing start", selector: "-2", wantError: "must be positive"},
+	}
+
+	for _, test := range tests {
+		t.Run(test.name, func(t *testing.T) {
+			_, err := ParseSelector(test.selector)
+			if err == nil {
+				t.Fatalf("expected error for %q", test.selector)
+			}
+			if !strings.Contains(err.Error(), test.wantError) {
+				t.Fatalf("error = %q, want substring %q", err.Error(), test.wantError)
+			}
+		})
+	}
+}
+
+func assertIDs(t *testing.T, selector Selector, want []int) {
+	t.Helper()
+	got := selector.IDs()
+	if !equalInts(got, want) {
+		t.Fatalf("IDs = %v, want %v", got, want)
+	}
+}
+
+func assertContains(t *testing.T, selector Selector, checks map[int]bool) {
+	t.Helper()
+	for id, want := range checks {
+		if got := selector.Contains(id); got != want {
+			t.Fatalf("Contains(%d) = %t, want %t", id, got, want)
+		}
+	}
+}
+
+func equalInts(left []int, right []int) bool {
+	if len(left) != len(right) {
+		return false
+	}
+	for index := range left {
+		if left[index] != right[index] {
+			return false
+		}
+	}
+	return true
+}
Author	SHA1	Message	Date
Eric Rakestraw	e6d3b4a46e	Harden trim integration All checks were successful ci/woodpecker/tag/release Pipeline was successful Details	2026-05-08 15:00:46 +00:00
Eric Rakestraw	54f7717de8	Document trim command	2026-05-08 14:57:52 +00:00
Eric Rakestraw	c48b02d2ec	Add trim report output	2026-05-08 14:56:24 +00:00
Eric Rakestraw	ac3dcf2557	Add trim CLI command	2026-05-08 14:53:59 +00:00
Eric Rakestraw	1c0e4438ae	Recompute overlap groups during trim	2026-05-08 14:47:52 +00:00
Eric Rakestraw	52f7729100	Add artifact trim transformation	2026-05-08 14:44:31 +00:00
Eric Rakestraw	2c82f8bf5c	Add trim selector parsing	2026-05-08 14:41:47 +00:00
Eric Rakestraw	d865bda4a9	Updated .gitignore to ignore .codex and related files	2026-05-08 14:36:00 +00:00