Files
feedkit/doc.go
Eric Rakestraw 96039f6530 refactor!: introduce generic processors registry and remove normalize registry adapter
- add new `processors` package with canonical `Processor` interface
- add `processors.Registry` with Register/Build/BuildChain factory model
- switch `pipeline.Pipeline` to `[]processors.Processor`
- replace `normalize.Registry` + registry adapter with direct `normalize.Processor`
- remove `normalize/registry.go`
- update root docs to position normalize as one optional processing stage
- add tests for processors registry, normalize processor behavior, and pipeline flow

BREAKING CHANGE:
- `pipeline.Processor` removed; use `processors.Processor`
- `normalize.Registry` and old normalize processor adapter APIs removed
- downstream daemons must update processor wiring to new `processors.Registry`
  and `normalize.NewProcessor(...)`
2026-03-16 13:14:24 -05:00

111 lines
3.2 KiB
Go

// Package feedkit provides domain-agnostic plumbing for feed-processing daemons.
//
// A feed daemon ingests upstream input, turns it into event.Event values, applies
// optional processing, and emits to sinks.
//
// Conceptual flow:
//
// Collect -> Process (optional stages, including normalize) -> Route -> Emit
//
// In feedkit this maps to:
//
// Collect: sources + scheduler
// Process: pipeline + processors + normalize (optional stage)
// Route: dispatch
// Emit: sinks
// Config: config
//
// feedkit intentionally does not define domain payload schemas or domain-specific
// validation rules. Those belong in each concrete daemon.
//
// Public packages
//
// - config
// YAML config loading/validation (strict decode + domain-agnostic checks).
//
// SourceConfig supports both polling and streaming sources:
//
// - mode: "poll" | "stream" | omitted (auto by driver type)
//
// - every: poll interval (required for mode="poll")
//
// - kinds: optional expected emitted kinds
//
// - kind: legacy singular fallback
//
// - params: driver-specific settings
//
// - event
// Domain-agnostic event envelope (ID, Kind, Source, EmittedAt, Schema, Payload).
//
// - sources
// Source abstractions and source-driver registry.
//
// There are two source interfaces:
//
// - PollSource: Poll(ctx) ([]event.Event, error)
//
// - StreamSource: Run(ctx, out) error
//
// Both share Input{Name()}. A source may emit 0..N events per poll/run step,
// and may emit multiple event kinds.
//
// - scheduler
// Runs one goroutine per job:
//
// - PollSource jobs run on Every (+ jitter)
//
// - StreamSource jobs run continuously
//
// - pipeline
// Processor chain between scheduler and dispatch.
// Processors can transform, drop, or reject events.
//
// - processors
// Generic processor interface and named factory registry for wiring chains.
//
// - normalize
// Concrete pipeline processor for raw->canonical mapping.
// If no normalizer matches, the event passes through unchanged by default.
//
// - dispatch
// Routes events to sinks and isolates slow sinks via per-sink queues/workers.
//
// - sinks
// Sink abstractions + sink registry.
//
// Typical wiring (daemon main.go)
//
// 1. Load config.
// 2. Register source drivers and build sources from config.Sources.
// 3. Register sink drivers and build sinks from config.Sinks.
// 4. Compile routes.
// 5. Start scheduler (sources -> bus) and dispatcher (bus -> pipeline -> sinks).
//
// Sketch:
//
// cfg, _ := config.Load("config.yml")
// srcReg := sources.NewRegistry()
// // domain registers poll/stream drivers...
//
// var jobs []scheduler.Job
// for _, sc := range cfg.Sources {
// src, _ := srcReg.BuildInput(sc)
// jobs = append(jobs, scheduler.Job{
// Source: src,
// Every: sc.Every.Duration,
// })
// }
//
// bus := make(chan event.Event, 256)
// s := &scheduler.Scheduler{Jobs: jobs, Out: bus, Logf: logf}
// // start dispatcher similarly...
//
// # Context and cancellation
//
// All blocking work should honor context cancellation:
// - source polling/streaming I/O
// - sink consumption
// - any expensive processor work
package feedkit