Files
feedkit/doc.go

114 lines
3.4 KiB
Go

// Package feedkit provides domain-agnostic plumbing for feed-processing daemons.
//
// A feed daemon ingests upstream input, turns it into event.Event values, applies
// optional processing, and emits to sinks.
//
// Conceptual flow:
//
// Collect -> Process (optional stages, including normalize) -> Route -> Emit
//
// In feedkit this maps to:
//
// Collect: sources + scheduler
// Process: pipeline + processors + normalize (optional stage)
// Route: dispatch
// Emit: sinks
// Config: config
//
// feedkit intentionally does not define domain payload schemas or domain-specific
// validation rules. Those belong in each concrete daemon.
//
// Public packages
//
// - config
// YAML config loading/validation (strict decode + domain-agnostic checks).
//
// SourceConfig supports both polling and streaming sources:
//
// - mode: "poll" | "stream" | omitted (auto by driver type)
//
// - every: poll interval (required for mode="poll")
//
// - kinds: optional expected emitted kinds
//
// - kind: legacy singular fallback
//
// - params: driver-specific settings
//
// - event
// Domain-agnostic event envelope (ID, Kind, Source, EmittedAt, Schema, Payload).
//
// - sources
// Source abstractions and source-driver registry.
//
// There are two source interfaces:
//
// - PollSource: Poll(ctx) ([]event.Event, error)
//
// - StreamSource: Run(ctx, out) error
//
// Both share Input{Name()}. A source may emit 0..N events per poll/run step,
// and may emit multiple event kinds.
//
// - scheduler
// Runs one goroutine per job:
//
// - PollSource jobs run on Every (+ jitter)
//
// - StreamSource jobs run continuously
//
// - pipeline
// Processor chain between scheduler and dispatch.
// Processors can transform, drop, or reject events.
//
// - processors
// Generic processor interface and named factory registry for wiring chains.
//
// - normalize
// Concrete pipeline processor for raw->canonical mapping.
// If no normalizer matches, the event passes through unchanged by default.
//
// - dispatch
// Routes events to sinks and isolates slow sinks via per-sink queues/workers.
//
// - sinks
// Sink abstractions + sink registry.
// Built-ins include stdout, NATS, and Postgres. For Postgres, downstream
// code registers table schemas/mappers while feedkit manages DDL, writes,
// and optional prune helpers.
//
// Typical wiring (daemon main.go)
//
// 1. Load config.
// 2. Register source drivers and build sources from config.Sources.
// 3. Register sink drivers and build sinks from config.Sinks.
// 4. Compile routes.
// 5. Start scheduler (sources -> bus) and dispatcher (bus -> pipeline -> sinks).
//
// Sketch:
//
// cfg, _ := config.Load("config.yml")
// srcReg := sources.NewRegistry()
// // domain registers poll/stream drivers...
//
// var jobs []scheduler.Job
// for _, sc := range cfg.Sources {
// src, _ := srcReg.BuildInput(sc)
// jobs = append(jobs, scheduler.Job{
// Source: src,
// Every: sc.Every.Duration,
// })
// }
//
// bus := make(chan event.Event, 256)
// s := &scheduler.Scheduler{Jobs: jobs, Out: bus, Logf: logf}
// // start dispatcher similarly...
//
// # Context and cancellation
//
// All blocking work should honor context cancellation:
// - source polling/streaming I/O
// - sink consumption
// - any expensive processor work
package feedkit