Cleaned up documentation and removed stubs and TODOs throughout the application
This commit is contained in:
136
doc.go
136
doc.go
@@ -1,130 +1,56 @@
|
||||
// Package feedkit provides domain-agnostic plumbing for feed-processing daemons.
|
||||
// Package feedkit provides a high-level map of the feedkit package set.
|
||||
//
|
||||
// A feed daemon ingests upstream input, turns it into event.Event values, applies
|
||||
// optional processing, and emits to sinks.
|
||||
// Most real applications do not import the root package directly. Instead, they
|
||||
// compose the subpackages that handle configuration, collection, processing,
|
||||
// routing, and sinks.
|
||||
//
|
||||
// Conceptual flow:
|
||||
// The usual flow through feedkit is:
|
||||
//
|
||||
// Collect -> Process (optional stages, including dedupe + normalize) -> Route -> Emit
|
||||
// Collect -> Process -> Route -> Emit
|
||||
//
|
||||
// In feedkit this maps to:
|
||||
//
|
||||
// Collect: sources + scheduler
|
||||
// Process: pipeline + processors + processors/dedupe + processors/normalize (optional stages)
|
||||
// Route: dispatch
|
||||
// Emit: sinks
|
||||
// Config: config
|
||||
//
|
||||
// feedkit intentionally does not define domain payload schemas or domain-specific
|
||||
// validation rules. Those belong in each concrete daemon.
|
||||
//
|
||||
// Public packages
|
||||
// That flow maps to packages like this:
|
||||
//
|
||||
// - config
|
||||
// YAML config loading/validation (strict decode + domain-agnostic checks).
|
||||
//
|
||||
// SourceConfig supports both polling and streaming sources:
|
||||
//
|
||||
// - mode: "poll" | "stream" | omitted (auto by driver type)
|
||||
//
|
||||
// - every: poll interval (required for mode="poll")
|
||||
//
|
||||
// - kinds: optional expected emitted kinds
|
||||
//
|
||||
// - kind: legacy singular fallback
|
||||
//
|
||||
// - params: driver-specific settings
|
||||
// Loads and validates daemon config. This package owns domain-agnostic
|
||||
// config shape and consistency checks.
|
||||
//
|
||||
// - event
|
||||
// Domain-agnostic event envelope (ID, Kind, Source, EmittedAt, Schema, Payload).
|
||||
// Defines the event.Event envelope shared across sources, processors,
|
||||
// dispatch, and sinks.
|
||||
//
|
||||
// - sources
|
||||
// Source abstractions and source-driver registry.
|
||||
//
|
||||
// There are two source interfaces:
|
||||
//
|
||||
// - PollSource: Poll(ctx) ([]event.Event, error)
|
||||
//
|
||||
// - StreamSource: Run(ctx, out) error
|
||||
//
|
||||
// Both share Input{Name()}. A source may emit 0..N events per poll/run step,
|
||||
// and may emit multiple event kinds.
|
||||
//
|
||||
// For HTTP-backed polling sources, sources.NewHTTPSource provides a shared
|
||||
// helper for generic params:
|
||||
//
|
||||
// - params.url
|
||||
//
|
||||
// - params.user_agent
|
||||
//
|
||||
// - params.conditional (optional, default true)
|
||||
//
|
||||
// When conditional polling is enabled, feedkit opportunistically uses ETag
|
||||
// and Last-Modified validators. A 304 Not Modified response is treated as a
|
||||
// successful poll that emits no events.
|
||||
// Defines polling and streaming source interfaces, the source registry, and
|
||||
// reusable source helpers.
|
||||
//
|
||||
// - scheduler
|
||||
// Runs one goroutine per job:
|
||||
//
|
||||
// - PollSource jobs run on Every (+ jitter)
|
||||
//
|
||||
// - StreamSource jobs run continuously
|
||||
//
|
||||
// - pipeline
|
||||
// Processor chain between scheduler and dispatch.
|
||||
// Processors can transform, drop, or reject events.
|
||||
// Runs configured sources on a cadence or as long-lived stream workers.
|
||||
//
|
||||
// - processors
|
||||
// Generic processor interface and named factory registry for wiring chains.
|
||||
// Defines the generic processor interface and registry used to build
|
||||
// ordered processor chains.
|
||||
//
|
||||
// - processors/dedupe
|
||||
// Built-in in-memory LRU dedupe processor keyed by Event.ID.
|
||||
// Built-in in-memory dedupe processor keyed by Event.ID.
|
||||
//
|
||||
// - processors/normalize
|
||||
// Concrete pipeline processor for raw->canonical mapping.
|
||||
// If no normalizer matches, the event passes through unchanged by default.
|
||||
// Built-in normalization processor plus helper APIs for raw-to-canonical
|
||||
// event mapping.
|
||||
//
|
||||
// - pipeline
|
||||
// Applies an ordered processor chain between collection and dispatch.
|
||||
//
|
||||
// - dispatch
|
||||
// Routes events to sinks and isolates slow sinks via per-sink queues/workers.
|
||||
// Compiles routes and fans events out to sinks with per-sink isolation.
|
||||
//
|
||||
// - sinks
|
||||
// Sink abstractions + sink registry.
|
||||
// Built-ins include stdout, NATS, and Postgres. For Postgres, downstream
|
||||
// code registers table schemas/mappers while feedkit manages DDL, writes,
|
||||
// optional automatic retention pruning (via sink params.prune), and
|
||||
// manual prune helpers. Postgres table schemas must declare PruneColumn.
|
||||
// Defines sink interfaces, the sink registry, built-in sinks, and Postgres
|
||||
// schema registration helpers.
|
||||
//
|
||||
// Typical wiring (daemon main.go)
|
||||
// feedkit is intentionally domain-agnostic. Domain schemas, domain event kinds,
|
||||
// upstream-specific parsing, and daemon lifecycle remain the responsibility of
|
||||
// each concrete application.
|
||||
//
|
||||
// 1. Load config.
|
||||
// 2. Register source drivers and build sources from config.Sources.
|
||||
// 3. Register sink drivers and build sinks from config.Sinks.
|
||||
// 4. Compile routes.
|
||||
// 5. Start scheduler (sources -> bus) and dispatcher (bus -> pipeline -> sinks).
|
||||
//
|
||||
// Sketch:
|
||||
//
|
||||
// cfg, _ := config.Load("config.yml")
|
||||
// srcReg := sources.NewRegistry()
|
||||
// // domain registers poll/stream drivers...
|
||||
//
|
||||
// var jobs []scheduler.Job
|
||||
// for _, sc := range cfg.Sources {
|
||||
// src, _ := srcReg.BuildInput(sc)
|
||||
// jobs = append(jobs, scheduler.Job{
|
||||
// Source: src,
|
||||
// Every: sc.Every.Duration,
|
||||
// })
|
||||
// }
|
||||
//
|
||||
// bus := make(chan event.Event, 256)
|
||||
// s := &scheduler.Scheduler{Jobs: jobs, Out: bus, Logf: logf}
|
||||
// // start dispatcher similarly...
|
||||
//
|
||||
// # Context and cancellation
|
||||
//
|
||||
// All blocking work should honor context cancellation:
|
||||
// - source polling/streaming I/O
|
||||
// - sink consumption
|
||||
// - any expensive processor work
|
||||
// For repository-level overview and usage narrative, see README.md. For
|
||||
// code-level details, each subpackage doc.go is the source of truth for that
|
||||
// package's public API surface and optional helpers.
|
||||
package feedkit
|
||||
|
||||
Reference in New Issue
Block a user