# feedkit **feedkit** provides the domain-agnostic core plumbing for *feed-processing daemons*. A feed daemon is a long-running process that: - polls one or more upstream providers (HTTP APIs, RSS feeds, etc.) - normalizes upstream data into a consistent internal representation - applies lightweight policy (dedupe, rate-limit, filtering) - emits events to one or more sinks (stdout, files, databases, brokers) feedkit is designed to be reused by many concrete daemons (e.g. `weatherfeeder`, `newsfeeder`, `rssfeeder`) without embedding *any* domain-specific logic. --- ## Philosophy feedkit is **not a framework**. It does **not**: - define domain schemas - enforce allowed event kinds - hide control flow behind inversion-of-control magic - own your application lifecycle Instead, it provides **small, composable primitives** that concrete daemons wire together explicitly. The goal is clarity, predictability, and long-term maintainability. --- ## Conceptual pipeline Collect → Normalize → Filter / Policy → Route → Persist / Emit In feedkit terms: | Stage | Package(s) | |------------|--------------------------------------| | Collect | `sources`, `scheduler` | | Normalize | *(today: domain code; planned: pipeline processor)* | | Policy | `pipeline` | | Route | `dispatch` | | Emit | `sinks` | | Configure | `config` | --- ## Public API overview ### `config` — Configuration loading & validation **Status:** 🟢 Stable - Loads YAML configuration - Strict decoding (`KnownFields(true)`) - Domain-agnostic validation only (shape, required fields, references) - Flexible `Params map[string]any` with typed helpers Key types: - `config.Config` - `config.SourceConfig` - `config.SinkConfig` - `config.Load(path)` --- ### `event` — Domain-agnostic event envelope **Status:** 🟢 Stable Defines the canonical event structure that moves through feedkit. Includes: - Stable ID - Kind (stringly-typed, domain-defined) - Source name - Timestamps (`EmittedAt`, optional `EffectiveAt`) - Optional `Schema` for payload versioning - Opaque `Payload` Key types: - `event.Event` - `event.Kind` - `event.ParseKind` - `event.Event.Validate` feedkit infrastructure never inspects `Payload`. --- ### `sources` — Polling abstraction **Status:** 🟢 Stable (interface); 🔵 evolving patterns Defines the contract implemented by domain-specific polling jobs. ```go type Source interface { Name() string Kind() event.Kind Poll(ctx context.Context) ([]event.Event, error) } ``` Includes a registry (sources.Registry) so daemons can register drivers (e.g. openweather_observation, rss_feed) without switch statements. Note: Today, most sources both fetch and normalize. A dedicated normalization hook is planned (see below). ### `scheduler` — Time-based polling **Status:** 🟢 Stable Runs one goroutine per source on a configured interval with jitter. Features: - Per-source interval - Deterministic jitter (avoids thundering herd) - Immediate poll at startup - Context-aware shutdown Key types: - `scheduler.Scheduler` - `scheduler.Job` ### `pipeline` — Event processing chain **Status:** 🟡 Partial (API stable, processors evolving) Allows events to be transformed, dropped, or rejected between collection and dispatch. ```go type Processor interface { Process(ctx context.Context, in event.Event) (*event.Event, error) } ``` Current state: - `pipeline.Pipeline` is fully implemented Placeholder files exist for: - `dedupe` (planned) - `ratelimit` (planned) This is the intended home for: - normalization - deduplication - rate limiting - lightweight policy enforcement ### `dispatch` — Routing & fan-out **Status:** 🟢 Stable Routes events to sinks based on kind and isolates slow sinks. Features: - Compiled routing rules - Per-sink buffered queues - Bounded enqueue timeouts - Per-consume timeouts - Sink panic isolation - Context-aware shutdown Key types: - `dispatch.Dispatcher` - `dispatch.Route` - `dispatch.Fanout` ### `sinks` — Output adapters ***Status:*** 🟡 Mixed Defines where events go after processing. ```go type Sink interface { Name() string Consume(ctx context.Context, e event.Event) error } ``` Registry-based construction allows daemons to opt into any sink drivers. Sink Status stdout 🟢 Implemented nats 🟢 Implemented file 🔴 Stub postgres 🔴 Stub All sinks are required to respect context cancellation. ### Normalization (planned) **Status:** 🔵 Planned (API design in progress) Currently, most domain implementations normalize upstream data inside `sources.Source.Poll`, which leads to: - very large source files - mixed responsibilities (HTTP + mapping) - duplicated helper code The intended evolution is: - Sources emit raw events (e.g. `json.RawMessage`) - A dedicated normalization processor runs in the pipeline - Normalizers are selected by `Event.Schema`, `Kind`, or `Source` This keeps: - `feedkit` domain-agnostic - `sources` small and focused - normalization logic centralized and testable ### Runner helper (planned) **Status:** 🔵 Planned (optional convenience) Most daemons wire together the same steps: - load config - build sources - build sinks - compile routes - start scheduler - start dispatcher A small, opt-in `Runner` helper may be added to reduce boilerplate while keeping the system explicit and debuggable. This is not intended to become a framework. ## Stability summary Area Status Event model 🟢 Stable Config API 🟢 Stable Scheduler 🟢 Stable Dispatcher 🟢 Stable Source interface 🟢 Stable Pipeline core 🟡 Partial Normalization 🔵 Planned Dedupe/Ratelimit 🔵 Planned Non-stdout sinks 🔴 Stub Legend: 🟢 Stable — API considered solid 🟡 Partial — usable, but incomplete 🔵 Planned — design direction agreed, not yet implemented 🔴 Stub — placeholder only ## Non-goals `feedkit` intentionally does not: - define domain payload schemas - enforce domain-specific validation - manage persistence semantics beyond sink adapters - own observability, metrics, or tracing (left to daemons) Those concerns belong in concrete implementations. ## See also - NAMING.md — repository and daemon naming conventions - event/doc.go — detailed event semantics - **Concrete example:** weatherfeeder (reference implementation) ---