Cleaned up documentation and removed stubs and TODOs throughout the application

2026-03-28 13:02:37 -05:00
parent 3ef93faf69
commit 3281368922
18 changed files with 403 additions and 345 deletions
--- a/README.md
+++ b/README.md
@@ -1,127 +1,92 @@
 # feedkit

-`feedkit` provides domain-agnostic plumbing for feed-processing daemons.
+`feedkit` is a small Go toolkit for building feed-processing daemons.

-A daemon built on feedkit typically:
- ingests upstream input (polling APIs or consuming streams)
+It gives you the reusable plumbing around collection, processing, routing, and
+emission, while leaving domain concepts, schemas, and application wiring in
+your daemon. The intended shape is a family of sibling applications such as
+`weatherfeeder`, `newsfeeder`, or `earthquakefeeder` that all share the same
+infrastructure patterns without sharing domain logic.
+
+## What It Does
+
+A daemon built on `feedkit` typically:
+- ingests upstream input by polling HTTP APIs or consuming streams
 - emits domain-agnostic `event.Event` values
- applies optional processing (normalization, dedupe, policy)
- routes events to sinks (stdout, NATS, files, databases, etc.)
+- optionally processes those events with stages like dedupe or normalization
+- routes events to one or more sinks such as stdout, NATS, or Postgres
+
+Conceptually, the pipeline is:
+
+`Collect -> Process -> Route -> Emit`

 ## Philosophy

-feedkit is not a framework. It provides small composable packages and leaves
-lifecycle, domain schemas, and domain-specific validation in your daemon.
+`feedkit` is intentionally not a framework.

-## Conceptual pipeline
+It does not try to own:
+- your domain payload schemas
+- your domain event kinds
+- your daemon lifecycle or `main.go`
+- your observability stack or deployment model

-Collect -> Process (optional stages, including dedupe + normalize) -> Route -> Emit
+Instead, it provides small composable packages that are easy to wire together in
+different daemons.

-| Stage | Package(s) |
-|---|---|
-| Collect | `sources`, `scheduler` |
-| Process | `pipeline`, `processors`, `processors/dedupe`, `processors/normalize` (optional stages) |
-| Route | `dispatch` |
-| Emit | `sinks` |
-| Configure | `config` |
+## When To Use It

-## Core packages
+`feedkit` is a good fit when you want:
+- multiple small ingestion daemons with shared infrastructure patterns
+- clear separation between raw upstream payloads and normalized canonical models
+- reusable routing and sink behavior across domains
+- strong config and event-envelope conventions without centralizing domain rules

-### `config`
+It is a poor fit if you want a monolithic framework that dictates application
+structure end-to-end.

-Loads YAML config with strict decoding and domain-agnostic validation.
+## Built-In Capabilities

-`SourceConfig` supports both source modes:
- `mode: poll` requires `every`
- `mode: stream` forbids `every`
- omitted `mode` means auto (inferred from the registered driver type)
+`feedkit` currently includes:
+- strict YAML config loading and validation
+- polling and streaming source abstractions
+- scheduler orchestration for configured sources
+- optional pipeline processors
+- built-in dedupe and normalization processors
+- route compilation and sink fanout
+- built-in sinks for `stdout`, `nats`, and `postgres`

-It also supports optional expected source kinds:
- `kinds: ["observation", "alert"]` (preferred)
- `kind: "observation"` (legacy fallback)
+The Postgres sink is intentionally split between feedkit-owned infrastructure
+and daemon-owned schema mapping. `feedkit` manages connection setup, DDL,
+writes, and pruning; downstream applications define the schema and event mapper.

-### `event`
+## Typical Wiring

-Defines the domain-agnostic event envelope (`event.Event`) used across the system.
-
-### `sources`
-
-Defines source interfaces and driver registry:
-
-```go
-type Input interface {
-    Name() string
-}
-
-type PollSource interface {
-    Input
-    Poll(ctx context.Context) ([]event.Event, error)
-}
-
-type StreamSource interface {
-    Input
-    Run(ctx context.Context, out chan<- event.Event) error
-}
-```
-
-Notes:
- a poll can emit `0..N` events
- stream sources emit events continuously
- a single source may emit multiple event kinds
- driver implementations live in downstream daemons and are registered via `sources.Registry`
-
-### `scheduler`
-
-Runs one goroutine per source job:
- poll sources: cadence driven (`every` + jitter)
- stream sources: continuous run loop
-
-### `pipeline`
-
-Optional processing chain between collection and dispatch.
-Processors can transform, drop, or reject events.
-
-### `processors`
-
-Defines the generic processor interface and a named-driver registry used by
-daemons to build ordered processor chains.
-
-### `processors/dedupe`
-
-Built-in in-memory LRU dedupe processor that drops repeated events by `Event.ID`.
-
-### `processors/normalize`
-
-Concrete normalization processor implementation. Typical use: sources emit raw
-payload events, then a normalize stage maps them to canonical schemas.
-
-### `dispatch`
-
-Compiles routes and fans out events to sinks with per-sink queue/worker isolation.
-
-### `sinks`
-
-Defines sink interface and sink registry. Built-ins include:
- `stdout`
- `nats`
- `postgres`
-
-Detailed Postgres configuration and wiring examples live in package docs:
-`sinks/doc.go`.
-
-## Typical wiring
+At a high level, a daemon built on `feedkit` does this:

 1. Load config.
-2. Register/build sources from `cfg.Sources`.
-3. Register/build sinks from `cfg.Sinks`.
-4. Compile routes.
-5. Start scheduler (`sources -> bus`).
-6. Start dispatcher (`bus -> pipeline -> sinks`).
+2. Register domain-specific source drivers.
+3. Register built-in and/or custom sinks.
+4. Build sources, sinks, and optional processor chain from config.
+5. Compile routes.
+6. Start the scheduler and dispatcher.

-## Non-goals
+The package docs are the better source of truth for code-level details. In
+particular, each subpackage `doc.go` describes its external API surface and any
+optional helper APIs in `helpers.go`.

-feedkit intentionally does not:
- define domain payload schemas
- enforce domain-specific event kinds
- own application lifecycle
- prescribe observability stack choices
+## Package Layout
+
+The major packages are:
+- `config`: config loading and validation
+- `event`: the domain-agnostic event envelope
+- `sources`: source interfaces and reusable source helpers
+- `scheduler`: source execution and cadence management
+- `processors`: processor interfaces and registry
+- `processors/dedupe`: built-in in-memory dedupe processor
+- `processors/normalize`: built-in normalization processor and helpers
+- `pipeline`: optional processor chain
+- `dispatch`: route compilation and fanout
+- `sinks`: sink interfaces, built-ins, and Postgres registration helpers
+
+The root package docs in `doc.go` provide a concise package-by-package map for
+Go documentation consumers.