normalizers: added a structure for normalizers; refactoring sources -> sources+normalizers is still todo.

This commit is contained in:
2026-01-14 10:35:16 -06:00
parent aa4774e0dd
commit efc44e8c6a
7 changed files with 272 additions and 1 deletions

141
internal/normalizers/doc.go Normal file
View File

@@ -0,0 +1,141 @@
// Package normalizers defines weatherfeeders **prescriptive** conventions for
// writing feedkit normalizers and provides the recommended project layout.
//
// Summary
// -------
// weatherfeeder ingests multiple upstream providers whose payloads differ.
// Sources should focus on polling/fetching. Normalizers should focus on
// transforming provider-specific raw payloads into canonical internal models.
//
// This package is domain code (weatherfeeder). feedkits normalize package is
// infrastructure (registry + processor).
//
// Directory layout (required)
// ---------------------------
// Normalizers are organized by provider:
//
// internal/normalizers/<provider>/
//
// Example:
//
// internal/normalizers/nws/observation.go
// internal/normalizers/nws/common.go
// internal/normalizers/openweather/observation.go
// internal/normalizers/openmeteo/observation.go
// internal/normalizers/common/units.go
//
// Rules:
//
// 1. One normalizer per file.
// Each file contains exactly one Normalizer implementation (one type).
//
// 2. Provider-level shared helpers live in:
// internal/normalizers/<provider>/common.go
//
// 3. Cross-provider helpers live in:
// internal/normalizers/common/
//
// 4. Matching is standardized on Event.Schema.
// (Do not match on event.Source or event.Kind in weatherfeeder normalizers.)
//
// Schema conventions (required)
// -----------------------------
// Sources emit RAW events with provider-specific schemas.
// Normalizers convert RAW -> CANONICAL schemas.
//
// Raw schemas:
//
// raw.<provider>.<thing>.vN
//
// Canonical schemas:
//
// weather.<kind>.vN
//
// weatherfeeder centralizes schema strings in internal/standards/schemas.go.
// Always use those constants (do not inline schema strings).
//
// Example mappings:
//
// standards.SchemaRawOpenWeatherCurrentV1 -> standards.SchemaWeatherObservationV1
// standards.SchemaRawOpenMeteoCurrentV1 -> standards.SchemaWeatherObservationV1
// standards.SchemaRawNWSObservationV1 -> standards.SchemaWeatherObservationV1
//
// Normalizer structure (required template)
// ----------------------------------------
// Each normalizer file must follow this structure (with helpful comments):
//
// type OpenWeatherObservationNormalizer struct{}
//
// func (OpenWeatherObservationNormalizer) Match(e event.Event) bool { ... }
//
// func (OpenWeatherObservationNormalizer) Normalize(ctx context.Context, in event.Event) (*event.Event, error) {
// // 1) Decode raw payload (recommended: json.RawMessage)
// // 2) Parse into provider structs
// // 3) Map provider -> canonical internal/model types
// // 4) Build output event (copy input, modify intentionally)
// // 5) Set EffectiveAt if applicable
// // 6) Validate out.Validate()
// }
//
// Required doc comment content
// ----------------------------
// Every normalizer type must have a doc comment that states:
//
// - what it converts (e.g., “OpenWeather current -> WeatherObservation”)
// - which raw schema it matches (constant name + value)
// - which canonical schema it produces (constant name + value)
// - any special caveats (units, day/night inference, missing fields, etc.)
//
// Event field handling (strong defaults)
// --------------------------------------
// Normalizers should treat the incoming event envelope as stable identity and
// should only change fields intentionally.
//
// Default behavior:
//
// - Keep: ID, Kind, Source, EmittedAt
// - Set: Schema to the canonical schema
// - Set: Payload to the canonical payload (internal/model/*)
// - Optional: EffectiveAt (often derived from observation timestamp)
// - Avoid changing Kind unless you have a clear “raw kind vs canonical kind” design.
//
// Always validate the output event:
//
// if err := out.Validate(); err != nil { ... }
//
// Payload representation for RAW events
// -------------------------------------
// weatherfeeder recommends RAW payloads be stored as json.RawMessage for JSON APIs.
// This keeps sources small and defers schema-specific decoding to normalizers.
//
// If a source already decodes into typed provider structs, it can still emit the
// raw event; it should simply re-marshal to json.RawMessage (or better: decode
// once in the normalizer instead to keep “fetch” separate from “transform”).
//
// Registration pattern
// --------------------
// feedkit normalization uses a match-driven registry (“first match wins”).
//
// Provider subpackages should expose:
//
// func Register(reg *normalize.Registry)
//
// And internal/normalizers/builtins.go should provide one entrypoint:
//
// func RegisterBuiltins(reg *normalize.Registry)
//
// which calls each providers Register() in a stable order.
//
// Testing guidance (recommended)
// ------------------------------
// Add a unit test per normalizer:
//
// internal/normalizers/openweather/observation_test.go
//
// Tests should:
//
// - build a RAW event with Schema=standards.SchemaRaw... and Payload=json.RawMessage
// - run the normalizer
// - assert canonical Schema + key payload fields + EffectiveAt
// - assert out.Validate() passes
package normalizers