Add foundational documentation templates to support product design and architecture planning, including ADR, archetypes, LLM systems, dev setup, and shared modules.

2025-12-12 02:31:03 +02:00
parent 5053235e95
commit c905cbb725
26 changed files with 759 additions and 65 deletions
--- a/DOCS.md
+++ b/DOCS.md
@@ -14,20 +14,33 @@ Technical index for developers and AI agents. Use this as the entry point to all
 ## Documentation Structure (root: `docs/`)
 ### 1) General Project Docs
- `docs/project-overview.md` — goals, target users, value proposition, key flows (data ingestion, processing, human approvals, reporting, billing), non-functional requirements.
+- `docs/archetypes.md` — product archetypes and optional modules; pick what applies.
 - `docs/project-overview.md` — goals, target users, chosen archetype, key flows (pick relevant), non-functional requirements.
 - `docs/phases-plan.md` — Phase 0–4 plan with tasks across product, frontend, backend, infra, data/LLM.
- `docs/content-structure.md` — high-level UI/content structure and page sections.
+- `docs/content-structure.md` — high-level UI/content structure and page sections (archetype‑specific examples).
 - `docs/dev-setup.md` — local/CI setup once code exists.
 ### 1c) Architecture Decision Records (`/docs/adr`)
 - `docs/adr/README.md` — how ADRs work.
 - `docs/adr/0000-template.md` — ADR template.
 ### 1b) LLM / AI System (`/docs/llm`)
 - `docs/llm/prompting.md` — prompt structure, versioning, output schemas.
 - `docs/llm/evals.md` — evaluation strategy, datasets, regression gates.
 - `docs/llm/safety.md` — privacy, injection defenses, `reasoning_trace` policy.
 - `docs/llm/caching-costs.md` — caching layers, budgets, monitoring.
 - `docs/llm/rag-embeddings.md` — RAG/embeddings design and evaluation.
 ### 2) Frontend (`/docs/frontend`)
- `docs/frontend/overview.md` — frontend scope, key user journeys (onboarding, data connection, approvals, reports).
+- `docs/frontend/overview.md` — frontend scope and key user journeys (depends on archetype/modules).
- `docs/frontend/architecture.md` — recommended stack and patterns (feature-first structure, data fetching, styling).
+- `docs/frontend/architecture.md` — **canonical, locked frontend decisions** (after Phase 1).
- `docs/frontend/FRONTEND_ARCHITECTURE_PLAN.md` — planning notes for frontend architecture decisions.
+- `docs/frontend/FRONTEND_ARCHITECTURE_PLAN.md` — working architecture notes; **archive/delete after Phase 1**.
 - `docs/frontend/ui-ux-guidelines.md` — UX/UI guidance for data review and approval flows.
 - `docs/frontend/seo-performance.md` — performance and SEO recommendations.
 ### 3) Backend (`/docs/backend`)
- `docs/backend/overview.md` — backend scope: ingestion, processing (rules + embeddings + LLM), approvals, reporting, billing.
+- `docs/backend/overview.md` — backend scope (integrations, AI capability, optional pipelines/approvals/billing).
- `docs/backend/architecture.md` — stack and module boundaries (provider-agnostic recommendations).
+- `docs/backend/architecture.md` — **canonical, locked backend decisions** (after Phase 1).
 - `docs/backend/api-design.md` — API resources and conventions (entities, rules, approvals, reports, billing, events).
 - `docs/backend/security.md` — authN/Z, secret handling, webhook verification, audit/event logging.
 - `docs/backend/payment-flow.md` — payment integration (provider-agnostic template; single source of truth for payment flows and webhooks).
--- a/README.md
+++ b/README.md
@@ -10,7 +10,7 @@
 - Базовая структура документации (`/docs`) и индекс навигации.
 - Папка `agents/` с примерами профилей агентов разных ролей (frontend/backend/security/test/code‑review/prompt‑engineer) и общими правилами (`RULES.md`).
 - Рекомендации по архитектуре фронтенда/бэкенда, безопасности, API‑дизайну и оплатам (провайдер‑агностично).
- Примеры практик: мультиарендность, очереди фоновых задач, event‑лог, explainability (`reasoning_trace`) — адаптируйте под свой домен или удалите лишнее.
+- Примеры модулей и практик (опционально): мультиарендность, интеграции/ингест, очереди фоновых задач, event‑лог, explainability (`reasoning_trace`), биллинг — **оставляйте только то, что нужно вашему продукту**.
 - Предложенный стек: Next.js (TypeScript, Tailwind, React Query/SWR) на фронте; Node.js + Express/Fastify, Prisma/Drizzle, Postgres (+опц. `pgvector`) на бэкенде; Docker для деплоя.
 ---
@@ -26,7 +26,12 @@
 ## Навигация по документации
 - **Общие документы (`/docs`):**  
-  [`docs/project-overview.md`](docs/project-overview.md) — обзор проекта (шаблон); [`docs/phases-plan.md`](docs/phases-plan.md) — план по фазам; [`docs/content-structure.md`](docs/content-structure.md) — структура контента/страниц.
+  [`docs/archetypes.md`](docs/archetypes.md) — продуктовые архетипы и опциональные модули;  
  [`docs/project-overview.md`](docs/project-overview.md) — обзор проекта (шаблон);  
  [`docs/phases-plan.md`](docs/phases-plan.md) — план по фазам;  
  [`docs/content-structure.md`](docs/content-structure.md) — структура контента/страниц.
  [`docs/adr/README.md`](docs/adr/README.md) — как фиксировать архитектурные решения (ADR);  
  [`docs/dev-setup.md`](docs/dev-setup.md) — dev‑setup и команды (когда появится код).
 - **Frontend (`/docs/frontend`)** — обзор, архитектура (feature‑first), UX/UI гайды, SEO/performance.
 - **Backend (`/docs/backend`)** — обзор, архитектура (модульный монолит), API design, security, payment flow (провайдер‑агностично), события и вебхуки.
 - **Индексы/правила:** [`DOCS.md`](DOCS.md) — индекс документации; [`RULES.md`](RULES.md) и папка [`agents/`](agents/) — правила и профили агентов.
@@ -34,7 +39,7 @@
 ---
 ## Как использовать шаблон
- **Быстрый старт:** прочитайте `docs/project-overview.md` и `docs/phases-plan.md` — адаптируйте под ваш проект.
+- **Быстрый старт:** начните с `docs/archetypes.md`, выберите архетип и набор модулей, затем заполните `docs/project-overview.md` и `docs/phases-plan.md`.
 - **Технический дизайн:** используйте `docs/frontend/architecture.md` и `docs/backend/architecture.md`; для API и безопасности — `docs/backend/api-design.md` и `docs/backend/security.md`; оплаты — `docs/backend/payment-flow.md` (как пример/рыбу).
 - **Работа с агентами:** перед задачей проверяйте `RULES.md`; агент выбирается по описанию профилей в `agents/` (см. протокол выбора в `RULES.md`).
 - **Внесение изменений:** обновляйте документы при принятии решений; для новых подсистем добавляйте файлы в `docs/` (предпочтительно — английский).
--- a/RECOMMENDATIONS.md
+++ b/RECOMMENDATIONS.md
@@ -9,6 +9,8 @@ Update this file whenever you make a decision that changes the stack, scope, con
 ## 1. Project Context (fill in)
 - **Domain / product type:** _[e.g., customer support AI, compliance assistant, content classifier]_  
 - **Chosen archetype:** _[A–E from `/docs/archetypes.md`]_  
 - **Modules in scope:** _[core AI, integrations, pipeline, human feedback, reporting, billing, tenancy, events, queues]_  
 - **Primary users:** _[roles/personas]_  
 - **Key success metrics:** _[latency, accuracy, cost, adoption, etc.]_  
 - **Current phase:** _Phase 0 / 1 / 2 / 3 / 4_  
@@ -57,3 +59,16 @@ List any intentional departures from `docs/` recommendations and why.
 ## 6. Change Log (optional)
 - _YYYY‑MM‑DD_: _[decision summary]_  
 ## 7. Architecture Decision Records (ADRs)
 Record major decisions and their rationale in `docs/adr/`.
 Example ADRs you may need:
 - _ADR: modular monolith vs microservices_
 - _ADR: vector storage choice (pgvector vs external)_
 - _ADR: billing model/provider_
 - _ADR: RAG strategy and retrieval stack_
 - _ADR: auth/tenancy model_
 Once accepted, link ADRs here:
 - `docs/adr/0001-<title>.md` — _accepted_
--- a/apps/api/README.md
+++ b/apps/api/README.md
@@ -0,0 +1,5 @@
 # apps/api
 Placeholder for the backend API/service (e.g., Node.js + TypeScript).
 Scaffold code here in Phase 2.
--- a/apps/web/README.md
+++ b/apps/web/README.md
@@ -0,0 +1,5 @@
 # apps/web
 Placeholder for the frontend application (e.g., Next.js).
 Scaffold code here in Phase 2.
--- a/docs/adr/0000-template.md
+++ b/docs/adr/0000-template.md
@@ -0,0 +1,32 @@
 # ADR-0000: <Decision Title>
 ---
 **Status:** proposed | accepted | superseded | rejected  
 **Date:** YYYY-MM-DD  
 **Owners:** <names/roles>  
 **Related:** <links to other ADRs / docs>  
 ---
 ## Context
 What problem are we solving?  
 What constraints matter (scale, compliance, team, timeline, cost)?
 ## Decision
 State the decision clearly and specifically.
 ## Alternatives Considered
 List viable alternatives with short pros/cons.
 ## Consequences
 ### Positive
 - …
 ### Negative / Risks
 - …
 ## Migration / Rollback Plan
 If we change our mind later, how do we move safely?
 ## References
 - Links to relevant docs, benchmarks, vendor docs, or prototypes.
--- a/docs/adr/README.md
+++ b/docs/adr/README.md
@@ -0,0 +1,44 @@
 # Architecture Decision Records (ADR)
 ADRs are short, versioned documents that capture **why** we made a major architectural/product decision.
 They prevent “tribal knowledge” and make reversals/migrations explicit.
 ---
 **Last Updated:** 2025-12-12  
 **Phase:** Phase 0 (Planning)  
 **Status:** Draft — adopt in Phase 1  
 **Owner:** Tech Leads  
 ---
 ## When to write an ADR
 Create an ADR whenever a decision is:
 - hard to reverse later,
 - affects multiple modules/teams,
 - has meaningful trade‑offs,
 - changes default recommendations in `/docs`.
 Examples:
 - modular monolith vs microservices,
 - Postgres + `pgvector` vs external vector DB,
 - billing model (subscription vs usage),
 - RAG vs non‑RAG architecture,
 - auth provider choice.
 ## Where ADRs live
 One file per decision in `docs/adr/`.
 Naming convention:
 - `0001-short-title.md`
 - `0002-another-decision.md`
 Status lifecycle:
 `proposed` → `accepted` → (`superseded` | `rejected`)
 ## How ADRs relate to `RECOMMENDATIONS.md`
 - `RECOMMENDATIONS.md` is the **current snapshot** of locked decisions and constraints.
 - ADRs are the **history and rationale** behind those decisions.
 Link every ADR from `RECOMMENDATIONS.md` once accepted.
 ## Template
 Start from `docs/adr/0000-template.md`.
--- a/docs/archetypes.md
+++ b/docs/archetypes.md
@@ -0,0 +1,86 @@
 # Product Archetypes & Optional Modules (Starter Template)
 This template is intentionally **modular**. It contains a set of reusable building blocks and an example “pipeline‑based AI SaaS” path.  
 To keep it universal, **choose an archetype first**, then keep only the modules that apply to your product.
 ---
 **Last Updated:** 2025-12-11  
 **Phase:** Phase 0 (Planning)  
 **Status:** Draft  
 **Owner:** Product + Tech Leads  
 ---
 ## 1. Choose a Product Archetype
 Pick the closest archetype and adapt terminology/screens/modules accordingly.
 ### A) Chat‑first Assistant
 Typical products: customer support assistant, internal copilot, personal agent.
 - Primary surface: chat / conversational UI, or API chat endpoint.
 - Data: conversation history + optional knowledge/RAG sources.
 - Feedback: thumbs up/down, corrections, or escalation to humans.
 - Optional modules: multi‑tenancy, integrations, billing, audit/events.
 ### B) Workflow / Generation Tool
 Typical products: content generation studio, design assistant, coding tool.
 - Primary surface: forms + iterative generation, drafts, versions.
 - Data: artifacts (drafts, assets), prompts, evaluations.
 - Feedback: edit‑and‑retry loops, version diffing, human review optional.
 - Optional modules: billing, audit/events, integrations for input/output.
 ### C) Classification / Decision Support Pipeline
 Typical products: document labeling, risk scoring, compliance triage.
 - Primary surface: list/review UI for items; batch processing.
 - Data: items/records, rules, embeddings, traces.
 - Feedback: approvals/overrides, rule creation.
 - Optional modules: multi‑tenancy, billing, integrations, events feed.
 ### D) Agentic Automation / Orchestration
 Typical products: multi‑step agents, task executors, ops automation.
 - Primary surface: tasks/jobs + progress, optional chat.
 - Data: tasks, tools/integrations, execution logs, policies.
 - Feedback: human checkpoints, rollback, audit.
 - Optional modules: queues/workers, events feed, multi‑tenancy, billing.
 ### E) Internal / One‑off AI Tool
 Typical products: internal dashboards, research tools, prototypes.
 - Primary surface: lightweight UI or scripts.
 - Data: domain‑specific, usually single‑tenant.
 - Feedback: simple review loops.
 - Optional modules: keep only what you truly need.
 ## 2. Optional Modules (Mix & Match)
 These are reusable “lego blocks”. Rename entities to your domain.
 1. **Core AI capability**
   - Chat, generation, classification, RAG, or automation.
 2. **Integrations / Ingestion**
   - OAuth2, webhooks, file uploads, connector SDKs.
 3. **Processing pipeline**
   - Rules → embeddings → LLM fallback (or any staged flow).
 4. **Human feedback loop**
   - Approvals, edits, corrections, escalation.
 5. **Artifacts / Items store**
   - Messages, drafts, tasks, records — whatever your domain produces.
 6. **Events / Audit log**
   - `EventLog`, trace IDs, optional `reasoning_trace`.
 7. **Multi‑tenancy & RBAC**
   - Tenants, roles, isolation (optional for single‑tenant/internal tools).
 8. **Billing**
   - Provider‑hosted subscriptions/usage (optional).
 9. **Reporting / Analytics**
   - Dashboards, exports, evaluation metrics.
 10. **Queues / Workers**
   - For long‑running jobs, retries, backoff (optional).
 ## 3. How to Adapt the Template Quickly
 1. In `RECOMMENDATIONS.md`, write:
   - chosen archetype (A–E),
   - which modules you keep/remove,
   - locked stack decisions and constraints.
 2. Rename “records/rules/approvals” to your domain terms.
 3. Delete or archive irrelevant docs/modules early to reduce drift.
 4. Update `DOCS.md` navigation and phase status after decisions.
--- a/docs/backend/api-design.md
+++ b/docs/backend/api-design.md
@@ -16,6 +16,8 @@
 - Tenant-scoped resources; role-based authorization.
 - Idempotent ingestion/webhook endpoints; trace IDs for debugging.
 > Resource set is archetype‑specific. Endpoints below are a **pipeline/classification example** — adapt for chat‑first, generation, or automation products.
 ## 2. Core Resources (high-level)
 - `/auth` — login, tenant context, token refresh.
 - `/tenants` — tenant profile, roles, invites.
--- a/docs/backend/architecture.md
+++ b/docs/backend/architecture.md
@@ -13,6 +13,9 @@
 ---
 > Recommendations for Phase 0. Lock decisions in Phase 1.  
 > After Phase 1, this file is the **canonical record of locked backend architecture decisions**.  
 > Keep exploratory notes in a separate `*_PLAN.md` (if you use one) and archive/delete it after Phase 1.  
 > The module list below reflects a pipeline/classification archetype. Keep/rename/omit modules per `/docs/archetypes.md`.
 ## 1. Approach & Stack
 - Style: modular monolith with clear modules; containerized.
--- a/docs/backend/overview.md
+++ b/docs/backend/overview.md
@@ -12,22 +12,22 @@
 ---
 ## 1. Role of Backend
- Own business logic for ingestion, processing/classification (rules + embeddings + LLM fallback), approvals, reporting, billing, and audit.
+- Own business logic for integrations, AI capability (chat/generation/pipelines/automation), optional human feedback loops, reporting, billing, and audit.
 - Integrate safely with external providers (OAuth2/webhooks, payment provider, LLM provider) and expose consistent APIs + events.
- Enforce security: tenant isolation, RBAC, webhook verification, event/audit logging.
+- Enforce security appropriate to your archetype (single‑ or multi‑tenant), webhook verification, and event/audit logging.
 ## 2. Main Domain Areas
- **Auth & Tenants:** authentication/authorization, roles, tenant-scoped access.
+- **Auth & Tenancy (optional):** users, roles, tenant isolation if needed.
- **Integrations:** external providers via OAuth2/webhooks; connection health.
+- **Integrations / Ingestion (optional):** OAuth2/webhooks/files; connection health.
- **Records:** normalized feeds, statuses (ingested, processed, needs_approval, approved, failed), `reasoning_trace` JSONB.
+- **Core AI Module:** chat, generation, classification, RAG, or agentic automation.
- **Rules & Processing:** rules engine, embeddings similarity, LLM fallback; logging with `source_agent`.
+- **Processing Pipeline (optional):** staged evaluation (rules/embeddings/LLM); `reasoning_trace` JSONB if used.
- **Approvals:** human-in-the-loop decisions, overrides, optional rule creation; audit trail.
+- **Human Feedback Loop (optional):** approvals/edits/ratings/escalations; audit trail.
- **Reports & Exports:** dashboards/summaries with export history.
+- **Reporting & Exports (optional):** dashboards/summaries with history.
- **Billing:** provider-hosted subscriptions, tenant-scoped access control, webhooks.
+- **Billing (optional):** provider-hosted subscriptions/usage, webhooks.
- **Events:** `/api/events` feed for downstream agents and internal observability.
+- **Events / Audit:** `/api/events` feed for observability and downstream agents.
 ## 3. Integrations
 - **External data providers:** OAuth2 + webhooks; signatures/verification; idempotent writes via workers.
 - **Payment provider:** subscriptions, checkout/portal; webhooks for lifecycle events.
- **LLM provider:** OpenAI API via single helper; configurable model.
+- **LLM provider:** chosen LLM API via a single helper; configurable model/params.
 - **Queues:** BullMQ (Redis) for ingestion/categorization/notifications.
--- a/docs/backend/security.md
+++ b/docs/backend/security.md
@@ -40,6 +40,27 @@
 ## 6. LLM Safety
 - All LLM calls go through a single helper; centralize redaction, logging, and parameter control.
 - Strip/obfuscate sensitive fields before sending to LLM; log only references in traces.
 - Detailed LLM safety and `reasoning_trace` policy live in `/docs/llm/safety.md`.
 ### 6.1 AI‑Specific Threats & Controls (summary)
 These apply to any archetype that uses LLMs or RAG.
 - **Prompt injection / jailbreak**
  - Treat all user input and retrieved content as **untrusted**.
  - Delimit untrusted blocks explicitly and never allow them to override system constraints.
  - Detect injection patterns; on suspicion → refuse or route to human review.
 - **Outbound‑data policy**
  - Use **allowlists** for what may be sent to the model.
  - Mandatory redaction pipeline before every LLM call (PII/PHI/PCI/secrets).
  - Never send cross‑tenant data; never send raw billing/auth secrets.
 - **Output validation**
  - Validate model outputs against strict schemas (types, enums, bounds).
  - Reject/repair invalid outputs; fall back to safe defaults or human checkpoints for high‑risk actions.
  - For agentic tools: validate tool arguments and enforce per‑tool scopes.
 - **Trusted vs untrusted context (RAG)**
  - Retrieved documents are untrusted unless curated.
  - Keep retrieval tenant‑scoped; record only doc IDs in traces.
  - If grounding is required and context is insufficient → ask user or defer.
 ## 7. Audit & Events
 - Log domain events to `EventLog` with `source_agent`; include user ID, tenant, timestamps, and relevant context.
--- a/docs/content-structure.md
+++ b/docs/content-structure.md
@@ -13,6 +13,8 @@
 Structure of key screens/flows for a generic AI‑assisted product. Use this as a template and adapt to your domain.
 > Screen sets depend on the chosen archetype. The sections below illustrate a **pipeline/classification** example; for chat‑first or generation tools, replace with your own flow map (see `/docs/archetypes.md`).
 ## General Principles
 - Web-first (desktop + mobile web); clear, concise copy for target users in your domain.
 - Emphasize trust: audit trail, optional reasoning traces, secure webhooks, role-based access, provider-hosted billing.
@@ -31,17 +33,17 @@ Structure of key screens/flows for a generic AI‑assisted product. Use this as
 - Connect data sources: OAuth2 to external providers, webhook URL display.
 - Confirm webhooks/health; show status and retry guidance.
-## 3. Records & Processing
+## 3. Items / Artifacts & AI Processing (example)
- Records list with filters (date, source, status: ingested, processed, needs_approval, approved, failed).
+- Items list with filters (date, source, status).
- Detail drawer: raw fields, matched rule/embedding score, LLM `reasoning_trace`, history, related attachments.
+- Detail view: raw fields, matched rule/embedding score (if used), LLM `reasoning_trace`, history, attachments.
- Bulk actions: reprocess, send to approval, archive (if allowed).
+- Bulk actions: reprocess/regenerate, send to review, archive (if allowed).
-## 4. Approvals & Rules
+## 4. Human Feedback Loop (optional example)
- Approval queue: items requiring human review; assign/reassign; batch approve.
+- Review/approval queue: items requiring human input; assign/reassign; batch actions.
- Override outcome with optional rule creation (scope, conditions, action); preview impact/safety notes.
+- Override/edit outcome with optional rule creation (scope, conditions, action); preview impact/safety notes.
 - Event log snippet (who/what/when, `source_agent`).
-## 5. Rules Management
+## 5. Rules / Policies Management (optional example)
 - Rules list (name, priority, hit rate, last updated, enabled/disabled).
 - Rule editor: conditions (source fields, ranges, text, embedding similarity tag), actions (labels/categories/tags), evaluation order.
 - Test/preview on a sample set before saving.
--- a/docs/dev-setup.md
+++ b/docs/dev-setup.md
@@ -0,0 +1,62 @@
 # Development Setup (Starter Template)
 ---
 **Last Updated:** 2025-12-12  
 **Phase:** Phase 0 (Planning)  
 **Status:** Draft — finalize when code starts  
 **Owner:** Tech Leads  
 **References:**
 - `/RECOMMENDATIONS.md`
 - `/RULES.md`
 ---
 This file describes how to run the project locally and in CI once code exists.  
 Lock concrete commands and versions in Phase 1/2.
 ## 1. Prerequisites
 - **Node.js:** LTS version (specify in `RECOMMENDATIONS.md` and `.nvmrc`/`.tool-versions` later).
 - **Package manager:** npm / pnpm / yarn (pick one and lock it).
 - **Database/Redis:** only if your modules require them.
 ## 2. Repo Layout (when using a monorepo)
 Example skeleton:
 ```
 apps/
  web/            # frontend app (e.g., Next.js)
  api/            # backend API (e.g., Node/TS)
 packages/
  shared/         # shared types/utils
 docs/
  ...
 ```
 If you do not use a monorepo, document the real structure here.
 ## 3. Environment Variables
 - Keep local env in `.env.local` (never commit secrets).
 - Provide `.env.example` once variables are known.
 - Describe required vars per app:
  - `apps/web` — public env (`NEXT_PUBLIC_*`) + auth client config.
  - `apps/api` — DB/Redis URLs, provider secrets, webhook secrets, LLM keys, billing keys.
 ## 4. Common Commands (fill in when code exists)
 > Respect any forbidden commands in `/RULES.md`.
 - Install deps: `<pm> install`
 - Build: `<pm> run build`
 - Tests: `<pm> test` or `<pm> run test:*`
 - Lint: `<pm> run lint`
 - Format: `<pm> run format`
 - Local dev server (if allowed): `<pm> run dev`
 ## 5. Tooling
 - **Type checking:** TypeScript strict mode.
 - **Linting:** ESLint (or equivalent).
 - **Formatting:** Prettier/Biome (pick one).
 - **E2E:** Playwright/Cypress (if applicable).
 ## 6. CI Notes
 - Use the same commands as local (no hidden CI‑only steps).
 - Cache package manager store.
 - Run unit → integration → e2e in increasing cost order.
--- a/docs/frontend/FRONTEND_ARCHITECTURE_PLAN.md
+++ b/docs/frontend/FRONTEND_ARCHITECTURE_PLAN.md
@@ -1,26 +1,30 @@
 # Frontend Architecture Plan — AI Product Template
 > **Status:** Phase 0 (Planning) — recommendations to finalize in Phase 1  
-> **Scope:** Next.js (App Router, TS) + Tailwind + React Query/SWR for ingestion → categorization → approvals → reporting → billing  
+> **Scope:** Next.js (App Router, TS) + Tailwind + React Query/SWR. The concrete flows (chat, generation, pipeline, approvals, billing) depend on the chosen archetype in `/docs/archetypes.md`.  
 > **Sources:** Next.js App Router best practices (per latest docs), product/domain docs in `/docs`
 > **Purpose:** This file is a scratchpad for exploring options and trade‑offs.  
 > Move finalized decisions into `/docs/frontend/architecture.md`.  
 > **Archive or delete this plan after Phase 1** to avoid duplication.
 ## 0. Goals & Non-Goals
- Ship a clear, predictable frontend for users in your target domain: fast tables, reliable status, explainability (rule hit, embedding score, LLM reasoning trace).
+- Ship a clear, predictable frontend for your chosen archetype: fast core interactions (chat/workflow/pipeline/automation), reliable status, explainability when applicable (scores, traces, source).
- Optimize for multi-tenant, webhook-driven states, and payment‑provider‑hosted billing; no direct provider/LLM calls from the browser.
+- Optimize for long‑running or webhook‑driven states if your product uses them; no direct provider/LLM calls from the browser.
 - Non-goal: custom card forms or direct provider SDKs in the browser.
 ## 1. Routing & Layout
 - App Router with `app/`; root `layout.tsx` imports globals (per Next.js docs).  
 - Layout shells: public (marketing), app shell (auth/tenant-protected), auth screens.  
- Minimal routes: `/`, `/records`, `/records/review`, `/rules`, `/reports`, `/settings/billing`, `/settings/integrations`, `/settings/team`.  
+- Minimal routes depend on archetype. Example for pipeline products: `/`, `/records`, `/records/review`, `/rules`, `/reports`, `/settings/billing`, `/settings/integrations`, `/settings/team`.  
 - Navigation state from `next/navigation`; avoid dynamic `href` objects in `<Link>` (App Router requirement).
 ## 2. Feature/Folder Structure
 ```
 src/
  app/                   # routes/layouts
-  features/              # onboarding, records, approvals, rules, reports, billing, settings
+  features/              # archetype flows (e.g., onboarding, chat, items, approvals, reports, billing, settings)
-  entities/              # tenant, user, record, rule, report, event, integration
+  entities/              # domain entities (tenant/user optional; items/messages/drafts/tasks, etc.)
  shared/                # ui kit, hooks, api client, lib (formatting, domain utils), auth client
  widgets/               # composed UI blocks (dashboards, charts, tables)
 ```
--- a/docs/frontend/architecture.md
+++ b/docs/frontend/architecture.md
@@ -14,6 +14,9 @@
 ---
 > Recommendations for Phase 0. Lock decisions in Phase 1.  
 > After Phase 1, this file is the **canonical record of locked frontend architecture decisions**.  
 > Use `FRONTEND_ARCHITECTURE_PLAN.md` for working notes and archive/delete it after Phase 1.  
 > Modules like records/rules/approvals/billing are **examples for one archetype**. Rename or omit per `/docs/archetypes.md`.
 ## 1. Stack
 - Framework: Next.js (App Router) + TypeScript.
--- a/docs/frontend/overview.md
+++ b/docs/frontend/overview.md
@@ -12,17 +12,18 @@
 ---
 ## 1. Role of Frontend
- Deliver onboarding, data connection, categorization review, approval, reporting, and optional billing experiences for users in your target domain.
+- Deliver the primary user experience for your chosen archetype (chat, generation workflow, pipeline review, automation dashboard) plus onboarding/settings and optional billing.
- Keep flows fast, explainable (surface reasoning trace, rule hit), and safe (reflect webhook/provider states, avoid double actions).
+- Keep flows fast, explainable (surface reasoning traces/scores when used), and safe (reflect provider states, avoid double actions).
 ## 2. Core Screens & Flows
- Marketing/landing with CTA to signup.
+- Core screens depend on the chosen archetype (see `/docs/archetypes.md`).  
- Onboarding: signup/login, plan selection (payment provider Checkout/Portal if applicable), source connection (external providers via OAuth2/webhooks), team invites.
+  Example for pipeline products:
- Records: lists/filters, detail drawer (raw fields, rule hit, embedding score, LLM reasoning trace), bulk actions.
+  - Marketing/landing with CTA to signup.
- Approvals & Rules: approval queue, override + optional rule creation, rule list/editor, history snippets.
+  - Onboarding: signup/login, plan selection (provider Checkout/Portal if applicable), source connection (OAuth2/webhooks), team invites.
- Reports: dashboards/summaries, exports with statuses.
+  - Items/records: lists/filters, detail drawer (raw fields, scores, LLM reasoning trace), bulk actions.
- Billing & Settings: subscription status, payment method, tenant/team management, integrations health, audit/event log view.
+  - Human review (optional): approval/override UI, optional rules/policies editor.
- Routes (min set): `/`, `/records`, `/records/review`, `/rules`, `/reports`, `/settings/billing`, `/settings/integrations`.
+  - Reports (optional): dashboards/summaries, exports.
  - Billing & Settings (optional): subscription status, payment method, tenant/team management, integrations health, audit/event log view.
 ## 3. Technical Principles
 - Next.js (App Router) with TypeScript; Tailwind for styling; React Query/SWR for data fetching and cache orchestration.
--- a/docs/frontend/ui-ux-guidelines.md
+++ b/docs/frontend/ui-ux-guidelines.md
@@ -11,6 +11,8 @@
 - `/docs/backend/payment-flow.md`
 ---
 > This guide reflects a **pipeline / human‑review** archetype. For chat‑first or generation tools, keep the same principles (clarity, accessibility, safe actions) but replace the concrete flows and terminology per `/docs/archetypes.md`.
 ## 1. Tone & Clarity
 - Professional and concise. Emphasize trust (audit log, reasoning traces, secure billing and webhooks).
 - Avoid jargon without tooltips; show why a decision was made (rule hit, similarity, LLM reasoning).
--- a/docs/llm/caching-costs.md
+++ b/docs/llm/caching-costs.md
@@ -0,0 +1,54 @@
 # LLM System: Caching & Cost Control (Starter Template)
 ---
 **Last Updated:** 2025-12-12  
 **Phase:** Phase 0 (Planning)  
 **Status:** Draft — finalize in Phase 1  
 **Owner:** AI/LLM Lead + Backend Architect  
 **References:**
 - `/docs/llm/prompting.md`
 - `/docs/llm/evals.md`
 ---
 This document defines how to keep LLM usage reliable and within budget.
 ## 1. Goals
 - Minimize cost while preserving quality.
 - Keep latency predictable for user flows.
 - Avoid repeated work (idempotency + caching).
 ## 2. Budgets & Limits
 Define per tenant and per feature:
 - monthly token/cost cap,
 - per‑request max tokens,
 - max retries/timeouts,
 - concurrency limits.
 ## 3. Caching Layers
 Pick what applies:
 1. **Input normalization cache**  
   - canonicalize inputs (trim, stable ordering) to increase hit rate.
 2. **LLM response cache**  
   - key: `(prompt_version, model, canonical_input_hash, retrieval_config_hash)`.
   - TTL depends on volatility of the task.
 3. **Embeddings cache**  
   - store embeddings for reusable texts/items.
 4. **RAG retrieval cache**  
   - cache top‑k doc IDs for stable queries.
 > Never cache raw PII; cache keys use hashes of redacted inputs.
 ## 4. Cost Controls
 - Prefer cheaper models for low‑risk tasks; escalate to stronger models only when needed.
 - Use staged pipelines (rules/heuristics/RAG) to reduce LLM calls.
 - Batch non‑interactive jobs (classification, report gen).
 - Track tokens in/out per request and per tenant.
 ## 5. Fallbacks
 - On timeouts/errors: retry with backoff, then fallback to safe default or human review.
 - On budget exhaustion: degrade gracefully (limited features, queue jobs, ask user).
 ## 6. Monitoring
 - Dashboards for cost, latency, cache hit rate, retry rate.
 - Alerts for spikes, anomaly tenants, or runaway loops.
--- a/docs/llm/evals.md
+++ b/docs/llm/evals.md
@@ -0,0 +1,73 @@
 # LLM System: Evals & Quality (Starter Template)
 ---
 **Last Updated:** 2025-12-12  
 **Phase:** Phase 0 (Planning)  
 **Status:** Draft — finalize in Phase 1  
 **Owner:** AI/LLM Lead + Test Engineer  
 **References:**
 - `/docs/llm/prompting.md`
 - `/docs/llm/safety.md`
 ---
 This document defines how you measure LLM quality and prevent regressions.
 ## 1. Goals
 - Detect prompt/model regressions before production.
 - Track accuracy, safety, latency, and cost over time.
 - Provide a repeatable path for improving prompts and RAG.
 ## 2. Eval Suite Types
 Mix 3 layers depending on archetype:
 1. **Unit evals (offline, deterministic)**  
   - Small golden set, strict expected outputs.
 2. **Integration evals (offline, realistic)**  
   - Full pipeline including retrieval, tools, and post‑processing.
 3. **Online evals (production, controlled)**  
   - Shadow runs, A/B, canary prompts, RUM‑style metrics.
 ## 3. Datasets
 - Maintain **versioned eval datasets** with:
  - input,
  - expected output or rubric,
  - metadata (domain, difficulty, edge cases).
 - Include adversarial cases:
  - prompt injection,
  - ambiguous queries,
  - long/noisy inputs,
  - PII‑rich inputs (to test redaction).
 ## 4. Metrics (suggested)
 Choose per archetype:
 - **Task quality:** accuracy/F1, exact‑match, rubric score, human preference rate.
 - **Safety:** refusal correctness, policy violations, PII leakage rate.
 - **Robustness:** format‑valid rate, tool‑call correctness, retry rate.
 - **Performance:** p50/p95 latency, tokens in/out, cost per task.
 ## 5. Regression Policy
 - Every prompt or model change must run evals.
 - Define gates:
  - no safety regressions,
  - quality must improve or stay within tolerance,
  - latency/cost budgets respected.
 - If a gate fails: block rollout or require explicit override in `RECOMMENDATIONS.md`.
 ## 6. Human Review Loop
 - For tasks without ground truth, use rubric‑based human grading.
 - Sample strategy:
  - new prompt versions → 100% review on small batch,
  - stable versions → periodic audits.
 ## 7. Logging for Evals
 - Store eval runs with:
  - prompt version,
  - model/provider version,
  - retrieval config version (if used),
  - inputs/outputs,
  - metrics + artifacts.
 ## 8. Open Questions to Lock in Phase 1
 - Where datasets live (repo vs storage)?
 - Which metrics are hard gates for MVP?
 - Online eval strategy (shadow vs A/B) and sample sizes?
--- a/docs/llm/prompting.md
+++ b/docs/llm/prompting.md
@@ -0,0 +1,110 @@
 # LLM System: Prompting (Starter Template)
 ---
 **Last Updated:** 2025-12-12  
 **Phase:** Phase 0 (Planning)  
 **Status:** Draft — finalize in Phase 1  
 **Owner:** AI/LLM Lead  
 **References:**
 - `/docs/archetypes.md`
 - `/docs/llm/safety.md`
 - `/docs/llm/evals.md`
 ---
 This document defines how prompts are designed, versioned, and executed.  
 It is **archetype‑agnostic**: adapt the “interaction surface” (chat, workflow generation, pipeline classification, agentic tasks) to your product.
 ## 1. Goals
 - Produce **consistent, auditable outputs** across models/providers.
 - Make prompt changes **safe and reversible** (versioning + evals).
 - Keep sensitive data out of prompts unless strictly required (see safety).
 ## 2. Single LLM Entry Point
 All LLM calls go through one abstraction (e.g., `callLLM()` / “LLM Gateway”):
 - Centralizes model selection, temperature/top_p defaults, retries, timeouts.
 - Applies redaction and policy checks before sending prompts.
 - Emits structured logs + trace IDs to `EventLog`.
 - Enforces output schema validation.
 > Lock the exact interface and defaults in Phase 1.
 ## 3. Prompt Types
 Define prompt families that match your archetype:
 - **Chat‑first:** system prompt + conversation memory + optional retrieval context.
 - **Generation/workflow:** task prompt + constraints + examples + output schema.
 - **Classification/pipeline:** short instruction + label set + few‑shot examples + JSON output.
 - **Agentic automation:** planner prompt + tool policy + step budget + “stop/ask‑human” rules.
 ## 4. Prompt Structure (recommended)
 Use a predictable layout for every prompt:
 1. **System / role:** who the model is, high‑level mission.
 2. **Safety & constraints:** what not to do, privacy rules, refusal triggers.
 3. **Task spec:** exact objective and success criteria.
 4. **Context:** domain data, retrieved snippets, tool outputs (clearly delimited).
 5. **Few‑shot examples:** 1–3 archetype‑relevant pairs.
 6. **Output schema:** strict JSON/XML/markdown template.
 ### Example skeleton
 ```text
 [SYSTEM]
 You are ...
 [CONSTRAINTS]
 - Never ...
 - If unsure, respond with ...
 [TASK]
 Given input X, produce Y.
 [CONTEXT]
 <untrusted_input>
 ...
 </untrusted_input>
 [EXAMPLES]
 Input: ...
 Output: ...
 [OUTPUT_SCHEMA]
 { "label": "...", "confidence": 0..1, "reasoning_trace": {...} }
 ```
 ## 5. Prompt Versioning
 - Store prompts in a dedicated location (e.g., `prompts/` folder or DB table).
 - **Semantic versioning**: `prompt_name@major.minor.patch`.
  - **major:** behavior change or schema change.
  - **minor:** quality improvement (new examples, clearer instruction).
  - **patch:** typos / no behavior change.
 - Every version is linked to:
  - model/provider version,
  - eval suite run,
  - changelog entry.
 ## 6. Output Schemas & Validation
 - Prefer **strict JSON** for machine‑consumed outputs.
 - Validate outputs server‑side:
  - required fields present,
  - types/enum values correct,
  - confidence in range,
  - no disallowed keys (PII, secrets).
 - If validation fails: retry with “fix‑format” prompt or fallback to safe default.
 ## 7. Context Management
 - Separate **trusted** vs **untrusted** context:
  - Untrusted: user input, webhook payloads, retrieved docs.
  - Trusted: system instructions, tool policies, fixed label sets.
 - Delimit untrusted context explicitly to reduce prompt injection risk.
 - Keep context minimal; avoid leaking irrelevant tenant/user data.
 ## 8. Memory (if applicable)
 For chat/agentic archetypes:
 - Short‑term memory: last N turns.
 - Long‑term memory: curated summaries or embeddings with strict privacy rules.
 - Never store raw PII in memory unless required and approved.
 ## 9. Open Questions to Lock in Phase 1
 - Which models/providers are supported at launch?
 - Default parameters and retry/backoff policy?
 - Where prompts live (repo vs DB) and who can change them?
 - How schema validation + fallback works per archetype?
--- a/docs/llm/rag-embeddings.md
+++ b/docs/llm/rag-embeddings.md
@@ -0,0 +1,53 @@
 # LLM System: RAG & Embeddings (Starter Template)
 ---
 **Last Updated:** 2025-12-12  
 **Phase:** Phase 0 (Planning)  
 **Status:** Draft — finalize in Phase 1  
 **Owner:** AI/LLM Lead + Backend Architect  
 **References:**
 - `/docs/backend/architecture.md`
 - `/docs/llm/evals.md`
 - `/docs/llm/safety.md`
 ---
 This document describes retrieval‑augmented generation (RAG) and embeddings.  
 Use it only if your archetype needs external knowledge or similarity search.
 ## 1. When to Use RAG
 - You need grounded answers from a knowledge base.
 - Inputs are large or dynamic (docs, tickets, policies).
 - You want controllable citations/explainability.
 Do **not** use RAG when:
 - the task is purely generative with no grounding,
 - retrieval latency/cost outweighs benefit.
 ## 2. Data Sources
 - Curated docs, user‑uploaded files, internal DB records, external APIs.
 - Mark each source as trusted/untrusted and apply safety rules.
 ## 3. Chunking & Indexing
 - Define chunk size/overlap per domain.
 - Store embeddings in a vector index (e.g., `pgvector`, managed vector DB).
 - Keep an embedding model/version field to support migrations.
 ## 4. Retrieval Strategy
 - Default: semantic search top‑k + optional filters (tenant, type, recency).
 - Re‑rank if quality requires it.
 - Always include retrieved doc IDs in `reasoning_trace` (not raw text).
 ## 5. RAG Prompting Pattern
 - Provide retrieved snippets in a clearly delimited block.
 - Instruct model to answer **only** using retrieved context when grounding is required.
 - If context is insufficient → ask for clarification or defer.
 ## 6. Evaluating Retrieval
 - Measure recall/precision of retrieval separately from generation quality.
 - Add “no‑answer” test cases to avoid hallucinations.
 ## 7. Privacy & Multi‑Tenancy
 - Tenant‑scoped indexes or strict filters.
 - Never cross‑tenant retrieve.
 - Redact PII before embedding if embeddings can be exposed or logged.
--- a/docs/llm/safety.md
+++ b/docs/llm/safety.md
@@ -0,0 +1,86 @@
 # LLM System: Safety, Privacy & Reasoning Traces (Starter Template)
 ---
 **Last Updated:** 2025-12-12  
 **Phase:** Phase 0 (Planning)  
 **Status:** Draft — finalize in Phase 1  
 **Owner:** Security + AI/LLM Lead  
 **References:**
 - `/docs/backend/security.md`
 - `/docs/llm/prompting.md`
 ---
 This document defines the safety posture for any LLM‑backed feature: privacy, injection defenses, tool safety, and what you log.
 ## 1. Safety Goals
 - Prevent leakage of PII/tenant secrets to LLMs, logs, or UI.
 - Resist prompt injection and untrusted context manipulation.
 - Ensure outputs are safe to act on (validated, bounded, auditable).
 ## 2. Data Classification & Handling
 Define categories for your domain:
 - **Public:** safe to send and store.
 - **Internal:** safe to send only if necessary; store minimally.
 - **Sensitive (PII/PHI/PCI/Secrets):** never send unless explicitly approved; never store in traces.
 ## 3. Redaction Pipeline (before LLM)
 Apply a mandatory pre‑processing step in `callLLM()`:
 1. Detect sensitive fields (allowlist what *can* be sent, not what can’t).
 2. Redact or hash PII (names, emails, phone, addresses, IDs, card data).
 3. Replace with stable placeholders: `{{USER_EMAIL_HASH}}`.
 4. Attach a “redaction summary” to logs (no raw PII).
 ## 4. Prompt Injection & Untrusted Context
 - Delimit untrusted input (`<untrusted_input>...</untrusted_input>`).
 - Never allow untrusted text to override system constraints.
 - For RAG: treat retrieved docs as untrusted unless curated.
 - If injection detected → refuse or ask for human review.
 ## 5. Tool / Agent Safety (if applicable)
 - Tool allowlist with scopes and rate limits.
 - Confirm destructive actions with humans (“human checkpoint”).
 - Constrain tool outputs length and validate before reuse.
 ## 6. `reasoning_trace` Specification
 `reasoning_trace` is **optional** and should be safe to show to humans.  
 Store only **structured, privacy‑safe metadata**, never raw prompts or user PII.
 ### Allowed fields (example)
 ```json
 {
  "prompt_version": "classify@1.2.0",
  "model": "provider:model",
  "inputs": { "redacted": true, "source_ids": ["..."] },
  "steps": [
    { "type": "rule_hit", "rule_id": "r_123", "confidence": 0.72 },
    { "type": "retrieval", "top_k": 5, "doc_ids": ["d1","d2"] },
    { "type": "llm_call", "confidence": 0.64 }
  ],
  "output": { "label": "X", "confidence": 0.64 },
  "trace_id": "..."
 }
 ```
 ### Explicitly disallowed in traces
 - Raw user input, webhook payloads, or document text.
 - Emails, phone numbers, addresses, names, gov IDs.
 - Payment data, auth tokens, API keys, secrets.
 - Full prompts or full LLM responses (store refs or summaries only).
 ### How we guarantee “no PII” in traces
 1. **Schema allowlist:** trace is validated against a strict schema with only allowed keys.
 2. **Redaction required:** `callLLM()` sets `inputs.redacted=true` only after redaction succeeded.
 3. **PII linting:** server‑side scan of trace JSON for patterns (emails, phones, IDs) before storing.
 4. **UI gating:** only safe fields are rendered; raw text never shown from trace.
 5. **Audits:** periodic sampling in Phase 3+ to verify zero leakage.
 ## 7. Storage & Retention
 - Traces stored per tenant; encrypted at rest.
 - Retention window aligned with compliance needs.
 - Ability to disable traces globally or per tenant.
 ## 8. Open Questions to Lock in Phase 1
 - Exact redaction rules and allowlist fields.
 - Whether to store any raw LLM outputs outside traces (audit vault).
 - Who can access traces in UI and API.
--- a/docs/phases-plan.md
+++ b/docs/phases-plan.md
@@ -13,6 +13,8 @@
 This document describes a phased approach you can reuse for building AI‑assisted products. Phases structure the work, assign responsibilities, and track progress.
 > Phases and tasks are **modular**. After choosing an archetype in `/docs/archetypes.md`, delete or ignore tasks that don’t apply to your product.
 ## Phase Summary
 - **Phase 0 — Discovery & Requirements**
 - **Phase 1 — Architecture & Design**
@@ -54,6 +56,8 @@ This document describes a phased approach you can reuse for building AI‑assist
 - `backend/architecture.md` with modules (ingestion, processing/classification, approvals, reporting, billing), queues, DB schema notes (JSONB traces, `pgvector`).
 - `backend/api-design.md` with resources/endpoints and event feed.
 - UX prototypes for key flows.
 - Initial **ADR set** in `docs/adr/` for all locked decisions.
 - Initial **repo skeleton** (`apps/web`, `apps/api`, `packages/shared`) and `docs/dev-setup.md` ready for Phase 2 coding.
 ### Typical Tasks
 - Frontend: lock App Router + Tailwind; define shell/navigation; design loading/error patterns for lists, approvals, reports.
--- a/docs/project-overview.md
+++ b/docs/project-overview.md
@@ -12,10 +12,13 @@
 ## 1. Project Goal
-Provide a universal starting point for new products that leverage AI assistance. The template outlines how to structure documentation and technical decisions for projects that may include data ingestion, processing/classification (rules + embeddings + LLM), human approvals, reporting, and optional billing.
+Provide a universal starting point for new products that leverage AI assistance. The template outlines how to structure documentation and technical decisions for **different AI product archetypes** (chat assistants, generation tools, pipelines, agentic automation, internal tools).  
 Capabilities such as integrations/ingestion, staged processing (rules/embeddings/LLM), human review, reporting, multi‑tenancy, and billing are **optional modules** you can mix & match.
 Emphasis areas: trust (auditability, optional reasoning traces), speed (near‑real‑time processing), and clarity (human approvals and transparent reporting), adaptable to your domain.
 > **First step:** choose an archetype and modules in `/docs/archetypes.md`, then tailor the rest of this document.
 ## 2. Target Audience
 - Define your primary users and stakeholders for the target domain.
@@ -26,35 +29,36 @@ Emphasis areas: trust (auditability, optional reasoning traces), speed (near‑r
 ## 3. Unique Value Proposition
-1. **Automated processing with explainability**  
+1. **Clear AI interaction surface**  
-   Rules + embeddings (`pgvector`) first, LLM fallback via a single helper, with optional `reasoning_trace` per record.
+   Chat, workflow UI, batch pipeline, or API — with optional explainability (`reasoning_trace`) and trace IDs.
-2. **Human‑in‑the‑loop approvals**  
+2. **Optional human feedback loop**  
-   Review/override UI with optional rule creation; every action is logged (e.g., `source_agent`).
+   Approvals/overrides, edits, ratings, or escalation — all actions audited (`EventLog`, `source_agent`).
-3. **Reliable ingestion**  
+3. **Optional integrations & ingestion**  
-   OAuth2/webhooks for external providers; idempotent event handling; queue workers for retries.
+   OAuth2/webhooks/files/SDKs; idempotent handling and safe retries.
-4. **Reporting & optional billing**  
+4. **Optional reporting & billing**  
-   Dashboards/exports per tenant; subscription management with tenant‑scoped access control.
+   Dashboards/exports and provider‑hosted subscriptions/usage if your product needs them.
-5. **Multi‑tenant and secure by design**  
+5. **Security by design (single‑ or multi‑tenant)**  
-   Tenant isolation, RBAC, webhook verification, audit log (`EventLog`).
+   Tenant isolation and RBAC when applicable; webhook verification and audit logging for all products.
 ## 4. Key Features (example template)
-### 4.1. Onboarding & Integrations
+### 4.1. Onboarding & Integrations (optional)
- Tenant signup/login (Clerk/Auth.js or equivalent).
+- User/tenant signup/login if your product needs accounts.
- Connect external data sources (OAuth2, webhooks, file uploads, etc.).
+- Connect external data sources (OAuth2, webhooks, file uploads, etc.) if you ingest data.
 - Configure subscription/billing if applicable.
- Example domain model (multi‑tenant): `Tenant`, `User`, `Record`, `Rule`, `Attachment`, `Report`, `EventLog`, `Embedding`; `EventLog.source_agent` (e.g., default value), `Record.reasoning_trace` JSONB.
+- **Example domain model for pipeline products:** `Tenant`, `User`, `Record`, `Rule`, `Attachment`, `Report`, `EventLog`, `Embedding`; `EventLog.source_agent`, `Record.reasoning_trace` JSONB.  
  Rename or replace with your domain entities for other archetypes (messages, drafts, tasks, etc.).
 ### 4.2. Ingestion
- Webhooks/workers normalize provider payloads into a unified schema.
+- Webhooks/workers normalize provider payloads into a unified schema (if you have ingestion).
 - Idempotent writes; deduplication; `INGESTED` events with `source_agent`.
 ### 4.3. Classification/Processing
- Rule engine first; embeddings similarity next; LLM fallback via `callLLM()`.
+- If you use staged processing: rules/heuristics → embeddings/RAG → LLM fallback via `callLLM()`.
- Optionally persist `reasoning_trace` JSONB on records; log `PROCESSED` (or similar).
+- Optionally persist `reasoning_trace` JSONB on items; log `PROCESSED` (or similar).
 ### 4.4. Approval & Rules
- UI for reviewing/overriding outcomes; optional rule creation from overrides.
+- UI for reviewing/overriding outcomes; edits/ratings; optional rule creation from overrides (if relevant).
 - Log `APPROVED`, `RULE_CREATED`; track who/when/why.
 ### 4.5. Reporting & Events
--- a/packages/shared/README.md
+++ b/packages/shared/README.md
@@ -0,0 +1,5 @@
 # packages/shared
 Placeholder for shared packages: types, API client, UI kit, or utilities.
 Use only if you run a monorepo.