Add foundational documentation templates to support product design and architecture planning, including ADR, archetypes, LLM systems, dev setup, and shared modules.

This commit is contained in:
olekhondera
2025-12-12 02:31:03 +02:00
parent 5053235e95
commit c905cbb725
26 changed files with 759 additions and 65 deletions

View File

@@ -16,6 +16,8 @@
- Tenant-scoped resources; role-based authorization.
- Idempotent ingestion/webhook endpoints; trace IDs for debugging.
> Resource set is archetypespecific. Endpoints below are a **pipeline/classification example** — adapt for chatfirst, generation, or automation products.
## 2. Core Resources (high-level)
- `/auth` — login, tenant context, token refresh.
- `/tenants` — tenant profile, roles, invites.

View File

@@ -12,7 +12,10 @@
- `/docs/backend/payment-flow.md`
---
> Recommendations for Phase 0. Lock decisions in Phase 1.
> Recommendations for Phase 0. Lock decisions in Phase 1.
> After Phase 1, this file is the **canonical record of locked backend architecture decisions**.
> Keep exploratory notes in a separate `*_PLAN.md` (if you use one) and archive/delete it after Phase 1.
> The module list below reflects a pipeline/classification archetype. Keep/rename/omit modules per `/docs/archetypes.md`.
## 1. Approach & Stack
- Style: modular monolith with clear modules; containerized.

View File

@@ -12,22 +12,22 @@
---
## 1. Role of Backend
- Own business logic for ingestion, processing/classification (rules + embeddings + LLM fallback), approvals, reporting, billing, and audit.
- Own business logic for integrations, AI capability (chat/generation/pipelines/automation), optional human feedback loops, reporting, billing, and audit.
- Integrate safely with external providers (OAuth2/webhooks, payment provider, LLM provider) and expose consistent APIs + events.
- Enforce security: tenant isolation, RBAC, webhook verification, event/audit logging.
- Enforce security appropriate to your archetype (single or multitenant), webhook verification, and event/audit logging.
## 2. Main Domain Areas
- **Auth & Tenants:** authentication/authorization, roles, tenant-scoped access.
- **Integrations:** external providers via OAuth2/webhooks; connection health.
- **Records:** normalized feeds, statuses (ingested, processed, needs_approval, approved, failed), `reasoning_trace` JSONB.
- **Rules & Processing:** rules engine, embeddings similarity, LLM fallback; logging with `source_agent`.
- **Approvals:** human-in-the-loop decisions, overrides, optional rule creation; audit trail.
- **Reports & Exports:** dashboards/summaries with export history.
- **Billing:** provider-hosted subscriptions, tenant-scoped access control, webhooks.
- **Events:** `/api/events` feed for downstream agents and internal observability.
- **Auth & Tenancy (optional):** users, roles, tenant isolation if needed.
- **Integrations / Ingestion (optional):** OAuth2/webhooks/files; connection health.
- **Core AI Module:** chat, generation, classification, RAG, or agentic automation.
- **Processing Pipeline (optional):** staged evaluation (rules/embeddings/LLM); `reasoning_trace` JSONB if used.
- **Human Feedback Loop (optional):** approvals/edits/ratings/escalations; audit trail.
- **Reporting & Exports (optional):** dashboards/summaries with history.
- **Billing (optional):** provider-hosted subscriptions/usage, webhooks.
- **Events / Audit:** `/api/events` feed for observability and downstream agents.
## 3. Integrations
- **External data providers:** OAuth2 + webhooks; signatures/verification; idempotent writes via workers.
- **Payment provider:** subscriptions, checkout/portal; webhooks for lifecycle events.
- **LLM provider:** OpenAI API via single helper; configurable model.
- **LLM provider:** chosen LLM API via a single helper; configurable model/params.
- **Queues:** BullMQ (Redis) for ingestion/categorization/notifications.

View File

@@ -40,6 +40,27 @@
## 6. LLM Safety
- All LLM calls go through a single helper; centralize redaction, logging, and parameter control.
- Strip/obfuscate sensitive fields before sending to LLM; log only references in traces.
- Detailed LLM safety and `reasoning_trace` policy live in `/docs/llm/safety.md`.
### 6.1 AISpecific Threats & Controls (summary)
These apply to any archetype that uses LLMs or RAG.
- **Prompt injection / jailbreak**
- Treat all user input and retrieved content as **untrusted**.
- Delimit untrusted blocks explicitly and never allow them to override system constraints.
- Detect injection patterns; on suspicion → refuse or route to human review.
- **Outbounddata policy**
- Use **allowlists** for what may be sent to the model.
- Mandatory redaction pipeline before every LLM call (PII/PHI/PCI/secrets).
- Never send crosstenant data; never send raw billing/auth secrets.
- **Output validation**
- Validate model outputs against strict schemas (types, enums, bounds).
- Reject/repair invalid outputs; fall back to safe defaults or human checkpoints for highrisk actions.
- For agentic tools: validate tool arguments and enforce pertool scopes.
- **Trusted vs untrusted context (RAG)**
- Retrieved documents are untrusted unless curated.
- Keep retrieval tenantscoped; record only doc IDs in traces.
- If grounding is required and context is insufficient → ask user or defer.
## 7. Audit & Events
- Log domain events to `EventLog` with `source_agent`; include user ID, tenant, timestamps, and relevant context.