64 lines
3.4 KiB
Markdown
64 lines
3.4 KiB
Markdown
# Backend: Architecture (Recommendations)
|
|
|
|
---
|
|
**Last Updated:** 2025-01-17
|
|
**Phase:** Phase 0 (Planning)
|
|
**Status:** Draft — finalize in Phase 1
|
|
**Owner:** Backend Architect
|
|
**References:**
|
|
- `/docs/project-overview.md`
|
|
- `/docs/backend/api-design.md`
|
|
- `/docs/backend/security.md`
|
|
- `/docs/backend/payment-flow.md`
|
|
---
|
|
|
|
> Recommendations for Phase 0. Lock decisions in Phase 1.
|
|
|
|
## 1. Approach & Stack
|
|
- Style: modular monolith with clear modules; containerized.
|
|
- Language/Runtime: Node.js (LTS) + TypeScript.
|
|
- Framework: Express or Fastify with modular structure and DI where helpful.
|
|
- DB: Postgres (managed: Supabase/RDS). Vector: `pgvector` for embeddings.
|
|
- Queue: BullMQ (Redis) for ingestion, categorization, notifications.
|
|
- Auth: Clerk/Auth.js (or equivalent) with tenant-aware RBAC.
|
|
- Payments: provider-hosted subscriptions; no raw card data stored.
|
|
- LLM: OpenAI API (or equivalent) via single helper (e.g., `callLLM()`), configurable model/params.
|
|
|
|
## 2. Modules (logical)
|
|
- `auth` — tenants, users, roles, sessions.
|
|
- `integrations` — external connectors, OAuth2, webhooks, connection health.
|
|
- `records` — normalized record store, statuses, `reasoning_trace` JSONB.
|
|
- `rules` — rule definitions, evaluation order, testing, hit stats.
|
|
- `processing` — pipeline: rule engine → embeddings similarity → LLM fallback; writes `PROCESSED`, updates records.
|
|
- `approvals` — queues for human review, overrides, optional rule creation; logs `TX_APPROVED`/`RULE_CREATED` with `source_agent`.
|
|
- `reports` — dashboards/exports, history.
|
|
- `billing` — provider checkout/portal, webhooks, plan enforcement per tenant.
|
|
- `events` — audit/event log (`EventLog`), read-only `/api/events` for downstream agents.
|
|
- `files/receipts` — attachment storage metadata (`Receipt` with file URL/mime).
|
|
|
|
## 3. API Layers
|
|
- HTTP API (REST) with versioning (`/api/v1`).
|
|
- Service layer for business logic and database transactions.
|
|
- Repositories for data access; use migrations for schema evolution.
|
|
|
|
## 4. Infrastructure & Ops
|
|
- Environments: dev/stage/prod; Docker images; CI/CD.
|
|
- Observability: structured logging, metrics, tracing; dead-letter queues for failed jobs.
|
|
- Secrets management per environment; rotate webhook/LLM/payment provider secrets.
|
|
|
|
## 5. Data & Schema Notes
|
|
- Records: store raw payload + normalized fields + `reasoning_trace` JSONB (model, rationale, confidence, source).
|
|
- EventLog: include `source_agent` (default `balance`) and payload for auditability; ensure filters by tenant/time/type.
|
|
- Embeddings: table keyed by record text fields (or other domain signals) to support similarity search; index with `pgvector`.
|
|
- Multi-tenant: all core tables carry `tenantId` and enforce scoped queries; `User` role per tenant.
|
|
|
|
## 6. Payment & Messaging (High-Level)
|
|
- Payment provider: initiate sessions via backend; handle webhooks idempotently; map provider status to internal billing/subscription states; update tenant access.
|
|
- Notifications: optional email/webhook callbacks to surface ingestion/categorization failures; keep out of PII exposure.
|
|
|
|
## 7. Queues (BullMQ)
|
|
- `records:ingest` — normalize webhook payloads, write `Record`, emit `INGESTED`.
|
|
- `records:process` — rule engine → embeddings similarity → LLM fallback; emit `PROCESSED` with `reasoning_trace`.
|
|
- `reports:generate` — build domain reports/exports, emit `REPORT_GENERATED`.
|
|
- Dead-letter queues per stream; retries with backoff; idempotent handlers keyed by external event IDs.
|