DataSpear
SQL / Query-to-API control plane with security-by-default and deterministic behavior
Overview
DataSpear is a backend control plane that allows teams to publish query-defined APIs safely and deterministically. Instead of manually building backend services for every analytics use case, users define SQL-backed endpoints which DataSpear exposes as versioned, production-ready REST APIs with strong guardrails.
Core Capabilities
- Automatically publishes arbitrary SQL queries as REST endpoints with schema-aware parameter detection
- Generates OpenAPI-compatible parameter schemas directly from query definitions
- Supports immutable API versioning by hashing query specifications and enforcing pinned execution
- Designed to expose data safely without leaking full schemas or raw database access
Security & Governance
- Built-in authentication and role-based access control (Admin, Editor, Viewer) with strict org isolation
- Read-only enforcement for non-admin users, preventing unsafe or mutating queries by default
- Hard execution safeguards including statement timeouts, row limits, and payload size caps
- Every API request is audit-logged with parameters, execution time, row counts, and error metadata
Architecture & Execution Engine
- FastAPI-based control plane with Pydantic schemas for strict input validation
- PostgreSQL used as the sole durable store for metadata, versions, policies, and audit logs
- Execution engine enforces policies before query execution and isolates metadata from target data sources
- Designed with replayable configuration, deterministic behavior, and migration-safe metadata evolution
Extensibility & Roadmap
- Pluggable connector architecture starting with PostgreSQL and designed to extend to warehouses and lakes
- Planned support for Snowflake, BigQuery, Athena, ClickHouse, Elasticsearch, and vector databases
- Future capabilities include deterministic caching, rate limiting, EXPLAIN-based query warnings, and GitOps-style config export
- Long-term vision: any data source → any API, safely
Why this project?
- Eliminates repetitive backend development for analytics-driven APIs
- Brings software engineering discipline (versioning, auditability, safety) to data access
- Enables internal analytics teams to ship APIs without becoming backend specialists
- Acts as a foundation for governed data access in AI and LLM-driven systems