Run OpenQlik in your own datacenter
We are the only premium voice AI platform built end-to-end on open-source models, so the entire stack — LLM, TTS, STT, real-time voice, and orchestration — can be deployed inside your perimeter. SaaS, private cloud, or fully air-gapped.
Reference architecture
Every component runs inside your perimeter. Nothing phones home.
Components shipped
Supported environments
From a laptop pilot to a multi-region GPU cluster.
Docker Compose
Single-node deployment for pilots, demos, and small production workloads.
- Single-command bring-up via docker compose up
- Bundled PostgreSQL, Redis, MinIO, vector DB
- GPU passthrough via NVIDIA Container Toolkit
- Suggested GPU: 1× 24 GB (e.g. NVIDIA L4, A10 or RTX 4090) — sized to your model choice
Kubernetes
Helm charts for HA, autoscaling, and multi-tenant isolation across GPU node pools.
- Official Helm chart with values overrides per environment
- Horizontal autoscaling on inference + control plane
- Works with EKS, AKS, GKE, OpenShift, Rancher, vanilla k8s
- GPU operator + node selectors for mixed CPU/GPU pools
Bare Metal
Offline installer for fully air-gapped, sovereign, or regulated environments.
- Offline tarball with all images, weights, and dependencies
- Systemd-based service supervision
- Datacenter GPUs: NVIDIA H100 / A100 / L40S, AMD MI300 (ROCm) — sizing depends on workload
- Optional HA via keepalived + Patroni for Postgres
Pick a tier based on concurrent voice sessions
GPU recommendations depend on the model tier you choose. These are reference configurations our deployment engineers use; final sizing is confirmed during discovery.
Small (pilot)
Up to ~10 concurrent voice sessions
Mid (production)
Up to ~100 concurrent voice sessions
Large (enterprise)
1,000+ concurrent voice sessions
CPU, RAM and storage scale with the same tier — see the installation checklist for the full bill of materials.
Production-ready in 4 phases
The same checklist our deployment engineers use with banks, telcos, and healthcare customers.
1. Discovery & sizing
- Confirm expected concurrent voice sessions and TTS/STT minutes per month
- Inventory available GPU SKUs and pick a tier from the sizing guide below
- Choose deployment topology: Docker Compose / Kubernetes / bare-metal
- Identify air-gap, sovereignty, or regulatory requirements (HIPAA, GDPR, PDPL)
2. Infrastructure prerequisites
- Linux hosts (Ubuntu 22.04+ / RHEL 9+) with kernel 5.15+
- NVIDIA driver 535+ and Container Toolkit, or AMD ROCm 6+
- Kubernetes 1.28+ (if k8s) with GPU operator and a CSI storage class
- PostgreSQL 15+, Redis 7+, S3-compatible object storage (MinIO supported)
- Internal DNS, TLS certificates, and a load balancer (NGINX/HAProxy/F5)
3. Identity & security
- Wire SSO via SAML 2.0, OIDC, or LDAP/AD
- Define RBAC roles, workspace boundaries, and per-tenant quotas
- Enable audit log shipping to your SIEM (Splunk, ELK, Sentinel)
- Configure PII redaction policies and data retention windows
- Generate offline license + signing keys for air-gapped environments
4. Install & validate
- Pull or sideload OpenQlik images and open-source model weights
- Run helm install openqlik or the offline installer bundle
- Smoke test: TTS, STT, agent orchestration, real-time voice round-trip
- Load test target concurrency with the bundled k6 scenarios
- Hand off runbooks for upgrades, backups, and incident response
Ready for an on-prem deployment?
Our solutions team will run sizing, draft a topology, and ship a pilot in under 2 weeks.