CATALOGUE DES COMPOSANTS
Le Stack complet
50 composants de production répartis dans 9 catégories — tous open-source, tous éprouvés.
Infrastructure & OS
The bare-metal foundation: immutable OS, container runtime, and cluster orchestration.
Talos Linux
productionMinimal, immutable Linux distribution designed specifically for Kubernetes. No SSH, no shell — managed entirely through API.
Rôle : Node operating system for all 6 cluster nodes (3 control plane, 2 Talos workers, 1 DGX Spark)
containerd
productionIndustry-standard container runtime with low overhead and broad compatibility.
Rôle : Container runtime on all nodes
Kubernetes
productionProduction-grade container orchestration system for automating deployment, scaling, and management.
Rôle : Core orchestration platform running v1.34.1
Networking & Service Mesh
eBPF-powered networking, Gateway API ingress, service mesh, and DNS resolution.
Cilium
productioneBPF-based networking, observability, and security. Replaces kube-proxy with high-performance service load balancing.
Rôle : CNI plugin, network policy enforcement, L2 ARP announcement, Gateway API implementation
Hubble
productionNetwork observability platform built on Cilium eBPF data plane for deep visibility into communication and behavior.
Rôle : Network flow observability, service dependency mapping
Gateway API
productionNext-generation Kubernetes ingress API with expressive routing, TLS termination, and traffic splitting.
Rôle : Single shared gateway handling all HTTP/HTTPS traffic at 192.168.0.200
APISIX
productionHigh-performance, cloud-native API gateway with rich traffic management features.
Rôle : Advanced API gateway for complex routing scenarios
CoreDNS
productionFlexible, extensible DNS server for Kubernetes service discovery.
Rôle : Cluster DNS with wildcard resolution for *.apps.edgeprime.io
Linkerd
deployedUltralight service mesh providing mTLS, observability, and reliability features.
Rôle : Service mesh for zero-trust networking with automatic mTLS
Security & Identity
Zero-trust security: SSO, secrets management, policy enforcement, runtime detection, and certificate automation.
Keycloak
productionEnterprise identity and access management with OIDC, SAML, social login, and LDAP integration.
Rôle : Centralized SSO for all platform services — Vault, Harbor, Grafana, ArgoCD, OneDev, AFFiNE
HashiCorp Vault
productionSecrets management, encryption as a service, and privileged access management.
Rôle : HA deployment (3-replica Raft cluster) storing all platform secrets, DNS credentials, TLS certificates
External Secrets Operator
productionKubernetes operator that synchronizes secrets from external stores into Kubernetes secrets.
Rôle : Bridges Vault ↔ Kubernetes: syncs secrets to pods, pushes certificates back to Vault
cert-manager
productionAutomatic TLS certificate management with Let's Encrypt ACME protocol support.
Rôle : Automated certificate issuance via DNS-01 challenges with Cloudflare
Kyverno
productionKubernetes-native policy engine for validation, mutation, and generation of resources.
Rôle : Enforces security policies: label requirements, container restrictions, cross-tenant isolation
Falco
productionRuntime security monitoring using eBPF probes to detect anomalous container behavior.
Rôle : Real-time threat detection: shell spawning, privilege escalation, sensitive file access
Kubescape
deployedKubernetes security platform for continuous scanning against NSA, MITRE, and CIS benchmarks.
Rôle : Compliance scanning and hardening recommendations
Open AppSec
deployedML-based web application firewall and API security.
Rôle : WAF protection for exposed services
Observability
Full-spectrum observability: metrics, logs, traces, profiles, and cost monitoring in a unified stack.
Prometheus
productionPull-based metrics collection with multi-dimensional data model and powerful PromQL query language.
Rôle : Primary metrics scraping for all platform services via ServiceMonitors
Grafana
productionVisualization platform connecting metrics, logs, traces, and profiles in unified dashboards.
Rôle : Central observability UI with pre-built dashboards for every platform component
Mimir
productionHorizontally-scalable long-term metrics storage with Prometheus-compatible API.
Rôle : Indefinite metrics retention with high compression and fast queries
Loki
productionLog aggregation system inspired by Prometheus — indexes labels, not full log lines.
Rôle : Centralized logging with LogQL queries across all namespaces
Tempo
productionDistributed tracing backend supporting Jaeger, Zipkin, and OpenTelemetry formats.
Rôle : End-to-end request tracing across microservices
Pyroscope
productionContinuous profiling platform for CPU, memory, goroutine, and lock contention analysis.
Rôle : Runtime performance profiling with flame graph visualization
Grafana Alloy
productionUnified telemetry collector replacing Promtail, Grafana Agent, and OpenTelemetry Collector.
Rôle : Single agent collecting metrics, logs, traces, and profiles from all nodes
OpenCost
productionReal-time Kubernetes cost monitoring with per-namespace and per-workload breakdown.
Rôle : Infrastructure cost visibility and optimization recommendations
GitOps & CI/CD
Git-driven deployment pipelines with progressive delivery and infrastructure-as-code.
Argo CD
productionGitOps continuous delivery tool that reconciles desired state from Git with cluster state.
Rôle : Core GitOps engine with App-of-Apps pattern managing 40+ applications
Terraform
productionInfrastructure as Code for provisioning and managing cloud-agnostic resources.
Rôle : Manages Vault secrets, Keycloak OIDC clients, Grafana dashboards, Harbor config
OneDev
productionSelf-hosted Git repository manager with integrated CI/CD pipelines and code review.
Rôle : Private Git hosting with container-based CI runners
Kargo
plannedProgressive delivery engine adding multi-stage promotion workflows on top of Argo CD.
Rôle : Environment promotion pipelines: dev → staging → production
Storage & Registry
Distributed block storage, S3-compatible object storage, and secure container registry.
Longhorn
productionCloud-native distributed block storage with 3-way replication, snapshots, and backups.
Rôle : Primary storage class for all stateful workloads with automatic replication
Harbor
productionEnterprise container registry with vulnerability scanning, image signing, and RBAC.
Rôle : Private registry with Trivy scanning, OIDC auth, and replication policies
Garage
productionS3-compatible distributed object storage designed for self-hosted deployments.
Rôle : Cost-effective object storage for backups, logs, and unstructured data
Velero
deployedKubernetes backup and disaster recovery tool with snapshot and restore capabilities.
Rôle : Cluster-wide backup to S3 with scheduled policies
Databases & Messaging
Managed PostgreSQL, Redis-compatible cache, distributed KV store, Kafka streaming, and multi-model databases.
CloudNativePG
productionKubernetes operator for PostgreSQL with HA clustering, automated failover, and point-in-time recovery.
Rôle : Manages PostgreSQL clusters for 5+ applications (Keycloak, Backstage, Matomo, etc.)
Dragonfly
productionRedis-compatible in-memory data store with superior performance through modern algorithms.
Rôle : High-performance caching layer replacing Redis
Strimzi (Apache Kafka)
productionKubernetes operator for Apache Kafka with native CRD-based management.
Rôle : Event streaming platform for asynchronous communication
TiKV
productionDistributed transactional key-value store with ACID transactions and Raft consensus.
Rôle : Backend storage engine for SurrealDB with strong consistency
SurrealDB
productionMulti-model database supporting document, graph, and key-value data models.
Rôle : Flexible database for applications needing graph + document queries
Qdrant
productionVector database for similarity search, powering semantic search and AI applications.
Rôle : Vector embeddings store for AI/ML workloads
Application Platform
Developer portal, BaaS, workflow automation, analytics, and self-service tools.
Backstage
productionOpen platform for building developer portals with service catalog and self-service templates.
Rôle : Self-service portal for certificate management and tenant onboarding
Supabase
productionOpen-source Firebase alternative: PostgreSQL, auth, real-time, storage, and edge functions.
Rôle : Backend-as-a-Service for rapid application development
n8n
productionSelf-hosted workflow automation with 400+ integrations and visual builder.
Rôle : Event-driven automation for platform operations and notifications
Matomo
productionPrivacy-focused web analytics platform — self-hosted Google Analytics alternative.
Rôle : Visitor tracking without third-party data sharing
Homepage
productionApplication dashboard providing a unified start page for all platform services.
Rôle : Central dashboard linking all 20+ platform services
AFFiNE
productionPrivacy-focused knowledge management workspace — alternative to Notion.
Rôle : Team documentation and knowledge management
KubeVirt
deployedRun virtual machines alongside containers on the same Kubernetes infrastructure.
Rôle : VM workloads for legacy applications that can't be containerized
AI & Machine Learning
Edge AI inference on NVIDIA DGX Spark with Blackwell GPU — LLM model serving via AIBrix and vLLM on bare-metal Kubernetes.
NVIDIA DGX Spark
productionDesktop AI supercomputer powered by Grace Blackwell GB10 Superchip — 1 PFLOP FP4, 128GB unified LPDDR5x memory, ARM64 architecture.
Rôle : Dedicated GPU worker node (gx10) with Blackwell GPU, CUDA 13.0, and ConnectX-7 networking
AIBrix
productionOpen-source Kubernetes-native AI inference platform with prefix-cache-aware routing, LLM-specific autoscaling, and distributed KV cache.
Rôle : LLM model serving control plane — 3-wave ArgoCD deployment with Envoy Gateway routing
vLLM
productionHigh-throughput LLM inference engine with PagedAttention, continuous batching, and OpenAI-compatible API.
Rôle : Inference runtime serving Qwen, Llama, and Mistral models via NVIDIA NGC images on ARM64
NVIDIA GPU Operator
productionKubernetes operator automating GPU driver, container toolkit, device plugin, and DCGM exporter lifecycle.
Rôle : GPU resource management with driver-less mode for DGX OS — exposes nvidia.com/gpu to scheduler