Architecture
How Fungi's components fit together.
System Overview
┌─────────────────────────────────────────────┐
│ gRPC / REST API │
├─────────────────────────────────────────────┤
│ SQL (DataFusion) │ DataStream API │
├─────────────────────────────────────────────┤
│ Streaming Execution Engine │
│ ┌─────────┐ ┌──────────┐ ┌─────────────┐ │
│ │Operator │ │ State │ │ Watermark │ │
│ │ Graph │ │ Manager │ │ & Windows │ │
│ └─────────┘ └──────────┘ └─────────────┘ │
├─────────────────────────────────────────────┤
│ Kafka │ State Backend │ Checkpoints │
└─────────────────────────────────────────────┘
Crates
| Crate | Purpose |
|---|---|
fungi-common | Shared types, errors, config |
fungi-state | State backends (Memory, RocksDB) |
fungi-streaming | Watermarks, windows, timers |
fungi-execution | Operator graph, task executor, Python UDF |
fungi-checkpoint | Checkpoint coordinator |
fungi-sql | DataFusion integration |
fungi-flink | Flink DataStream API |
fungi-connector | Kafka source/sink |
fungi-server | gRPC/HTTP server |
fungi-cli | Command-line interface |
fungi-cluster | Distributed protocol |
fungi-scheduler | Job/task scheduling |
fungi-reliability | DLQ, retries, circuit breaker |
fungi-security | Auth, RBAC, audit |
fungi-tenancy | Multi-tenant isolation |
fungi-dashboard | Leptos WASM dashboard |
fungi-mcp | MCP server for AI agents |
Data Flow
Source → Deserialize → Timestamp → KeyBy → Window → Aggregate → Serialize → Sink
- Source reads from Kafka / file / API
- Deserialize → Arrow
RecordBatch - Timestamp assigns event time
- KeyBy partitions for parallelism
- Window groups by time (tumbling / sliding / session)
- Aggregate computes (sum / count / avg / custom)
- Serialize → bytes
- Sink writes to Kafka / DB / file
All internal data flows use Arrow RecordBatch — zero-copy, SIMD-friendly.
Execution Modes
| Mode | Use Case |
|---|---|
| Embedded | Single process, library API, CLI --local |
| Server | Single node, gRPC API, dashboard |
| Distributed | JobManager + N TaskManagers, K8s |
See Concepts / Distributed for the multi-node model.