Multi-tenant application patterns
Many SaaS providers and large enterprise platform teams use a single Temporal Namespace with per-tenant Task Queues or Task Queue Fairness to power their multi-tenant applications. These approaches maximize resource efficiency while maintaining logical separation between tenants.
This guide covers architectural patterns, design considerations, and practical examples for building multi-tenant applications with Temporal.
Architectural principlesโ
When designing a multi-tenant Temporal application, follow these principles:
- Define your tenant model - Determine what constitutes a tenant in your business (customers, pricing tiers, teams, etc.)
- Prefer simplicity - Start with the simplest pattern that meets your needs
- Understand Temporal limits - Design within the constraints of your Temporal deployment
- Test at scale - Performance testing must drive your capacity decisions
- Plan for growth - Consider how you'll onboard new tenants and scale workers
Architectural patternsโ
There are four main patterns for multi-tenant applications in Temporal, listed from most to least recommended:
1. Task queues per tenant (Recommended)โ
Use different Task Queues for each tenant's Workflows and Activities.
This is the recommended pattern for most use cases. Each tenant gets dedicated Task Queue(s), with Workers polling multiple tenant Task Queues in a single process.
Pros:
- Strong isolation between tenants
- Efficient resource utilization
- Flexible worker scaling
- Easy to add new tenants
- Can handle thousands of tenants per Namespace
Cons:
- Requires worker configuration management
- Potential for uneven resource distribution
- Need to prevent "noisy neighbor" issues at the worker level
2. Single Task Queue with Fairnessโ
Use a single Task Queue with Fairness keys to distribute work across tenants.
This pattern uses Task Queue Fairness to manage multi-tenant workloads within a single Task Queue. Each tenant is assigned a fairness key, and fairness weights control how much of the Task Queue's capacity each tenant receives.
You can also set per-fairness-key rate limits (requests per second) to cap individual tenant throughput, preventing any single tenant from consuming too much capacity.
Pros:
- Priority and Fairness keys and weights can be adjusted without redeployment
- Onboarding new tenants doesn't require spinning up additional Workers
- Simpler Worker topology than per-tenant Task Queues
Cons:
- Fairness is probabilistic and may be harder to debug than strict isolation
- The fairness weight applies at schedule time, not at dispatch time, so it only affects newly-scheduled Tasks
- When using Worker Versioning, Fairness isn't guaranteed between versions
This pattern works well when you have many tenants with different service tiers and want to manage them without the operational overhead of per-tenant Task Queues or Workers.
3. Shared Workflow Task Queues, separate Activity Task Queuesโ
Share Workflow Task Queues but use different Activity Task Queues per tenant.
Use this pattern when Workflows are lightweight but Activities have heavy resource requirements or external dependencies that need isolation.
Pros:
- Easier worker management than full isolation
- Activity-level tenant isolation
- Good for compute-intensive Activities
Cons:
- Less isolation than pattern #1
- Workflow visibility is shared
- More complex to reason about
4. Namespace per tenantโ
Use a separate Namespace for each tenant.
Only practical for a smaller number of high-value tenants due to operational overhead. Most teams find this manageable for fewer than 50 tenants, though organizations with strong automation may scale higher. This pattern is not a good fit if you expect a very large number of tenants (10,000+).
Pros:
- Complete isolation between tenants โ no noisy neighbor problem
- Each Namespace has its own rate limits that can be provisioned on demand per customer
- Each Namespace can be deployed across multiple regions globally
- Per-Namespace observability is available by default
- Maximum security boundary
Cons:
- Higher operational overhead
- Credential and connectivity management per Namespace
- Requires a new Worker pool deployment for each customer (minimum 2 per Namespace for high availability)
- Not cost-effective at scale
Pattern comparisonโ
| Task Queues per tenant | Fairness-based | Shared Workflow / Separate Activity TQs | Namespace per tenant | |
|---|---|---|---|---|
| Isolation | Task Queue level | Probabilistic (weighted) | Activity-level only | Complete |
| Noisy neighbor protection | Strong | Weight-based throttling | Activity-level | Full โ separate rate limits |
| Worker management | Moderate โ config per tenant | Simple โ single Task Queue | Moderate | High โ Worker pool per tenant |
| Onboarding new tenants | Config update and restart | Set fairness and priority values (no new Workers) | Config update and Worker restart | New Namespace and Worker pool |
| Observability | Per-Task Queue metrics | Per-Task Queue metrics | Mixed | Per-Namespace |
| Rate limiting | Shared across Task Queue | Per-key rate limits | Shared across Namespace | Independent per Namespace |
| Scale ceiling | Thousands of tenants | Thousands of tenants | Thousands of tenants | 10,000 (Namespace limit) |
| Best for | Most multi-tenant apps | Tiered SaaS with many tenants | Heavy Activity workloads | High-value, compliance-sensitive tenants |
Task Queue isolation patternโ
This section details the recommended pattern for most multi-tenant applications.
Worker designโ
When a Worker starts up:
- Load tenant configuration - Retrieve the list of tenants this Worker should handle (from config file, API, or database)
- Create Task Queues - For each tenant, generate a unique Task Queue name (e.g.,
customer-{tenant-id}) - Register Workflows and Activities - Register your Workflow and Activity implementations once, passing the tenant-specific Task Queue name
- Poll multiple Task Queues - A single Worker process polls all assigned tenant Task Queues
// Example: Go worker polling multiple tenant Task Queues
for _, tenant := range assignedTenants {
taskQueue := fmt.Sprintf("customer-%s", tenant.ID)
worker := worker.New(client, taskQueue, worker.Options{})
worker.RegisterWorkflow(YourWorkflow)
worker.RegisterActivity(YourActivity)
}
Routing requests to Task Queuesโ
Your application needs to route Workflow starts and other operations to the correct tenant Task Queue:
// Example: Starting a Workflow for a specific tenant
taskQueue := fmt.Sprintf("customer-%s", tenantID)
workflowOptions := client.StartWorkflowOptions{
ID: workflowID,
TaskQueue: taskQueue,
}
Consider creating an API or service that:
- Maps tenant IDs to Task Queue names
- Tracks which Workers are handling which tenants
- Allows both your application and Workers to read the mappings of:
- Tenant IDs to Task Queues
- Workers to tenants
Capacity planningโ
Key questions to answer through performance testing:
Namespace capacity:
- How many concurrent Task Queue pollers can your Namespace support?
- What are your Actions Per Second (APS) limits?
- What are your Operations Per Second (OPS) limits?
Worker capacity:
- How many tenants can a single Worker process handle?
- What are the CPU and memory requirements per tenant?
- How many concurrent Workflow executions per tenant?
- How many concurrent Activity executions per tenant?
SDK configuration to tune:
MaxConcurrentWorkflowTaskExecutionSizeMaxConcurrentActivityExecutionSizeMaxConcurrentWorkflowTaskPollersMaxConcurrentActivityTaskPollers- Worker replicas (in Kubernetes deployments)
Provisioning new tenantsโ
Automate tenant onboarding with a Temporal Workflow:
-
Create a tenant onboarding Workflow that:
- Validates tenant information
- Provisions infrastructure
- Deploys/updates Worker configuration
- Triggers Worker restarts or scaling
- Verifies the tenant is operational
-
Store tenant-to-Worker mappings in a database or configuration service
-
Update Worker deployments to pick up new tenant assignments
Practical exampleโ
Scenario: A SaaS company has 1,000 customers and expects to grow to 5,000 customers over 3 years. They have 2 Workflows and ~25 Activities per Workflow. All customers are on the same tier (no segmentation yet).
Assumptionsโ
| Item | Value |
|---|---|
| Current customers | 1,000 |
| Workflow Task Queues per customer | 1 |
| Activity Task Queues per customer | 1 |
| Max Task Queue pollers per Namespace | 20,000 (per Cloud limits) |
| SDK concurrent Workflow task pollers | 5 |
| SDK concurrent Activity task pollers | 5 |
| Max concurrent Workflow executions | 200 |
| Max concurrent Activity executions | 200 |
Capacity calculationsโ
Task Queue poller limits:
- Each Worker uses 10 pollers per tenant (5 Workflow + 5 Activity)
- Maximum Workers in Namespace: 20,000 pollers รท 10 = 2,000 Workers
Worker capacity:
- Each Worker can theoretically handle 200 Workflows and 200 Activities concurrently
- Conservative estimate: 250 tenants per Worker (accounting for overhead)
- For 1,000 customers: 4 Workers minimum (plus replicas for HA)
- For 5,000 customers: 20 Workers minimum (plus replicas for HA)
Namespace capacity:
- At 250 tenants per Worker, need 2 Workers per group of tenants (for HA)
- Maximum tenants in Namespace: (2,000 Workers รท 2) ร 250 = 250,000 tenants
These are theoretical calculations based on SDK defaults. Always perform load testing to determine actual capacity for your specific workload. Monitor CPU, memory, and Temporal metrics during testing.
While testing, also pay attention to your metrics capacity and cardinality.
Worker assignment strategiesโ
Option 1: Static configuration
- Each Worker reads a config file listing assigned tenant IDs
- Simple to implement
- Requires deployment to add tenants
Option 2: Dynamic API
- Workers call an API on startup to get assigned tenants
- Workers identified by static ID (1 to N)
- API returns tenant list based on Worker ID
- More flexible, no deployment needed for new tenants
Best practicesโ
Monitoringโ
Track these metrics per tenant:
- Workflow completion rates
- Activity execution rates
- Task Queue backlog
- Worker resource utilization
- Workflow failure rates
Handling noisy neighborsโ
Even with Task Queue isolation, monitor for tenants that:
- Generate excessive load
- Have high failure rates
- Cause Worker resource exhaustion
Strategies:
- Implement per-tenant rate limiting in your application
- Implement fairness keys and apply per-key rate limits
- Move problematic tenants to dedicated Workers
- Use Workflow/Activity timeouts aggressively
Tenant lifecycleโ
Plan for:
- Onboarding - Automated provisioning Workflow
- Scaling - When to add new Workers for growing tenants
- Offboarding - Graceful tenant removal and data cleanup
- Rebalancing - Redistributing tenants across Workers
Search Attributesโ
Use Search Attributes to enable tenant-scoped queries:
// Add tenant ID as a Search Attribute
searchAttributes := map[string]interface{}{
"TenantId": tenantID,
}
This allows filtering Workflows by tenant in the UI and SDK:
TenantId = 'customer-123' AND ExecutionStatus = 'Running'