← Resources

Scalable bot architecture

A bot that scales is not “a bigger script”: it is events → queue → workers → persistence → integrations, with platform limits and observability from day one.

Minimum layers

Mistakes we often fix

Mixing Discord and WhatsApp in one deployment without credential isolation; no structured logs; no human handoff; ignoring rate limits. We keep a messaging bots hub and one main service landing instead of ten thin clones.

Hire bot development Bots hub

1. Diagnosing why bots crash in production

The pattern is almost always the same: a single-file Python or Node script that works fine in demo, the bot launches with one channel, traffic grows, and within a month you start seeing duplicate messages, ghost replies, conversations stuck mid-flow, hosting bills that creep up, and a customer support backlog the bot was supposed to remove. The root cause is rarely the messaging library — it is the absence of architectural primitives that any production system needs: idempotency, queues, persistent state and observability.

Once you cross ~50 active conversations per hour, "bot" stops being a script and starts being a distributed system. Scaling means treating the webhook as untrusted input, the queue as the system's heartbeat, and the database as the only source of truth. Below is the minimum architecture we ship for every bot project, regardless of channel.

2. The eight architectural decisions that matter

3. Reference stack we ship

Layer Light (≤500 conv/day) Mid (≤20k conv/day) Heavy (100k+ conv/day)
Edge / webhook Single Node/Python service on Render or Railway Fastify/FastAPI behind a load balancer, 2–4 instances Edge runtime (Cloudflare Workers) or Kubernetes ingress
Queue BullMQ + Redis (managed) RabbitMQ or SQS + DLQ Kafka or Redis Streams with consumer groups
Database Managed Postgres (Neon, Supabase) RDS Postgres + read replica Postgres + Redis cache + partitioning by tenant
Observability Logtail / Better Stack Grafana Cloud + Sentry Datadog + OpenTelemetry collector
Monthly infra cost $40–$120 $300–$900 $2,500–$8,000

4. Frequently asked questions

Stateless or stateful bot — which one should I build?

Build a stateless bot worker on top of a stateful persistence layer (PostgreSQL, Redis). The webhook handler should be pure (validate, enqueue, ACK in <1s). Conversation state lives in the database keyed by tenant + user. This pattern lets you autoscale workers horizontally without sticky sessions and survives restarts without losing context.

Do I really need a queue between the webhook and my logic?

Yes, once you serve more than a handful of conversations per minute. Telegram, Discord and WhatsApp expect a 200 OK in under a few seconds or they retry — and a retry storm without idempotency duplicates messages. A queue (Redis Streams, RabbitMQ, SQS) decouples ingestion from processing, absorbs traffic spikes and lets you replay failed events safely.

What are the real rate limits per platform?

Telegram: 30 messages/sec global, 1 message/sec per chat, ~20 messages/min per group. Discord: 50 requests/sec per route, 5 messages/5s per channel. WhatsApp Business API: tier-based (1K → 100K conversations/day), 80 messages/sec per phone number, plus Meta's conversation-based pricing. Always implement exponential backoff and respect the Retry-After header.

How do I make webhooks idempotent?

Persist a hash of (platform, update_id or message_id, tenant_id) before processing. If the key already exists, ACK and skip. Use database transactions or Redis SETNX with TTL. Idempotency turns network retries from a duplicate-message bug into a no-op, which is essential when WhatsApp or Stripe retries the same webhook 3–5 times.

What does observability look like for a serious bot?

Structured logs with correlation IDs per conversation; metrics for queue depth, worker latency p50/p95/p99, error rate per platform, rate-limit hits, and human-handoff conversion; tracing across webhook → queue → worker → CRM call (OpenTelemetry). Alert on p95 latency, dead-letter queue size and failed deliveries — not on raw CPU.

Need a bot that scales past the demo?

We design and operate production bot infrastructure for Discord, Telegram and WhatsApp Business API — with queues, idempotency, CRM integrations and observability built in from day one. Tell us about your channel mix and traffic, and we will reply with a phased plan and fixed-price range within 24 hours.

Hire bot development Estimate cost (2 min)

Related resources

Last updated: June 2026 · Written by the RoviDev studio.

Ask for a no-commitment quote

Briefly describe your project and I usually reply in under 30 minutes with feasibility, phases and a price range.

or email contacto@rovidev.com