Why observability in Odoo makes the difference
Most Odoo deployments are managed reactively: someone calls saying Odoo is slow, you open an SSH session, look at the log with tail -f, and try to figure out what happened. This approach has a fundamental problem: by the time you find out about the issue, the damage is already done. A query that takes 45 seconds triggers no alarm until the server becomes saturated. A worker that silently dies leaves users without service and nobody notices.
Proactive observability — centralising logs, measuring metrics, setting thresholds and receiving alerts before the user complains — is the difference between a managed system and one administered by crisis. This guide describes the architecture we have implemented in production for several clients, including Rehabmedic, where the combination of ELK Stack and Telegram alerts allowed us to detect and resolve incidents before they impacted the business.
Observability architecture: overview
┌─────────────────────────────────────────────────────┐
│ SERVIDOR ODOO │
│ /var/log/odoo/odoo.log │
│ /var/log/postgresql/postgresql.log │
│ métricas de sistema (CPU, memoria, disco) │
│ │ │
│ ┌─────▼──────┐ │
│ │ Filebeat │ (agente ligero, sin lógica) │
│ └─────┬──────┘ │
└────────│────────────────────────────────────────────┘
│ TCP/TLS :5044
┌────────▼────────────────────────────────────────────┐
│ SERVIDOR ELK │
│ │
│ ┌──────────┐ ┌──────────────┐ ┌───────────┐ │
│ │ Logstash │───▶│Elasticsearch │───▶│ Kibana │ │
│ │ (parseo │ │ (almacén + │ │(dashboard │ │
│ │ + enriq.)│ │ búsqueda) │ │ + alertas)│ │
│ └──────────┘ └──────┬───────┘ └───────────┘ │
└─────────────────────────│───────────────────────────┘
│ Watcher / ElastAlert
┌───────▼──────────┐
│ Bot Telegram │
│ (alertas HTTP) │
└──────────────────┘
The components are:
- Filebeat: agent installed on the Odoo server. Reads log files and forwards them to Logstash with minimal resource usage.
- Logstash: processing pipeline. Parses Odoo logs (proprietary format), extracts structured fields, enriches with metadata and normalises.
- Elasticsearch: search and analytics database where all indexed events are stored.
- Kibana: web interface for exploration, dashboards and alert configuration (Watcher or Kibana Alerting).
- ElastAlert / Python script: alerting engine that evaluates conditions against Elasticsearch and fires notifications to Telegram.
What data to collect from Odoo
Odoo generates several event streams that should be monitored separately:
1. Odoo application log (/var/log/odoo/odoo.log)
This is the primary source. Odoo's default log format is:
2026-05-31 08:42:17,123 12345 INFO odoo.http: HTTP GET /web/dataset/call_kw 200 0.045s
2026-05-31 08:42:18,456 12346 WARNING odoo.addons.sale.models.order: Order SO-1234 warning: ...
2026-05-31 08:42:19,789 12347 ERROR odoo.sql_db: bad query: ...
Fields to extract: timestamp, PID, level (INFO/WARNING/ERROR/CRITICAL), logger (module), message, URL (if HTTP request), response time, HTTP status code.
2. Slow queries in PostgreSQL
Enable log_min_duration_statement = 1000 in PostgreSQL to log all queries taking more than 1 second. These entries in /var/log/postgresql/postgresql.log are critical for detecting database bottlenecks.
3. Odoo workers and processes
In multi-worker mode, Odoo spawns child processes. Monitor how many workers are active, how many are idle vs busy, and whether any restart abnormally.
4. Cron jobs
Odoo's scheduled jobs can fail silently. Detect errors in the log matching the pattern cron or ir.cron in the logger.
5. System metrics
CPU, memory, disk usage, active network connections. Metricbeat (part of the Elastic stack) or node_exporter + Prometheus are good complementary options.
Configuring Filebeat on the Odoo server
# /etc/filebeat/filebeat.yml
filebeat.inputs:
- type: log
id: odoo-application
enabled: true
paths:
- /var/log/odoo/odoo.log
fields:
service: odoo
environment: production
fields_under_root: true
multiline.type: pattern
multiline.pattern: '^\d{4}-\d{2}-\d{2}'
multiline.negate: true
multiline.match: after
# Las trazas de error de Python son multi-línea; las agrupamos
- type: log
id: postgresql
enabled: true
paths:
- /var/log/postgresql/postgresql-16-main.log
fields:
service: postgresql
environment: production
fields_under_root: true
multiline.type: pattern
multiline.pattern: '^\d{4}-\d{2}-\d{2}'
multiline.negate: true
multiline.match: after
output.logstash:
hosts: ["10.0.2.10:5044"]
ssl.certificate_authorities: ["/etc/filebeat/certs/ca.crt"]
ssl.certificate: "/etc/filebeat/certs/filebeat.crt"
ssl.key: "/etc/filebeat/certs/filebeat.key"
logging.level: warning
logging.to_files: true
logging.files:
path: /var/log/filebeat
The multiline block is essential: Python tracebacks span multiple lines, and without grouping each traceback line is indexed as a separate event, making search impossible.
Logstash pipeline: parsing the Odoo log format
# /etc/logstash/conf.d/odoo.conf
input {
beats {
port => 5044
ssl => true
ssl_certificate => "/etc/logstash/certs/logstash.crt"
ssl_key => "/etc/logstash/certs/logstash.key"
ssl_certificate_authorities => ["/etc/logstash/certs/ca.crt"]
}
}
filter {
if [service] == "odoo" {
grok {
match => {
"message" => "%{TIMESTAMP_ISO8601:odoo_timestamp} %{NUMBER:pid:int} %{LOGLEVEL:log_level} %{NOTSPACE:logger}: %{GREEDYDATA:log_message}"
}
tag_on_failure => ["_grokparsefailure_odoo"]
}
# Parsear líneas HTTP con tiempo de respuesta
if [logger] == "odoo.http" {
grok {
match => {
"log_message" => "HTTP %{WORD:http_method} %{URIPATH:request_path} %{NUMBER:http_status:int} %{NUMBER:response_time_s:float}s"
}
tag_on_failure => ["_grok_http_failure"]
}
# Convertir tiempo a ms para facilitar alertas
if [response_time_s] {
ruby {
code => "event.set('response_time_ms', (event.get('response_time_s').to_f * 1000).round)"
}
}
}
date {
match => ["odoo_timestamp", "yyyy-MM-dd HH:mm:ss,SSS"]
target => "@timestamp"
timezone => "Europe/Madrid"
}
# Detectar queries lentas referenciadas en el log de Odoo
if [log_message] =~ /slow query/ or [log_message] =~ /bad query/ {
mutate { add_tag => ["slow_query"] }
}
# Clasificar severidad de negocio
if [log_level] in ["ERROR", "CRITICAL"] {
mutate { add_field => { "alert_severity" => "high" } }
} else if [log_level] == "WARNING" {
mutate { add_field => { "alert_severity" => "medium" } }
}
}
if [service] == "postgresql" {
grok {
match => {
"message" => "%{TIMESTAMP_ISO8601:pg_timestamp} %{WORD:pg_tz} \[%{NUMBER:pg_pid:int}\] %{WORD:pg_user}@%{WORD:pg_db} %{LOGLEVEL:log_level}: %{GREEDYDATA:log_message}"
}
tag_on_failure => ["_grokparsefailure_pg"]
}
if [log_message] =~ /duration:/ {
grok {
match => { "log_message" => "duration: %{NUMBER:pg_query_duration_ms:float} ms" }
}
if [pg_query_duration_ms] and [pg_query_duration_ms] > 5000 {
mutate { add_tag => ["slow_query", "pg_slow_query"] }
}
}
}
mutate {
remove_field => ["agent", "ecs", "input", "log"]
}
}
output {
elasticsearch {
hosts => ["https://10.0.2.10:9200"]
index => "odoo-logs-%{+YYYY.MM.dd}"
user => "logstash_writer"
password => "<LOGSTASH_PASSWORD>"
ssl_certificate_verification => true
cacert => "/etc/logstash/certs/ca.crt"
}
}
Kibana dashboards: what to visualise
Once the logs are in Elasticsearch, Kibana allows you to build operational dashboards. These are the most useful panels for Odoo operations:
Dashboard 1: General status (on-call view)
- Error count by level in the last 24 h (ERROR, CRITICAL, WARNING).
- Time-series trend of errors and warnings (bar chart by hour).
- Top 10 loggers with the most errors (identify the problematic module).
- Average HTTP response time and 95th percentile (p95 > 3s is an alert signal).
Dashboard 2: Database performance
- Slow queries per hour (PG queries > 1s, > 5s, > 30s).
- Top 20 slowest queries with their truncated SQL text.
- Users/sessions generating the most load.
Dashboard 3: Workers and process health
- Worker restarts (pattern: process with a PID that disappears and a new one appears).
- Cron errors (filter by
logger: ir.cronandlog_level: ERROR). - Longpolling — active connections (gevent metric).
Dashboards are exported as NDJSON objects and imported into any Kibana instance with one click, making replication to staging environments straightforward.
Proactive alerts via Telegram bot
Telegram alerts are the layer that turns passive observability into active. The team receives an instant message when a threshold is exceeded, without having to watch Kibana.
Creating the Telegram bot
- Search for
@BotFatheron Telegram and run/newbot. - Save the token (
BOT_TOKEN). - Join the alerts channel or group and obtain the
CHAT_IDwith:curl https://api.telegram.org/bot<BOT_TOKEN>/getUpdates.
Python alerting script (lightweight ElastAlert alternative)
For small or medium environments, a Python script run via cron every minute is simpler and more transparent than full ElastAlert:
#!/usr/bin/env python3
# /opt/odoo-monitor/alert_odoo.py
"""Monitor de alertas Odoo -> Telegram.
Ejecuta cada minuto via cron: * * * * * /opt/odoo-monitor/venv/bin/python /opt/odoo-monitor/alert_odoo.py
"""
import os
import json
import requests
from datetime import datetime, timedelta, timezone
from elasticsearch import Elasticsearch
ES_HOST = os.environ["ES_HOST"] # https://10.0.2.10:9200
ES_USER = os.environ["ES_USER"]
ES_PASS = os.environ["ES_PASS"]
BOT_TOKEN = os.environ["TELEGRAM_BOT_TOKEN"]
CHAT_ID = os.environ["TELEGRAM_CHAT_ID"]
INDEX = "odoo-logs-*"
es = Elasticsearch(ES_HOST, basic_auth=(ES_USER, ES_PASS), verify_certs=True)
def send_telegram(message: str) -> None:
url = f"https://api.telegram.org/bot{BOT_TOKEN}/sendMessage"
requests.post(url, json={
"chat_id": CHAT_ID,
"text": message,
"parse_mode": "Markdown"
}, timeout=10)
def count_errors_last_minute() -> int:
now = datetime.now(timezone.utc)
one_min_ago = now - timedelta(minutes=1)
resp = es.count(index=INDEX, body={
"query": {
"bool": {
"must": [
{"terms": {"log_level.keyword": ["ERROR", "CRITICAL"]}},
{"range": {"@timestamp": {"gte": one_min_ago.isoformat(), "lte": now.isoformat()}}}
]
}
}
})
return resp["count"]
def get_slow_queries_last_minute() -> list:
now = datetime.now(timezone.utc)
one_min_ago = now - timedelta(minutes=1)
resp = es.search(index=INDEX, body={
"size": 5,
"query": {
"bool": {
"must": [
{"term": {"tags": "slow_query"}},
{"range": {"@timestamp": {"gte": one_min_ago.isoformat()}}}
]
}
},
"sort": [{"pg_query_duration_ms": "desc"}],
"_source": ["pg_query_duration_ms", "log_message", "@timestamp"]
})
return [h["_source"] for h in resp["hits"]["hits"]]
def main():
# Alerta 1: demasiados errores en el último minuto
error_count = count_errors_last_minute()
if error_count >= 5:
msg = (
f"*\u26a0\ufe0f ALERTA ODOO — ERRORES EN PRODUCCI\u00d3N*\n"
f"Se han detectado *{error_count} errores* en el \u00faltimo minuto.\n"
f"Revisa Kibana: https://kibana.skanndar.internal/app/dashboards\n"
f"`{datetime.now().strftime('%Y-%m-%d %H:%M:%S')} Europe/Madrid`"
)
send_telegram(msg)
# Alerta 2: queries lentas (> 5s)
slow_queries = get_slow_queries_last_minute()
if slow_queries:
top = slow_queries[0]
duration_s = top.get("pg_query_duration_ms", 0) / 1000
snippet = top.get("log_message", "")[:120].replace("`", "'")
msg = (
f"*\ud83d\udc22 QUERY LENTA EN POSTGRESQL*\n"
f"Duraci\u00f3n: *{duration_s:.1f}s*\n"
f"`{snippet}...`"
)
send_telegram(msg)
if __name__ == "__main__":
main()
Save the script to /opt/odoo-monitor/alert_odoo.py, create the virtual environment with pip install elasticsearch requests and add it to the system crontab:
# /etc/cron.d/odoo-monitor
* * * * * odoomonitor /opt/odoo-monitor/venv/bin/python /opt/odoo-monitor/alert_odoo.py
Environment variables are managed via a .env file loaded by the cron wrapper or by systemd if a service is preferred:
ES_HOST=https://10.0.2.10:9200
ES_USER=alert_reader
ES_PASS=<PASSWORD>
TELEGRAM_BOT_TOKEN=<TOKEN>
TELEGRAM_CHAT_ID=<CHAT_ID>
Recommended alert types for Odoo
| Condition | Threshold | Severity | Action |
|---|---|---|---|
| CRITICAL errors in 1 min | ≥ 1 | Critical | Immediate Telegram + PagerDuty |
| ERROR errors in 1 min | ≥ 5 | High | Immediate Telegram |
| Query PG > 30 s | Any | High | Telegram with SQL snippet |
| HTTP response time p95 > 5 s | 3 min sustained | High | Telegram |
| Worker restarted | Any | Medium | Telegram |
| Cron job failure | ≥ 2 in 10 min | Medium | Telegram |
| Query PG > 5 s | ≥ 10 in 5 min | Medium | Telegram (digest every 15 min) |
| Disk > 85 % | Any | Medium | Telegram |
| No Odoo logs for 5 min | Absence of events | Critical | Telegram (Odoo down) |
The last rule — alerting when logs do not arrive — is especially valuable: it detects when Odoo or Filebeat has crashed without generating any explicit error.
Odoo observability best practices
Retention and storage costs
Odoo production logs can generate between 500 MB and 5 GB per day depending on the logging level. Define a retention policy (ILM in Elasticsearch) with three phases: hot (7 days, SSD), warm (30 days, HDD), cold (90 days, compressed or S3). For medium-sized installations, 3 months of retention fits in under 100 GB.
Do not log at DEBUG level in production
The log_level = debug setting in odoo.conf generates 10–50× more data volume and includes sensitive information (field values, tokens). Use warn or info in production. Enable debug only temporarily and only on test databases.
Rotating Odoo logs with logrotate
# /etc/logrotate.d/odoo
/var/log/odoo/odoo.log {
daily
rotate 7
compress
delaycompress
missingok
notifempty
postrotate
/bin/kill -HUP $(cat /var/run/odoo/odoo.pid 2>/dev/null) 2>/dev/null || true
endscript
}
Separating business metrics from system metrics
ELK is ideal for logs and full-text search. For time-series metrics (CPU, PG connections, average response time), Prometheus + Grafana scales better and consumes fewer resources. A mature architecture combines both: ELK for logs and forensic analysis, Prometheus/Grafana for metrics and performance alerts.
Grouped alerts, not individual ones
If Odoo has a bug that generates 1,000 errors in a minute, you don't want to receive 1,000 Telegram messages. The example script groups them: it sends a single message with the count. For more sophisticated alerts, ElastAlert supports frequency, spike, flatline and cardinality as rule types, allowing complex patterns without writing code.
Securing the ELK stack
Since Elasticsearch 8.x, basic security is enabled by default (TLS between nodes, mandatory authentication). In earlier versions it was opt-in and many installations were left exposed. Always verify that Elasticsearch is not reachable from the internet on port 9200 and that Kibana requires authentication.
Result: what you will see in production
With this architecture in place, the operations team has:
- A dashboard in Kibana showing Odoo's status in real time, with drill-down to the exact error message in seconds.
- Telegram alerts that arrive before the user calls, with enough context to start diagnosing without opening SSH.
- A 90-day history enabling trend analysis: "are slow queries increasing on Tuesdays because of the billing cron?", "from which module version did the errors start?".
- Objective evidence for optimisation decisions: knowing that 80% of errors come from a single custom module changes sprint priorities.