Skip to content

Migrating from Datadog to obskit + OSS Stack

This guide helps teams that are running Datadog APM, metrics, and logging who want to move to an open-source observability stack with no vendor lock-in.

Target OSS stack

  • Traces → Grafana Tempo (OTLP ingest)
  • Metrics → Prometheus + Grafana
  • Logs → Grafana Loki (via promtail or OTLP)
  • Dashboards → Grafana
  • Alerts → Alertmanager / Grafana Alerting

Why Migrate from Datadog?

Concern Datadog obskit + OSS
Vendor lock-in Datadog-proprietary SDK, trace format OTLP open standard; switch backends without code changes
Cost $3–$5 per host per month for APM; log ingestion billed per GB Self-hosted; infrastructure cost only
Data residency Data leaves your network to Datadog's SaaS Runs entirely in your own infrastructure
Cardinality limits Datadog enforces tag cardinality limits You control Prometheus cardinality
Custom spans Requires Datadog-specific ddtrace decorators Standard OTel API; portable
Portability ddtrace is Datadog-only obskit uses OTel; works with Tempo, Jaeger, Zipkin

Installation

Bash
# Remove Datadog SDK
pip uninstall ddtrace datadog

# Install obskit
pip install "obskit[all]"

# Infrastructure (docker-compose or Helm)
# See: docs/guides/docker-compose.md

Tracing: ddtrace → setup_tracing with OTLP to Tempo

Before — ddtrace

Python
# In your application startup (often via DD_TRACE_ENABLED env var)
import ddtrace
ddtrace.patch_all()

from ddtrace import tracer, patch

# Manual spans
with tracer.trace("process_order", service="order-service", resource="POST /orders") as span:
    span.set_tag("order.id", order_id)
    result = process()
    span.set_tag("result.status", result.status)
Bash
# Environment variables
DD_AGENT_HOST=localhost
DD_TRACE_AGENT_PORT=8126
DD_SERVICE=order-service
DD_ENV=production
DD_VERSION=1.2.3

After — obskit + Tempo

Python
from obskit.tracing import setup_tracing, trace_span
from obskit.config import configure

configure(
    service_name="order-service",
    environment="production",
    version="1.2.3",
)
# Auto-instruments FastAPI, SQLAlchemy, httpx, Redis, etc.
setup_tracing(exporter_endpoint="http://tempo:4317")

# Manual spans
with trace_span("process_order", attributes={"order.id": order_id}) as span:
    result = process()
    span.set_attribute("result.status", result.status)
Bash
# Environment variables
OBSKIT_SERVICE_NAME=order-service
OBSKIT_ENVIRONMENT=production
OBSKIT_VERSION=1.2.3
OBSKIT_OTLP_ENDPOINT=http://tempo:4317

Metrics: DogStatsd → Prometheus + Grafana

Datadog uses the DogStatsd format for custom metrics. obskit uses Prometheus counters, histograms, and gauges — the standard for cloud-native observability.

Before — DogStatsd

Python
from datadog import initialize, statsd

initialize(statsd_host="localhost", statsd_port=8125)

def handle_payment(amount: float, currency: str) -> None:
    statsd.increment("payment.processed", tags=[f"currency:{currency}"])
    statsd.histogram("payment.amount", amount, tags=[f"currency:{currency}"])
    with statsd.timed("payment.duration", tags=[f"currency:{currency}"]):
        process_payment(amount, currency)

After — obskit REDMetrics

Python
import time
from obskit.metrics import REDMetrics

payments = REDMetrics("payment_service")

def handle_payment(amount: float, currency: str) -> None:
    start = time.perf_counter()
    try:
        process_payment(amount, currency)
        payments.record_request(
            "/payments", "POST",
            status="success",
            duration=time.perf_counter() - start,
            extra_labels={"currency": currency},
        )
    except Exception:
        payments.record_request(
            "/payments", "POST",
            status="error",
            duration=time.perf_counter() - start,
            extra_labels={"currency": currency},
        )
        raise

DogStatsd metric type mapping

DogStatsd obskit / Prometheus
statsd.increment("metric") Counter("metric_total", …).inc()
statsd.gauge("metric", value) Gauge("metric", …).set(value)
statsd.histogram("metric", value) Histogram("metric", …).observe(value)
statsd.timed("metric") context manager REDMetrics.record_request() with duration
statsd.event(title, text) Grafana annotation via API
statsd.service_check(name, status) obskit.health.HealthChecker

Logs: Datadog Agent Log Collection → Grafana Loki

Before — Datadog JSON logs

Python
import json
import logging

class DatadogFormatter(logging.Formatter):
    def format(self, record):
        return json.dumps({
            "message": record.getMessage(),
            "level": record.levelname.lower(),
            "dd.trace_id": getattr(record, "dd.trace_id", None),
            "dd.span_id": getattr(record, "dd.span_id", None),
        })

logger = logging.getLogger(__name__)
handler = logging.StreamHandler()
handler.setFormatter(DatadogFormatter())
logger.addHandler(handler)

After — obskit + Loki

Python
from obskit.logging import get_logger
from obskit.config import configure

configure(service_name="order-service", log_format="json")
logger = get_logger(__name__)

# trace_id and span_id are injected automatically
# Logs are in OpenTelemetry-compatible JSON format
logger.info("payment_processed", amount=99.99, currency="USD")

Log shipping to Loki: Use promtail to tail stdout and push to Loki, or set OBSKIT_OTLP_ENDPOINT to ship logs directly via OTLP.

YAML
# promtail config snippet
scrape_configs:
  - job_name: order-service
    static_configs:
      - targets: [localhost]
        labels:
          app: order-service
          __path__: /var/log/order-service/*.log
    pipeline_stages:
      - json:
          expressions:
            level: level
            trace_id: trace_id
      - labels:
          level:
          trace_id:

Datadog APM Correlations → Grafana Correlations

In Datadog, you click a log and jump to the trace. Grafana provides the same feature via Correlations (configured in Grafana UI: Explore → Correlations).

obskit automatically includes trace_id in every log record during an active span. Configure a Grafana correlation from Loki to Tempo:

  • Field: trace_id
  • Target datasource: Tempo
  • URL: ${__value.raw}

This recreates the Datadog log→trace drill-down experience.


Datadog Service Map → Grafana Service Graph

Datadog's Service Map is replicated in Grafana via the Service Graph panel (requires Tempo with metrics-generator enabled).

YAML
# tempo.yaml
metrics_generator:
  registry:
    external_labels:
      source: tempo
  storage:
    path: /tmp/tempo/generator
  traces_storage:
    path: /tmp/tempo/traces
  processor:
    service_graphs:
      enabled: true
    span_metrics:
      enabled: true

Cost Comparison

Rough estimate for a 10-service deployment

Component Datadog OSS Stack
APM ~$40/host/month Tempo: $0 (compute only)
Logs (10 GB/day) ~$1.50/GB/month = ~$450/month Loki: ~$0.03/GB/month = ~$9/month
Metrics ~$5/custom metric/month Prometheus: $0
Total (estimate) ~$1,000/month ~$50–$200/month (compute)

Actual costs depend heavily on traffic volume, retention, and cloud provider.


Migration Checklist

  • [ ] Deploy the OSS stack (docker-compose or Helm) — see Docker Compose guide
  • [ ] Replace ddtrace.patch_all() with setup_tracing(exporter_endpoint="http://tempo:4317")
  • [ ] Replace ddtrace.tracer.trace() spans with trace_span() / async_trace_span()
  • [ ] Replace DogStatsd statsd.increment/gauge/histogram with REDMetrics
  • [ ] Replace Datadog logger formatter with obskit.logging.get_logger()
  • [ ] Configure promtail or OTLP collector to ship logs to Loki
  • [ ] Set up Grafana correlations (Loki trace_id → Tempo)
  • [ ] Set up Grafana Service Graph (Tempo metrics-generator)
  • [ ] Migrate Datadog monitors to Grafana Alerting rules
  • [ ] Remove ddtrace and datadog packages from requirements.txt
  • [ ] Update DD_* environment variables to OBSKIT_*
  • [ ] Run python -m obskit.core.diagnose to verify the install
  • [ ] Verify traces appear in Tempo
  • [ ] Verify metrics appear in Prometheus / Grafana
  • [ ] Verify logs appear in Loki with trace_id labels