OpenTelemetry Collector

The OTel Collector is a vendor-neutral proxy that receives, processes, and exports telemetry. It sits between your application and your observability backends.

Architecture

┌─────────────────────────────────────────────────────────────┐
│                        Collector                            │
│                                                             │
│  ┌──────────┐    ┌────────────┐    ┌───────────┐           │
│  │ Receivers│───▶│ Processors │───▶│ Exporters │           │
│  └──────────┘    └────────────┘    └───────────┘           │
│        │               │                                   │
│        ▼               ▼                                   │
│  ┌──────────┐    ┌────────────┐                           │
│  │ Extensions│   │ Connectors │                           │
│  └──────────┘    └────────────┘                           │
└─────────────────────────────────────────────────────────────┘

Pipeline

Receivers

Receivers ingest telemetry in vendor-specific formats:

ReceiverProtocolSignal
otlpgRPC/HTTPtraces, metrics, logs
jaegerThrift/gRPCtraces
zipkinHTTPtraces
prometheusHTTP pullmetrics
prometheusremotewriteHTTP remote writemetrics
hostmetricsSystem callsmetrics
kafkaKafkatraces, metrics, logs
filelogFile taillogs
syslogSysloglogs

Processors

Processors act on data mid-pipeline:

ProcessorFunction
batchBatches spans/metrics/logs to reduce export calls
memory_limiterPrevents OOM by rejecting data when memory is high
transformModify attributes using OTTL (OpenTelemetry Transformation Language)
filterFilter spans/metrics/logs by criteria
resourceAdd/modify resource attributes
attributesAdd/modify span/log attributes
probabilistic_samplerSample a % of traces
tail_samplingSample based on policies (error, latency, etc.)
routingRoute to different exporters based on criteria
k8sattributesInject Kubernetes metadata (pod name, namespace, etc.)

Exporters

Exporters send data to backends:

ExporterBackend
otlpAny OTel-native backend
otlphttpAny backend via HTTP
jaegerJaeger
zipkinZipkin
prometheusPrometheus (pull or remote_write)
prometheusremotewritePrometheus via remote write
lokiGrafana Loki (logs)
datadogDatadog
awsxrayAWS X-Ray
awsemfAWS CloudWatch EMF (metrics)
azuremonitorexporterAzure Monitor
googlecloudmonitoringGCP Cloud Monitoring
loggingStdout (debug)
fileFile (debug)

Minimal Collector Config

receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318
 
processors:
  batch:
    timeout: 1s
    send_batch_size: 1024
  memory_limiter:
    limit_mib: 512
    check_interval: 1s
 
exporters:
  otlp:
    endpoint: http://tempo:4317
    tls:
      insecure: true
 
service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [batch, memory_limiter]
      exporters: [otlp]
    metrics:
      receivers: [otlp]
      processors: [batch, memory_limiter]
      exporters: [otlp]
    logs:
      receivers: [otlp]
      processors: [batch, memory_limiter]
      exporters: [otlp]

Deployment Modes

Agent Mode (Sidecar / DaemonSet)

Collector runs as a sidecar or daemonset on each node. Applications send telemetry locally.

Pod → OTel Agent (localhost) → OTel Gateway → Backend

Use when: You want to reduce backend connections from applications, add local batching/compression.

Gateway Mode (Central)

A single Collector Deployment acts as a central aggregation point.

App → OTel Agent → OTel Gateway → Backend
App → OTel Agent ──────────────────▶
App → OTel Agent ──────────────────▶

Use when: You want a single choke point for routing, filtering, sampling.

Standalone

Collector runs as a single process doing everything (receivers + exporters directly).

Use when: Small deployments, local development.

Extensions

Extensions are non-pipeline components (health checks, monitoring, etc.):

ExtensionPurpose
zpagesIn-process debug pages (trace stats, span names)
health_checkHTTP health endpoint at /
pprofGo profiling endpoint at localhost:1777
memory_ballastAllocates virtual memory to reduce GC pressure
oidcauthAuthenticate exports using OIDC tokens
extensions:
  health_check:
    endpoint: 0.0.0.0:13133
  zpages:
    endpoint: 0.0.0.0:55679
  memory_ballast:
    size_mib: 64
 
service:
  extensions: [health_check, zpages, memory_ballast]
  pipelines:
    # ...

Processors in Detail

Batch Processor

Batches data to reduce HTTP/gRPC call overhead:

processors:
  batch:
    timeout: 5s              # Flush after N seconds
    send_batch_size: 8192    # Or after N items
    send_batch_max_size: 8192  # Max batch size (vs send_batch_size which is target)

Memory Limiter

Protects against OOM when backend is slow/unavailable:

processors:
  memory_limiter:
    limit_mib: 512           # Hard limit
    spike_limit_mib: 128     # Spike allowance
    check_interval: 1s

Tail Sampling

Sample traces after collection based on policies:

processors:
  tail_sampling:
    decision_wait: 10s
    num_traces: 100000
    expected_new_traces_per_sec: 100
    policies:
      - name: errors-policy
        type: status_code
        status_code: {status_codes: [ERROR]}
      - name: slow-traces-policy
        type: latency
        latency: {threshold_ms: 1000}
      - name: probabilistic-policy
        type: probabilistic
        probabilistic: {sampling_percentage: 10}
      - name: latency-slo-policy
        type: and
        and: {and_policy_requirements:
          - policy: latency
            latency: {threshold_ms: 100}
          - policy: status_code
            status_code: {status_codes: [OK]}
        }

K8s Attributes Processor

Adds Kubernetes metadata to spans:

processors:
  k8sattributes:
    extract:
      metadata:
        - k8s.namespace.name
        - k8s.deployment.name
        - k8s.pod.name
        - k8s.pod.uid
        - k8s.pod.start_time
        - k8s.container.name
        - k8s.container.restart_count
    filter:
      node: ".*worker.*"   # Only pods on worker nodes

Connectors (Beta)

Connectors join two pipelines — they act as both an exporter (for one signal) and receiver (for another), enabling signal-to-signal routing:

connectors:
  spanmetrics:
    metrics_exporter: prometheus
 
service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [batch]
      exporters: [otlp, spanmetrics]  # spanmetrics connector
    metrics:
      receivers: [spanmetrics]         # receives from spanmetrics connector
      exporters: [prometheus]

This creates RED metrics (Request rate, Error rate, Duration) from traces automatically.

Resource Attributes in Collector

The resource processor adds/overrides resource attributes:

processors:
  resource:
    attributes:
      - action: upsert
        key: cloud.region
        value: us-east-1
      - action: upsert
        key: environment
        from_attribute: ENV
        # ENV env var becomes "environment" attribute

Performance Notes

  • Batch processor is essential — reduces export overhead 10-100x
  • Memory limiter should always be on in production
  • Ballast (extension) reduces Go GC pauses but Collector v0.91+ recommends NOT using ballast (changed memory management)
  • Workers on exporters enable parallel sends:
    exporters:
      otlp:
        workers: 10

Collector Binary

The Collector has two binaries:

BinaryUse
otelcolStandard binary (import from otel-contrib)
otelcol-contribCommunity-contrib receivers/exporters

For Kubernetes, use the OpenTelemetry Operator or the Helm chart:

helm repo add open-telemetry https://open-telemetry.github.io/opentelemetry-helm-charts
helm install otel-collector open-telemetry/opentelemetry-collector \
  --set mode=daemonset \
  --set config.receivers.otlp.protocols.grpc.endpoint=0.0.0.0:4317 \
  --set config.receivers.otlp.protocols.http.endpoint=0.0.0.0:4318