data_streams

DataDog data streams library for Elixir


Keywords
data-streams, datadog, elixir, kafka
License
Other

Documentation

Data Streams Ex

This is a port of the data-streams-go library to Elixir.

Introduction

This product is meant to measure end to end latency in async pipelines. It's in an Alpha phase. We currently support instrumentations of async pipelines using Kafka. Integrations with other systems will follow soon.

Glossary

  • Data stream: A set of services connected together via queues
  • Pathway: A single branch of connected services
  • Queue: A connection between two services
  • Edge latency: Latency of a queue between two services
  • Latency from origin: Latency from the first tracked service, down to the current service
  • Checkpoint: records at what time a specific operation on a payload occurred (eg: The payload was sent to Kafka). The product can then measure latency between checkpoints.

The product can measure edge latency, and latency from origin, for a set of checkpoints connected together via queues. To do so, we propagate timestamps, and a hash of the path that messages took with the payload.

Installation

Just add data_streams to your mix.exs file like so:

def deps do
  [
    {:data_streams, "~> 1.2.2"}
  ]
end

Documentation is automatically generated and published to HexDocs.

Elixir instrumentation

Prerequisites

You will need to configure the pipeline with the trace agent URL and enable it to start on application start. This can be done via your config/ files:

config :data_streams, :pipeline,
  enabled?: true,
  host: "localhost",
  port: 8126

The host and port should point to your Datadog agent.

The instrumentation relies on creating checkpoints at various points in your data stream services with specific tags, recording the pathway that messages take along the way. For a complete picture of your services, you will also need to configure some metadata about the current service running. You can do this in your config/ files as well:

config :data_streams, :metadata,
  service: "my-service",
  env: "production",
  primary_tag: "datacenter:d1" # You can leave this blank if you don't have a primary tag.

We recommend you keep these tags matching all your other instrumentation, like Open Telemetry and :telemetry, to ensure Datadog can aggregate data accurately.

Integrations

This library contains integration modules to help integrate with various async data pipelines. See one of these modules for usage details.

  • Datadog.DataStreams.Integrations.Kafka