kafka_replayer

Timestamp-based Kafka topic replayer


Keywords
kafka, consumer, replayer, replay
License
Apache-2.0
Install
pip install kafka_replayer==1.0.1

Documentation

Python Kafka Replayer

https://circleci.com/gh/SiftScience/python-kafka-replayer/tree/master.svg?style=svg

kafka_replayer is a library that helps consume time ranges of messages from Kafka topics. While the standard Kafka consumer API allows seeking to a specific offset and replaying from there, using offsets as the replay abstraction is cumbersome and potentially error-prone. This library does the translation from timestamps to offsets transparently.

This library is written in Python, and leverages kafka-python's consumer to poll Kafka for messages.

Installing

$ pip install kafka_replayer

Using

import json
import kafka_replayer

des_fn = lambda x: json.loads(x) if x else None
replayer = kafka_replayer.KafkaReplayer('my-topic',
                                        bootstrap_servers=['localhost:9092'],
                                        key_deserializer=des_fn,
                                        value_deserializer=des_fn)

# Replay all records between the start and end millis timestamps
for record in replayer.replay(1469467314341, 1469467907549):
    print record

Documentation

http://pythonhosted.org/kafka_replayer/

License

See LICENSE.