baidu/tera


An Internet-Scale Database.

License: BSD-3-Clause

Language: C++

Keywords: baidu, bigtable, c-plus-plus, data, database, hbase, nosql, storage


Tera - An Internet-Scale Database

Build Status Coverity Scan Build Status Documentation Status

Copyright 2015, Baidu, Inc.

Tera is a high performance distributed NoSQL database, which is inspired by google's BigTable and designed for real-time applications. Tera can easily scale to petabytes of data across thousands of commodity servers. Besides, Tera is widely used in many Baidu products with varied demands,which range from throughput-oriented applications to latency-sensitive service, including web indexing, WebPage DB, LinkBase DB, etc. (中文)

Features

  • Linear and modular scalability
  • Automatic and configurable sharding
  • Ranged and hashed sharding strategies
  • MVCC
  • Column-oriented storage and locality group support
  • Strictly consistent
  • Automatic failover support
  • Online schema change
  • Snapshot support
  • Support RAMDISK/SSD/DFS tiered cache
  • Block cache and Bloom Filters for real-time queries
  • Multi-type table support (RAMDISK/SSD/DISK table)
  • Easy to use C++/Java/Python/REST-ful API

Data model

Tera is the collection of many sparse, distributed, multidimensional tables. The table is indexed by a row key, column key, and a timestamp; each value in the table is an uninterpreted array of bytes.

  • (row:string, (column family+qualifier):string, time:int64) → string

To learn more about the schema, you can refer to BigTable.

Architecture

架构图

Tera has three major components: sdk, master and tablet servers.

  • SDK: a library that is linked into every application client to access Tera cluster.
  • Master: master is responsible for managing tablet servers and tablets, automatic load balance and garbage collection of files in filesystem.
  • Tablet Server: tablet server is the core module in tera, and it uses an enhance Leveldb as a basic storage engine. Tablet server manages a set of tablets, handles read/write/scan requests and schedule tablet split and merge online.

Building blocks

Tera is built on several pieces of open source infrastructure.

  • Filesystem (required)

    Tera uses the distributed file system to store transaction log and data files. So Tera uses an abstract file system interface, called Env, to adapt to different implementations of file systems (e.g., BFS, HDFS, HDFS2, POXIS filesystem).

  • Distributed lock service (required)

    Tera relies on a highly-available and persistent distributed lock service, which is used for a variety of tasks: to ensure that there is at most one active master at any time; to store meta table's location, to discover new tablet server and finalize tablet server deaths. Tera has an adapter class to adapt to different implementations of lock service (e.g., ZooKeeper, Nexus)

  • High performance RPC framework (required)

    Tera is designed to handle a variety of demanding workloads, which range from throughput-oriented applications to latency-sensitive service. So Tera needs a high performance network programming framework. Now Tera heavily relies on Sofa-pbrpc to meet the performance demand.

  • Cluster management system (not necessary)

    A Tera cluster in Baidu typically operates in a shared pool of machines that runs a wide variety of other distributed applications. So Tera can be deployed in a cluster management system Galaxy, which uses for scheduling jobs, managing resources on shared machines, dealing with machine failures, and monitoring machine status. Besides, Tera can also be deployed on RAW machine or in Docker container.

Documents

Quick start

Contributing to Tera

Contributions are welcomed and greatly appreciated.

Read Roadmap to get a general knowledge about our development plan.

See Contributions for more details.

Follow us

To join us, please send resume to {dist-lab, tera_dev, opensearch} at baidu.com.

Project Statistics

Sourcerank 7
Repository Size 14.2 MB
Stars 1,582
Forks 409
Watchers 201
Open issues 167
Dependencies 4
Contributors 24
Tags 23
Created
Last updated
Last pushed

Top Contributors See all

xupeilin 00k Yan Shiguang taocipian lylei ajie Sun Junyi lylei9 Miao Zhang Yang Ce Tony Chou sunjoy1984 zhengran junhui huang imotai baobaoyeye yyq224444 kreats Tlxxxxxxx owenliang

Recent Tags See all

v0.7.0 August 09, 2017
v0.5.7 April 16, 2017
v0.4.6 February 08, 2017
v0.5.5 January 19, 2017
v0.5.4 January 11, 2017
v0.4.5 January 09, 2017
v0.5.3-bin December 23, 2016
v0.5.3 December 23, 2016
v0.4.4 December 20, 2016
v0.5.2 December 08, 2016
v0.5.1 November 07, 2016
v0.3.2 August 31, 2016
0.3.2 August 29, 2016
0.4.2 August 29, 2016
0.5.1 August 11, 2016

Interesting Forks See all

bluebore/tera
A distributed table storage system.
C++ - BSD-3-Clause - Last pushed - 4 stars
opera/tera
An Internet-Scale Database.
C++ - BSD-3-Clause - Last pushed - 1 stars
cheneydeng/tera
An Internet-Scale Database.
C++ - Updated - 1 stars
Leeshine/tera
An Internet-Scale Database.
C++ - Updated - 1 stars
abael/tera
An Internet-Scale Database.
C++ - Updated - 1 stars

Something wrong with this page? Make a suggestion

Last synced: 2017-06-24 16:44:17 UTC

Login to resync this repository