trustgraph-parquet

TrustGraph provides a means to run a pipeline of flexible AI processing components in a flexible means to achieve a processing pipeline.


Keywords
agent-development, agent-framework, agentic, ai-agents, ai-developer-tools, ai-development, ai-engine, ai-infra, ai-privacy, ai-rag-product-development, data-management, developer-tools, development-engine, development-environment, development-platform, graph-rag, graphrag, knowledge-graph, on-device-ai, rag
Licenses
GPL-3.0/GPL-3.0+
Install
pip install trustgraph-parquet==0.16.5

Documentation

TrustGraph banner

Connect Data Silos with Explainable AI

PyPI version Discord

🚀 Get Started 🧑‍ðŸ’ŧ CLI Docs 📚 YouTube 💎 Discord 📖 Blog 📋 Use Cases

TrustGraph is a full AI powered data engineering platform. Extract your documents to knowledge graphs and vector embeddings with customizable data extraction agents. Deploy AI agents that leverage your data to generate reliable and accurate AI responses.

Key Features

  • 📄 Document Extraction: Bulk ingest documents such as .pdf,.txt, and .md
  • 🊓 Adjustable Chunking: Choose your chunking algorithm and parameters
  • 🔁 No-code LLM Integration: Anthropic, AWS Bedrock, AzureAI, AzureOpenAI, Cohere, Google AI Studio, Google VertexAI, Llamafiles, Ollama, and OpenAI
  • 📖 Entity, Topic, and Relationship Knowledge Graphs
  • ðŸ”Ē Mapped Vector Embeddings
  • ❔No-code GraphRAG Queries: Automatically perform a semantic similiarity search and subgraph extraction for the context of LLM generative responses
  • ðŸĪ– Agent Flow: Define custom tools used by a ReAct style Agent Manager that fully controls the response flow including the ability to perform GraphRAG requests
  • 🎛ïļ Production-Grade reliability, scalability, and accuracy
  • 🔍 Observability: get insights into system performance with Prometheus and Grafana
  • 🗄ïļ AI Powered Data Warehouse: Load only the subgraph and vector embeddings you use most often
  • ðŸŠī Customizable and Extensible: Tailor for your data and use cases
  • ðŸ–Ĩïļ Configuration UI: Build the YAML configuration with drop down menus and selectable parameters

Get Started

There are two primary ways of interacting with TrustGraph:

  • TrustGraph CLI
  • Configuration UI

The TrustGraph CLI installs the commands for interacting with TrustGraph while running. The Configuration UI enables customization of TrustGraph deployments prior to launching.

Install the TrustGraph CLI

pip3 install trustgraph-cli==0.15.6

Note

The TrustGraph CLI version must match the desired TrustGraph release version.

The full CLI docs are here.

Configuration UI

While TrustGraph is endlessly customizable through the YAML launch files, the Configuration UI can build a custom configuration in seconds that deploys with Docker, Podman, Minikube, or Google Cloud. There is a Configuration UI for the both the lastest and stable TrustGraph releases.

The Configuration UI has three sections:

  • Component Selection ✅: Choose from the available deployment platforms, LLMs, graph store, VectorDB, chunking algorithm, chunking parameters, and LLM parameters
  • Customization 🧰: Customize the prompts for the LLM System, Data Extraction Agents, and Agent Flow
  • Finish Deployment 🚀: Download the launch YAML files with deployment instructions

The Configuration UI will generate the YAML files in deploy.zip. Once deploy.zip has been downloaded and unzipped, launching TrustGraph is as simple as navigating to the deploy directory and running:

docker compose up -d

Tip

Docker is the recommended container orchestration platform for first getting started with TrustGraph.

When finished, shutting down TrustGraph is as simple as:

docker compose down -v

TrustGraph Releases

TrustGraph releases are available here. Download deploy.zip for the desired release version.

Release Type Release Version
Latest 0.16.3
Stable 0.15.6

TrustGraph is fully containerized and is launched with a YAML configuration file. Unzipping the deploy.zip will add the deploy directory with the following subdirectories:

  • docker-compose
  • minikube-k8s
  • gcp-k8s

Each directory contains the pre-built YAML configuration files needed to launch TrustGraph:

Model Deployment Graph Store Launch File
AWS Bedrock API Cassandra tg-bedrock-cassandra.yaml
AWS Bedrock API Neo4j tg-bedrock-neo4j.yaml
AzureAI API Cassandra tg-azure-cassandra.yaml
AzureAI API Neo4j tg-azure-neo4j.yaml
AzureOpenAI API Cassandra tg-azure-openai-cassandra.yaml
AzureOpenAI API Neo4j tg-azure-openai-neo4j.yaml
Anthropic API Cassandra tg-claude-cassandra.yaml
Anthropic API Neo4j tg-claude-neo4j.yaml
Cohere API Cassandra tg-cohere-cassandra.yaml
Cohere API Neo4j tg-cohere-neo4j.yaml
Google AI Studio API Cassandra tg-googleaistudio-cassandra.yaml
Google AI Studio API Neo4j tg-googleaistudio-neo4j.yaml
Llamafile API Cassandra tg-llamafile-cassandra.yaml
Llamafile API Neo4j tg-llamafile-neo4j.yaml
Ollama API Cassandra tg-ollama-cassandra.yaml
Ollama API Neo4j tg-ollama-neo4j.yaml
OpenAI API Cassandra tg-openai-cassandra.yaml
OpenAI API Neo4j tg-openai-neo4j.yaml
VertexAI API Cassandra tg-vertexai-cassandra.yaml
VertexAI API Neo4j tg-vertexai-neo4j.yaml

Once a configuration launch file has been selected, deploy TrustGraph with:

Docker:

docker compose -f <launch-file.yaml> up -d

Kubernetes:

kubectl apply -f <launch-file.yaml>

Architecture

architecture

TrustGraph is designed to be modular to support as many LLMs and environments as possible. A natural fit for a modular architecture is to decompose functions into a set of modules connected through a pub/sub backbone. Apache Pulsar serves as this pub/sub backbone. Pulsar acts as the data broker managing data processing queues connected to procesing modules.

Pulsar Workflows

  • For processing flows, Pulsar accepts the output of a processing module and queues it for input to the next subscribed module.
  • For services such as LLMs and embeddings, Pulsar provides a client/server model. A Pulsar queue is used as the input to the service. When processed, the output is then delivered to a separate queue where a client subscriber can request that output.

Data Extraction Agents

TrustGraph extracts knowledge documents to an ultra-dense knowledge graph using 3 automonous data extraction agents. These agents focus on individual elements needed to build the knowledge graph. The agents are:

  • Topic Extraction Agent
  • Entity Extraction Agent
  • Relationship Extraction Agent

The agent prompts are built through templates, enabling customized data extraction agents for a specific use case. The data extraction agents are launched automatically with the loader commands.

PDF file:

tg-load-pdf <document.pdf>

Text or Markdown file:

tg-load-text <document.txt>

RAG Queries

Once the knowledge graph and embeddings have been built or a knowledge core has been loaded, RAG queries are launched with a single line:

tg-query-graph-rag -q "Write a blog post about the 5 key takeaways from SB1047 and how they will impact AI development."

Deploy and Manage TrustGraph

🚀 Full Deployment Guide 🚀

TrustGraph Developer's Guide

Developing for TrustGraph