pgai allows you to develop RAG, semantic search, and other AI applications directly in PostgreSQL

pgai simplifies the process of building search, Retrieval Augmented Generation (RAG), and other AI applications with PostgreSQL. It complements popular extensions for vector search in PostgreSQL like pgvector and pgvectorscale, building on top of their capabilities.

Overview

The goal of pgai is to make working with AI easier and more accessible to developers. Because data is the foundation of most AI applications, pgai makes it easier to leverage your data in AI workflows. In particular, pgai supports:

Working with embeddings generated from your data:

Automatically create and sync vector embeddings for your data (learn more)
Search your data using vector and semantic search (learn more)
Implement Retrieval Augmented Generation inside a single SQL statement (learn more)
Perform high-performance, cost-efficient ANN search on large vector workloads with pgvectorscale, which complements pgvector.

Leverage LLMs for data processing tasks:

Retrieve LLM chat completions from models like Claude Sonnet 3.5, OpenAI GPT4o, Cohere Command, and Llama 3 (via Ollama). (learn more)
Reason over your data and facilitate use cases like classification, summarization, and data enrichment on your existing relational data in PostgreSQL (see an example).

Learn more about pgai: To learn more about the pgai extension and why we built it, read pgai: Giving PostgreSQL Developers AI Engineering Superpowers.

Contributing: We welcome contributions to pgai! See the Contributing page for more information.

Demo: pgai Vectorizer

Supported features in each model

Model	Tokenize	Embed	Chat Complete	Generate	Moderate	Classify	Rerank
Ollama		✔️	✔️	✔️
OpenAI	✔️️	✔️	✔️		✔️
Anthropic				✔️
Cohere	✔️	✔️	✔️			✔️	✔️

Getting Started

Here's how to get started with pgai:

For a quick start, try out automatic data embedding using pgai Vectorizer:

Try our cloud offering by creating a free trial account and heading over to our pgai Vectorizer documentation.
or check out our quick start guide to get up and running in less than 10 minutes with a self-hosted Postgres instance.

For other use cases, first Install pgai in Timescale Cloud, a pre-built Docker image, or from source. Then, choose your own adventure:

Automate AI embedding with pgai Vectorizer.
Use pgai to integrate AI from your provider. Some examples:
- Ollama - configure pgai for Ollama, then use the model to embed, chat complete and generate.
- OpenAI - configure pgai for OpenAI, then use the model to tokenize, embed, chat complete and moderate. This page also includes advanced examples.
- Anthropic - configure pgai for Anthropic, then use the model to generate content.
- Cohere - configure pgai for Cohere, then use the model to tokenize, embed, chat complete, classify, and rerank.
Leverage LLMs for data processing tasks such as classification, summarization, and data enrichment (see the OpenAI example).

Installation

The fastest ways to run PostgreSQL with the pgai extension are to:

Create your database environment. Either:
Enable the pgai extension.
Use pgai.

Use a pre-built Docker container

Run the TimescaleDB Docker image, then enable the pgai extension.

Use a Timescale Cloud service

pgai is available for new or existing Timescale Cloud services. For any service, enable the pgai extension.

Install from source

To install pgai from source on a PostgreSQL server:

Install the prerequisite software system-wide
- PostgreSQL: Version 16 or newer is required.
- Python3: if running python3 --version in Terminal returns command not found, download and install the latest version of Python3.
- Pip: if running pip --version in Terminal returns command not found:
  - Standard installation: use one of the pip supported methods.
  - Virtual environment: usually, pip is automatically installed if you are working in a Python virtual environment. If you are running PostgreSQL in a virtual environment, pgai requires several python packages. Set the PYTHONPATH and VIRTUAL_ENV environment variables before you start your PostgreSQL server.
```
PYTHONPATH=/path/to/venv/lib/python3.12/site-packages \
VIRTUAL_ENV=/path/to/venv \
pg_ctl -D /path/to/data -l logfile start
```
- PL/Python: follow How to install Postgres 16 with plpython3u: Recipes for macOS, Ubuntu, Debian, CentOS, Docker.
  
  macOS: the standard PostgreSQL brew in Homebrew does not include the plpython3 extension. These instructions show how to install from an alternate tap.
  - Postgresql plugin for the asdf version manager: set the --with-python option when installing PostgreSQL:
```
POSTGRES_EXTRA_CONFIGURE_OPTIONS=--with-python asdf install postgres 16.3
```
- pgvector: follow the install instructions from the official repository.
These extensions are automatically added to your PostgreSQL database when you Enable the pgai extension.
Install the pgai PostgreSQL extension:
```
just ext install
```
We use just to run project commands. If you don't have just you can install the extension with:
```
projects/extension/build.py install
```
Enable the pgai extension.

Enable the pgai extension in your database

Connect to your database with a postgres client like psql v16 or PopSQL.

psql -d "postgres://<username>:<password>@<host>:<port>/<database-name>"

Create the pgai extension:
```
CREATE EXTENSION IF NOT EXISTS ai CASCADE;
```
The CASCADE automatically installs pgvector and plpython3u extensions.

Usage of pgai

The main features in pgai are:

Working with embeddings generated from your data:

Automatically create and sync vector embeddings for your data
Search your data using vector and semantic search
Implement Retrieval Augmented Generation inside a single SQL statement

Leverage LLMs for data processing tasks: You can use pgai to integrate AI from the following providers:

Learn how to moderate content directly in the database using triggers and background jobs.

Automatically create and sync LLM embeddings for your data

The pgvector and pgvectorscale extensions allow you to store vector embeddings in your database and perform fast and efficient vector search. The pgai Vectorizer builds on top of these extensions to automatically create and synchronize embeddings for any text data in your database.

With one line of code, you can define a vectorizer that creates embeddings for data in a table:

SELECT ai.create_vectorizer(
    <table_name>::regclass,
    destination => <embedding_table_name>,
    embedding => ai.embedding_openai(<model_name>, <dimensions>),
    chunking => ai.chunking_recursive_character_text_splitter(<column_name>)
);

This newly created vectorizer will automatically track any changes to the data in the source table and update the destination embedding table with the new embeddings asynchronously.

Automate AI embedding with pgai Vectorizer shows you how to implement embeddings in your own data. When you create Vectorizers in a Timescale Cloud database, embeddings are automatically created and synchronized in the background. On a self-hosted Postgres installation, you use a Vectorizer Worker to asynchronously processes your vectorizers.

Search your data using vector and semantic search

pgai exposes a set of functions to directly interact with the LLM models through SQL, enabling you to do semantic search directly in your database:

SELECT
   chunk,
   embedding <=> ai.openai_embed(<embedding_model>, 'some-query') as distance
FROM <embedding_table>
ORDER BY distance
LIMIT 5;

This is a perfectly normal SQL query. You can combine it with where clauses and other SQL features to further refine your search. pgai solves the missing where clause in vector search problem for real.

Implement Retrieval Augmented Generation inside a single SQL statement

Similar to semantic search, pgai LLM functions enable you to implement RAG directly in your database. For example:

Create a RAG function:

CREATE OR REPLACE FUNCTION generate_rag_response(query_text TEXT)
RETURNS TEXT AS $$
DECLARE
   context_chunks TEXT;
   response TEXT;
BEGIN
   -- Perform similarity search to find relevant blog posts
   SELECT string_agg(title || ': ' || chunk, ' ') INTO context_chunks
   FROM (
       SELECT title, chunk
       FROM blogs_embedding
       ORDER BY embedding <=> ai.openai_embed('text-embedding-3-small', query_text)
       LIMIT 3
   ) AS relevant_posts;

   -- Generate a summary using gpt-4o-mini
   SELECT ai.openai_chat_complete(
       'gpt-4o-mini',
       jsonb_build_array(
           jsonb_build_object('role', 'system', 'content', 'You are a helpful assistant. Use only the context provided to answer the question. Also mention the titles of the blog posts you use to answer the question.'),
           jsonb_build_object('role', 'user', 'content', format('Context: %s\n\nUser Question: %s\n\nAssistant:', context_chunks, query_text))
       )
   )->'choices'->0->'message'->>'content' INTO response;

   RETURN response;
END;
$$ LANGUAGE plpgsql;

Execute your function in a SQL query:

SELECT generate_rag_response('Give me some startup advice');

Get involved

pgai is still at an early stage. Now is a great time to help shape the direction of this project; we are currently deciding priorities. Have a look at the list of features we're thinking of working on. Feel free to comment, expand the list, or hop on the Discussions forum.

To get started, take a look at how to contribute and how to set up a dev/test environment.

About Timescale

Timescale is a PostgreSQL database company. To learn more visit the timescale.com.

Timescale Cloud is a high-performance, developer focused, cloud platform that provides PostgreSQL services for the most demanding AI, time-series, analytics, and event workloads. Timescale Cloud is ideal for production applications and provides high availability, streaming backups, upgrades over time, roles and permissions, and great security.