An OpenAI implementation of the OmniAI interface supporting ChatGPT, Whisper, Text-to-Voice, Voice-to-Text, and more. This library is community maintained.
gem install omniai-openai
A client is setup as follows if ENV['OPENAI_API_KEY']
exists:
client = OmniAI::OpenAI::Client.new
A client may also be passed the following options:
-
api_key
(required - default isENV['OPENAI_API_KEY']
) -
organization
(optional) -
project
(optional) -
host
(optional) useful for usage with Ollama or LocalAI
Global configuration is supported for the following options:
OmniAI::OpenAI.configure do |config|
config.api_key = 'sk-...' # default: ENV['OPENAI_API_KEY']
config.organization = '...' # default: ENV['OPENAI_ORGANIZATION']
config.project = '...' # default: ENV['OPENAI_PROJECT']
config.host = '...' # default: 'https://api.openai.com' - override for usage with LocalAI / Ollama
end
Usage with LocalAI
LocalAI offers built in compatability with the OpenAI specification. To initialize a client that points to a Ollama change the host accordingly:
client = OmniAI::OpenAI::Client.new(host: 'http://localhost:8080', api_key: nil)
For details on installation or running LocalAI see the getting started tutorial.
Usage with Ollama
Ollama offers built in compatability with the OpenAI specification. To initialize a client that points to a Ollama change the host accordingly:
client = OmniAI::OpenAI::Client.new(host: 'http://localhost:11434', api_key: nil)
For details on installation or running Ollama checkout the project README.
A chat completion is generated by passing in prompts using any a variety of formats:
completion = client.chat('Tell me a joke!')
completion.choice.message.content # 'Why did the chicken cross the road? To get to the other side.'
completion = client.chat({
role: OmniAI::Chat::Role::USER,
content: 'Is it wise to jump off a bridge?'
})
completion.choice.message.content # 'No.'
completion = client.chat([
{
role: OmniAI::Chat::Role::SYSTEM,
content: 'You are a helpful assistant.'
},
'What is the capital of Canada?',
])
completion.choice.message.content # 'The capital of Canada is Ottawa.'
model
takes an optional string (default is gtp-4o
):
completion = client.chat('How fast is a cheetah?', model: OmniAI::OpenAI::Chat::Model::GPT_3_5_TURBO)
completion.choice.message.content # 'A cheetah can reach speeds over 100 km/h.'
temperature
takes an optional float between 0.0
and 2.0
(defaults is 0.7
):
completion = client.chat('Pick a number between 1 and 5', temperature: 2.0)
completion.choice.message.content # '3'
OpenAI API Reference temperature
stream
takes an optional a proc to stream responses in real-time chunks instead of waiting for a complete response:
stream = proc do |chunk|
print(chunk.choice.delta.content) # 'Better', 'three', 'hours', ...
end
client.chat('Be poetic.', stream:)
format
takes an optional symbol (:json
) and that setes the response_format
to json_object
:
completion = client.chat([
{ role: OmniAI::Chat::Role::SYSTEM, content: OmniAI::Chat::JSON_PROMPT },
{ role: OmniAI::Chat::Role::USER, content: 'What is the name of the drummer for the Beatles?' }
], format: :json)
JSON.parse(completion.choice.message.content) # { "name": "Ringo" }
OpenAI API Reference response_format
When using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message.
A transcription is generated by passing in a path to a file:
transcription = client.transcribe(file.path)
transcription.text # '...'
prompt
is optional and can provide additional context for transcribing:
transcription = client.transcribe(file.path, prompt: '')
transcription.text # '...'
format
is optional and supports json
, text
, srt
or vtt
:
transcription = client.transcribe(file.path, format: OmniAI::Transcribe::Format::TEXT)
transcription.text # '...'
OpenAI API Reference response_format
language
is optional and may improve accuracy and latency:
transcription = client.transcribe(file.path, language: OmniAI::Transcribe::Language::SPANISH)
transcription.text
temperature
is optional and must be between 0.0 (more deterministic) and 1.0 (less deterministic):
transcription = client.transcribe(file.path, temperature: 0.2)
transcription.text
OpenAI API Reference temperature
Speech can be generated by passing text with a block:
File.open('example.ogg', 'wb') do |file|
client.speak('How can a clam cram in a clean cream can?') do |chunk|
file << chunk
end
end
If a block is not provided then a tempfile is returned:
tempfile = client.speak('Can you can a can as a canner can can a can?')
tempfile.close
tempfile.unlink
voice
is optional and must be one of the supported voices:
client.speak('She sells seashells by the seashore.', voice: OmniAI::OpenAI::Speak::Voice::SHIMMER)
model
is optional and must be either tts-1
or tts-1-hd
(default):
client.speak('I saw a kitten eating chicken in the kitchen.', format: OmniAI::OpenAI::Speak::Model::TTS_1)
speed
is optional and must be between 0.25 and 0.40:
client.speak('How much wood would a woodchuck chuck if a woodchuck could chuck wood?', speed: 4.0)
format
is optional and supports MP3
(default), OPUS
, AAC
, FLAC
, WAV
or PCM
:
client.speak('A pessemistic pest exists amidst us.', format: OmniAI::OpenAI::Speak::Format::FLAC)