Lord of Large Language Models (LoLLMs) Server is a text generation server based on large language models. It provides a Flask-based API for generating text using various pre-trained language models. This server is designed to be easy to install and use, allowing developers to integrate powerful text generation capabilities into their applications.
- Fully integrated library with access to bindings, personalities and helper tools.
- Generate text using large language models.
- Supports multiple personalities for generating text with different styles and tones.
- Real-time text generation with WebSocket-based communication.
- RESTful API for listing personalities and adding new personalities.
- Easy integration with various applications and frameworks.
- Possibility to send files to personalities
- Possibility to run on multiple nodes and provide a generation service to many outputs at once.
- Data stays local even in the remote version. Only generations are sent to the host node. The logs, data and discussion history are kept in your local disucssion folder.
You can install LoLLMs using pip, the Python package manager. Open your terminal or command prompt and run the following command:
pip install --upgrade lollms
Or if you want to get the latest version from the git:
pip install --upgrade git+https://github.com/ParisNeo/lollms.git
If you want to use cuda. Either install it directly or use conda to install everything:
conda create --name lollms python=3.10
Activate the environment
conda activate lollms
Install cudatoolkit
conda install -c anaconda cudatoolkit
Install lollms
pip install --upgrade lollms
Now you are ready.
To simply configure your environment run the settings app:
lollms-settings
The tool is intuitive and will guide you through configuration process.
The first time you will be prompted to select a binding.
Once the binding is selected, you have to install at least a model. You have two options:
1- install from internet. Just give the link to a model on hugging face. For example. if you select the default llamacpp python bindings (7), you can install this model:
https://huggingface.co/TheBloke/airoboros-7b-gpt4-GGML/resolve/main/airoboros-7b-gpt4.ggmlv3.q4_1.bin
2- install from local drive. Just give the path to a model on your pc. The model will not be copied. We only create a reference to the model. This is useful if you use multiple clients so that you can mutualize your models with other tools.
Now you are ready to use the server.
Here is the smallest possible example that allows you to use the full potential of the tool with nearly no code
from lollms.console import Conversation
cv = Conversation(None)
cv.start_conversation()
Now you can reimplement the start_conversation method to do the things you want:
from lollms.console import Conversation
class MyConversation(Conversation):
def __init__(self, cfg=None):
super().__init__(cfg, show_welcome_message=False)
def start_conversation(self):
prompt = "Once apon a time"
def callback(text, type=None):
print(text, end="", flush=True)
return True
print(prompt, end="", flush=True)
output = self.safe_generate(prompt, callback=callback)
if __name__ == '__main__':
cv = MyConversation()
cv.start_conversation()
Or if you want here is a conversation tool written in few lines
from lollms.console import Conversation
class MyConversation(Conversation):
def __init__(self, cfg=None):
super().__init__(cfg, show_welcome_message=False)
def start_conversation(self):
full_discussion=""
while True:
prompt = input("You: ")
if prompt=="exit":
return
if prompt=="menu":
self.menu.main_menu()
full_discussion += self.personality.user_message_prefix+prompt+self.personality.link_text
full_discussion += self.personality.ai_message_prefix
def callback(text, type=None):
print(text, end="", flush=True)
return True
print(self.personality.name+": ",end="",flush=True)
output = self.safe_generate(full_discussion, callback=callback)
full_discussion += output.strip()+self.personality.link_text
print()
if __name__ == '__main__':
cv = MyConversation()
cv.start_conversation()
Here we use the safe_generate method that does all the cropping for you ,so you can chat forever and will never run out of context.
Once installed, you can start the LoLLMs Server using the lollms-server
command followed by the desired parameters.
lollms-server --host <host> --port <port> --config <config_file> --bindings_path <bindings_path> --personalities_path <personalities_path> --models_path <models_path> --binding_name <binding_name> --model_name <model_name> --personality_full_name <personality_full_name>
-
--host
: The hostname or IP address to bind the server (default: localhost). -
--port
: The port number to run the server (default: 9600). -
--config
: Path to the configuration file (default: None). -
--bindings_path
: The path to the Bindings folder (default: "./bindings_zoo"). -
--personalities_path
: The path to the personalities folder (default: "./personalities_zoo"). -
--models_path
: The path to the models folder (default: "./models"). -
--binding_name
: The default binding to be used (default: "llama_cpp_official"). -
--model_name
: The default model name (default: "Manticore-13B.ggmlv3.q4_0.bin"). -
--personality_full_name
: The full name of the default personality (default: "personality").
Start the server with default settings:
lollms-server
Start the server on a specific host and port:
lollms-server --host 0.0.0.0 --port 5000
-
connect
: Triggered when a client connects to the server. -
disconnect
: Triggered when a client disconnects from the server. -
list_personalities
: List all available personalities. -
add_personality
: Add a new personality to the server. -
generate_text
: Generate text based on the provided prompt and selected personality.
-
GET /personalities
: List all available personalities. -
POST /personalities
: Add a new personality to the server.
Sure! Here are examples of how to communicate with the LoLLMs Server using JavaScript and Python.
// Establish a WebSocket connection with the server
const socket = io.connect('http://localhost:9600');
// Event: When connected to the server
socket.on('connect', () => {
console.log('Connected to the server');
// Request the list of available personalities
socket.emit('list_personalities');
});
// Event: Receive the list of personalities from the server
socket.on('personalities_list', (data) => {
const personalities = data.personalities;
console.log('Available Personalities:', personalities);
// Select a personality and send a text generation request
const selectedPersonality = personalities[0];
const prompt = 'Once upon a time...';
socket.emit('generate_text', { personality: selectedPersonality, prompt: prompt });
});
// Event: Receive the generated text from the server
socket.on('text_generated', (data) => {
const generatedText = data.text;
console.log('Generated Text:', generatedText);
// Do something with the generated text
});
// Event: When disconnected from the server
socket.on('disconnect', () => {
console.log('Disconnected from the server');
});
import socketio
# Create a SocketIO client
sio = socketio.Client()
# Event: When connected to the server
@sio.on('connect')
def on_connect():
print('Connected to the server')
# Request the list of available personalities
sio.emit('list_personalities')
# Event: Receive the list of personalities from the server
@sio.on('personalities_list')
def on_personalities_list(data):
personalities = data['personalities']
print('Available Personalities:', personalities)
# Select a personality and send a text generation request
selected_personality = personalities[0]
prompt = 'Once upon a time...'
sio.emit('generate_text', {'personality': selected_personality, 'prompt': prompt})
# Event: Receive the generated text from the server
@sio.on('text_generated')
def on_text_generated(data):
generated_text = data['text']
print('Generated Text:', generated_text)
# Do something with the generated text
# Event: When disconnected from the server
@sio.on('disconnect')
def on_disconnect():
print('Disconnected from the server')
# Connect to the server
sio.connect('http://localhost:9600')
# Keep the client running
sio.wait()
Make sure to have the necessary dependencies installed for the JavaScript and Python examples. For JavaScript, you need the socket.io-client
package, and for Python, you need the python-socketio
package.
Contributions to the LoLLMs Server project are welcome and appreciated. If you would like to contribute, please follow the guidelines outlined in the CONTRIBUTING.md file.
LoLLMs Server is licensed under the Apache 2.0 License. See the LICENSE file for more information.
The source code for LoLLMs Server can be found on GitHub