Introduction
In today’s data-driven world, efficient data management and security have become paramount. Weaviate, an open-source vector database, offers a robust solution for storing and retrieving data objects based on their semantic properties. This article will cover the security settings I found helpful to lockdown my environments when testing Weaviate with my AI-powered applications. To ensure a convenient setup process and enhanced security measures, using Docker Compose along with Nginx and Let’s Encrypt is highly recommended. In this article, I will walk you through the process of setting up Weaviate using Docker Compose and securing it with Nginx and Let’s Encrypt.
Understanding the Components
Before we dive into the setup process, it’s crucial to understand the key components involved:
- Weaviate: Weaviate is an open-source vector database with features like vector search, structured filtering, and a GraphQL API for seamless data access. It efficiently handles semantic search and data retrieval, making it an invaluable tool for software engineers, data engineers, and data scientists.
- Docker Compose: Docker Compose simplifies the management of multiple Docker containers, grouping them as a single application. This tool proves invaluable in handling complex containerized applications through its straightforward service definition and execution capabilities.
- Nginx: Nginx is a popular web server and reverse proxy server. In our setup, Nginx acts as a reverse proxy for Weaviate, handling SSL encryption and forwarding requests to the Weaviate container.
- Let’s Encrypt: Let’s Encrypt is a trusted certificate authority that offers automated and free SSL certificates. By utilizing Let’s Encrypt, we can secure our Nginx server with an SSL certificate, ensuring data protection through encryption.
Prerequisites
- A registered domain name. This tutorial will use
example.comthroughout. - Both of the following DNS records set up for your server.
- An A record with
example.compointing to your server’s public IP address. - An A record with
vector.example.compointing to your server’s public IP address.
- An A record with
Setting Up Weaviate with Docker Compose and Securing It
Step 1: Install Docker and Docker Compose:
Begin by installing Docker and Docker Compose on your server. Follow the official Docker documentation to install Docker and Docker Compose. If you are new to Docker Compose please check-out the Docker Introduction for Weaviate Users.
Step 2: Configure the Docker Compose File:
Create a new file named docker-compose.yml and copy the provided configuration. This configuration file defines two services: weaviate and nginx. Weaviate Configurator is a quick way to get started with generating the docker-compose.yml. The weaviate service runs the Weaviate container, exposing port 8080, and employs volume mounting for persistent data storage. The nginx service runs the Nginx container, exposing ports 80 and 443, and includes volume mounts for Nginx configuration and SSL certificates.
Weaviate Configurator is a quick way to get started with generating the docker-compose.yml.
Despite the AI applications I am currently testing with Weaviate being configured to connect over a Virtual Network (VNet), I have taken the proactive step of configuring Weaviate to enforce access restrictions by disallowing anonymous access and implementing a key-based access mechanism, as seen in the following docker-compose.yml example:
version: '3.4'
services:
weaviate:
image: semitechnologies/weaviate:1.21.1
ports:
- 8080:8080
volumes:
- weaviate_data:/var/lib/weaviate
restart: on-failure:0
environment:
QUERY_DEFAULTS_LIMIT: 25
AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'false'
AUTHENTICATION_APIKEY_ENABLED: 'true'
AUTHENTICATION_APIKEY_ALLOWED_KEYS: ‘**’
AUTHENTICATION_APIKEY_USERS: ‘**’
PERSISTENCE_DATA_PATH: '/var/lib/weaviate'
DEFAULT_VECTORIZER_MODULE: 'text2vec-transformers'
ENABLE_MODULES: 'text2vec-transformers'
TRANSFORMERS_INFERENCE_API: http://t2v-transformers:8080
CLUSTER_HOSTNAME: 'node1'
t2v-transformers:
image: semitechnologies/transformers-inference:sentence- transformers-multi-qa-MiniLM-L6-cos-v1
environment:
ENABLE_CUDA: 0 # set to 1 to enable
volumes:
weaviate_data:
…
Step 3: Install and Setup Nginx :
Because Nginx is available in Ubuntu’s default repositories, we can easily install it using the apt packaging system.
Since this is our first time working with the apt packaging system in this session, let’s start by updating our local package index to make sure we have access to the latest packages. After that, we can go ahead and install Nginx:
sudo apt update
sudo apt install nginx
Prior to commencing Nginx testing, it is imperative to make the necessary adjustments to the firewall software to facilitate service access. Nginx, upon installation, seamlessly registers itself as a service with ufw, streamlining the process of granting access to Nginx.
There are three distinct profiles available for configuring Nginx:
Nginx Full: This profile grants access to both port 80 (for regular, unencrypted web traffic) and port 443 (for TLS/SSL encrypted traffic).
Nginx HTTP: This profile exclusively opens port 80, facilitating access for normal, unencrypted web traffic.
Nginx HTTPS: This profile singularly opens port 443, enabling access specifically for TLS/SSL encrypted traffic.
Enable HTTP and SSL traffic by typing:
sudo ufw allow 'Nginx Full'
Step 4: Obtain the Let’s Encrypt SSL Certificate:
To secure your Nginx server with an SSL certificate from Let’s Encrypt, install Certbot and the Nginx plugin by running the command:
sudo apt install certbot python3-certbot-nginx
Certbot offers a range of methods for obtaining SSL certificates via its diverse set of plugins. Among these, the Nginx plugin assumes the responsibility of dynamically reconfiguring the Nginx web server and reloading its configuration whenever such adjustments are deemed necessary. Utilize the ‘certbot’ tool with the ‘–nginx’ plugin, making use of the ‘-d’ flag to specify the domain names for which the certificate should be valid. To employ this plugin, the following command should be executed:
sudo certbot --nginx -d example.com -d vector.example.com
Just so you’re aware, Let’s Encrypt certificates are only good for ninety days. This is to encourage everyone to automate the renewal process. The certbot package set up to handle this for us. It’s got a systemd timer that runs twice a day and takes care of renewing any certificate that’s about to expire within the next thirty days.
You can query the status of the timer with systemctl:
sudo systemctl status certbot.timer
Step 5: Start the Weaviate and Nginx Containers:
Navigate to the directory containing the docker-compose.yml file and run the command docker-compose up -d to start the containers. The -d flag ensures that the containers run in detached mode, allowing them to operate independently.
docker-compose up -d
Step 6: Access Weaviate via Nginx:
Congratulations! You have now successfully set up Weaviate using Docker Compose and secured it with Nginx and Let’s Encrypt. Access Weaviate by visiting https://vector.example.com or https://your-server-ip. Remember to replace example.com or your-server-ip with your own domain or server IP address.
Connection Example with Python Client:
import weaviate
client = weaviate.Client(
url="https://vector.example.com/",
auth_client_secret=weaviate.AuthApiKey(api_key="****"),
)
Conclusion:
Setting up Weaviate with Docker Compose offers a convenient approach to managing and securing your data. By incorporating Nginx as a reverse proxy and utilizing Let’s Encrypt for SSL encryption, you ensure the protection of your valuable information. Please note that this article provides a basic setup and does not cover advanced configuration options or production considerations. Refer to the official documentation of Weaviate, Docker Compose, Nginx, and Let’s Encrypt for more comprehensive information and customization options. With Weaviate, you can leverage its powerful features for semantic search and data retrieval, empowering you in various use cases such as e-commerce search, data classification, and automated data harmonization.

Leave a Reply