Deploy and Secure Airbyte with Nginx Reverse Proxy, Basic Authentication & Let’s Encrypt SSL Certificates

Index

Introduction

Architecture

Solution Deployment

A. Deploy VM’s for Nginx and Airbyte

B. Configure HTTP Basic Authenticaiton and Deploy Nginx

C. Enable HTTPS and Lets Encrypt SSL Certificates

Summary

References

Introduction

In this post we will deploy Airbyte, one of the most exciting Open source ELT tools in modern data engineering. This is an ongoing series of posts on deploying and using Airbyte for data engineering use-cases. There is already a deployment guide available for Airbyte on OCI. This setup is a production grade setup build using components on Oracle Cloud Infrastructure (OCI), with minimum cost and by using the Always Free tier available on OCI you can build it for almost $0.

Airbyte is still in Alpha release and currently does not have an authentication scheme or support for HTTPS and TLS/SSL certificates. But the tool is amazing for modern data engineering use-cases, but with those features missing the adoption for enterprises would be difficult as HTTPS and SSL certificates are a must for any organization which wants to create a data pipeline on it’s critical data. To overcome these limitations we decided to redesign the architecture for Airbyte deployment and harden it using Lets Encrypt Free SSL certificates and use Nginx to act as a reverse proxy and send the connections to Airbyte which is running on a Private IP and is not directly exposed to the Internet.

This architecture can also be build on AWS or Azure using the same Architectural under-pining of Private IP, Public IP, Reverse Proxy etc. This is a cloud agnostic architecture using a mix of basic IaaS and Open Source software. For eg: on AWS you can have a VPC with Private and Public Subnet and deploy Nginx on a EC2 instance in Public subnet with Airbyte deployed on Docker on Private Subnet.

Architecture

The stack has the below important components, it is a mix of Network and IaaS components on Oracle Cloud Infrastructure which will host Nginx and Airbyte. For the Nginx deployment we will use OCI ARM based A1 instances and for Airbyte we will use a AMD E4 Flex instance. Both these instances are available in the always free tier

  1. OCI ARM A1 Instance — Nginx and SSL Certificates
  2. OCI AMD E4 Flex Instance — Airbyte on Docker
  3. OCI DNS Public Zone which has Domain Management and the A-records are added here
  4. VCN (Virtual Cloud Network) — 2 Subnets , 1 Public subnet hosting the Nginx VM, 1 Private Subnet running the Airbyte Docker container

Solution Deployment

A. Deploy the Virtual Machines for Nginx and Airbyte in Public and Private Subnet Respectively

  1. Deploy OCI ARM Instance in Public Subnet and install nginx on it. Ensure port 80 is allowed in security list of Public subnet as stateless rule

Refer Deployment Guide of Ngnix on OCI ARM VM : https://medium.com/oracledevs/deploy-nginx-on-the-new-oci-arm-a1-instance-in-under-2-mins-977f68a7984d

Private IP of Nginx instance : 10.10.1.138

a. Enable this flag SELinux on the OCI ARM VM


sudo setsebool -P httpd_can_network_connect 1

b. Install nginx by creating a file named /etc/yum.repos.d/nginx.repo and paste one of the configurations below:

[nginx]
name=nginx repo
baseurl=https://nginx.org/packages/rhel/$releasever/$basearch/
gpgcheck=0
enabled=1
## Install nginxsudo yum install nginx
sudo systemctl start nginx
sudo systemctl status nginx
sudo systemctl enable nginx
## Whitelist HTTP Port 80 on the Instance for External Acessssudo firewall-cmd -zone=public -permanent -add-port=80/tcp
sudo firewall-cmd -zone=public -permanent -add-service=http
sudo firewall-cmd -reload
sudo firewall-cmd -zone=public -permanent -list-ports

2. Deploy Airbyte Instance on OCI VM in a Private Subnet. Ensure the Security list of the Private subnet allows port 8000 as Stateless rule

Private IP of VM hosting Airbyte : 10.10.1.147

Deployment guide for Airbyte on OCI VM Please Refer : https://docs.airbyte.io/deploying-airbyte/on-oci-vm

B. Configure HTTP Basic Authentication and Install Nginx

1. Configure nginx to act as reverse proxy for Airbyte with basic http authentication

a. Install httpd-tools on ARM instance

sudo mkdir -p /etc/apache2/
sudo htpasswd -c /etc/apache2/.htpasswd admin
sudo vim /etc/nginx/nginx.conf

Add the below in the nginx.conf file, save it and reload nginx

user root;events {
worker_connections 4096; ## Default: 1024
}
http {
server {
listen 80;
listen [::]:80;
server_name 10.10.1.138;location / {
proxy_pass http://10.10.1.147:8000;
proxy_set_header X-Forwarded-User $http_authorization;
auth_basic “Administrator’s Area”;
auth_basic_user_file /etc/apache2/.htpasswd;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header Host $host;
proxy_set_header X-Forward-For $proxy_add_x_forwarded_for;
proxy_pass_header Accept;
proxy_pass_header Server;
proxy_http_version 1.1;
proxy_set_header Authorization $http_authorization;
proxy_pass_header Authorization;
proxy_set_header ns_server-ui yes;
}
}
}

b. Check the configuration is ok

sudo nginx -t

c. Restart nginx

sudo systemctl restart nginx

4. Connect to Public IP where nginx runs and it will prompt for password for the basic http authentication and act as a reverse proxy and forward connections to the Airbyte Instance running with a PrivateIP

http://<public-ip>/

Spool error log to check for any issues ##

sudo tail -30f /var/log/nginx/error.log

C. Enable HTTPS and Acquire Lets Encrypt SSL certificates for your Domain/Sub-Domain

Before we get started add 2 A-records in your DNS domain Management. It could be done on any 3rd party provider like Crazy Domains or Go Daddy, but in my case I use OCI to manage my Domain. For eg : airbyte.yourdomainname.com and www.airbyte.yourdomain.com using the Public IP of your ARM instance

1. Enable EPEL Repo on Oracle Linux and Install certbot

sudo yum install -y yum-utils
sudo yum-config-manager — enable ol7_optional_latest
sudo yum-config-manager — enable ol7_developer_EPEL
cd /tmp
wget https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
sudo rpm -Uvh /tmp/epel-release-latest-7.noarch.rpm
sudo yum install certbot
sudo yum install python-certbot-nginx

2. Edit the nginx.conf and make a few changes to add the new domain names which you created in Step 5

sudo vi /etc/nginx/nginx.conf

user root;events {
worker_connections 4096; ## Default: 1024
}
http {
server {
listen 80;
listen [::]:80;
server_name airbyte.yourdomain.com www.airbyte.yourdomain.com;location / {
proxy_pass http://10.10.1.147:8000;
proxy_set_header X-Forwarded-User $http_authorization;
auth_basic “Administrators Area”;
auth_basic_user_file /etc/apache2/.htpasswd;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header Host $host;
proxy_set_header X-Forward-For $proxy_add_x_forwarded_for;
proxy_pass_header Accept;
proxy_pass_header Server;
proxy_http_version 1.1;
proxy_set_header Authorization $http_authorization;
proxy_pass_header Authorization;
proxy_set_header ns_server-ui yes;
}
}
}

3. Reload Nginx

sudo nginx -s reload

4. Run Certbot to get the Lets Encrypt SSL certificates for the new sub-domain

sudo certbot — nginx -d airbyte.yourdomain.com -d www.airbyte.yourdomain.com

To check for any non-valid special characters in the config file please run below and avoid error >>> “An unexpected error occurred: UnicodeDecodeError: ‘ascii’ codec can’t decode byte 0xe2 in position 35: ordinal not in range(128)” ##

sudo grep -r -P ‘[^\x00-\x7f]’ /etc/apache2 /etc/letsencrypt /etc/nginx

Run again after fixing the special characters in the nginx.conf file

sudo certbot — nginx -d airbyte.yourdomain.com -d www.airbyte.yourdomain.com

select option 2 for redirecting HTTP to HTTPS for all traffic. Ensure to add port 443 into the Security list of Public subnet as a Stateless rule

5. Add HTTPS 443 port to firewall-cmd on the Nginx VM

sudo firewall-cmd -zone=public -permanent -add-port=443/tcp
sudo firewall-cmd -zone=public -permanent -add-service=https
sudo firewall-cmd -reload
sudo firewall-cmd -zone=public -permanent -list-ports

Login to https://airbyte.yourdomain.com

Use the username ‘Admin’ and password which was created earlier in Step 1

Summary

We have seen how easy it is to create a production grade Airbyte deployment on Oracle Cloud Infrastructure using Lets Encrypt, Nginx and Http Authentication. This stack can be build with almost zero cost using the Oracle Cloud Always Free tier (barring the DNS domain cost)

References:

[1] Oracle Cloud Compute E4 platform — https://blogs.oracle.com/cloud-infrastructure/post/announcing-oracle-cloud-compute-e4-platform-on-third-gen-amd-epyc-processors

[2] Oracle Ampere A1 Compute — https://www.oracle.com/au/cloud/compute/arm/

[3] Oracle Cloud Always Free Tier — https://www.oracle.com/au/cloud/free/

[4] Deploying Airbyte on OCI VM — https://docs.airbyte.io/deploying-airbyte/on-oci-vm

[5] Nginx on OCI ARM — https://medium.com/oracledevs/deploy-nginx-on-the-new-oci-arm-a1-instance-in-under-2-mins-977f68a7984d

[6] Nginx Reverse Proxy — https://docs.nginx.com/nginx/admin-guide/web-server/reverse-proxy/

[7] Lets Encrypt certbot — https://certbot.eff.org/

[8] Nginx Restricting Access with HTTP Basic Authentication — https://docs.nginx.com/nginx/admin-guide/security-controls/configuring-http-basic-authentication/

--

--

--

Principal Cloud Solutions Architect, Database & Analytics at Oracle

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Knowing is always better than not knowing. Every time!

Knowing is always better than not knowing. Every time! | @iSwamiK

i miss those times when I can read wattpad and still get to study

Your Detailed Guide to Apache ShardingSphere’s Operating Modes

[Java][Error] Could not find a setter for property xxx in class OOO.

Ridiculous Job Posting or Modern Renaissance Man

clouds at sunset

Hackerrank ACM ICPC Team Solution

Re-thinking data lake architectures with google cloud — part I

How to Upgrade Your Terminal with Oh My Zsh and More

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Shadab Mohammad

Shadab Mohammad

Principal Cloud Solutions Architect, Database & Analytics at Oracle

More from Medium

Deploy an application for Azure Container Registry (Part 2: Build and store images by using Azure…

Build and Push a Docker image to AWS ECR with Pulumi : Part 2 (with Azure DevOps)

How to monitor Docker Containers with Elasticsearch, Filebeat & Metricbeat

Testing Dockers II: Unit Testing and Contract Testing with Terratest