— A. Deploy VM’s for Nginx and Airbyte
— B. Configure HTTP Basic Authenticaiton and Deploy Nginx
— C. Enable HTTPS and Lets Encrypt SSL Certificates
— D. Create Replication from Snowflake to Oracle Database
In this post we will use Airbyte, one of the most exciting Open source ELT tools in modern data engineering to create near real-time replication by fetching data from Snowflake to a 2-Node Oracle RAC Database on OCI. Please note Airbyte is a Extract-Load and Transform (ELT) tool and not a ETL tool like Airflow which can do DAG(Directed Acyclic Graphs) executions. …
A. Deploy VM’s for Nginx and Airbyte
B. Configure HTTP Basic Authenticaiton and Deploy Nginx
C. Enable HTTPS and Lets Encrypt SSL Certificates
In this post we will deploy Airbyte, one of the most exciting Open source ELT tools in modern data engineering. This is an ongoing series of posts on deploying and using Airbyte for data engineering use-cases. There is already a deployment guide available for Airbyte on OCI. …
When you have large files to be transferred either from on-premise or from another cloud to OCI, the best way is to do that is using Multipart upload. Multipart uploads help to accommodate objects that are too large for a single upload operation. OCI now supports doing multipart upload from cURL directly instead of using SDK’s or oci-cli. With a Pre-Authenticated URL (PAR) you can avoid configuring API keys and sharing it with external parties. PAR URL can provide a quick and secure way to upload files to OCI object storage.
The maximum size for an uploaded object is 10…
A Datalake is the evolution of the Data Warehouse from both an etymological and functional sense. In the past the Enterprise companies had relational databases as data sources where data was extracted, transformed, and then loaded into a central repository. These process were run on a daily basis in batch mode and were appropriately termed ETL (Extract Transform Load), and the central database was referred to as a Data warehouse. Data warehouse became a generic term for a huge database where you park your goods (data in this case) for a long period of time before extracting value from it…
Wordpress is the largest content management system (CMS) in the World and the main engine behind at least half of the websites hosted globally. It is a considered an old school CMS based on PHP and Apache, yet it is quite a powerful platform and flexible enough to adopt to the modern CI/CD pipelines. AWS CodeCommit is highly scalable source control service for hosting private git repositories.
In this article we will create Wordpress Integration with AWS CodeCommit to push your code to a private git repository.
1. Create a Repository in Code Commit called “Demo-Website” and Create a Wordpress…
Airbyte is an upcoming Open Source ELT platform. Airbyte supports both realtime and batch based ETL operations from a variety of sources & destinations. It has a very easy to use user interface which makes replication and transformation of data between disparate data sources a breeze.
In this article we will deploy Airbyte on a Oracle Cloud Infrastructure(OCI) VM. Once the deployment is completed we will create PostgreSQL to MySQL replication using Airbyte on OCI VM. Airbyte deployment guide on OCI is also available here
Go to OCI Console > Compute > Instances > Create Instance
This is a syndicated post and also appears on the official OCI blog here
Polyglot persistence is a term you might have heard a lot in the recent past when addressing a new cloud computing paradigm. Though it sounds fairly sophisticated, a simple definition of the term in context of computing is “Polyglot persistence is the use of different data stores for storing and processing data for different functionality of an application. For eg: An e-commerce website which sells products online will use a NoSQL Store for storing the session state of the users shopping on the website while…
Oracle recently launched the new ARM based A1 instance on Oracle Cloud Infrastructure(OCI). The new instances are available in all commercial OCI regions.
You can get started here to deploy the new ARM Instance using the Oracle Linux 8 Cloud Developer Image. Once the image is deployed, SSH into the instance and run the below commands to setup a Ngnix reverse proxy/webserver.
If the package is not available then refer this link 
To check if the Nginx Webserver is up and running, open http://<publicip-address-of-instance> from a browser. …
In this post, we will create a Java package using Eclipse IDE, and create an executable jar file to query your Redshift cluster.
Eclipse is one of the most popular IDEs for Java programming. You can download it free from here → https://www.eclipse.org/downloads/
Java can be downloaded from Oracle’s website. I prefer to use Oracle java as it is more widely compatible with the ecosystem out there
Download Redshift JDBC driver from this link: https://docs.aws.amazon.com/redshift/latest/mgmt/configure-jdbc-connection.html
Create a demo Redshift cluster and make sure the security group has the IP address or CIDR range of the machine from which you will…
Oracle Cloud Infrastructure (OCI) has an Internet-scale Object Storage called OCI Object Storage Service. OCI Object storage provides Amazon S3 compatible API’s, using which you can access OCI Object Storage from any Tools or SDK’s which supports Amazon S3.
Amazon’s Python SDK is called BOTO3. Boto3 can be used for multiple AWS Services but in this blog post we will focus on making calls to S3. We will create a Boto3 Python Script to query an OCI bucket and then upload file to the bucket using Amazon’s S3 API’s. …