Kubefeeds Team A dedicated and highly skilled team at Kubefeeds, driven by a passion for Kubernetes and Cloud-Native technologies, delivering innovative solutions with expertise and enthusiasm.

Migrate 🪳Coackroach DB into Postgres🐘

6 min read

Migrating from CockroachDB to PostgreSQL involves several steps because CockroachDB and PostgreSQL share some similarities, but they also have important differences, particularly in terms of distributed systems, consistency models, and features.

Infrastructure

  1. Cockroach DB Cluster running in AWS [EC2] with 3 AZ.
  2. Amazon S3 (Simple Storage Service).
  3. Amazon EKS (Elastic Kubernetes Service).
  4. Cockroach DB running local on your computer.

Important Details

  • Database name ‘lynx_core_prod’

CRDB Backup Preparation

1- Login to CockroachDB via Session Manager.

sudo -i
su - cockroach
cockroach sql --certs-dir=certs/

2- Create Cockroach Backup.

BACKUP DATABASE lynx_core_prod INTO 'userfile://defaultdb.public.userfiles_$user/lynx_core_prod-backup' AS OF SYSTEM TIME '-10s';

3- Save new backup [lynx_core_prod-backup] to local directory and push it to AWS S3.

cockroach userfile get --certs-dir=certs/ 'userfile://defaultdb.public.userfiles_$user/lynx_core_prod-backup'

aws s3 cp lynx_core_prod-backup s3://gersonsplace-bucket/crdb-backup --recursive

At this point we already have a backup in S3 of the Database we need to migrate to Postgres now it is time to export DDL and make it compatible with Postgres. [This can be tricky and you need to probably make some manual work depending on how big your DB is]

Restore CRDB backup into a local CRDB database on your Computer.

1- Copy your Database from S3 to your Local computer

aws s3 cp s3://gersonsplace-bucket/crdb-backup /crdb_to_postgres --recursive

2- Run a local Cockroach Database local

brew install cockroach

3- Start a single-node Cockroach DB and access SQL Shell

cockroach start-single-node --insecure --listen-addr=localhost:26258 --http-addr=localhost:8081

cockroach sql --insecure --host=localhost:26258

4- Now you need to create a new folder to be able to mount the backup on your local CRDB and then restore backup.

cd /Users/gersonsplace/cockroach-data/

mkdir extern/backup

Note: Copy the backup to cockroach-data/extern/backup folder.

5- Validate backup on your local CRDB Database using cockroach sql.

show backups in 'nodelocal://1/backup';

6- Read and shows files inside the backup using cockroach sql (Optional)

show backup from LATEST in 'nodelocal://1/backup';

7- Restore backup into your local CRDB database with skip_localities_check

RESTORE DATABASE lynx_core_prod FROM LATEST IN 'nodelocal://1/backup' with skip_localities_check;

Congratulations at this point you have your Backup restored on your local CRDB database!!! 

Prepare CRDB to be migrated into Postgres

1- Export DDL [Data definition language] create the following script and run it ./ddl.sh

#!/bin/bash

echo "Exporting DDL for all tables to ddl.sql..."

# Query to get all table names
 psql -U root -h localhost -p 26258 -d lynx_core_prod  -Ato ddl.sql -c " show create all tables "

echo "DDL export complete. Check ddl.sql for the results."

Above script is going to create ddl.sql file it would contain all tables from your Database in the CockroachDB style so here comes the tricky part and probably some manual work if someone know a better approach feel free to share it but this process worked for me.

Example of how 2 tables in the ddl.sql
Image description

Example of the same 2 tables after making them postgres compatible

Image description

As I mentioned this could be tricky but not impossible depending on the size of your CRDB and numbers of tables.

Once you have an idea how your tables are going to look like with the postgres compatible format you can move to the next step.

Run Postgres Database local on your computer

1- Install Postgres
brew install postgresql@14
pg_ctl -D /Users/user/postgressql -l logfile start'
psql postgres

Connect local CRDB and Postgres together

1- Create a foreign server in postgres database and allow it to connect to CRDB

CREATE EXTENSION postgres_fdw;
CREATE SERVER crdb FOREIGN DATA WRAPPER postgres_fdw OPTIONS (
    host 'localhost',
    port '26258',
    dbname 'lynx_core_prod',
    sslmode 'disable'
);
CREATE USER MAPPING FOR root SERVER crdb OPTIONS (
 user 'root', password ''
);

2- Now you need to create a foreign table in postgres for each table in CRDB example using the 2 tables I mentioned above.

Image description
Note: Notice that I use foreign server to be able to access the table on CRDB from postgres.

`SERVER crdb OPTIONS (
    table_name 'table-name'`

3- Now you need to create a local table to be able to insert data from foreign tables I used prefix local_ to create local table and to be able to Alter once I finish with the import. Example using the 2 tables I mentioned above.

Image description

4- At this point you can drop foreign server

DROP EXTENSION postgres_fdw;
DROP SERVER crdb CASCADE;
DROP USER MAPPING FOR root SERVER crdb;

5- After importing Data to the local table and dropping foreign server it is time to use Alter to change name to the correct one. Example using the 2 tables I mentioned above.
ALTER TABLE public.local_schema_migrations RENAME TO schema_migrations;
ALTER TABLE client_side.local_mounts RENAME TO mounts;

6- Finally you can now check your local Postgres DB and it should be the same as CRDB Database including all Data.

Image description

Lets make things more interesting and do the same process but now in Kubernetes

Requirements

  • Minikube or EKS with IRSA to access S3 and be able to download backup using initcontainer and emptyDir.

  • CRDB Backup.

  • 3 files create_local_tables.sql, create_foreign_tables.sql and rename_tables.sql these files are the same files that we explained before in order to be able to make CRDB to run in Postgres but know we would use them to automate process in Kubernetes.

In this example I would use Minikube taking into consideration that we already have our backup file and the 3 files .sql save in the following path cockroach-data/extern/backup

Start Minikube with --mount-string flag pointing to where you have the CRDB backup /Users/user/cockroach-data/extern/backup

minikube start --mount-string="/Users/user/cockroach-data/extern/backup:/cockroach-backup" --mount

Let’s create k8s manifests

RBAC

---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: crdb-to-postgres-migration
  namespace: crdb-to-postgres
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: crdb-to-postgres-migration-role
rules:
  - apiGroups: ["*"]
    resources: ["*"]
    verbs: ["*"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: crdb-to-postgres-migration-crb
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: crdb-to-postgres-migration-role
subjects:
  - kind: ServiceAccount
    name: crdb-to-postgres-migration
    namespace: crdb-to-postgres

Deployment

I explain deployment so you would find the following

  • cockroach container (Restore CRDB Backup)
  • postgres container (Create foreign server, create foreign tables, create local tables, import data from CRDB, drops foreign server and rename local tables to the actual table name.
  • export-pg-dump (To be able to restore postgres into other DB)
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: migration-demo
  namespace: crdb-to-postgres
spec:
  replicas: 1
  selector:
    matchLabels:
      app: db
  template:
    metadata:
      labels:
        app: db
    spec:
      containers:
      - name: cockroach
        image: cockroachdb/cockroach:v24.2.0
        command: ["/bin/sh", "-c"]
        args:
          - |
            cockroach start-single-node --insecure --http-port=8080 --port=26257 --store=/cockroach-data &

            echo "#### Waiting for CockroachDB to start... ####" &&
            while ! curl -s http://localhost:8080/_status/cluster; do
              sleep 1;
            done &&

            echo "### CockroachDB is up and running! ###" &&
            cockroach sql --insecure --execute "RESTORE DATABASE lynx_core_prod FROM LATEST IN 'nodelocal://1/backup' WITH skip_localities_check;" &&

            echo "### Restore operation completed. ####" &&
            tail -f /dev/null # Keep the container running
        volumeMounts:
        - name: cockroach-backup
          mountPath: /cockroach-data/extern/backup
        ports:
        - containerPort: 8080
        - containerPort: 26257
        resources:
          requests:
            cpu: 50m
            memory: 100Mi
          limits:
            cpu: 1000m
            memory: 2000Mi
      - name: postgres
        image: postgres:latest
        command: ["sh", "-c"]
        args:
          - |
            docker-entrypoint.sh postgres &

            echo " #### Waiting for PostgreSQL to start... #### " &&

            until pg_isready -h localhost; do
              sleep 15;
            done &&

            echo " #### PostgreSQL is up and running! #### " &&

            psql -U root -c "CREATE DATABASE lynx_core_prod;" &&
            psql -U root -d lynx -c "CREATE EXTENSION IF NOT EXISTS postgres_fdw;" &&
            psql -U root -d lynx -c "CREATE SERVER crdb FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host 'localhost', port '26257', dbname 'lynx_core_prod', sslmode 'disable');" &&
            psql -U root -d lynx -c "CREATE USER MAPPING FOR root SERVER crdb OPTIONS (user 'root', password '');" &&
            psql -U root -d lynx -c "CREATE SCHEMA client_side;" &&
            psql -U root -d lynx -c "CREATE SCHEMA settings;" &&
            psql -U root -d lynx -c "IMPORT FOREIGN SCHEMA public FROM SERVER crdb INTO public;" &&
            psql -U root -d lynx -c "IMPORT FOREIGN SCHEMA client_side FROM SERVER crdb INTO client_side;" &&
            psql -U root -d lynx -c "IMPORT FOREIGN SCHEMA settings FROM SERVER crdb INTO settings;" &&

            echo "#### Creating foreign_tables... ####" &&
            sleep 5 &&
            psql -U root -d lynx -f /scripts/create_foreign_tables.sql &&

            echo "#### Creating local_tables... ####" &&
            sleep 5 &&
            psql -U root -d lynx -f /scripts/create_local_tables.sql &&

            echo "#### Dropping  CRDB_Server... ####" &&
            psql -U root -d lynx -c "DROP SERVER crdb CASCADE;" &&
            sleep 2 &&

            echo "#### Renaming_tables... ####" &&
            psql -U root -d lynx -f /scripts/rename_tables.sql &&

            echo "#### All Database commands executed.... ####" &&
            sleep 10 &&

            echo "#### lynx Database dump created at /pg_dump/lynx_dump_$(date +%Y%m%d).sql ####" &&
            pg_dump -U root -d lynx -f /pg_dump/lynx_dump_$(date +%Y%m%d).sql &&

            tail -f /dev/null
        env:
        - name: POSTGRES_DB
          value: root
        - name: POSTGRES_USER
          value: root
        - name: POSTGRES_PASSWORD
          value: root
        ports:
        - containerPort: 5432
        resources:
          requests:
            cpu: 50m
            memory: 100Mi
          limits:
            cpu: 1000m
            memory: 2000Mi
        volumeMounts:
        - name: postgres-data
          mountPath: /var/lib/postgresql/data
        - name: cockroach-backup
          mountPath: /scripts
        - name: pg-dump
          mountPath: /pg_dump
      - name: export-pg-dump
        image: bitnami/kubectl:latest
        env:
          - name: POD_NAME
            valueFrom:
              fieldRef:
                fieldPath: metadata.name
        command:
        - /bin/sh
        - -c
        - |
          sleep 60
          echo "#### Copying "lynx_dump_$(date +%Y%m%d).sql" to /extern/backup ####"

          kubectl cp /pg_dump/lynx_dump_$(date +%Y%m%d).sql /Users/gerson/cockroach-data/extern/backup/
          tail -f /dev/null # Keep the container running
        volumeMounts:
        - name: pg-dump
          mountPath: /pg_dump
        resources:
          requests:
            cpu: 50m
            memory: 100Mi
          limits:
            cpu: 100m
            memory: 200Mi
      volumes:
      - name: postgres-data
        emptyDir: {}
      - name: cockroach-backup
        hostPath:
          path: /cockroach-backup 
          type: Directory
      - name: pg-dump
        emptyDir: {}
      serviceAccountName: crdb-to-postgres-migration

After the whole process finish you can export the pg_dump and run it in another postgres database useful for development

1- Copy pg_dump to local PC
kubectl cp pod-name:/pg_dump/lynx_dump_$(date +%Y%m%d).sql /Users/user/lynx_dump_$(date +%Y%m%d).sql -c export-pg-dump -n crdb-to-postgres

2- Create Database in postgres
createdb lynx_core_prod
psql -U root -d lynx_core_prod < lynx_dump_20241217.sql

Congratulations you were able to migrate CRDB🪳 database into Postgres🐘 DB :)Pura Vida!

Kubefeeds Team A dedicated and highly skilled team at Kubefeeds, driven by a passion for Kubernetes and Cloud-Native technologies, delivering innovative solutions with expertise and enthusiasm.