Migrating from CockroachDB to PostgreSQL involves several steps because CockroachDB and PostgreSQL share some similarities, but they also have important differences, particularly in terms of distributed systems, consistency models, and features.
Infrastructure
- Cockroach DB Cluster running in AWS [EC2] with 3 AZ.
- Amazon S3 (Simple Storage Service).
- Amazon EKS (Elastic Kubernetes Service).
- Cockroach DB running local on your computer.
Important Details
- Database name ‘lynx_core_prod’
CRDB Backup Preparation
1- Login to CockroachDB via Session Manager.
sudo -i
su - cockroach
cockroach sql --certs-dir=certs/
2- Create Cockroach Backup.
BACKUP DATABASE lynx_core_prod INTO 'userfile://defaultdb.public.userfiles_$user/lynx_core_prod-backup' AS OF SYSTEM TIME '-10s';
3- Save new backup [lynx_core_prod-backup] to local directory and push it to AWS S3.
cockroach userfile get --certs-dir=certs/ 'userfile://defaultdb.public.userfiles_$user/lynx_core_prod-backup'
aws s3 cp lynx_core_prod-backup s3://gersonsplace-bucket/crdb-backup --recursive
At this point we already have a backup in S3 of the Database we need to migrate to Postgres now it is time to export DDL and make it compatible with Postgres. [This can be tricky and you need to probably make some manual work depending on how big your DB is]
Restore CRDB backup into a local CRDB database on your Computer.
1- Copy your Database from S3 to your Local computer
aws s3 cp s3://gersonsplace-bucket/crdb-backup /crdb_to_postgres --recursive
2- Run a local Cockroach Database local
brew install cockroach
3- Start a single-node Cockroach DB and access SQL Shell
cockroach start-single-node --insecure --listen-addr=localhost:26258 --http-addr=localhost:8081
cockroach sql --insecure --host=localhost:26258
4- Now you need to create a new folder to be able to mount the backup on your local CRDB and then restore backup.
cd /Users/gersonsplace/cockroach-data/
mkdir extern/backup
Note: Copy the backup to cockroach-data/extern/backup folder.
5- Validate backup on your local CRDB Database using cockroach sql.
show backups in 'nodelocal://1/backup';
6- Read and shows files inside the backup using cockroach sql (Optional)
show backup from LATEST in 'nodelocal://1/backup';
7- Restore backup into your local CRDB database with skip_localities_check
RESTORE DATABASE lynx_core_prod FROM LATEST IN 'nodelocal://1/backup' with skip_localities_check;
Congratulations at this point you have your Backup restored on your local CRDB database!!!
Prepare CRDB to be migrated into Postgres
1- Export DDL [Data definition language] create the following script and run it ./ddl.sh
#!/bin/bash
echo "Exporting DDL for all tables to ddl.sql..."
# Query to get all table names
psql -U root -h localhost -p 26258 -d lynx_core_prod -Ato ddl.sql -c " show create all tables "
echo "DDL export complete. Check ddl.sql for the results."
Above script is going to create ddl.sql
file it would contain all tables from your Database in the CockroachDB style so here comes the tricky part and probably some manual work if someone know a better approach feel free to share it but this process worked for me.
Example of how 2 tables in the ddl.sql
Example of the same 2 tables after making them postgres compatible
As I mentioned this could be tricky but not impossible depending on the size of your CRDB and numbers of tables.
Once you have an idea how your tables are going to look like with the postgres compatible format you can move to the next step.
Run Postgres Database local on your computer
1- Install Postgres
brew install postgresql@14
pg_ctl -D /Users/user/postgressql -l logfile start'
psql postgres
Connect local CRDB and Postgres together
1- Create a foreign server
in postgres database and allow it to connect to CRDB
CREATE EXTENSION postgres_fdw;
CREATE SERVER crdb FOREIGN DATA WRAPPER postgres_fdw OPTIONS (
host 'localhost',
port '26258',
dbname 'lynx_core_prod',
sslmode 'disable'
);
CREATE USER MAPPING FOR root SERVER crdb OPTIONS (
user 'root', password ''
);
2- Now you need to create a foreign table
in postgres for each table in CRDB example using the 2 tables I mentioned above.
Note: Notice that I use foreign server
to be able to access the table on CRDB from postgres.
`SERVER crdb OPTIONS (
table_name 'table-name'`
3- Now you need to create a local table to be able to insert data from foreign tables I used prefix local_
to create local table and to be able to Alter
once I finish with the import. Example using the 2 tables I mentioned above.
4- At this point you can drop foreign server
DROP EXTENSION postgres_fdw;
DROP SERVER crdb CASCADE;
DROP USER MAPPING FOR root SERVER crdb;
5- After importing Data
to the local table and dropping foreign server
it is time to use Alter
to change name to the correct one. Example using the 2 tables I mentioned above.
ALTER TABLE public.local_schema_migrations RENAME TO schema_migrations;
ALTER TABLE client_side.local_mounts RENAME TO mounts;
6- Finally you can now check your local Postgres DB and it should be the same as CRDB Database including all Data.
Lets make things more interesting and do the same process but now in Kubernetes
Requirements
-
Minikube or EKS with IRSA to access S3 and be able to download backup using initcontainer and
emptyDir
. -
CRDB Backup.
-
3 files
create_local_tables.sql
,create_foreign_tables.sql
andrename_tables.sql
these files are the same files that we explained before in order to be able to make CRDB to run in Postgres but know we would use them to automate process in Kubernetes.
In this example I would use Minikube
taking into consideration that we already have our backup file and the 3 files .sql save in the following path cockroach-data/extern/backup
Start Minikube with --mount-string
flag pointing to where you have the CRDB backup /Users/user/cockroach-data/extern/backup
minikube start --mount-string="/Users/user/cockroach-data/extern/backup:/cockroach-backup" --mount
Let’s create k8s manifests
RBAC
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: crdb-to-postgres-migration
namespace: crdb-to-postgres
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: crdb-to-postgres-migration-role
rules:
- apiGroups: ["*"]
resources: ["*"]
verbs: ["*"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: crdb-to-postgres-migration-crb
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: crdb-to-postgres-migration-role
subjects:
- kind: ServiceAccount
name: crdb-to-postgres-migration
namespace: crdb-to-postgres
Deployment
I explain deployment so you would find the following
-
cockroach container
(Restore CRDB Backup) -
postgres container
(Create foreign server, create foreign tables, create local tables, import data from CRDB, drops foreign server and rename local tables to the actual table name. -
export-pg-dump
(To be able to restore postgres into other DB)
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: migration-demo
namespace: crdb-to-postgres
spec:
replicas: 1
selector:
matchLabels:
app: db
template:
metadata:
labels:
app: db
spec:
containers:
- name: cockroach
image: cockroachdb/cockroach:v24.2.0
command: ["/bin/sh", "-c"]
args:
- |
cockroach start-single-node --insecure --http-port=8080 --port=26257 --store=/cockroach-data &
echo "#### Waiting for CockroachDB to start... ####" &&
while ! curl -s http://localhost:8080/_status/cluster; do
sleep 1;
done &&
echo "### CockroachDB is up and running! ###" &&
cockroach sql --insecure --execute "RESTORE DATABASE lynx_core_prod FROM LATEST IN 'nodelocal://1/backup' WITH skip_localities_check;" &&
echo "### Restore operation completed. ####" &&
tail -f /dev/null # Keep the container running
volumeMounts:
- name: cockroach-backup
mountPath: /cockroach-data/extern/backup
ports:
- containerPort: 8080
- containerPort: 26257
resources:
requests:
cpu: 50m
memory: 100Mi
limits:
cpu: 1000m
memory: 2000Mi
- name: postgres
image: postgres:latest
command: ["sh", "-c"]
args:
- |
docker-entrypoint.sh postgres &
echo " #### Waiting for PostgreSQL to start... #### " &&
until pg_isready -h localhost; do
sleep 15;
done &&
echo " #### PostgreSQL is up and running! #### " &&
psql -U root -c "CREATE DATABASE lynx_core_prod;" &&
psql -U root -d lynx -c "CREATE EXTENSION IF NOT EXISTS postgres_fdw;" &&
psql -U root -d lynx -c "CREATE SERVER crdb FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host 'localhost', port '26257', dbname 'lynx_core_prod', sslmode 'disable');" &&
psql -U root -d lynx -c "CREATE USER MAPPING FOR root SERVER crdb OPTIONS (user 'root', password '');" &&
psql -U root -d lynx -c "CREATE SCHEMA client_side;" &&
psql -U root -d lynx -c "CREATE SCHEMA settings;" &&
psql -U root -d lynx -c "IMPORT FOREIGN SCHEMA public FROM SERVER crdb INTO public;" &&
psql -U root -d lynx -c "IMPORT FOREIGN SCHEMA client_side FROM SERVER crdb INTO client_side;" &&
psql -U root -d lynx -c "IMPORT FOREIGN SCHEMA settings FROM SERVER crdb INTO settings;" &&
echo "#### Creating foreign_tables... ####" &&
sleep 5 &&
psql -U root -d lynx -f /scripts/create_foreign_tables.sql &&
echo "#### Creating local_tables... ####" &&
sleep 5 &&
psql -U root -d lynx -f /scripts/create_local_tables.sql &&
echo "#### Dropping CRDB_Server... ####" &&
psql -U root -d lynx -c "DROP SERVER crdb CASCADE;" &&
sleep 2 &&
echo "#### Renaming_tables... ####" &&
psql -U root -d lynx -f /scripts/rename_tables.sql &&
echo "#### All Database commands executed.... ####" &&
sleep 10 &&
echo "#### lynx Database dump created at /pg_dump/lynx_dump_$(date +%Y%m%d).sql ####" &&
pg_dump -U root -d lynx -f /pg_dump/lynx_dump_$(date +%Y%m%d).sql &&
tail -f /dev/null
env:
- name: POSTGRES_DB
value: root
- name: POSTGRES_USER
value: root
- name: POSTGRES_PASSWORD
value: root
ports:
- containerPort: 5432
resources:
requests:
cpu: 50m
memory: 100Mi
limits:
cpu: 1000m
memory: 2000Mi
volumeMounts:
- name: postgres-data
mountPath: /var/lib/postgresql/data
- name: cockroach-backup
mountPath: /scripts
- name: pg-dump
mountPath: /pg_dump
- name: export-pg-dump
image: bitnami/kubectl:latest
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
command:
- /bin/sh
- -c
- |
sleep 60
echo "#### Copying "lynx_dump_$(date +%Y%m%d).sql" to /extern/backup ####"
kubectl cp /pg_dump/lynx_dump_$(date +%Y%m%d).sql /Users/gerson/cockroach-data/extern/backup/
tail -f /dev/null # Keep the container running
volumeMounts:
- name: pg-dump
mountPath: /pg_dump
resources:
requests:
cpu: 50m
memory: 100Mi
limits:
cpu: 100m
memory: 200Mi
volumes:
- name: postgres-data
emptyDir: {}
- name: cockroach-backup
hostPath:
path: /cockroach-backup
type: Directory
- name: pg-dump
emptyDir: {}
serviceAccountName: crdb-to-postgres-migration
After the whole process finish you can export the pg_dump and run it in another postgres database useful for development
1- Copy pg_dump to local PC
kubectl cp pod-name:/pg_dump/lynx_dump_$(date +%Y%m%d).sql /Users/user/lynx_dump_$(date +%Y%m%d).sql -c export-pg-dump -n crdb-to-postgres
2- Create Database in postgres
createdb lynx_core_prod
psql -U root -d lynx_core_prod < lynx_dump_20241217.sql
Congratulations you were able to migrate CRDB🪳 database into Postgres🐘 DB :)Pura Vida!