Migrating Legacy SQL Databases To Aurora PostgreSQL

This post is rather lengthy, so here is an executive summary of what is covered.

Why migrate from SQL Server to Aurora Postgres?
AWS Aurora PostgreSQL Instance
Creating a Babelfish Postgres Database
Babelfish Compass Tool
AWS Schema Conversion Tool
- SCT steps
AWS Data Migration Service

Why migrate from SQL Server to Aurora Postgres?

Migrating databases from one platform to another is a task that most database administrators don’t like to contend with. Migrations are usually time-consuming, and the end product often doesn’t work as well as the original (unless significant effort is made to improve the app/database during the migration). So… Why bother? Over the past couple of years I’ve spoken with several DBAs, and the reason(s) for migrating can be summarized as:

Cost reduction – Reducing licensing costs, consolidating hardware, etc.
Database estate consolidation – Moving all databases to a common platform to simplify operational management and data access.
Skills consolidation – Management wants to improve support and development response time by having everyone working on a common platform.

Our decision to move some (not all) SQL databases to AWS Aurora was partially a cost decision, but also an effort to simplify operational management by diversifying our database platforms. Now that second reason might seem counter-intuitive, but here are some details about the situation and how we came to our decision:

We have a number of applications that were written by people who are no longer with our organization.
Most applications were written using language(s) that none of our current developers use. Documentation ranges from minimal to non-existent.
Each application is used infrequently by a small number of users.
The databases are all on a SQL Server 2012 instance, and Microsoft stopped support in July 2022.
The server hardware support ends in the near future.
For some of the applications, we are required to maintain access to the data for a specific number of years.

Some of the solutions we came up with were:

Build a new server (with newer hardware, Operating System, and SQL Server version), keeping the old server name.
Migrate the databases to a newer, existing SQL server and update the application code to point to the new server.
Migrate the databases to a cloud-hosted database platform and update the application code.

Option 1 was deemed to be too expensive. Option 2 would just push the problem out a few years. Additionally, the databases would be taking up resources on higher-performance systems, and they don’t really need that level of performance. Option 3 would be less expensive than Option 1, and not take up the space/resources of Option 2. Moving the databases specifically to AWS Aurora Postgres would be less expensive than AWS RDS SQL Server (more on that in the next section). Aurora would also reduce the complexity of some of our tier 1 Production SQL server tasks (maintenance planning and such). Essentially, Aurora would be used as a tier 2 platform (still production, but not requiring the performance and management of our on-prem tier 1 servers).

AWS Aurora PostgreSQL Instance

The first step of the migration process was to create an AWS Aurora Postgres instance. We opted to use Aurora Postgres with Babelfish. The Babelfish functionality acts as an interpreter for SQL-based applications to work with Postgres databases. Without the Babelfish functionality, each of the database applications would have to be rewritten before they would work with the Postgres databases. Note: When Babelfish is enabled, the Postgres instance will have configuration options for 2 TCP ports (Postgres & Babelfish). The default port to talk directly to the Postgres databases is 5432. Babelfish has a separate TCP port for the application to talk to. To keep things simple, the Babelfish port can be set to the same port number as the SQL Server database used (usually 1433).

Right-sizing the Postgres instance is essential. In our first-run Proof of Concept, we opted to use a Serverless instance. Since the application was infrequently used, and by a small number of people, the Serverless option reduced the monthly operating cost by about 65% (compared to a SQL Server Standard instance). Unfortunately, further testing found that we could only add 1 application database to the Postgres instance. To create a single Postgres instance that would support multiple application databases, we ultimately decided to use a db.t3.medium instance. The cost was marginally higher than the Serverless instance but still less than half of the monthly cost for one of our (Standard Edition) SQL servers.

Creating a Babelfish Postgres Database

One architectural difference between Microsoft SQL Server and PostgreSQL databases is how they handle object naming. SQL Server supports a 3-part naming convention (ie. DatabaseName.Schema.Object). PostgreSQL only supports a 2-part naming convention (ie. Schema.Object). To get around this limitation, Babelfish has a separate Postgres database (babelfish_db) where it stores information on SQL objects that are renamed in the Postgres database. By default, Babelfish will merge the SQL database and schema names into a new schema name (eg. the dbo schema in the AppTest database on a SQL Server will be renamed to schema apptest_dbo in Postgres). For this reason, it is essential that databases & objects are created through the Babelfish interface (TCP port), and not created directly through the Postgres interface.

To create the application database in Postgres:

Open SQL Server Management Studio (SSMS).
Connect to the Babelfish port (eg 1433) on the Postgres server.
Execute a CREATE DATABASE statement.
Note: Using a Postgres management application (eg. PGAdmin or DBeaver) you can connect to the Postgres port (5432) and view the new application database (eg. dbcentral_dbo) under Schemas in the Babelfish database.

Once the database has been created, there are 2 options for migrating the SQL database schema into the Postgres database:

Babelfish Compass Tool
AWS Schema Conversion Tool

Babelfish Compass Tool

I won’t go in-depth on the Compass tool in this post. In short, it works for many situations, and it’s a good option if you need to put your schema conversion process into a code repository. The general process for using the Compass tool is:

Create the SQL database DDL script.
- In SSMS, right-click on the SQL database and select Generate Scripts.
Use the Compass tool to create an assessment report of the DDL script.
Review the assessment report and make necessary changes to the DDL script.
Create a Babelfish Postgres database. (section above)
Using SSMS, Run the DDL script against the Babelfish database.
Test the database (if necessary, make additional changes to the DDL script & reiterate).

AWS Schema Conversion Tool (SCT)

In our initial Proof of Concept testing, I found the AWS Schema Conversion Tool to be more novice-friendly, and quicker for one-off migrations, than the Compass tool.

SCT steps:

Before running the SCT, a user will need to be created on the SQL Server with these permissions:
- On the database to be migrated:
  - View Definition
  - View Database State
- On the Master database:
  - View Server State
  - View Any Definition
- If there are SQL Agent Jobs associated with the database, add the user to the SQLAgentUser role.
The Schema Conversion Tool (wizard) will prompt for a Project name, a location to store the project files, source and destination database engines, and connection parameters for the source (SQL) server.
The SCT will then connect to the source server to load the metadata, and may display a warning for any databases that it does not have sufficient permissions on.
After selecting the schema & objects to analyze,
SCT will display a migration assessment.
The last step in the wizard is to provide the destination database information.
Upon closing the migration assessment, the SCT app will display the source server objects that will be migrated and will mark any objects that require user intervention. Selecting an object will display the SQL create statement with any issues highlighted. The SQL statements can be directly modified to correct the problems.
Once all of the Create statements have been corrected, click on the Add Target menu and select Amazon RDS for SQL Server. It will then prompt for the connection details of the Babelfish port on the Aurora Postgres instance that was created previously. Note: do not use the Amazon Aurora (PostgreSQL) option, doing so will connect directly to the Postgres instance and not use the Babelfish functionality.
Next, Create a Mapping Rule from the Source server to the Target server.
- If the Mapping Rules are not displayed in the center window, Right-click on the SQL Server database name in the left window and select Create Mapping.
- Delete any existing mappings that go to //Servers/<Babelfish (virtual)>
- Create a new mapping to the Babelfish port (1433) of the Postgres server.
Right-click on the Source database, select Convert schema, and the Schema conversion tool will begin generating the schema conversion tasks.
Once the conversion process has completed, expand the SQL Servers list in the Target servers window (right side), right-click on the new database, and select Apply to database. SCT will now begin uploading the converted database to Postgres.
After all of the objects have been applied (created), SCT will return a Success/Fail report.
At this time the database connection in SSMS can be refreshed, and all of the new objects will be displayed.
- Refreshing the Postgres connection in PGAdmin (or DBeaver) will show the same objects have been created in the new schema (eg. dbcentral_dbo) of the babelfish_db database.

AWS Data Migration Service

The next step is to copy the data into the Postgres tables. This can be accomplished using the AWS Data Migration Service (AWS DMS). AWS DMS is a managed ETL process that can do a one-off data migration, or it can do ongoing replication (SQL to Aurora) if the migration project doesn’t have a hard cutover date. More on this in a future post.

Once the data has been copied into the Postgres tables, the final migration steps are:

Create any application logins & roles (this can be done in PGAdmin). The migration process automatically creates _db_owner and _dbo Roles that an app login can be added to.
Update the application’s database connection info to the Babelfish port on the Postgres server.
Run any QA testing to verify the app works correctly.

Posts

Migrating Legacy SQL Databases To Aurora PostgreSQL