CSC/ECE 517 Spring 2024 - NTNX-1 : Extend NDB Operator to Support Postgres HA: Difference between revisions

From Expertiza_Wiki
Jump to navigation Jump to search
No edit summary
No edit summary
Line 11: Line 11:
The operator then reconciles the database/NDB CR, and watches its status thereafter. If a user modifies that DB CR, the reconcile loop begins again.
The operator then reconciles the database/NDB CR, and watches its status thereafter. If a user modifies that DB CR, the reconcile loop begins again.


[[File:NDBdiagram.jpg]]
[[File:NDBdiagram.jpg]|2500px]


<h4>Postgres HA Instance</h4>
<h4>Postgres HA Instance</h4>

Revision as of 00:14, 25 March 2024

NTNX4 Design Document

Provisioning Postgres HA Instances

Justin Orringer, Kandarpkumar Patel, Cody Irion


Existing Architecture

When provisioning a new database, the NDB Operator monitors the cluster for newly created custom resources. Once it is created, the operator syncs the change with the NDB Server, which records all databases to be provisioned.

The operator then reconciles the database/NDB CR, and watches its status thereafter. If a user modifies that DB CR, the reconcile loop begins again.

[[File:NDBdiagram.jpg]|2500px]

Postgres HA Instance

Postgres HA instances use redundancies to keep a PostgreSQL database active and accessible during hardware failures, software difficulties, and other disturbances. These redundancies, DB replication, failover, and load balancing, are managed through a HAProxy. The proxy determines which database the query goes to based on the load and status.

The NDB API already supports Postgres High Availability DB, but provisioning them in the Kubernetes operator is not supported.


Problem Statement

We need to expand the NDB Operator to accommodate Postgres HA. This, as you will see below, will add fields to the existing architecture. With the new HA options, we will create end-to-end and unit tests for provisioning and removing DB.


Design and Workflow

To start we compared the API payloads for creating a Postgres and Postgres HA database to see which parameters are unique to Postgres HA. The values unique to the Postgres HA payload can be seen here:

Next, we compared these unique parameters with the pull request from last semester to begin implementing Postgres HA. This told us which parts have been partially or fully implemented and which have not been added at all.

When comparing the Postgres HA with the changes made in the existing PR we can see that most of the parameters have been implemented in a hard coded fashion. With the past implementation a user can create a Postgres HA instance by setting the new isHighAvailability parameter in the NDB Custom Resource (referred to later as CR) to true. This allowed the provisioning of a Postgres HA instance with preset values for the various HA options.

Our implementation will instead move the isHighAvailability field to be a part of the database parameters AdditionalArgument map. This has the benefit of not requiring another field for the custom resource when not wanting an HA instance.

Most of the work in our implementation will be allowing the default values below to be set using optional parameters. These optional parameters will be provided by the user when creating the CR by specifying the parameters by their key/value pairs inside the additionalArguments section.

Provided parameters are used (assuming it is valid) to override the default values. If the parameter is absent, the default values will be left unchanged.


Out of those ActionArguments above, the unimplemented ones are:

actionArguments": [

   {
     "name": "provision_virtual_ip",
     "value": true
   },
   {
     "name": "deploy_haproxy",
     "value": true
   },
   {
     "name": "failover_mode",
     "value": "Automatic"
   },
   {
     "name": "archive_wal_expire_days",
     "value": "-1"
   },
   {
     "name": "patroni_cluster_name",
     "value": "patroni"
   },
 ],