CSC/ECE 517 Spring 2023- NTNX-4. Extend NDB operator provision postregresql aws

From Expertiza_Wiki
Revision as of 01:15, 22 March 2023 by Asubram9 (talk | contribs) (First draft)
Jump to navigation Jump to search

NTNX-4. Extend Nutanix Database Service Kubernetes operator to provision Postgres Single Instance databases on AWS EC2

Problem Statement

Nutanix Database Service is the only hybrid cloud database-as-a-service for Microsoft SQL Server, Oracle Database, PostgreSQL, MongoDB, and MySQL. Efficiently manage hundreds to thousands of databases. We need to extend the Nutanix Database Service Kubernetes operator to provision Postgres Single Instance databases on AWS EC2.

Architecture

Workflow

  • Apply the instance CRD manifest file to create an instance of NDB resource in the K8 cluster
  • The operator controller runs its logic to build a provisioning request and sends it to the Nutanix Database Service endpoint pertaining to provision in cloud(AWS in our case)
  • The reconcile function loops to take actions depending on the status of provisioning - Empty, Provisioning and Ready
  • Once the database is provisioned in AWS and the status is in Ready, we set up a Kubernetes networking service and an endpoint with the same name in the cluster. Any application pod in this cluster will communicate with this endpoint rather than communicating directly with the database.

Design

Single Responsibility Principle(SRP)

SRP is adopted here to separate the business logic of the operator from the reconciliation loop.

  • The logic to handle remoteType : cloud is handled separately from the client (ndbClient) through helper methods - This ensures that the client is responsible for only making get/post requests to the specified endpoint without worrying about whether it is for on-prem or cloud.
  • The reconciliation loop is responsible for ensuring that the state of the system matches the desired state declared by the operator. It listens to events from the Kubernetes API server and performs the necessary actions to reconcile the system to the desired state. The business logic of the operator, on the other hand, is responsible for implementing the higher-level functionality of the operator, such as managing the deployment of a complex application or managing a database cluster.
  • Even within the reconciliation logic, there are clear indications of following SRP. We have a separate file utils/secret.go to implement GetDataFromSecret and GetAllDataFromSecret - logic to get data from the resource denoted by the name/namespace combination.

Operator Pattern

The operator pattern is based on the idea of declarative configuration management. Rather than writing imperative code to perform a series of steps to reach a desired state, an operator declares the desired state of the system and uses the Kubernetes API to perform the necessary actions to reconcile the system to that desired state.

We continue to follow this pattern for our implementation. We have a manifest file for a NDB custom resource responsible for provisioning and managing PostgreSQL in AWS and the logic for it is handled in the controller code (controllers/database_controllers.go)

 EXAMPLE MANIFEST FILE
  ...
  
  apiVersion: ndb.nutanix.com/v1alpha1
  kind: Database
  metadata:
    name: db
  spec:
    ndb:
      remoteType: "cloud"
      clusterId: "Nutanix Cluster Id"
      credentialSecret : ndb-secret-name
      server: https://[NDB IP]:8443/era/v1.0
      skipCertificateVerification: true
    databaseInstance:
      databaseInstanceName: "Database Instance Name"
      databaseNames:
        - database_one
        - database_two
        - database_three
      credentialSecret: db-instance-secret-name
      size: 10
      timezone: "UTC"
      type: postgres
  ...

The controller logic and related helper code will use the newly created 'remoteType' field to select the NDS endpoint and construct provisionRequest appropriately. We have distinguished the configuration(Manifest file) and the logic(controller code) this way.

Implementation related to this project

This section will explain the changes that are made/to-be-made as part of this project.

NOTE: MOST OF THESE CHANGES HAVE NOT BEEN MADE NOW SINCE OUR TEAM HAS NOT OBTAINED ACCESS TO CONNECT THE SERVICE WITH EC2 INSTANCE FROM THE NUTANIX TEAM We will tell the changes that have not been made and they will be made in the future

Generate Secrets manifest and set name field + credentials in it

This secrets.yaml manifest file is to create a Secret resource instance that is will have the data regarding the NDB instance name and database credentials.

Note: We have created this file, but currently have template values. After getting access, we will fill them accordingly

Modify database_types.go to have allow remote types

We have added a remoteType field to the NDB struct along with validation markers ...

  // +kubebuilder:validation:Required;Enum=cloud;on-prem
  RemoteType string `json:"remoteType"`


Add helper method to figure out endpoint depending on remote type

We currently have this logic as part of the ndbClient and should be moved to a separate helper method. The purpose of this function is to determine if the provisioning is for on-prem or cloud.

Modify helper methods to generate request, provision database and setup connectivity

Below are the functions that are core to the provisioning logic. They are written for provisioning a database in Nutanix cloud infrastructure(on-prem)

GenerateProvisioningRequest(database_reconciler_helpers.go)

This data structure should be modified/extended to include fields that are required to be part of the JSON request for the NDS endpoint that pertains to provisioning on cloud.

ProvisionDatabase(database_reconciler_helpers.go)

This method takes care of posting a request to the NDS endpoint to provision database. Should be modified to allow the same on cloud.

SetupConnectivity(database_reconciler_helpers.go)

This function checks and creates a new service (without label selectors) if it does not exists and also sets up the database as the owner for the created service. This also checks and creates an endpoints object for the service if it does not already exists.

Test plan

Runing tests

Pull the code, cd into the directory and run:

  make test

This runs all the required tests as specified in the Makefile.

Test Scenarios

  1. Testing construction of provision request - All test cases for this that are currently written for on-prem must work for cloud remoteType too.
  2. Additional tests for fields that are mandatory for requests constructed for provisioning in cloud
  3. Verify that the correct API endpoints are being hit based on the remote type specified - On-Prem should hit 0.9 API and cloud should hit 1.0 API
  4. Test the connectivity between the K8 service and the deployed DB on EC2.

Actions to be taken before Manual testing

Manual testing would require you to get access to the Nutanix Database Service

  1. Obtain access to Nutanix Database Service SASS and access keys for the AWS EC2 instance.
  1. Add the appropriate credentials to the secrets.yaml file.
  1. Apply your manifest instance using :
  kubectl apply -f <your manifest instance file>

You can verify the status of provisioning in your NDB dashboard.

Future implementation

  • Obtain EC2 keys and connect NDS to the EC2 instance
  • Construct provision request for provisioning DB in cloud
  • Add logic to select proper endpoint and request based on the remoteType
  • Add logic to de-provision the database.

Github

Repo(Public): https://github.com/arvindsrinivas1/ndb-operator Pull Request: https://github.com/nutanix-cloud-native/ndb-operator/pull/73

Contributors

Arvind Srinivas Subramanian (asubram9)