CSC/ECE 517 Spring 2024 - NTNX-2 : Snapshot Functionality for provisioned databases

From Expertiza_Wiki
Jump to navigation Jump to search

Problem Backgroud

NDB Operator

The NDB Operator is a tool for Kubernetes that helps manage MySQL NDB Clusters. It automates tasks like setting up, scaling, and backing up databases, making it easier to handle these in Kubernetes. It works with tools like Ansible, Jenkins, and Terraform to help set up and run databases automatically. The NDB Operator also helps with security and following rules by offering features like data encryption and access control. It’s designed to take advantage of Kubernetes' features like auto-scaling and load balancing, so developers can focus more on their main tasks and worry less about managing databases. It sets up a way for applications to connect to the databases easily, making the whole process smoother.

Nutanix Database Service

Nutanix Database Service (NDB) is a Database-as-a-Service platform that simplifies managing databases across different environments, such as on-premises and cloud. It supports databases like SQL Server, Oracle, and MongoDB, automating tasks like provisioning, cloning, and backup. NDB offers a self-service experience for developers, streamlining database setup and management, allowing DBAs to focus on more strategic tasks while maintaining control over their database environments. Problem Statement and Architecture The project involves implementing a Kubernetes controller and a custom resource for managing snapshot operations of databases provisioned by the Nutanix Database Service (NDB). This task encompasses developing a comprehensive system that captures, manages, and reports on the snapshots, providing a Kubernetes-native interface for these functionalities. Users should be able to create, view, manage, and filter snapshots, accessing details like time, name, and other metadata. The solution must integrate seamlessly with NDB, ensuring robust management and real-time status reporting of snapshot operations within the Kubernetes environment.

Problem Statement and Architecture

The project involves implementing a Kubernetes controller and a custom resource for managing snapshot operations of databases provisioned by the Nutanix Database Service (NDB). This task encompasses developing a comprehensive system that captures, manages, and reports on the snapshots, providing a Kubernetes-native interface for these functionalities. Users should be able to create, view, manage, and filter snapshots, accessing details like time, name, and other metadata. The solution must integrate seamlessly with NDB, ensuring robust management and real-time status reporting of snapshot operations within the Kubernetes environment.

NDB Architecture

Functional Design

To implement a snapshot custom resource and controller that manages and reports on snapshots for provisioned databases, we need to integrate the existing APIs and extend their functionality with Kubernetes custom resources and controllers.Here is the details of the existing api for manage the snapshots of the database are as follows:


Take the /snapshots for example, It could summary as following

 Path: /snapshots Method: GET
 Method: GET
 Request Payload:
 {
   value-type (string): Filters the snapshots based on the specified type, such as "type", "status", "protection-domain-id", "database-node", "snapshot-id".
   value (string): Corresponds to the value-type to provide a specific filter value. database-ids (string): Fetches snapshots for the specified comma-separated list of database IDs.
   all (boolean, default: false): If true, fetches all snapshots regardless of other filters.
   time-zone (string, default: UTC): Sets the time zone for the snapshot timestamps.
 }
 Authorization: Basic Auth (username-password based)

Description:

Retrieves a list of all snapshots, with options to filter based on type, status, specific databases, and other criteria. The result would be


2.Define the Snapshot Custom Resource Definition (CRD) and Implement the Snapshot Controller

● Resource Definition Example: Define a CRD named Snapshot that includes the necessary metadata for snapshot operations. Example as below:

Kind: Snapshot 
Spec:
 name: String - The name of the snapshot.
 ip: String - The IP address of the database. 
 timeMachineID: String - An identifier for the Time Machine. 
 username: String - The username for authentication. 
 password: String - The password for authentication.
Status:
 operationID: String - A unique identifier for the snapshot operation.
 status: String - Reflects the outcome of the snapshot operation (e.g., OK, Failed).

Controller Functions: Develop a Kubernetes controller that manages the lifecycle of Snapshot custom resources, including creation, update, and deletion.

● Watch Changes: The controller should monitor changes to Snapshot resources and interact with the Nutanix API to execute snapshot operations.

● Status Update: The controller should update the custom resource's status to reflect the progress and outcome of the snapshot operations.


3. Integrated API Development, Code Implementation, and User Interface Enhancement


● API and Data Handling: Develop a new API endpoint (e.g /<TimeMachineID>/snapshotsv2) in the NDB service that accepts additional parameters and interfaces with the existing snapshot API. This endpoint should handle extra metadata, ensuring seamless integration and efficient processing of snapshot-related data.

● Go File Creation and Kubernetes Integration: Implement the snapshot functionality by creating essential Go files such as snapshot.go, snapshot_request_types.go, snapshot_response_types.go, and snapshot_helpers.go. Concurrently, define a Kubernetes resource object in snapshot_types.go to encapsulate the specifications (Spec) and operational status (Status) of the snapshot, facilitating its management within the Kubernetes ecosystem.

● User Interface and Access Control: Establish a user-friendly configuration file (snapshot.yaml) to enable users to specify parameters for snapshot creation. Additionally, revise or formulate RBAC policies through files like roles.yaml,snapshot_editor_role.yaml, and snapshot_viewer_role.yaml, to define and regulate the actions permissible for different user roles, ensuring secure and controlled access to the snapshot functionalities.

Implementation and WorkFlow

1. Check the creation of the snapshot, also define the Snapshot Custom Resource Definition (CRD)

Verify that the Kubernetes cluster is running and accessible.Nutanix Database Service (NDB): Ensure that NDB is properly configured and accessible within the cluster.

● Create a new CRD to define the snapshot resource, which should include metadata such as the snapshot name, IP address, TimeMachineID, username, and password.

● Define the Spec and Status sections, where Spec includes parameters needed for creating a snapshot, and Status contains the operation ID and status (e.g., success or failure).


2.Implement the Snapshot Controller

The changes include the addition of file

1) snapshot_controller.go [File]

2) snapshot_controller_helper.go [File]

3) snapshot.go [File]

Develop a Kubernetes controller to handle the lifecycle of snapshot CRs, including creation, update, and deletion.The controller should watch for changes in snapshot CRs and interact with the Nutanix API to perform snapshot operations. It should also update the CR's status to reflect the progress and outcome of the snapshot operations.


3.Integrate with the Existing API

● Create a new API endpoint in the NDB service that accepts additional parameters and calls the existing snapshot API.

● Ensure the new API can handle the additional metadata and pass it to the existing snapshot API.


Write Code to Support New Features

● Create snapshot.go, snapshot_request_types.go, snapshot_response_types.go, and snapshot_helpers.go in Go, to implement the snapshot functionality.

● Create snapshot_types.go for representing the Kubernetes resource object, including the Spec and Status sections.


Implement User Interface and Access Control

● Create snapshot.yaml to allow users to provide parameters needed for creating a snapshot.

● Modify or create RBAC policies (such as roles.yaml, snapshot_editor_role.yaml, snapshot_viewer_role.yaml) to control the operations that different roles of users can perform.

Test Plan

To ensure our custom resource and controller can effectively get, take, and perform snapshots for provisioned databases while reporting statuses, here is a detailed test plan: CRD Verification: Check if the Snapshot CRD is correctly defined and can be instantiated.

● Create a Snapshot resource using kubectl apply and verify that it is accepted by the Kubernetes API. Controller Operations: Test the lifecycle operations handled by the Snapshot Controller.

● Create: Trigger snapshot creation through the Snapshot custom resource and verify that the controller initiates the snapshot process.

● Update: Modify an existing Snapshot resource and observe if the controller processes the update correctly.

● Delete: Delete a Snapshot resource and ensure the controller cleans up accordingly. To ensure your custom resource and controller can effectively manage the snapshot lifecycle for provisioned databases and accurately report statuses, a focused test plan is as follows:

Test Test Description
Snapshot Creation Success Tests the successful creation of a snapshot by the controller when a Snapshot custom resource is applied. Verifies that the snapshot is taken in the database and that the status is reported back as successful in the custom resource.
Snapshot Creation Failure Handling Attempts to create a snapshot with invalid parameters to test the controller's ability to handle failure. Verifies that the status is reported back as failed, and the appropriate error message is logged.
Snapshot Listing and Metadata Validation Ensures that the controller can list all snapshots and that the metadata associated with each snapshot is correct and complete. Validates the ability to filter snapshots based on metadata criteria.
Snapshot Deletion Success Confirms that the controller can successfully delete a snapshot and accurately reflects this in the status of the Snapshot custom resource. Also verifies that the snapshot is no longer retrievable from the database.
Snapshot Status Updates Monitors the status field of the Snapshot custom resource for accurate real-time updates throughout the snapshot lifecycle, including during creation, successful completion, failure, and deletion.
Role-Based Access Control (RBAC) Compliance Verifies that the Snapshot custom resource and controller adhere to RBAC policies, ensuring that only authorized users can perform create, read, update, or delete operations on snapshots.

This test plan, currently consisting of six tests, provides a thorough overview of the snapshot lifecycle management, encompassing creation, retrieval, deletion, status reporting, and access control. It will be subject to further refinement and expansion throughout the project’s development.

References:

https://www.nutanix.com/what-we-do

https://github.com/dvrohan/ndb-operator

Team:

Mentor

Kartiki Bhandakkar <kbhanda3@ncsu.edu>

Student

Zhi Zhang zzhan224@ncsu.edu

Yucheng Zhu yzhu67@ncsu.edu

Shardul Ladekar sladeka@ncsu.edu