CSC/ECE 517 Fall 2023 - NTX-4 Extend NDB Operator capabilities to support Postgres HA

From Expertiza_Wiki
Revision as of 02:38, 2 November 2023 by Syacham (talk | contribs)
Jump to navigation Jump to search

Kubernetes

The open-source container orchestration platform Kubernetes, sometimes shortened to K8s, automates the deployment, scaling, and administration of containerized applications. Google built it initially, and the Cloud Native Computing Foundation (CNCF) is currently responsible for its maintenance. The strong and adaptable container management architecture offered by Kubernetes simplifies the deployment and maintenance of complicated, dispersed applications.

Key Concepts and Components of Kubernetes

1. Containers: Containers are small, lightweight, isolated environments that run applications and their dependencies. Kubernetes is built to interact with them. One of the most often utilized container runtimes with Kubernetes is Docker.

2. Nodes: Your containerized apps are executed on these machines, which might be real or virtual. Nodes are accountable for managing containers and supplying computational power inside a cluster.

3. Pods: In Kubernetes, pods are the smallest deployable units. One or more containers that share the same network namespace and storage volumes can be found inside a pod. Using `localhost`, containers in the same pod can speak to one another.

4. Replica Sets and Deployments: These are the controllers that keep track of and regulate the number of pod replicas that are deployed. They are employed in rolling out and scaling upgrades.

5. Services: Applications that are operating in pods can be consistently accessed and exposed through the use of Kubernetes services. They can be applied to various tasks like service discovery and load balancing.

6. Ingress: Resources and controllers for ingress offer a mechanism to control external network access to services inside the cluster.

7. ConfigMaps and Secrets: They are used to handle sensitive data, such as passwords or API keys, and configuration information apart from the application code.

8.Namespaces: Kubernetes provides a notion called namespaces that lets you divide and isolate resources inside a cluster logically. Applications for organizing and multi-tenancy can benefit from it.

9. Kubelet: This agent is in charge of making sure containers are operating in a pod and is installed on every cluster node.

10. Master Node: The cluster is managed and supervised by the control plane, which is made up of the Kubernetes master components. It consists of the scheduler, controller manager, etcd (a key-value store for cluster data), and API server.

11. kubectl: To communicate with a Kubernetes cluster, use this command-line utility. It gives you the ability to add, remove, and manage cluster resources.

Kubernetes is a popular choice for managing containerized apps, microservices, and workloads that are cloud-native since it is extremely adaptable and can be linked with a variety of tools and services. It offers a uniform platform for automating deployment, scaling, and operations in contemporary cloud-native systems and abstracts many of the challenges associated with managing containers.

Secret:

An object, like a password, token, or key, that holds a tiny amount of sensitive information is called a secret. Usually, a container image or Pod specification would contain this information. Secrets enable the omission of private information from application code.

During the process of generating, viewing, and editing Pods, there is a lower chance of the Secret (and its data) being disclosed because Secrets can be produced independently of the Pods that use them. When working with Secrets, Kubernetes and cluster apps can additionally take extra safety measures, such as not storing important data in nonvolatile storage.

Secrets and ConfigMaps are similar, but Secrets are made especially for storing private information.


Custom Resource Definition:

An object that expands the Kubernetes API or lets us add our own API to a project or cluster is called a custom resource. Our own object types are defined in a custom resource definition (CRD) file, which allows the API Server to manage the whole lifecycle.


Kubernetes Operator:

One specialized way to package, deploy, and manage Kubernetes applications is with a Kubernetes operator. It creates, configures, and automates complicated application instances on behalf of users by utilizing the Kubernetes API and tools. Kubernetes controllers are extended by operators, who have domain-specific expertise to manage the whole application lifecycle. They can scale, upgrade, and manage different parts of the program, including kernel modules, in addition to continuously monitoring and maintaining the application.

To manage components and applications, operators use custom resources (CRs) defined by custom resource definitions (CRDs). They observe CR kinds and utilize logic that is embedded with best practices to translate high-level user directions into low-level actions. Role-based access control policies and kubectl can be used to manage these custom resources. Beyond the capabilities of Kubernetes' built-in automation features, operators enable the automation of operations that are in line with site reliability engineering (SRE) and DevOps methodologies. They are usually developed by people who are knowledgeable about the business logic of the particular application. They incorporate human operational knowledge into software, avoiding manual duties.

The Operator Framework is a collection of open-source tools that speed up the development of operators. It provides an Operator SDK for developers who lack a thorough understanding of the Kubernetes API, Operator Lifecycle Management for managing the installation and management of operators, and Operator Metering for usage reporting in specialized services.


Nutanix Database Service


NDB Kubernetes Operator

The goal of NDB Operator, a Kubernetes operator, is to make the process of setting up and maintaining database clusters within Kubernetes clusters easier. An application with operational knowledge of another application is called a Kubernetes operator. After deployment within the Kubernetes Cluster, it can start monitoring the endpoints of interest and modifying the application under management. An NDB Cluster can be deployed, managed, and modified with the least amount of human intervention thanks to the NDB Operator.

Using their K8s cluster, developers can now provision PostgreSQL, MySQL, and MongoDB databases directly, saving them days or even weeks of work. They can take advantage of NDB's complete database lifecycle management while using the opensource NDB Operator on their preferred K8s platform.

Problem Statement

The problem statement requires us to extend NDB operator capabilities to support Postgres HA (High Availability). Currently NDB has support for Posgres High Availability databases but the NDB operator cannot manage them. Our task is to identify what additions need to be made to the project to support Postgres HA and implement these additions. Moreover, we will perform end-to-end testing of the provisioning and deprovisioning processes to ensure their smooth functionality.

Postgres HA involves implementation of measures that ensure that a PostgreSQL database system remains operational and accessible even in the face of hardware failures, software issues, or other types of disruptions. This includes measures like replication, failover, load balancing and more.

Approach

- To start a hardcoded file containing basic config for Postgres HA can be made.
- The hardcoded file will contain details such as databaseInstanceName, description, clusterId, credentialSecret, size.
- Starting from ndb_api changes can be made to integrate Postgres HA support.
- The flow of changes will be from ndb_api -> controller adapters -> controllers -> api.
- In ndb_api, another method appendRequest for Postgres HA needs to be added to db_helpers.go. Within this method, additional specifications for this specific database type shall be provided.
- Within api, the Instance struct needs to be updated to include parameters like copies of the instance, how often data will be copied from the primary and more.

References

Nutanix Database Service

PostgreSQL High Availability

Relevant Links

link to GitHub repository: https://github.com/rithvik2607/ndb-operator

Team

Mentor
Nandini Mundra

Student Team
Sai Rithvik Ayithapu (sayitha@ncsu.edu)
Sreehith Yachamaneni (syacham@ncsu.edu)
Rushil Patel (rdpate24@ncsu.edu)