CSC/ECE 517 Spring 2023 - NTNX-1. Support provisioning MongoDb via NDB Kubernetes Operator

From Expertiza_Wiki
Jump to navigation Jump to search

Background

Kubernetes Kubernetes is an open-source container orchestration platform used for automating the deployment, scaling, and management of containerized applications. Kubernetes enables developers to deploy and manage containerized applications across a distributed network of computers or servers. It uses a declarative model to define the desired state of an application and automatically manages the containerized components to ensure that the actual state matches the desired state. Kubernetes offers a wide range of features for managing containerized applications, including automatic scaling, rolling updates, self-healing, service discovery, and load balancing, and can be run on public or private cloud infrastructure or on-premise data centers.

Nutanix Database Service

Nutanix Database Service is a hybrid multi-cloud database-as-a-service for various databases including Microsoft SQL Server, Oracle Database, PostgreSQL, MongoDB, and MySQL. It offers the ability to manage hundreds to thousands of databases effectively, easily create new ones, and automate tedious administrative tasks such as patching and backups. Additionally, users can select operating systems, database versions, and extensions to meet compliance and application requirements. With Nutanix Database Service, customers from all over the world have streamlined their databases across various locations and accelerated software development.

Features offered by NDB Service:

  1. It enables users to manage the complete database lifecycle, including database provisioning, scaling, version upgrades, and patch automation.
  2. The product allows users to manage hundreds to thousands of databases, such as Microsoft SQL Server, Oracle, PostgreSQL, MySQL, and MongoDB, across various platforms, including on-premises, colocation facilities, and multiple public clouds, all from a single control point.
  3. Users can provision databases for both dev/test and production purposes through API integration with infrastructure management and development tools, such as ServiceNow.
  4. The product enables users to quickly deploy patches across some or all of their databases to prevent the latest security threats. Additionally, it provides role-based access controls to restrict access to databases, ensuring compliance with regulatory requirements and best practices.
  5. It also provides features for data protection, compliance, and security, including data encryption, role-based access control, and audit logging.
  6. It integrates with popular DevOps tools like Ansible, Jenkins, and Terraform to automate the deployment and management of databases.

NDB Kubernetes Operator

The NDB Kubernetes operator is a tool that simplifies the deployment and management of open-source databases on Kubernetes. It allows users to deploy and manage popular databases like MySQL, PostgreSQL, and MariaDB on Kubernetes using a declarative approach. It is one way in which you use NDB service. Kubernetes was built to manage stateless things. NDB needs to be handled in a very specific way, thus we needed an operator for this.

The NDB Kubernetes Operator provides automated deployment, scaling, backup, recovery, and monitoring of databases, making it easier to manage databases in a Kubernetes environment. It also integrates with popular DevOps tools like Ansible, Jenkins, and Terraform to automate the deployment and management of databases. It also provides features for data protection, compliance, and security, including data encryption, role-based access control, and audit logging. It allows users to leverage the benefits of Kubernetes, including automatic scaling, rolling updates, self-healing, service discovery, and load balancing.

With the NDB Kubernetes Operator, developers and DevOps teams can focus on the high-level aspects of their applications rather than the low-level details of managing databases, making application deployment and management more scalable and reliable.

Existing Architecture and Problem Statement

Problem Statement: Support provisioning MongoDb via NDB Kubernetes Operator

The current version of the NDB Kubernetes operator has limited support for database provisioning, with PostgreSQL being the only database type currently available. As part of this project, we seek to introduce a new database type - MongoDB, and extend the existing interfaces to enable support for NoSQL databases. Subsequently, we will perform end-to-end testing of the provisioning and deprovisioning processes to ensure their smooth functionality.

NDB Architecture

The Nutanix Database Service architecture is a distributed system that is designed to provide high availability, scalability, and performance for various types of databases, including Microsoft SQL Server, Oracle Database, PostgreSQL, MySQL, and MongoDB. The architecture is built on top of Nutanix's hyper converged infrastructure, which provides a scalable and flexible platform for running enterprise workloads.

The Nutanix Database Service architecture consists of several layers. At the bottom layer is the Nutanix hyperconverged infrastructure, which provides storage, compute, and networking resources for running the databases. On top of this layer is the Nutanix Acropolis operating system, which provides the core virtualization and management capabilities.

Above the Nutanix Acropolis layer is the Nutanix Era layer, which provides the database lifecycle management capabilities for the Nutanix Database Service. This layer includes the Nutanix Era Manager, which is a centralized management console that provides a single pane of glass for managing the databases across multiple clouds and data centers.

The Nutanix Era layer also includes the Nutanix Era Orchestrator, which is responsible for automating the provisioning, scaling, patching, and backup of the databases. The Orchestrator is designed to work with various databases and provides a declarative model for defining the desired state of the databases.

Finally, at the top layer is the Nutanix Era Application, which is a web-based interface that allows developers and database administrators to easily provision and manage the databases. The Era Application provides a self-service interface for provisioning databases, as well as a suite of tools for monitoring and troubleshooting database performance.

Design & Workflow

One of the biggest bottlenecks in connecting a database provisioned by the operator to an application is automatically sharing the database instance's connection details with the application. While sharing the username and password is relatively simple, sharing the database instance host IP with the application pod can be more complicated as the IP is assigned after the database is provisioned. To address this, there are two potential methods: creating a K8s service that maps to the external NDB service endpoint once the database instance is provisioned or creating a configmap with the IP for the database and referencing the configmap in the application pod.

In terms of making the application wait for the database before starting up, there are three options: handling the wait in the application logic (not recommended), failing the application and hence the pod in case of a database connection failure so that Kubernetes can attempt to restart the application pod until it succeeds, or using init-containers to wait on a Kubernetes pod or service.

Out of these options, creating a K8s service that maps to the external NDB service endpoint is the preferred method and is recommended by Google Cloud's Kubernetes Best Practices. This method provides a decoupling between the database instance and the application pod, and an init-container can wait for the service to be ready and start up the application container only after the service and the underlying database instance on NDB are available. Using these two mechanisms together can enable automatic connectivity between the database instance and the application pod(s).

We will be following the Kubernetes Operator Pattern in the project.

The Kubernetes operator pattern is a way to extend the Kubernetes API by defining custom resources and controllers that manage those resources. Operators automate common deployment, scaling, and management tasks for complex applications, such as databases, message queues, and monitoring systems, that require more than just creating a set of pods.

Here are the key components of the Kubernetes operator pattern:

  • Custom resource definition (CRD): A CRD defines a new type of Kubernetes resource that can be managed by an operator. The CRD specifies the API schema and validation rules for the custom resource.

The snippet above shows the configuration of the custom resources we will be provisioning. The server URL and cluster changes every time a new era test drive is created, since a single test drive can only last for a maximum of four hours. This will supposedly provision a MongoDB database on the Nutanix Database Service.

  • Controller: A controller watches for changes to the custom resources and takes actions to ensure the desired state of the resources is maintained. The controller reconciles the actual state of the resources with the desired state by interacting with the Kubernetes API server and other APIs, such as cloud providers or external systems.
  • Operator business logic: The operator business logic implements the custom behavior required to manage the custom resources. This can include creating and deleting Kubernetes resources, interacting with external systems, performing backups and restores, and handling failure scenarios.
  • Domain-specific language (DSL): A DSL provides an abstraction layer that simplifies the implementation of the operator business logic. DSLs can be created using a range of programming languages or tools, such as Ansible or Helm, to express the specific requirements of the target system.

The Kubernetes operator pattern is a powerful way to automate complex tasks and simplify the management of applications running in Kubernetes clusters. By using custom resources and controllers, operators can help you automate tasks like backups, scaling, and failover, freeing up time and resources to focus on higher-level tasks.

MongoDB provisioning payload:

The image above shows the MongoDB provisioning payload - which will be sent to the Nutanix Database Service client API for provisioning a Mongo database.

Potential Design Patterns, Principles, and Code Refactoring strategies

The codebase could be converted into an Object Oriented fashion with classes. Further, here are some of the design patterns we could use:

Builder: For any objects that are created, we could instantiate using say method chaining rather than initializing everything with a constructor.

Factory: Instead of using the regular way (such as the ‘new’ keyword) to instantiate an object, a factory method would be used to do the same. This pattern can be used if we are creating a superclass for provisioning databases, and subclasses for provisioning different kinds of databases (MongoDB, MySQL, etc). This is because if we want to add another kind of database to our project, and we are creating new databases by conditional checking, our code could get messy. Thus, a factory method could instead create objects in a smarter way for the different database classes (or modules in our case).

Facade: This pattern could be used for masking the complicated provisioning payloads.

Open and Closed principle: As per this principle, an interface is open for extension but closed for modification. We could have an interface with provisioning and deprovisioning methods, and interfaces of different databases could extend and reimplement those methods.

Adapter design pattern could also be added to adapt to different databases

DRY(Don’t repeat yourself): There are a lot of ways in which DRY principles will be applied in our project:

  • Reusing the provisioning function
  • Using constants instead of variables
  • Extracting all common functionality into reusable modules and functions.

Code changes

The GenerateProvisioningRequest API in api/v1alpha1/ndb_api_helpers.go generates and returns a request for provisioning a database on the Nutanix Database Service. We will be working on refactoring this API with this API being a core part of our project’s functionality and purpose. Here’s how we plan on doing so:


Going with the Factory design pattern, let’s first take a look at how our API is working. It generates a provisioning request based on the database type with the corresponding payload. This payload differs for each database type. An outline of how the MongoDB provisioning payload might look has been attached earlier. Here’s a part of how this looks like in the code:



Primarily, the ActionArguments differ for different database types. Thus, instead of repeating the entire payload for many databases, we need to think in line with the DRY(Don't Repeat Yourself) principle. For this purpose, we created a function that takes in the database type and returns the corresponding action arguments.

We also added new functions that return the payload when the above function is called.

Through these code changes, we implemented the Factory Design Pattern.

The code snippet for unit tests for the above functionality is discussed in the Test section.

Demo


Updated final project Demo video:

https://www.youtube.com/watch?v=CqZ7U9pxZ60

Old video for program 3:

https://www.youtube.com/watch?v=lUarCdA8RP0

Test Plan

Here is the video for test execution:

https://www.youtube.com/watch?v=O9Z9ypvQJDM

Here’s an example test for GeneratingProvisioningRequest.

The purpose of the test is to verify that the function generates a valid provisioning request for different database types. The test sets up a test server using the GetServerTestHelper function, creates a ndbclient object, and then iterates over a list of database types (PostgreSQL, MySQL, and MongoDB).

For each database type, the test creates a dbSpec object that defines the desired state of the database instance. It then creates a reqData map that specifies additional provisioning request parameters such as the password and SSH public key. The GenerateProvisioningRequest function is called with the ndbclient, dbSpec, and reqData arguments to generate the actual provisioning request.

The test then checks that the generated request is valid by asserting that the DatabaseType field matches the expected value for the given database type. It also checks that the SoftwareProfileId, SoftwareProfileVersionId, ComputeProfileId, NetworkProfileId, and DbParameterProfileId fields are not empty, and that the TimeMachineInfo.SlaId field is set to NONE_SLA_ID. If any of the assertions fail, the test logs an error message using the t.Errorf or t.Logf functions, indicating which specific assertion failed.


There are also many tests that are handling a lot of niche functionality with regards to generating the provisioning request. For instance, TestGenerateProvisioningRequestReturnsErrorIfDBPasswordIsEmpty is one. It ensures that the function returns an error when the db password parameter is empty. The test sets up an HTTP server, creates an instance of an ndbclient, and then generates a provisioning request with an empty database password for three different database types: postgres, mysql, and mongodb. If the function GenerateProvisioningRequest returns no error, the test fails with an error message.

We have added a unit test for getting the correct Action Arguments on the basis of database type. Following is the code snippet for the same.


Following is the code coverage we achieved after adding the new test.

Github

Mentors

  • Prof. Edward F. Gehringer
  • Krunal Jhaveri
  • Manav Rajvanshi
  • Krishna Saurabh Vankadaru
  • Kartiki Bhandakkar

Contributors

  • Ajith Kumar Vinayakamoorthy Patchaimayil (avinaya)
  • Kartik Soni (ksoni)
  • Nandini Mundra (nmundra)

References

[1] Nutanix. (n.d.). Nutanix Database Service. Retrieved from https://www.nutanix.com/products/database-service

[2] Kubernetes Operator Pattern https://kubernetes.io/docs/concepts/extend-kubernetes/operator

[3] MongoDB https://www.mongodb.com/