CSC/ECE 517 Fall 2023 - G2352. Add GitLab support for using GraphQL to query repository information

From Expertiza_Wiki
Jump to navigation Jump to search

This wiki page is for information regarding the changes made for the G2352. Add GitLab support for using GraphQL to query repository information.

GitLab

GitLab is more than just a version control system; it's your collaborative platform for the entire DevOps lifecycle. Imagine a centralized space where your team can seamlessly collaborate on code, manage projects, automate workflows, and deploy applications—all within a single application. GitLab makes this vision a reality. It serves as an integrated platform that hosts your source code repositories while offering a plethora of tools for project management, continuous integration (CI), and continuous delivery (CD). It's not just a version control system; it's your all-in-one solution for turning ideas into code and code into robust, scalable applications.

Key features

1. Version Control System (VCS):GitLab provides a powerful and user-friendly version control system based on Git. It enables teams to manage and track changes to their codebase efficiently. Developers can collaborate seamlessly, branching and merging with ease.

2. Issue Tracking and Project Management: GitLab's project management tools go beyond just code. You can track issues, plan sprints, and manage your entire project's lifecycle in a centralized space. This simplifies collaboration, ensuring everyone is on the same page from ideation to deployment.

3. Integrated CI/CD: One of GitLab's standout features is its integrated Continuous Integration and Continuous Delivery capabilities. With built-in CI/CD pipelines, you can automate testing and deployment, ensuring that your applications are always in a deployable state.

4. Code Review and Collaboration: Facilitate collaboration among your team members with GitLab's robust code review features. Conduct efficient peer reviews, comment on specific lines of code, and ensure the quality of your codebase before merging changes.

5. Container Registry: GitLab includes a built-in container registry, allowing you to store, manage, and deploy Docker images directly from your GitLab instance. This streamlines the process of building and deploying containerized applications.

6. Collaboration and Transparency: GitLab promotes collaboration and transparency by providing a centralized hub for your development efforts. Teams can communicate, contribute, and track progress within the same environment. The platform's open and accessible nature fosters a culture of openness and inclusivity.

7. Community and Documentation: The GitLab community is vibrant and diverse. From beginners to seasoned developers, everyone can find resources, ask questions, and contribute to the platform's improvement. Extensive documentation and guides make it easy for users to get started and make the most of GitLab's features.

In essence, GitLab isn't just a tool; it's a platform that empowers teams to streamline their development processes. It unifies version control, project management, CI/CD, and more, offering a comprehensive solution for modern software development. With GitLab, your team can transform ideas into reality, collaborating efficiently at every step of the development journey.

GraphQL

GraphQL is a revolutionary query language for APIs that challenges the traditional RESTful approach to fetching and updating data. Developed by Facebook, GraphQL provides a more efficient and flexible alternative for building APIs. Unlike REST, where clients often receive more data than needed, GraphQL allows clients to specify the exact data requirements, minimizing over-fetching and under-fetching issues. This adaptability makes GraphQL a powerful tool for developers seeking a more precise and streamlined approach to data retrieval.

At the heart of GraphQL is its schema and query language. The schema defines the types of data that can be queried and the relationships between them. Clients can then send queries to the GraphQL server, specifying the exact data they need. This level of granularity not only enhances performance but also simplifies the development process. Additionally, GraphQL supports real-time data with subscriptions, enabling applications to receive live updates as soon as data changes on the server.

At the heart of GraphQL is its schema and query language. The schema defines the types of data that can be queried and the relationships between them. Clients can then send queries to the GraphQL server, specifying the exact data they need. This level of granularity not only enhances performance but also simplifies the development process. Additionally, GraphQL supports real-time data with subscriptions, enabling applications to receive live updates as soon as data changes on the server.

GitLab GraphQL API

Origin and Development

GitLab's GraphQL API is a product of GitLab Inc., a company founded in 2011 by Dmitriy Zaporozhets and Valery Sizov. The aim was to provide a comprehensive platform for the entire DevOps lifecycle. Over the years, GitLab has evolved from a source code management tool to an integrated solution that includes project management, continuous integration, continuous delivery, and more. The GraphQL API for GitLab was introduced to address the need for a flexible and efficient way to interact with GitLab's vast array of features. GraphQL itself was developed by Facebook in 2012 and later open-sourced in 2015. It gained popularity for its ability to empower clients to request exactly the data they need, reducing over-fetching and under-fetching common in traditional RESTful APIs.

Current Usage and Adoption

GitLab's GraphQL API has become an integral part of the GitLab ecosystem, providing developers with a powerful and versatile interface for accessing and manipulating GitLab data. It enables users to perform a wide range of operations, from querying project details to managing issues, branches, and merge requests. Many organizations and projects leverage GitLab's GraphQL API to integrate GitLab functionality seamlessly into their workflows. This includes automating processes, building custom tools, and creating integrations with other third-party services. As of my last knowledge update in January 2022, GitLab continues to refine and expand its GraphQL API, responding to user feedback and evolving industry practices.

Features of the API

1. Flexible Data Retrieval: GraphQL allows clients to specify the exact data they need, reducing the amount of unnecessary data transferred over the network. This flexibility is particularly beneficial for optimizing performance in applications.

2. Single Request for Multiple Resources: With GraphQL, you can request multiple resources in a single query, reducing the number of requests required to fetch data. This can lead to more efficient and faster interactions with the API.

3. Introspection and Documentation: GraphQL's introspective nature enables clients to explore the API schema and understand available types, queries, and mutations. This self-documenting feature simplifies API discovery and usage.

4. Real-time Data with Subscriptions: GitLab's GraphQL API supports real-time data updates through subscriptions. This is crucial for applications that require live updates, such as collaborative editing or real-time monitoring.

5. Community and Ecosystem: GitLab has a vibrant community that actively contributes to the GraphQL API's growth. The community-driven aspect ensures a diverse set of tools, libraries, and resources are available to developers.

Available Queries

At the root level, the GraphQL API offers the following queries:

Query Description
project Project information and many of its associations, such as issues and merge requests.
group Basic group information and epics.
user Information about a particular user.
namespace The namespace and the projects in it.
currentUser Information about the authenticated user.
users Information about a collection of users.
metaData Metadata about GitLab and the GraphQL API.
snippets Snippets visible to the authenticated user.


GitLab's GraphQL API represents a significant step forward in providing developers with a flexible, efficient, and powerful interface for interacting with GitLab's extensive features. While it comes with its set of challenges, the benefits of efficient data retrieval, real-time capabilities, and a thriving community make GitLab's GraphQL API a valuable asset for modern software development and DevOps workflows. As the landscape continues to evolve, GitLab remains committed to refining and expanding its GraphQL API to meet the evolving needs of the development community.

Problem Statement

Phase 1

What needs to be done:

  • First, understand the given code, you can check the README.md file to help you understand how the code works. You can also run the demo.py file to see the response.
  • Find the corresponding parts from GitLab
  • Develop GraphQL queries to query the same or similar information from GitLab with its GraphQL API.

Phase 2

Design and implement a user-friendly frontend Graphical User Interface (GUI) using React with Typescript. This GUI will serve as the interface for presenting information retrieved from both GitHub and GitLab repositories. The backend will be powered by Flask, providing APIs to fetch and deliver the relevant data to the frontend.

Workflow Diagram

Phase 1

Phase 2

Design Pattern

This project is dedicated to retrieving GitLab-specific data associated with a user by their project. Our objective was to fetch a comprehensive array of user-related information, including:

  • Project contributors
  • Project contributor's contribution
  • Project contributors’ commit information

Our goal is to get a comprehensive picture of a user's contributions and activities inside the GitLab ecosystem by streamlining the extraction process and utilizing GitLab's robust GraphQL APIs.

Frontend GUI (React + Typescript):

  • Design a responsive and intuitive user interface to display information from GitHub and GitLab repositories.
  • Implement functionalities for users to interact with the displayed data, including searching, sorting, and filtering.
  • Ensure a seamless user experience with a clean and visually appealing design.

Backend API (Flask):

  • Develop Flask APIs to communicate with GitHub and GitLab repositories, retrieving necessary information.
  • Design endpoints for different types of data requests from the queries written.
  • Ensure security measures, including authentication, to protect sensitive data.

Workflow:

  • The React frontend communicates with the Flask backend through well-defined APIs.
  • The Flask backend interacts with GitHub and GitLab APIs to fetch repository information.
  • The retrieved data is processed and presented in an organized manner on the React frontend.

File(s) Modified

The following files have been modified to incorporate the new GitLab accessing queries.

PR raised for the Phase-1 changes

A whole new folder for accommodating the UI changes is created with the necessary files and packages included within.

PR raised for the Phase-2 changes

- react_frontend

GitHub repository link: https://github.ncsu.edu/ttodkar/Gitlab_query

Solutions/Details of Changes Made

Phase 1 solution approach

1. Identify GraphQL Queries: Focused on contributors, their contributions, and commit details. Developed GraphQL queries that precisely retrieve the required information. Tested these queries using tools like Postman to ensure they returned accurate and expected results.

2. Test Queries with Postman: Used Postman to verify the correctness of the GraphQL queries. Confirmed that the queries provide the necessary data and that the responses align with expectations. Iterated and refined the queries as needed until they consistently returned accurate results.

3. Implement Queries in Python: Created separate Python files for each of the GraphQL queries identified earlier. Organized the project folder to include these files for clarity and maintainability. Wrote Python code that sends the GraphQL queries to the GitLab API, retrieves the responses, and stores them in a usable format.


Repository Contributors Query-

class ProjectContributorsQuery(Query):
   def __init__(self):
       super().__init__(
           fields=[
               QueryNode(
                   "project",
                   args={"fullPath": "$repo_name"},
                   fields=[
                       QueryNode("mergeRequests",fields=[
                               QueryNode("nodes",fields=[
                                       QueryNode("commits",fields=[
                                               QueryNode("nodes",fields=[
                                                       QueryNode("author",fields=[
                                                               "name",
                                                               "id",
                                                           ])
                                                   ])
                                           ])
                                   ])
                           ])
                   ])
           ])


Repository Commits Query-

class ProjectQuery(Query):
   def __init__(self):
       super().__init__(
           fields=[
               QueryNode(
                   "project",
                   args={"fullPath": "$repo_name"},
                   fields=[
                       "createdAt",
                       QueryNode(
                           "mergeRequests",
                           fields=[
                               "count",
                               QueryNode(
                                   "nodes",
                                   fields=[
                                       QueryNode(
                                           "diffStatsSummary",
                                           fields=[
                                               "additions",
                                               "deletions",
                                               "fileCount",
                                           ],
                                       ),
                                       "commitCount",
                                       QueryNode(
                                           "commits",
                                           fields=[
                                               QueryNode(
                                                   "nodes",
                                                   fields=[
                                                       "committedDate",
                                                       "message",
                                                       QueryNode(
                                                           "author",
                                                           fields=[
                                                               "email",
                                                               "name",
                                                               "username",
                                                           ],
                                                       ),
                                                   ],
                                               ),
                                           ],
                                       ),
                                   ],
                               ),
                           ],
                       ),
                   ],
               ),
           ]
       )


Repository Contributor's Contributions Query-

class ProjectContributorsContribution(Query):
   def __init__(self):
       super().__init__(
           fields=[
               QueryNode(
                   "currentUser",
                   fields=[
                       QueryNode(
                           "authoredMergeRequests",
                           args={"projectPath": "$repo_name"},
                           fields=[
                               "count",
                               QueryNode(
                                   "nodes",
                                   fields=[
                                       QueryNode(
                                           "diffStatsSummary",
                                           fields=[
                                               "additions",
                                               "deletions",
                                               "fileCount",
                                           ],
                                       ),
                                       QueryNode(
                                           "commits",
                                           fields=[
                                               QueryNode(
                                                   "nodes",
                                                   fields=[
                                                       "committedDate",
                                                       "message",
                                                   ],
                                               ),
                                           ],
                                       ),
                                   ],
                               ),
                           ],
                       ),
                       "id",
                   ],
               ),
           ],
       )

4. Format and Filter Results: Analyzed the raw GraphQL responses and identify the specific information needed for the project. Modified the Python code to format and filter the results accordingly. Implemented changes to display only the relevant information in the desired format, ensuring clarity and user-friendliness.

5. Integration with Demo.py: Integrated the newly created Python files containing GraphQL queries into the larger project, specifically in the demo.py file. Update demo.py to include function calls for both GitHub and GitLab queries, coordinating the two seamlessly within the project.

6. User Input Integration: Modified the code to accept user input, specifically the project path saved on their GitLab account. This ensures that the queries are tailored to the user's specific repository. Implemented a mechanism to pass the user's project path as input for the GitLab query, allowing dynamic and personalized retrieval of results.

7. Documentation: Document the changes made, including details about the implemented queries, modifications to existing code, and the overall functionality. Provide clear instructions on how users can input their project path and retrieve the desired information using the updated system.

8. Version Control: Committed the changes to version control to maintain a record of the modifications made to the project. Ensured that the version control system reflects the latest updates and improvements.

By following these steps, we ensured a systematic and well-documented process for implementing and integrating GraphQL queries into the GitLab project, enhancing its functionality and usability for end-users.

Phase 2 solution approach

React Frontend Workflow:

1. User Input: Users input their Github username, repository name, and user ID on the webpage.

2. Communication with Backend: The entered information is sent to the backend server for processing.

3. GraphQL Query Building: The backend server, equipped with a GraphQL query builder, creates a specific query based on the user input.

4. GraphQL Request: The constructed GraphQL query is sent to Github to access the desired metrics.

5. Github API Interaction: Github processes the request and sends back the response in JSON format.

@app.route('/api/github/repositorycommits')
def fetch_github_commits():
   username = request.args.get("owner")
   repoName = request.args.get("repoName")
   pgSize = request.args.get("pgSize")
   for response in client.execute(query=RepositoryCommits(),
                              substitutions={"owner": username, "repoName": repoName , "pgSize": pgSize}):
       return response
@app.route("/api/github/repositorycontributorscontribution")
def fetch_github_commit():
    username = request.args.get("owner")
    repoName = request.args.get("repoName")
    uid = request.args.get("uid")
    response = client.execute(
           query=RepositoryContributorsContribution(), substitutions={"owner": username, "repoName":repoName, "id": { "id": uid}}
    )
    return response
@app.route("/api/github/repositorycontributors")
def fetch_github_contributors():
   username = request.args.get("owner")
   repoName = request.args.get("repoName")
   response = client.execute(
           query=RepositoryContributors(), substitutions={"owner": username, "repoName":repoName}
   )
   return response


6. Response Handling: The backend server receives and interprets the JSON response from Github.

7. Rendering on Webpage: The backend server renders the relevant information from the Github response on the webpage.

8. User Interaction: Users can now interact with the webpage to explore and visualize the requested Github metrics.

The React frontend serves as the user interface, facilitating user input and communicating with the backend server. The backend server, utilizing GraphQL, communicates with the Github API to fetch and display the specified metrics on the webpage for user interaction.

YouTube Video Link

Video Link: https://www.youtube.com/watch?v=e6OHvGunuxU

Screenshots

Home Page


User Login


Repository commits


Contributors Contributions


Repository Contributors

Testing

Manual Testing

Passing test cases:

  • After entering the username and repository name, the user is able to get the User Login details.
  • After entering the user ID, the repository commits and repository contributors are rendered in their respective webpages.
  • The webpages respond dynamically fetching real-time data.
  • When no parameters are given, then no information is being displayed and the webpages are not crashing.

Failing test cases:

  • If the repository is forked from another repository, then the number of contributors are to be traced with respect to the origin repository which the repositoryContributors query is not able to fetch which in-turn is failing to render the result on the webpage leading to webpage crashing.

Team

Mentor

  • Jialin Cui

Team Members

  • Chandana Mallu (cmallu)
  • Tanishq Virendrabhai Todkar (ttodkar)
  • Tripurashree Mysore Manjunatha (tmysore)