CSC/ECE 517 Fall 2019 - M1952. Missing DOM features project

From Expertiza_Wiki
Jump to navigation Jump to search

Servo is a modern, high-performance browser engine designed for both application and embedded use. The current version of Servo has a couple of issues. The first issue is the absence of the capability to parse the srcdoc attribute in an iframe tag in the HTML code. The second issue is that Servo does not have a named getter implemented in HTMLFormElement to reference the form elements by their id. The goal of this project is to implement these two functionalities in the current version of Servo.

Introduction

Servo

Servo is an experimental browser engine that seeks to create a highly parallel environment, in which components such as rendering, layout, HTML parsing, image decoding, etc. are handled by fine-grained, isolated tasks. It leverages the memory safety properties and concurrency features of the Rust programming language.

Rust

Rust is a multi-paradigm systems programming language primarily developed focused making the browser safe and concurrently operable. Rust has been the "most loved programming language" in the Stack Overflow Developer Survey every year since 2016.

DOM

DOM, short for Document Object Model, is an interface and a way to how programs treat web pages. It parses the web pages in a structured order so that programs can read and manipulate the web page's content, structure, and style. When an HTML page is parsed, the programs build, what is called, a DOM tree and this lists all the HTML tags as nodes while maintaining the scope under which these tags might be defined.

bitsofcode has an excellent read on the basics of DOM and here is a quick snapshot from the same: [Left - HTML page content; Right - DOM tree]


<---------------->

Problem Statement

The issue that we have worked on for our OSS project and which is mentioned in the initial steps of the project page is the srcdoc iframe issue. In HTML, there is a tag called <iframe> which allows you to embed a web page into another web page. This attribute has attributes like src and srcdoc which can be used to embed web pages. However, the uses of both attributes are different.

To embed a web page using src attribute, we need to provide a URL of the web page to be embedded.

To embed a web page using srcdoc attribute, all we need to provide is just HTML content and it works even without adding <html> and <body> tags. The current Servo version doesn't have any mechanism to process the srcdoc attribute and due to this, we cannot embed a web page by using the srcdoc attribute of <iframe> tag currently. This is the issue is supposed to be addressed.

Setup

We need to compile and build Servo on our local machines to work on the code and check whether the tests pass. Servo's GitHub page has an excellent starting guide to set up the environment for Servo here. It also mentions the other dependencies that need to be installed specific to an operating system.

Scope

Since the project deals with solving two issues in Servo, a 2-step process has been listed by the Servo team to help streamline our work. The srcdoc iframe issue is to be done as initial steps while the named getter issue is to be worked upon as subsequent steps. The project page can be found here.

The initial steps, for the srcdoc iframe issue, listed on the project page are as follows:

  • Uncomment the srcdoc WebIDL attribute and implement the attribute getter.
  • Add a field to structure LoadData for storing the srcdoc contents when loading a srcdoc iframe.
  • Add a new method to script_thread.rs which loads the special about:srcdoc URL per the specification.
  • Call this new method from handle_new_layout when it's detected that a srcdoc iframe is being loaded.
  • In process_the_iframe_attributes, implement the srcdoc specification so that LoadData initiates a srcdoc load.
  • In attribute_mutated, ensure that changing the srcdoc attribute of an iframe element follows the specification.

Design Pattern

Design patterns are not applicable as our task involves the implementation of methods and modifying various files. However, the Implementation section below provides details of why certain steps were implemented the way they were.

Implementation

We have worked on the initial steps mentioned on the project page here.

Step 1: Uncomment srcdoc WebIDL attribute and implement the attribute getter

The srcdoc attribute was already declared. We simply uncommented those lines in the file HTMLIFrameElement.webidl.

We implemented the attribute getter in the file htmliframeelement.rs. It basically defines a new Element which stores the srcdoc String in its attribute and its value is returned by the getter. The lack of a semi-colon in the last line of a Rust function denotes that the value of the variable be returned from the function.

Since this attribute getter is called only at one place in the entire codebase in process_the_iframe_attributes() function, it was suggested to us that we make the function inline and we did the change in lines 245, 246 in our latest commit.

Step 2: Add a field to LoadData for storing the srcdoc contents when loading a srcdoc iframe

We added a public field srcdoc of String type in the line 170 in file lib.rs. We declared srcdoc of type DOMString in the webidl file and we are mapping the same field in the rust file. The data type DOMString is inherently a Rust String as can be seen here.

Step 3: Add a new method to script_thread.rs which loads the special about:srcdoc URL per the specification

We defined a method page_load_about_srcdoc which is based on the method start_page_load_about_blank in the file script_thread.rs and handles the loading of iframe tag with srcdoc property.

Effectively, we parse the about:srcdoc URL and set the URL in the context of the response which we load. Modern web browsers send responses in chunks and this is why we send the srcdoc content (an HTML string) in the chunk of the response.

Step 4: Call this new method from handle_new_layout when it's detected that a srcdoc iframe is being loaded

We already defined the method page_load_about_srcdoc in the above step. This function handle_new_layout is responsible for loading new data and redirecting the navigation to the relevant function based on the URL. If the structure LoadData has about:srcdoc in its url parameter, we pass in the new load and srcdoc string stored in LoadData.

Step 5: In process_the_iframe_attributes, implement the srcdoc specification so that LoadData initiates a srcdoc load

We added the processing of srcdoc specification in process_the_iframe_attributes() function in this file htmliframeelement.rs by referring the specification and with help from Josh.

We first check if the HTML element has the srcdoc attribute or not. In our case, we are processing the iframe HTML element and so self.upcast::<Element>() returns the iframe element's ID. We fetch the document to be shown on the window and store the ID of the incomplete process which we are currently executing. This is required since the browser processes are highly parallel. Next, we define a new LoadData instance and set its srcdoc property to that fetched by the attribute getter we implemented in Step 1. We then set the browsing context with the new attribute values.

Step 6: In attribute_mutated, ensure that changing the srcdoc attribute of an iframe element follows the specification

We added a code to fire the process_the_iframe_attributes method when srcdoc attribute of an iframe element is changed in the file htmliframeelement.rs.

Testing

We have successfully completed all the initial steps, however, our pull request is in pending stage because of issues in the tests and not our implementations explained below. We test if the engine is able to process iframe tag with srcdoc with the command: ./mach test-wpt tests/wpt/web-platform-tests/html/semantics/embedded-content/the-iframe-element/srcdoc_process_attributes.html . The result that we get can be seen in the image below:

File:Test run result.png

We discussed with Josh as to why the test fails and the primary issue is that the test which checks for successful srcdoc processing isn't written as per the specification and hence a srcdoc iframe's load event is fired before the srcdoc attribute is modified. This means that the engine tries to load srcdoc iframe before its attribute is correctly modified and parse and hence the error. This issue was solved by the original author of the test in another PR. However, Josh mentioned here that the test will continue to fail until it is synced with upstream which hasn't happened yet. He left us some refactoring comments which we have addressed in our latest commit.

 ./mach check
 ./mach tidy-test
 ./mach build --dev --verbose

We executed these commands to check the code follows proper style, has no compilation errors and builds servo successfully.

To test the code changes, follow the steps as outlined.

  1. Install the pre-requisites required for servo as mentioned here
  2. Clone our GitHub repo: git clone https://github.com/jaymodi98/servo
  3. Navigate to servo's directory: cd servo
  4. Checkout the git branch iframe-srcdoc: git checkout iframe-srcdoc
  5. Check if code follows style guidelines: ./mach test-tidy
  6. Check if code has no compilation errors: ./mach check
  7. Check if servo is built successfully: ./mach build --dev --verbose

You will see that the servo build is successful and no errors are reported.

Pull Request

Here is the link to our pull request. We have attached the code snippets for the changes made in files in the PR.

References

[1] https://servo.org/
[2] https://bocoup.com/blog/third-party-javascript-development-future#iframe-srcdoc
[3] https://www.w3schools.com/tags/tag_iframe.asp
[4] https://html.spec.whatwg.org/multipage/forms.html#dom-form-nameditem
[5] https://en.wikipedia.org/wiki/Servo_(software)
[6] https://en.wikipedia.org/wiki/Rust_(programming_language)
[7] https://github.com/servo/servo/wiki/Missing-DOM-features-project
[8] https://github.com/servo/servo/blob/master/README.md#setting-up-your-environment [9] https://bitsofco.de/what-exactly-is-the-dom/