CSC/ECE 517 Fall 2019 - M1952. Missing DOM features project

From Expertiza_Wiki
Revision as of 04:17, 12 November 2019 by Jmodi3 (talk | contribs) (→‎Scope)
Jump to navigation Jump to search

Servo is a modern, high-performance browser engine designed for both application and embedded use. The current version of Servo has a couple of issues. The first issue is the absence of the capability to parse the srcdoc attribute in an iframe tag in the HTML code. The second issue is that Servo does not have a named getter implemented in HTMLFormElement to reference the form elements by their id. The goal of this project is to implement these two functionalities in the current version of Servo.


Introduction

Servo

Servo is an experimental browser engine that seeks to create a highly parallel environment, in which components such as rendering, layout, HTML parsing, image decoding, etc. are handled by fine-grained, isolated tasks. It leverages the memory safety properties and concurrency features of the Rust programming language.

Rust

Rust is a multi-paradigm systems programming language primarily developed focused making the browser safe and concurrently operable. Rust has been the "most loved programming language" in the Stack Overflow Developer Survey every year since 2016.

DOM

DOM, short for Document Object Model, is an interface and a way to how programs treat web pages. It parses the web pages in a structured order so that programs can read and manipulate the web page's content, structure, and style. When an HTML page is parsed, the programs build, what is called, a DOM tree and this lists all the HTML tags as nodes while maintaining the scope under which these tags might be defined.

bitsofcode has an excellent read on the basics of DOM and here is a quick snapshot from the same: [Left - HTML page content; Right - DOM tree]


<---------------->

Problem Statement

OSS Project

We have worked on the initial steps of the project page which is the srcdoc iframe issue. In HTML, there is a tag called <iframe> which allows you to embed a web page into another web page. This attribute has attributes like src and srcdoc which can be used to embed web pages. However, the uses of both attributes are different.

To embed a web page using src attribute, we need to provide a URL of the web page to be embedded. This works in Servo.

To embed a web page using srcdoc attribute, all we need to provide is just HTML content and it works even without adding <html> and <body> tags. This does not work in Servo. We have worked upon this issue for our OSS project.

Final Project

We are working on the subsequent steps listed on the project page which is the named getter issue. Servo is unable to submit forms on web pages since it is not able to fetch the form elements by their ID. Now, in terms of DOM and HTML, the HTMLFormElement is the interface to the <form> tag in HTML. Hence, we need to implement the named element getter in HTMLFormElement files.

Note to Reviewers

1. Our Mozilla project consists of 2 issues: srcdoc iframe & named getter. For our OSS project, we were supposed to work only on the srcdoc iframe issue (initial steps), so kindly keep this in mind while reviewing. The second issue named getter (subsequent steps) will be worked on for our final project.

2. You would find that our code doesn't contain many comments. The current maintainer for the project advised us to remove comments which only read the code further and hence, to follow Servo's formatting and style guidelines, we removed these comments.

3. The second issue is obviously open as we have not yet started working upon it. The first issue is open since the test cases are faulty on Servo's end. Once those are corrected and our tests pass, our code will be merged onto the master branch and the issue will be closed.

For any other doubts and more information, kindly refer to the comments thread in our PR.

Setup

We need to compile and build Servo on our local machines to work on the code and check whether the tests pass. Servo's GitHub page has an excellent starting guide to set up the environment for Servo here. It also mentions the other dependencies that need to be installed specific to an operating system.

Scope

Since the project deals with solving two issues in Servo, a 2-step process has been listed by the Servo team to help streamline our work. The srcdoc iframe issue is to be done as initial steps while the named getter issue is to be worked upon as subsequent steps. The project page can be found here.

OSS Project

The initial steps, for the srcdoc iframe issue, listed on the project page are as follows:

  • Uncomment the srcdoc WebIDL attribute and implement the attribute getter.
  • Add a field to structure LoadData for storing the srcdoc contents when loading a srcdoc iframe.
  • Add a new method to script_thread.rs which loads the special about:srcdoc URL per the specification.
  • Call this new method from handle_new_layout when it's detected that a srcdoc iframe is being loaded.
  • In process_the_iframe_attributes, implement the srcdoc specification so that LoadData initiates a srcdoc load.
  • In attribute_mutated, ensure that changing the srcdoc attribute of an iframe element follows the specification.

Final Project

The subsequent steps, for the named getter issue, listed on the project page are:

  • Uncomment the named getter from HTMLFormElement.webidl file.
  • Add the missing NamedGetter and SupportedPropertyNames methods to HTMLFormElement.
  • Implement SupportedPropertyNames according to the specification given here:
    • Create an enum to represent the id, name, and past states for the sourced names.
    • Create a vector of (SourcedName, DomRoot<HTMLElement>) by iterating over self.controls and checking the element type and calling methods like HTMLElement::is_listed_element.
    • Sort and filter elements from the vector as described in the spec using Node::CompareDocumentPosition and return a new vector of unique names.
  • Implement a live NodeList for form element collections:
    • Create an enum representing the kind of live RadioNodeList - Listed or Img.
    • Add a FormControls variant to NodeListType which contains a Dom<HTMLFormElement>, the new enum, and a DOMString.
    • Add a method to HTMLFormElement that returns a Ref<Vec<Dom<Element>>> and exposes its self.controls.
    • Implement the NodeList API for the new NodeListType variant using the new HTMLFormElement API to iterate over and filter matching elements from the form element's controls.

Design Pattern

Design patterns are not applicable as our task involves the implementation of methods and modifying various files. However, the Implementation section below provides details of why certain steps were implemented the way they were.

Implementation

OSS Project

We have worked on the initial steps mentioned on the project page here.

Step 1: Uncomment srcdoc WebIDL attribute and implement the attribute getter

The srcdoc attribute was already declared. We simply uncommented those lines in the file HTMLIFrameElement.webidl.

We implemented the attribute getter in the file htmliframeelement.rs. It basically defines a new Element which stores the srcdoc String in its attribute and its value is returned by the getter. The lack of a semi-colon in the last line of a Rust function denotes that the value of the variable be returned from the function.

Since this attribute getter is called only at one place in the entire codebase in process_the_iframe_attributes() function, it was suggested to us that we make the function inline and we did the change in lines 245, 246 in our latest commit.

Step 2: Add a field to LoadData for storing the srcdoc contents when loading a srcdoc iframe

We added a public field srcdoc of String type in the line 170 in file lib.rs. We declared srcdoc of type DOMString in the webidl file and we are mapping the same field in the rust file. The data type DOMString is inherently a Rust String as can be seen here.

Step 3: Add a new method to script_thread.rs which loads the special about:srcdoc URL per the specification

We defined a method page_load_about_srcdoc which is based on the method start_page_load_about_blank in the file script_thread.rs and handles the loading of iframe tag with srcdoc property.

Effectively, we parse the about:srcdoc URL and set the URL in the context of the response which we load. Modern web browsers send responses in chunks and this is why we send the srcdoc content (an HTML string) in the chunk of the response.

Step 4: Call this new method from handle_new_layout when it's detected that a srcdoc iframe is being loaded

We already defined the method page_load_about_srcdoc in the above step. This function handle_new_layout is responsible for loading new data and redirecting the navigation to the relevant function based on the URL. If the structure LoadData has about:srcdoc in its url parameter, we pass in the new load and srcdoc string stored in LoadData.

Step 5: In process_the_iframe_attributes, implement the srcdoc specification so that LoadData initiates a srcdoc load

We added the processing of srcdoc specification in process_the_iframe_attributes() function in this file htmliframeelement.rs by referring the specification and with help from Josh.

We first check if the HTML element has the srcdoc attribute or not. In our case, we are processing the iframe HTML element and so self.upcast::<Element>() returns the iframe element's ID. We fetch the document to be shown on the window and store the ID of the incomplete process which we are currently executing. This is required since the browser processes are highly parallel. Next, we define a new LoadData instance and set its srcdoc property to that fetched by the attribute getter we implemented in Step 1. We then set the browsing context with the new attribute values.

Step 6: In attribute_mutated, ensure that changing the srcdoc attribute of an iframe element follows the specification

We added a code to fire the process_the_iframe_attributes method when srcdoc attribute of an iframe element is changed in the file htmliframeelement.rs.

Final Project

Test Plan

OSS Project

To test if the engine is able to process iframe tag with srcdoc with the command: ./mach test-wpt tests/wpt/web-platform-tests/html/semantics/embedded-content/the-iframe-element/srcdoc_process_attributes.html.

The result of the test is:


We have successfully completed all the initial steps. The tests fail but the issue is because of the tests and not our implementation explained below. Due to this, the current status of our pull request is Pending.

From the test result image above, it can be seen that 3 subtasks failed in test because our output was true and the code asserted for False. We discussed with Josh the reason for these failed tests and the primary issue is that the test which checks for successful srcdoc processing isn't written as per the specification and hence a srcdoc iframe's load event is fired before the srcdoc attribute is modified. This means that the engine tries to load srcdoc iframe before its attribute is correctly modified and parsed and hence the error.


Josh reopened the discussion in this PR where the author wrote these tests. The author and maintainer admitted that the tests were indeed wrong and the issue was solved by the author in this commit. The changes the author made for the test can be seen below:


Based on the changes by the test author above, it can be seen that the changes were made in 3 statements where assert_false() was changed to assert_true(). This should mean that the results where we received true are indeed correct.

Josh mentioned here that the test will continue to fail until the new expected tests are synced with upstream which hasn't happened yet. He left us some refactoring comments which we have addressed in our latest commit.

Based on our communication with Josh, the changes we have made are correct and should soon pass once the tests are corrected by them.

Final Project

The tests for named getter issue have already been written. We need to check whether the modifications we make to the code can still pass these tests.

The tests will be run using the mach utility commands:

./mach test-wpt tests/wpt/web-platform-tests/html/semantics/forms/the-form-element/form-elements-nameditem-01.html
./mach test-wpt tests/wpt/web-platform-tests/html/semantics/forms/the-form-element/form-elements-nameditem-02.html
./mach test-wpt tests/wpt/web-platform-tests/html/semantics/forms/the-form-element/form-nameditem.html


To test whether the code change works, follow the steps as outlined.

  1. Install the pre-requisites required for servo as mentioned here
  2. Clone our GitHub repo: git clone https://github.com/jaymodi98/servo
  3. Navigate to servo's directory: cd servo
  4. Checkout the git branch iframe-srcdoc: git checkout iframe-srcdoc
  5. Check if code follows style guidelines: ./mach test-tidy
  6. Check if code has no compilation errors: ./mach check
  7. Check if servo is built successfully: ./mach build --dev --verbose
  8. Check if test pass, i.e. servo can process srcdoc iframes: ./mach test-wpt tests/wpt/web-platform-tests/html/semantics/embedded-content/the-iframe-element/srcdoc_process_attributes.html

You will see that the servo build is successful and tests fail because of issue highlighted above.

Pull Requests

OSS Project

Here is the link to our pull request. We have attached the code snippets for the changes made in files in the PR.

Final Project

We have not filed a PR since we have not done substantial work and Servo being a global community, our mentor advised us to file a PR once some concrete work has been done from your end.

References

[1] https://servo.org/
[2] https://bocoup.com/blog/third-party-javascript-development-future#iframe-srcdoc
[3] https://www.w3schools.com/tags/tag_iframe.asp
[4] https://html.spec.whatwg.org/multipage/forms.html#dom-form-nameditem
[5] https://en.wikipedia.org/wiki/Servo_(software)
[6] https://en.wikipedia.org/wiki/Rust_(programming_language)
[7] https://github.com/servo/servo/wiki/Missing-DOM-features-project
[8] https://github.com/servo/servo/blob/master/README.md#setting-up-your-environment
[9] https://bitsofco.de/what-exactly-is-the-dom/
[10] https://developer.mozilla.org/en-US/docs/Web/API/HTMLFormElement