CSC/ECE 517 Spring 2018- Project M1803: Implement a web page fuzzer to find rendering mismatches (Part 2): Difference between revisions

From Expertiza_Wiki
Jump to navigation Jump to search
No edit summary
Line 3: Line 3:
==Introduction==
==Introduction==
This Mozilla project was broken in to 2 main parts: the initial and subsequent steps.  The initial steps were finished as a part of the OSS project.  So the goal of this final project is to complete the subsequent steps.  As a part of the OSS project (explained more below) we created a tool which generates random valid HTML files and automated servo.  Now, as a part of this project we are supposed to extend the program to also control Firefox, compare the resulting screenshots, and expand upon the page generation tool.
This Mozilla project was broken in to 2 main parts: the initial and subsequent steps.  The initial steps were finished as a part of the OSS project.  So the goal of this final project is to complete the subsequent steps.  As a part of the OSS project (explained more below) we created a tool which generates random valid HTML files and automated servo.  Now, as a part of this project we are supposed to extend the program to also control Firefox, compare the resulting screenshots, and expand upon the page generation tool.
==Background==
TODO: explain servo


==Previous Work (Part of the OSS Project)==
==Previous Work (Part of the OSS Project)==

Revision as of 19:46, 15 April 2018

By Alexander Simpson(adsimps3), Abhay Soni (asoni3), Dileep badveli (dbadvel) and Jake Batty(jbatty)

Introduction

This Mozilla project was broken in to 2 main parts: the initial and subsequent steps. The initial steps were finished as a part of the OSS project. So the goal of this final project is to complete the subsequent steps. As a part of the OSS project (explained more below) we created a tool which generates random valid HTML files and automated servo. Now, as a part of this project we are supposed to extend the program to also control Firefox, compare the resulting screenshots, and expand upon the page generation tool.

Background

TODO: explain servo

Previous Work (Part of the OSS Project)

As per the project description, we were expected to complete the initial steps. The implementation is explained below for each of these steps.

1) In a new repository, create a program that can generate a skeleton HTML file with a doctype, head element, and body element, print the result to stdout
- Here is the link to the repository which contains code_generation.py file which will be used to generate random valid HTML files.
2) Add a module to the program that enable generating random content specific to the <head> element (such as inline CSS content inside of a <style> element) and add it to the generated output
- The file code_generation.py, contains the code which generates random content specific to the head element and adds style on top of it. As seen in this code, after generating random content to the file, we will add CSS elements on top of this content. We have established a list of commonly used styles, weights, fonts, font_styles, and alignments which will be used at random. For practical purposes, we are limiting the number of options.
3) Add a module to the program that enables generating random content specific to the <body> element (such as a

block that contains randomly generated text) and add it to the generated output

4) Generate simple random CSS that affects randomly generated content (ie. if there is an element with an id foo, generate a CSS selector like #foo that applies a style like colorto it)
5) Create a program under Servo's etc/ that launches Servo and causes it to take a screenshot of a particular URL - use this to take screenshots of pages randomly generated by the previous program
Sample Screenshot:

Work to be done

Below is a list of the tasks to be done as a part of our final project. Below each task we have described what we think it will take to complete the respective task.

1) Extend the program that controls Servo to also control Firefox using geckodriver
Task 1 is relatively simple. It just involves downloading geckodriver and running it. Geckodriver is an open source software engine that allows us to render marked content on a web browser. It should allow us to take screenshots of a particular URL just like task 5 in the previous work section.
2) Compare the resulting screenshots and report the contents of the generated page if the screenshots differ
This task involves automating Firefox to use geckodriver and the current servo program. They both will create 2 different screenshots. If servo and Firefox render it differently, we will report that file and mark the differences.
3) Extend the page generation tool with a bunch of additional strategies, such as:
-Generating elements trees of arbitrary depth
-Generating sibling elements
-Extending the set of CSS properties that can be generated (display, background, float, padding, margin, border, etc.)
-Extending the set of elements that can be generated (span, div, header elements, table (and associated table contents), etc.)
-Randomly choose whether to generate a document in quirks mode or not
For task 3 there are several different parts, but the main goal is to increase the complexity of our randomly generated pages. First we will increase the tree depth to an arbitrary depth. We will then generate sibling elements and increase the CSS styling options. Finally, we will increase the amount of HTML elements that can be generated and randomly choose whether to generate a document in quirks mode or not.

Conclusion

The previously completed work allows us to generate simple html documents with a randomized structure, render the page in Servo, and take a screenshot of the page. We plan on furthering this work by rendering the pages in both Servo and Firefox, taking screenshots of the pages within both browsers, and reporting differences between the two. Doing so will allow users to evaluate Servo’s ability to load web pages. To make this testing even more informative, we plan to increase the complexity of the structure and styling of the randomly generated html documents.