CSC/ECE 517 Spring 2015 M1503 EDTS
Extending Developer Tools for Servo
Introduction
Rust
Rust is a general purpose, multi-paradigm, compiled programming language developed by Mozilla Research. It is designed to be a "safe, concurrent, practical language", supporting pure-functional, concurrent-actor, imperative-procedural, and object-oriented styles.<ref>http://en.wikipedia.org/wiki/Rust_%28programming_language%29</ref> Being a modern systems programming language focusing on safety and speed, it accomplishes these goals by being memory safe without using garbage collection.<ref>http://doc.rust-lang.org/nightly/intro.html</ref>
Servo
Servo is an experimental project to build a Web browser engine for a new generation of hardware: mobile devices, multi-core processors and high-performance GPUs. With Servo, we are rethinking the browser at every level of the technology stack — from input parsing to page layout to graphics rendering — to optimize for power efficiency and maximum parallelism. Servo builds on top of Rust to provide a secure and reliable foundation. Memory safety at the core of the platform ensures a high degree of assurance in the browser’s trusted computing base. Rust’s lightweight task mechanism also promises to allow fine-grained isolation between browser components, such as tabs and extensions, without the need for expensive runtime protection schemes, like operating system process isolation.<ref>https://www.mozilla.org/en-US/research/projects/</ref>
Background
Remote Developer Tools
Firefox supports remote developer tools - ie. communicating with an arbitrary server that implements a protocol for exposing information about web content. You can use the Firefox developer tools on your desktop to debug Web sites and Web apps running in other browsers or runtimes. The other browser might be on the same device as the tools themselves or on a different device, such as a phone connected over USB.
Project Description
Servo implements a very basic developer tools server that currently supports executing JS remotely and investigating the DOM tree in the document inspector. We want to expand these capabilities by completing previous work that enables remote logging from web content, and add new capabilities to log HTTP requests and responses to allow for easier debugging of network-related problems in Servo.<ref>https://github.com/servo/servo/wiki/More-developer-tools-student-project</ref>
The initial step included changes for enabling remote logging from the Web Console using 'console.log'. This was completed as part of the OSS project (Pull request here).
In continuation, the objective of the project is to add support for logging HTTP requests and responses in the Web Console.
Requirement Analysis
To configure Firefox for remote debugging, please follow the instructions on Setting up Firefox
Alternatively, to just view the Web Console, follow instructions here.
As mentioned above, Firefox provides the ability to debug a web page running on a remote server with its Developer Tools. The Message Display pane of the Web Console displays various kinds of messages:
- HTTP requests
- JavaScript warnings and errors
- CSS warnings, errors, and reflow events
- Security warnings and errors
- console API calls
- Input/output messages
The Message Display pane looks like this:
Open the Web Console from the Developer menu (or Ctrl+Shift+K) from a version of Firefox with Developer Tools extensions. Navigate to any web page and observe messages on the Web Console (Console tab). You may have to check the Log option under "Net" (the first option in black). The log is cleared on every redirection/reload of a page. To avoid this, you can select the "Enable Persistent Logs" option in Settings.
HTTP Requests are logged with lines that looks like this on the Web Console:
Clicking on the message brings up a new window that gives more details about the HTTP request and the response.
Versions of Firefox in use today use Gecko as the browser engine. The Developer Edition of Firefox includes complete support for the latest Firefox Developer Tools
Servo so far implements partial support for Developer Tools, which does not include the logging of HTTP requests as shown above. The goal of this project is to implement capability for logging HTTP requests and responses on the Web Console. When we run Servo with our changes on a webpage with the --devtools argument, and connect from an instance of Firefox, we should be able to see the HTTP requests and responses from the server appear on the log in the Web Console.
Implementation
The following is a rough list of changes that will be required in the code to enable logging of HTTP messages. It includes adding types for HTTP request and response messages to the enum that represents messages exchanged between Servo and the debugger client. An "actor" is created to store the information that will be sent in these messages.
- Add a NetworkEventMessage variant to the DevtoolsControlMsg enum in devtools_traits/lib.rs, containing fields for the request id and the network event.
- DevtoolsControlMsg is a template type for the messages exchanged between the developer tools server and the actor objects. The developer tools server runs in a loop in the run_server function in devtools/lib.rs (this will be referred to as the "main loop" in run_server). The run_server function takes as arguments a sender and receiver, that can send and receive respectively messages of type <DevtoolsControlMsg>
- A new variant NetworkEventMessage of DevtoolsControlMsg will be used for all network event messages that are sent to the developer tools server. These messages will notify the server of either an HTTP request or an HTTP response.
- NetworkEventMessage will have two fields - a String representing the unique id corresponding to a request-response pair, and an enum NetworkEvent that holds the information about the request or response.
- The HTTRequest variant of the NetworkEvent enum contains fields for the target url, the method, headers, and the request body. It uses the types that are present in the LoadData struct in components/net/resource_task.rs.
- The HTTPResponse variant of the DevtoolsControlMsg enum containing fields for the response headers, status, and body. It uses the same types that are present in the Metadata struct in components/net/resource_task.rs.
 
- Make the HTTP loader's load function take an optional Sender<DevtoolsControlMsg>. Use the cookies_chan argument as a model. Send HTTPRequest and HTTPResponse messages using this sender at the appropriate times.
- The HTTP loader's load function is present in components/net/http_loader.rs. This is the place in Servo from which HTTP requests are sent and responses are received. This means that the developer tools server, if it is running (i.e., Servo is run with the --devtools argument), needs to be notified of network events from this function.
- So the load function now needs a channel to send messages to the developer tools server (the run_server function).
- This is accomplished by adding an optional argument devtools_chan of type Sender<DevtoolsControlMsg> to the HTTP loader's factory. This is then passed to the load function, where it can be used to send messages to the devtools server.
- The devtools_chan parameter is passed to the HTTP loader's factory when a new resource task is created in components/net/resource_task.rs (new_resource_task)
- This devtools_chan is used to send messages of the type DevtoolsControlMsg::NetworkEventMessage from the load function.
- A message containing NetworkEvent::HttpRequest is sent with the information contained in the load_data struct of the load function.
- A new unique ID for this request is created using the uuid crate, and passed in the NetworkEventMessage. This ID is used whenever the devtools server is notified of further updates corresponding to this request or its response. A new ID is generated for every new request.
- A message containing NetworkEvent::HttpResponse is sent with the response information after the load function receives the HTTP response. The values in the fields of the HttpResponse enum come from the information contained in the metadata struct. This response is tagged with the same ID value that was generated for its request earlier.
 
- Create a NetworkEventActor actor in the devtools crate that stores the request and response information transmitted in the new messages. The String field in the NetworkEventMessage contains a unique ID that joins the HTTPRequest and HTTPResponse. Associate the unique IDs with the corresponding NetworkEventActor names via a hashtable.
- A new NetworkEventActor that implements the Actor trait is added to components/devtools/actors/. The NetworkEventActor needs to implement the name() and handle_message(..) functions of the Actor trait. The NetworkEventActor stores information about the HTTP requests and responses.
- A new NetworkEventActor object is created for each network event that is a new HTTP request. The corresponding response and updates should use the same actor object, which makes it possible for the debugger client to send query messages to the actor for more information (headers, cookies, security info, etc).
- This is achieved by tracking the unique request_id of each new request when its actor object is created. The new NetworkEventActor object is assigned a name using the new_name function of the actor registry, which creates a name by appending a number to the "netevent" string that is passed to it. The (request_id, actor_name) pairs are tracked in a HashMap called actor_requests.
- When the devtools server is notified of new network event, the actor_requests hash table is first looked up with the request_id to see if an actor exists for the event. If it does, the name of the actor is retrieved from the table. Otherwise, a new NetworkEventActor is created.
 
- Send the networkEvent message when the HTTPRequest and HTTPResponse messages are received. Use the Firefox devtools code (onNetworkEvent) as a reference.
- When a NetworkEventMessage (HTTPRequest or HTTPResponse) is received, the NetworkEventActor corresponding to it is either created or located as described above.
- On an HTTPRequest, a networkEvent message is sent to each debugger client using the accepted_connections vector of TcpStream objects. The message is in JSON and contains the name of the NetworkEventActor associated with the event.
 
{
  "from": "server1.conn0.consoleActor2",
  "type": "networkEvent",
  "eventActor": {
    "actor": "server1.conn0.netEvent45",
    "startedDateTime": "2015-04-22T20:47:08.545Z",
    "url": "https://aus4.mozilla.org/update/3/Firefox/40.0a1/20150421152828/Darwin_x86_64-gcc3/en-US/default/Darwin%2013.4.0/default/default/update.xml?force=1",
    "method": "GET",
    "isXHR": true,
    "private": false
  }
}
- Similarly, on an HTTPResponse, a networkEventUpdate message is sent to the debugger client(s). The name of the NetworkEventActor passed in this message should be the same as of the actor that was used for the request.
 
{
  "from": "server1.conn0.netEvent45",
  "type": "networkEventUpdate",
  "updateType": "responseStart",
  "response": {
    "httpVersion": "HTTP/1.1",
    "remoteAddress": "63.245.217.43",
    "remotePort": 443,
    "status": "200",
    "statusText": "OK",
    "headersSize": 337,
    "discardResponseBody": true
  }
}
- Implement the getRequestHeaders, getRequestCookies, getRequestPostData, getReponseHeaders, getReponseCookies, and getResponseContent messages for NetworkEventActor.
- Messages can be sent from the debugger clients to the developer tools server via the Actor object.
- Once the client knows the name of the NetworkEventActor that is associated with a request, it can query the devtools server for more information about the request/response by sending messages with the actor name in the "to" field. These messages will be directed to the actor object with that name registered in the actors registry (The network event actors are registered upon creation).
- The handle_message function in the NetworkEventActor will handle messages that are sent to the actor object from the debugger client. It contains a parameter stream of type TcpStream that can be used to send JSON messages in reply.
- The handle_message function should return the appropriate response using the information stored in the fields of the NetworkEventActor. The struct contains private fields of type struct HttpRequest and struct HttpResponse that are populated at the time when requests and responses are received by the main loop in run_server.
 
Sample query message from the client:
DBG-SERVER: Received packet 304: {
  "to": "server1.conn1.child1/netEvent42",
  "type": "getRequestHeaders"
}
Sample response from the actor (truncated):
DBG-SERVER: Received packet 305: {
  "from": "server1.conn1.child1/netEvent42",
  "headers": [
    {
      "name": "Host",
      "value": "sendto.mozilla.org"
    },
    {
      "name": "User-Agent",
      "value": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:40.0) Gecko/20100101 Firefox/40.0"
    },
    {
      "name": "Accept",
      "value": "image/png,image/*;q=0.8,*/*;q=0.5"
    },
    {
      "name": "Accept-Language",
      "value": "en-US,en;q=0.5"
    },
Architecture
The conceptual architecture of Mozilla Firefox running Gecko is represented in the following figure. This is from Grosskurth and Godfrey, A Reference Architecture for Web Browsers <ref>http://grosskurth.ca/papers/browser-refarch.pdf</ref>
Component Diagrams
Servo is designed to be highly parallel, with many components (rendering, layout, HTML parsing, image decoding, etc.) handled by fine-grained, isolated tasks.
The following figures represent the browser architecture with all these components, and the Servo architecture <ref>http://www.joshmatthews.net/fosdemservo/</ref>
Browser:
Servo Architecture:
Design
enum in Rust
Enums are datatypes with several alternate representations. A simple enum defines one or more constants with the same data type. In Rust, an enum can have complex variants though, like a struct. For example, consider an enum 'Shape' with variants 'Circle' and 'Triangle' each of which is a struct.
   enum Shape {
       Circle { center: Point, radius: f64 },
       Triangle{ vert1: Point, vert2: Point, vert3: Point }
   }
A variable of type Shape can be resolved to its appropriate variant by using a 'match'.
   fn area(sh: Shape) -> f64 {
       match sh {
           Circle(_, size) => f64::consts::PI * size * size,
           Rectangle(Point { x, y }, Point { x: x2, y: y2 }) => (x2 - x) * (y2 - y)
       }
   }
The Servo Developers Tools project has a 'DevtoolsControlMsg' enum which is used to instruct the devtools server to update its known actors/state according to changes in the browser. The current project requires the addition of two new variants to the 'DevtoolsControlMsg' enum, namely 'HTTRequest' and 'HTTResponse'.
Sender and Receiver
In Rust, a pipe is used for communication between tasks. A pipe is simply a pair of endpoints: one for sending messages and another for receiving messages(Sender and Receiver). The simplest way to create a pipe is to use the channel function to create a (Sender, Receiver) pair. In Rust parlance, a sender is a sending endpoint of a pipe, and a receiver is the receiving endpoint. A simple channel can be created as follows:
   let (tx, rx) = channel();
   spawn(proc() {
       tx.send(10i);
   });
   assert_eq!(rx.recv(), 10i);
The sending half of the asynchronous channel type is Struct Sender<T> and the receiving half is the struct Receiver<T>. For the purpose of this project, we will be using the channel to send and receive messages. To log the HTTP requests and responses on the console, we can use a variant of the enum 'DevtoolsControlMsg'. This enum is used to send messages to the devtools server to update its actors/state according to changes in the browser. We now add HTTPRequest and HHTPResponse variant to the DevtoolsControlMsg enum. We can now use the HTTP loaders' load function to send the HTTPRequest and HHTPResponse to the main loop in the run_server, where it can be received using a receiver. This is done by adding an argument, Sender<DevtoolsControlMsg> to the load function.
Design Patterns
Actor Model
The actor model<ref>http://en.wikipedia.org/wiki/Actor_model</ref> in computer science is a mathematical model of concurrent computation that treats "actors" as the universal primitives of concurrent computation: in response to a message that it receives, an actor can make local decisions, create more actors, send more messages, and determine how to respond to the next message received.
The Actor model adopts the philosophy that everything is an actor. This is similar to the everything is an object philosophy used by some object-oriented programming languages, but differs in that object-oriented software is typically executed sequentially, while the Actor model is inherently concurrent.
An actor is a computational entity that, in response to a message it receives, can concurrently:
- send a finite number of messages to other actors;
- create a finite number of new actors;
- designate the behavior to be used for the next message it receives.
Servo uses an actor-based devtools server implementation. There are different types of actor objects created to handle different messages. The developer tools server creates a ConsoleActor that is a part of every TabActor. The ConsoleActor is responsible for handling console events and sending messages of type ConsoleMsg to the debugger clients. As part of the initial steps of this project, we extended existing work that made console.log messages appear on the Web Console.
For the logging of Http requests and responses, a new NetworkEventActor that implements the Actor trait is added. This actor sends messages to, and listens for messages from, the debugger clients. We use the NetworkEventActor to store the request and response information that is to be transmitted in these messages. A unique id is used to keep track of a request and its corresponding response (as described in the Implementation section). In the event that a client wants more information about a request/response, it can query the server using this unique id. Following is a code snippet showing the structure of the actor and its fields.
struct HttpRequest {
    url: String,
    method: Method,
    headers: Headers,
    body: Option<Vec<u8>>,
}
struct HttpResponse {
    headers: Option<Headers>,
    status: Option<RawStatus>,
    body: Option<Vec<u8>>
}
pub struct NetworkEventActor {
    pub name: String,
    request: HttpRequest,
    response: HttpResponse,
}
Examples of messages sent from and handled by the NetworkEventActor are shown in the Implementation section. Some other sample JSON messages sent from Servo to the debugger clients can be found on this page
UML Diagrams
The diagrams are an approximation of UML, that are drawn by treating the 'struct's in Rust as classes.
Class Diagram
Traits can be used in a manner similar to interfaces in a language like C++ or Java.
Sequence Diagram
Proposed Test Cases
Testing the feature
The feature added will be tested using the following steps:
- Build Servo successfully with the changes (./mach build)
- Run Servo on a web page with the --devtools argument specifying a port number, say 6000 (./mach run [url] --devtools 6000)
- Open an instance of a recent version Firefox configured with Remote Debugging enabled.
- Connect to Servo running the web page on port 6000.
- Enable persistent logs in Settings. Select logging of Net messages from the Console.
- Navigate to another url from the webpage running Servo.
- The HTTP request and response messages should be logged on the Console.
Regression Testing
We want to ensure that everything is in order after we have completed our changes; more importantly, that we have not broken any existing functionality. Mozilla has an automated workflow in place for running tests on pull requests submitted to Servo. Servo has a test suite which runs on every reviewed pull request, via bors-servo and the buildbot.
After it sees an approval on a reviewed pull requests, bors-servo automatically merges the changes
It starts off running a suite of regression tests, and any failure is reported. It can be issued a retry command once the fixes are in place.
When all tests pass, the changes are successfully merged into master. The following is the set of tests that will be run after the pull request is reviewed:
Unit Tests
The unit tests can be run using the command:
./mach test-unit
Since the code changes involve changing the parameters needed by existing functions new_resource_task(..) and LoadData::new(..), some of the unit tests that used these functions need to be fixed.
Reference
<references/>












