CSC/ECE 517 Fall 2014/oss M1452 jns
Integrating an XML Parser
Servo is an experimental web browser layout engine being developed by Mozilla Research. The prototype seeks to create a highly parallel environment, in which many components (such as rendering, layout, HTML parsing, image decoding, etc.) are handled by fine-grained, isolated tasks. The project has a symbiotic relationship with the Rust programming language, in which it is being developed.
Background Information
Rust
Rust is a systems language for writing high performance applications that are usually written in C or C++ but it was developed to prevent some of the problems related to invalid memory accesses that generate segmentation faults. It covers imperative, functional and object-oriented programming. It's designed to support concurrency and parallelism in building platforms that take full advantage of modern hardware. Its static type system is safe and expressive and it provides strong guarantees about isolation, concurrency execution and memory safety. <ref name=Rust>Overview of Mozilla Research Projects</ref>
Rust combines powerful and flexible modern programming constructs with a clear performance model to make program efficiency predictable and manageable. One important way it achieves this is by allowing fine-grained control over memory allocation through contiguous records and stack allocation. This control is balanced with the absolute requirement of safety: Rust’s type system and runtime guarantee the absence of data races, buffer overflow, stack overflow or access to uninitialized or deallocated memory. <ref name=Rust>Overview of Mozilla Research Projects</ref>
Servo
Servo is an experimental project to build a Web browser engine for a new generation of hardware: mobile devices, multi-core processors and high-performance GPUs. With Servo, we are rethinking the browser at every level of the technology stack — from input parsing to page layout to graphics rendering — to optimize for power efficiency and maximum parallelism.
Servo builds on top of Rust to provide a secure and reliable foundation. Memory safety at the core of the platform ensures a high degree of assurance in the browser’s trusted computing base. Rust’s lightweight task mechanism also promises to allow fine-grained isolation between browser components, such as tabs and extensions, without the need for expensive runtime protection schemes, like operating system process isolation.<ref name=Rust></ref>
Prerequisites
Installing Dependencies for Servo
The list of commands to install the dependencies on various platforms are<ref name = github>Servo Github</ref>
On OS X (homebrew):
brew install automake pkg-config python glfw3 cmake pip install virtualenv
On OS X (MacPorts):
sudo port install python27 py27-virtualenv cmake
On Debian-based Linuxes:
sudo apt-get install curl freeglut3-dev \ libfreetype6-dev libgl1-mesa-dri libglib2.0-dev xorg-dev \ msttcorefonts gperf g++ cmake python-virtualenv \ libssl-dev libglfw-dev
On Fedora:
sudo yum install curl freeglut-devel libtool gcc-c++ libXi-devel \ freetype-devel mesa-libGL-devel glib2-devel libX11-devel libXrandr-devel gperf \ fontconfig-devel cabextract ttmkfdir python python-virtualenv expat-devel \ rpm-build openssl-devel glfw-devel cmake pushd . cd /tmp wget http://corefonts.sourceforge.net/msttcorefonts-2.5-1.spec rpmbuild -bb msttcorefonts-2.5-1.spec sudo yum install $HOME/rpmbuild/RPMS/noarch/msttcorefonts-2.5-1.noarch.rpm popd
On Arch Linux:
sudo pacman -S base-devel git python2 python2-virtualenv mesa glfw ttf-font cmake
Building Servo
Normal Build
git clone https://github.com/servo/servo cd servo ./mach build ./mach run tests/html/about-mozilla.html
Building for Android target
git clone https://github.com/servo/servo cd servo ANDROID_TOOLCHAIN=/path/to/toolchain ANDROID_NDK=/path/to/ndk PATH=$PATH:/path/to/toolchain/bin ./mach build --android cd ports/android ANDROID_SDK=/path/to/sdk make install
Steps of Implementation
→ We created a new file mod.rs in the following directory: servo/components/script/parse/mod.rs
→ We have created a Parser Trait with the parse_chunk method in this mod.rs
pub mod html; pub trait Parser { fn parse_chunk(&self,input: String); fn finish(&self); }
→ Implementation of this Parser trait for the ServoHTML Parser struct is in the servo /components/script/dom/servohtmlparser.rs
impl Parser for ServoHTMLParser{ fn parse_chunk(&self, input: String) { self.tokenizer().borrow_mut().feed(input); } fn finish(&self){ self.tokenizer().borrow_mut().end(); } }
→ We have modified servo/components/script/parse/html.rs
to invoke the parse_chunk method appropriately.
InputString(s) => { parser.parse_chunk(s); }
InputUrl(url) => { let load_response = load_response.unwrap(); match load_response.metadata.content_type { Some((ref t, _)) if t.as_slice().eq_ignore_ascii_case("image") => { let page = format!("<html><body><img src='{:s}' /></body></html>", base_url.as_ref().unwrap().serialize()); parser.parse_chunk(page); }, Payload(data) => { // FIXME: use Vec<u8> (html5ever #34) let data = UTF_8.decode(data.as_slice(), DecodeReplace).unwrap(); parser.parse_chunk(data); }
References
<references></references>