CSC/ECE 517 Fall 2015/oss M1505 MSV
M1505: Add conformance tests to unicode-bidi and fix conformance bugs
This project involved adding conformance tests to the Servo implementation of the Unicode Bidirectional algorithm (unicode-bidi).
Problem Statement
Web browsers are expected to support international text, and Servo is no exception. The unicode-bidi library built into servo implements the Unicode Bidirectional Algorithm for display of mixed right-to-left and left-to-right text. This library's conformance with the Unicode Bidirectional Algorithm specification has yet to be comprehensively tested.
The primary objectives of this project involved:
- Adding code to tools/generate.py to download the two specification files listed below, that make up the conformance testsuite:http://www.unicode.org/Public/UNIDATA/BidiTest.txt and http://www.unicode.org/Public/UNIDATA/BidiCharacterTest.txt
- Conversion and extension of one or more test cases from the specification files into Rust test cases that can be run automatically.
Changes and Implementation
Initial Steps
The following steps were performed in more or less serial order:
- The current directory being pointed to by the running Python instance was modified. Since this instance points by default to the directory where the source file exists, it was pointing to the /tools/ directory. Instead it was made to point to the /src/ directory, where it could modify / check existence of existing files and download new files.
- After changing the current directory, the predefined fetch() function was used to download and save the two files that make up the conformance test suite.
- Once the files were fetched several test cases were inserted to test the conformance of the unicode-bidi implementation. The test cases that were added included:
- Several cases of line reordering
- Several cases where the RTL recognition was checked
- Several cases where the LTR recognition was checked
- All cases where the removal of characters as per the step X9 were tested
- All cases where the characters that weren't supposed to be removed as per the step X9 were tested
Forked Branch
The forked branch for our project can be found here.
Pull Request
The pull request can be found here.
Project Walk-through Video
The demonstration video of our project can be found here.
Mozilla Servo
Servo<ref>https://github.com/servo/servo</ref> is a Web Browser engine written using the Rust<ref>https://www.rust-lang.org/</ref> programming platform. Servo is an experimental project build that is optimized for new generations of hardware, particularly mobile devices, devices with multi-core processors and those with high-performance GPUs. It's core design principles are focused on optimizing power efficiency along with maximizing parallelism.<ref>https://en.wikipedia.org/wiki/Servo_(layout_engine)</ref>
Rust
Rust is a multi-paradigm, compiled programming language developed by Mozilla Research. The syntax of Rust is similar to C and C++. Rust has a self hosting compiler, rustc. <ref>https://en.wikipedia.org/wiki/Rust_(programming_language)</ref>
New projects can be created in Rust using Cargo. Cargo is the package manager for Rust. It also builds the rust code and manages its dependencies.<ref>http://siciarz.net/24-days-rust-cargo-and-cratesio/</ref>
A new Rust project can be created using the command:
$ cargo new project_name
Testing in Rust
Cargo will automatically generate a simple test when you make a new project. This new test function can be found in src/lib.rs. #[test] attribute indicates that a given function is a test function.<ref>https://doc.rust-lang.org/book/testing.html</ref>
The standard format for test functions is:
#[test] fn test_method() { }
The tests can be run using the following command:
$ cargo test
Unicode Bidirectional Algorithm
The Unicode Standard prescribes a memory representation order of text for browsers known as logical order.But the order they display text is different and is called the visual order.
When text is displayed horizontally, most scripts display the characters from left to right.However in case of languages like Arabic, Hebrew etc. the ordering is from right to left.Also, they have digits that are displayed from left to right.So, the text is bidirectional in nature.In addition, these languages may also have embedded in them, letters from scripts that are displayed from left to right.
To remove any ambiguities that may arise the Unicode Bidirectional Algorithm provides a set of rules which are used by a web-browser to produce the correct order at the time of display.
External Links
References
<references/>