CSC/ECE 517 Fall 2015 M1505 Add conformance tests to unicode-bidi and fix conformance bugs

From Expertiza_Wiki
Revision as of 07:51, 13 November 2015 by Ssharm17 (talk | contribs)
Jump to navigation Jump to search

Introduction

Web browsers are expected to support international text, and Servo is no exception. This project is an attempt to improve an existing library to implement the Unicode Bidirectional Algorithm for display of mixed right-to-left and left-to-right text, and it has not yet achieved full conformance with the specification<ref name="servo">http://en.wikipedia.org/wiki/Servo_%28layout_engine%29</ref>.

Servo

Servo is an experimental project to build a Web browser engine for a new generation of hardware: mobile devices, multi-core processors and high-performance GPUs. With Servo, we are rethinking the browser at every level of the technology stack — from input parsing to page layout to graphics rendering — to optimize for power efficiency and maximum parallelism. Servo builds on top of Rust to provide a secure and reliable foundation. Memory safety at the core of the platform ensures a high degree of assurance in the browser’s trusted computing base. Rust’s lightweight task mechanism also promises to allow fine-grained isolation between browser components, such as tabs and extensions, without the need for expensive runtime protection schemes, like operating system process isolation<ref name = "servo"/>.

Rust

Rust is a new programming language for developing reliable and efficient systems. It is designed to support concurrency and parallelism in building platforms that take full advantage of modern hardware. Its static type system is safe and expressive and it provides strong guarantees about isolation, concurrency execution and memory safety. Rust combines powerful and flexible modern programming constructs with a clear performance model to make program efficiency predictable and manageable. One important way it achieves this is by allowing fine-grained control over memory allocation through contiguous records and stack allocation. This control is balanced with the absolute requirement of safety: Rust’s type system and runtime guarantee the absence of data races, buffer overflow, stack overflow or access to uninitialized or deallocated memory<ref>http://www.rust-lang.org/</ref>.

Architecture

  • generate.py - This file is central to the code base as it fetches the test data files BidiTest.txt<ref name="biditest">http://www.unicode.org/Public/UNIDATA/BidiTest.txt</ref> and BidiCharacterTest.txt<ref name="bidichartest">http://www.unicode.org/Public/UNIDATA/BidiCharacterTest.txt</ref>
  • BidiTest.txt<ref name="biditest"/> - Contains test case sample data at word level.
  • BidiCharacterTest.txt<ref name="bidichartest"/> - Contains test case sample data at character level.
  • Cargo - Rust file which will be used to run the cargo test.
  • lib.rs - Rust file which will be used for Unicode Bidirectional Algorithm testing.

Project Description

Project Implementation Flowchart

Here is a flowchart which represents the sequence of activities that make up the logical flow of this project

In order to move forward with the required changes following changes have been planned:

  • Need to add methods to tools/generate.py that automatically converts the tests in the conformance suite into Rust tests that can be run automatically [1]
  • Implement the missing step L1 from the UBA [2]
  • Implement the missing step N0 from the UBA [3]
  • Solve the conformance problems related to the implementation of steps W1 to W7[4]

Design Principles

Two of our proposed design principles are:

1. Open-Closed principle:

We will be adding code to generate.py to convert the test cases in BidiTest.txt and BidiCharacterTest.txt into Rust test cases. However, we won't be changing exisiting code in the generate.py file.

2. Single Responsibility Principle:

The code to be implemented in generate.py and lib.rs will contain seperate single responsibilities. generate.py deals will fetching files, loading, unloading data. Whereas, lib.rs deals with actually testing the exisiting methods, and extending the functionality of the Unicode-Bidi algorithm.

Design Pattern

The proposed design pattern for this project is the Command Pattern. The Command Pattern uses an object to encapsulate all information needed to perform an action or trigger an event at a later time. <ref>https://en.wikipedia.org/wiki/Command_pattern</ref>

The four terms associated with this pattern are:

  • Command Object:A command object knows about receiver and invokes a method of the receiver.
  • Command: Values for parameters of the receiver method are stored in the command
  • Receiver: Does the work after receiving the command.
  • Invoker: An invoker object knows how to execute a command, and optionally does bookkeeping about the command execution.
  • Client: The client decides which commands to execute at which points

In this project the terms given above can be interpreted as:

  • Command Object: generate.py
  • Command: BidiCharacterTest.txt/ BidiTest.txt
  • Receiver: lib.rs
  • Invoker: Cargo Test
  • Client: Cargo

UML Diagrams

Class Diagram

Here is a class diagram representing the different classes involved and their mutual interaction.

Test Cases

The project involved adding code from BidiCharacterTest.txt and BidiTest.txt so as to ensure that the implementation of the unicode-bidi algorithm always conforms to the specifications defined in the Unicode Bidirectional algorithm. However as part of the initial steps, we did add a few manual test cases that would check for conformance to some of the major steps. Here are some of those test cases:

  • Check for LTR by passing the level number
  • Check for RTL by passing the level number
  • Check for removal of characters according to the Rule X9 of the algorithm
  • Check for non removal of characters according to the Rule X9 of the algorithm
  • Check for reordering of characters in accordance with the following types of characters:
    • Weak LTR
    • Strong LTR
    • Strong RTL
    • Neutral characters
    • RTL(Explicit Right-To-Left) Markers (Failing Test Case. The steps to implement this was not implemented till that point in time.)

Video

<TBA>

References

<references/>