Introduction to the DMS Software Reengineering Toolkit

From Expertiza_Wiki
Jump to navigation Jump to search

The toolkit provides means for defining language grammars and will produce parsers which automatically construct abstract syntax trees (ASTs), and prettyprinters to convert original or modified ASTs back into compilable source text. The parse trees capture, and the prettyprinters regenerate, complete detail about the original source program, including source position, comments, radix and format of numbers, etc., to ensure that regenerated source text is as recognizable to a programmer as the original text modulo any applied transformations.

Many program analysis and transformation tools are limited to ASCII or Western European character sets such as ISO-8859; DMS can handle these as well as UTF-8, UTF-16, EBCDIC, Shift-JIS and a variety of Microsoft character encodings.

DMS uses GLR parsing technology, enabling it to handle all practical context-free grammars. Semantic predicates extend this capability to interesting non-context-free grammars (Fortran requires matching of multiple DO loops with shared CONTINUE statements by label; GLR with semantic predicates enables the DMS Fortran parser to produce ASTs for correctly nested loops as it parses).

DMS provides attribute grammar evaluators for computing custom analyses over ASTs, such as metrics, and including special support for symbol table construction. Other program facts can be extracted by built-in control- and data- flow analysis engines, local and global pointer analysis, whole-program call graph extraction, and symbolic range analysis by abstract interpretation.