Programming languages have grammars, just like
ordinary languages, in which ambiguities can occur that make a computer
program - written in that language - do something that you do not want, causing mistakes. This type of error is very difficult to find and very expensive to
fix. Bas Basten, PhD student from the Centrum Wiskunde & Informatica (CWI) in
Amsterdam, designed new tools and techniques to find such double meanings. On 15
December he defends his PhD thesis ‘Ambiguity Detection for Programming Language Grammars’ at the University of Amsterdam. With the new CWI techniques grammars for
programming languages can be checked faster for ambiguities – sometimes even
thousands of times faster. Results from this study are useful for software
engineers, for example in the design of domain specific languages (DSLs).
PEMDAS,
or Mr. Van Dalen
In the past, children in the Netherlands learned
the (now obsolete) mnemonic ‘Mr. Van Dalen waits for an answer’ to remember the
calculation rule, which is in the USA commonly known as PEMDAS: Parentheses come before Exponents, which
precede Multiplication and Division, who in turn have priority
over Addition and Subtraction. The sum 1+1*3 equals, according to
the calculation rule, 4 but for someone who does not know these rules the
answer could also be (1+1)*3=6. If it is not clear for a computer
which operation it should handle first, confusion may occur between
programmer and computer, causing errors in the software. Therefore it is
important that all ambiguities are resolved before a language is put to use. Since
an infinite amount of combinations of operators such as + and * exist, it
frequently happens that authors of new programming languages overlook an
ambiguous combination in the design.
In his research Bas Basten first examined existing methods to detect ambiguities in grammars of programming languages. Then he developed his own, improved techniques. He designed a method and corresponding tool ‘AmbiDexter’, which first filters out parts of grammars that cannot induce ambiguities. This allows the remainder to be searched faster for double meanings. When an ambiguity is found, it is often very difficult to see what exactly is wrong. Therefore Basten and his colleagues also developed the expert system ‘Dr. Ambiguity’ to analyse the cause of the double meaning. Both tools are integrated in Rascal – the meta-programming language of CWI. They are freely available for users (open source). In a case study, several ambiguities were found in grammars of programming languages. AmbiDexter is one of the first useful detection systems for ambiguities in software, and at the moment it is the fastest. Benchmark studies have shown that the filtering method can be up to thousands of times faster than existing methods. This research is a good example of ‘Software as a Service’ – an important research area for CWI.
More information: http://homepages.cwi.nl/~basten/ambiguity/ and http://www.rascal-mpl.org/
Supervisor: Prof. Dr. Paul Klint (CWI and UvA), co-supervisor: Dr Jurgen J. Vinju (CWI).
PhD defence: 15 December 2011, Agnietenkapel, Oudezijds Voorburgwal 231, Amsterdam, 14:00h.
Picture: Ambigram, by Bas Basten (CWI).