In a previous post, I looked at how I build Ramble and an example of a build a calculator. Since then, I have changed how the parser works and I thought it would be a good opportunity to demonstrate something you could do as a weekend hack in Ramble. Here I will take you through how to write a parser for a subset for a PL/0 language. I would probably need to spend a little more time to get the whole language working.

Ramble library decisions

Compared with the previous post there are some functionality which have been remove or refactored. There has been many functions renamed to text variants, rather than the use of symbols. Those changes were rather cosmetic and will not be covered here. The two notable ones are below.

Removal of do function

The do function was removed. This is because it was cumbersome and restrictive in the R language (though it definitely made sense in Haskell). The do function was also not required in implementing the expression parser, hence it only seemed to confuse people rather than assist in building readable parsers.

Changing the %then% function

The Unlist function, which is very similar to the R base unlist function has been added to the then function.

Unlist <- function(a.list) {
  hasLowerLevel = TRUE
  while(hasLowerLevel) {
    a.list1 <- unlist(a.list, recursive=FALSE, use.names=FALSE)
    if (is(a.list1, 'list')) {
      a.list <- a.list1
    }
    else {
      hasLowerLevel = FALSE
      return(a.list)
    }
  }
  warning("loop should have returned a list!")
  return(a.list)
}

How this function works is that it continues to unlist until the object is no longer a list. This function has not been optimised, and further work could be undertaken to improve it. The then function now looks like this:

then <- function(p1, p2) {
  return(function(string) {
    result <- p1 (string)
    if (length(result) == 0) {
      return (list())
    }
    else {
      result_ <- p2 (result$leftover)
      if (length(result_$leftover) == 0 || is.null(result_$leftover)) {return(list())}
      return(list(result=Unlist(append(list(result$result), list(result_$result))), leftover=result_$leftover))
    }
  })
}

This allows iterating over $results to be much easier when using the %using% function.

What we are implementing

The PL/0 language is fairly simple. Wikipedia even has the model languaged defined in EBNF for us already:

program = block "." .

block = [ "const" ident "=" number {"," ident "=" number} ";"]
        [ "var" ident {"," ident} ";"]
        { "procedure" ident ";" block ";" } statement .

statement = [ ident ":=" expression | "call" ident 
              | "?" ident | "!" expression 
              | "begin" statement {";" statement } "end" 
              | "if" condition "then" statement 
              | "while" condition "do" statement ].

condition = "odd" expression |
            expression ("="|"#"|"<"|"<="|">"|">=") expression .

expression = [ "+"|"-"] term { ("+"|"-") term}.

term = factor {("*"|"/") factor}.

factor = ident | number | "(" expression ")".

In my attempt, I have implemented everything except block, call and while functionality.

Firstly, the expression grammar and lower has already been completed in my previous post, so we will only consider how condition and statements are implemented.

condition

condition is implemented in a similar way, note that we can use %using% and then use R code to develop the actual condition.

condition <- (expr %then% (token(String("<="))
                           %alt% token(String(">="))
                           %alt% symbol("=") 
                           %alt% symbol("<")
                           %alt% symbol(">")) 
                   %then% expr
                   %using% function(bool) {
                     if (bool[[2]] == "<") {
                       try(bool1 <- as.numeric(bool[[1]]) < as.numeric(bool[3]), silent=TRUE)
                       return(if (is.na(bool1)) bool else bool1)
                     }
                     else if (bool[[2]] == ">") {
                       try(bool1 <- as.numeric(bool[[1]]) > as.numeric(bool[3]), silent=TRUE)
                       return(if (is.na(bool1)) bool else bool1)
                     }
                     else if (bool[[2]] == "=") {
                       try(bool1 <- as.numeric(bool[[1]]) == as.numeric(bool[3]), silent=TRUE)
                       return(if (is.na(bool1)) bool else bool1)
                     }
                     else if (bool[[2]] == "<=") {
                       try(bool1 <- as.numeric(bool[[1]]) <= as.numeric(bool[3]), silent=TRUE)
                       return(if (is.na(bool1)) bool else bool1)
                     }
                     else if (bool[[2]] == ">=") {
                       try(bool1 <- as.numeric(bool[[1]]) >= as.numeric(bool[3]), silent=TRUE)
                       return(if (is.na(bool1)) bool else bool1)
                     }
                     else {
                       warning("boolean symbol should have matched, please check PL\\0 implementation")
                       return(bool)
                     }
                   })

We can use try here in a similar way to try, except in Python. Which is helpful if the expression cannot be evaluated. We can also make using of warning which can raise helpful information in the console when running the parser.

statement

The statement parser is created in much the same way. The only twist is the use of environment to assign and get the variables. We can use environments to “save” procedures once it gets implemented.

statement <- (((identifier() %then% token(String(":=")) %then% expr)
               %using% function(stateVar) {
                 if (stateVar[[2]] == ":=") {
                   assign(stateVar[[1]], stateVar[[3]], env=PL0)
                 }
                 return(stateVar)
               })
            %alt% (symbol("!") %then% identifier() 
                    %using% function(stateVar) {
                      # this calls a defined function (procedure)
                      print(get(stateVar[[2]], envir = PL0))
                      return(stateVar)
                    })
            %alt% (token(String("if")) %then% condition %then% token(String("then")) 
                    %then% statement %using% function(x) {
                      if(x[[2]]) {
                        return(x[[4]])
                      }
                      else {
                        return(x)
                      }
                    })
             %alt% (token(String("begin")) %then% (statement %then% many(symbol(";") %then% statement))
                    %then% token(String("end")))
             # call not implemented
             # while loop not implemented
            )

From its usage we will be able to understand better how this actually works.

For example, to assign 1 to x, increment x by 2, and print x we can write:

> invisible(statement("begin x := 1; x := x+2; ! x end"))
[1] 3