Thursday, 4 March 2010

Live Maths Two

-->

Live Maths Two

/ LiveMaths

What Next?
Parser
Syntax
Simple Formulas
Logo-ish
Function Definitions
Sum and Product expressions
Subscripted Variables
Overloading of Operators
Interlude

What Next?

The choice is between the interesting and complicated things like ranges and subscripts on the one hand and the dull but important things like formatting on the other.

The dull might be a better choice because both it and the more interesting things both require improvements to the parser.

Parser

The parser has several purposes:

  • Converts the formula into executable source code to be interpreted by OpenOfficeBasic (or BeanShell or JavaScript).
  • Discover the type of the equation: definition or output so that we know what to do with the value,
  • Preserve formatting so that the user doesn't complain,
  • Track dimensionality.

Syntax

What is the syntax? Is it regular and can we take advantage of that regularity in our parser? How much conversion do we really need to do? How do we report to the user that the formula is invalid or simply unsupported?

The help for OpenOfficeMath is not very good and doesn't give a clear and thorough description of the syntax so a web search and inspection will have to be used.

We will have to define the syntax so that we can create a simple hand coded parser. See LiveMathsSyntax.

Simple Formulas

A simple formula using only operators and function calls that have direct one to one correspondence with operators and functions in OpenOfficeBasic can be used without rearrangement by making the following replacements:

  • cdot, odot, times must be replaced an asterisk,
  • over, odivide, div by a slash,
  • size n by an empty string (n is any number).
  • nitalic, nbold, newline, color colour-name by an empty string.
  • ominus by dash,
  • neg by not,
  • neq by <>,
  • geslant by >=,
  • leslant by <=,
  • func e^ by exp,
  • [ and { with (,
  • ] and } with ),
  • left( with (,
  • right) with ).

In addition we must strip brackets from the left hand side and from the expression as a whole. Superfluous brackets may remain on the right hand side without causing any harm.

Logo-ish

The functions in OpenOfficeMath look a little like Logo in that they have fixed arity and hence do not need the argument list to be enclosed in brackets. In fact a bracketed group would be seen as a single argument.

The curly brackets enclose text that is to be treated as a single entity by the functions or operators.

This suggests that we could have a list of arities and a simple tokenizer. We read the text one token at a time and when we encounter a token that represents a function that we can handle we can then insert a round left bracket, count forward as many tokens as expected (treating curly bracketed groups as a single token) inserting commas as we go and then insert the closing bracket.

But unfortunately not all have fixed arity.

This:

 sum from {i=1} to {i=10} x_{i}  
and this:
 sum x_{i}  

are both legal.

We can console ourselves by saying that the second is not legal for us because we cannot execute it even if we did parse it; that is, we insist that the arity of the sum function is three. Yes, three not five because from and to are also functions, each of one argument.

In practice we might need to treat many such things specially.

Function Definitions

Function definitions compile to the function stream.

For instance: f(x) := x+1

Compiles as:

private function f(x_) as variant

dim x_local x_local = x ' copy global so we can shadow it x = x_ ' make argument global so that RHS can see it even if out of lexical scope.

f = x+1 ' reinstate global x = x_local

end function

Sum and Product expressions

Sum and product compile like this:

sum from {i=1} to{i=10} {i^3}

becomes simply sum_1 in the eval stream.

And this function is defined on the function stream.

private function sum_1 as variant

for i = 1 to 10 sum_1 = sum_1 + i^3 next i

end function

Product will be the same except for the names and operator.

Note that the digit after the function name is auto-generated to ensure that multiple sum expressions generate distinct functions.

Note also that there is a potential problem. Examine these formulas:

z := 2 f(x) := x+i a := sum from {i=1} to{i=10} {f(z) times i^3}

Which i does the function f refer to?

As described so far it will be the global i but that doesn't actually exist.

We could fix it by doing the same shadowing trick inside sum as we did inside the function definition:

private function sum_1 as variant
  dim i_save
  i_save = global.i ' or whatever the syntax is
  for i = 1 to 10
    sum_1 = sum_1 + i^3
  next i
  global.i = i_save
end function

Note that this time we cannot rename i because we don't know what refers to it. Ideally we would like to rename all variables and functions to avoid name clashes with other globals. As all this is supposed to be in a single module (if it fits in 64K) we don't really need globalscope but there appears to not be an equivalent way of referring to the module scope. To fix this we can generate subs that do it for us:

private function getmodule_i
  getmodule_i = i
end function

private sub setmodule_i(i_)
  i = i_
end function

Then the sum function looks like this:

private function sum_1 as variant
  dim i_save
  i_save = getmodule_i
  dim i as long
  for i = 1 to 10
    sum_1 = sum_1 + i^3
  next i
  setmodule_i i_save
end function

Subscripted Variables

Subscripted variables on right hand side compile to array dereferences:

x_i

compiles to:

x(i)

on the eval stream.

x_{i+1}

compiles to:

x(i+1)

On the left hand side it's more complicated.

Overloading of Operators

An unmentioned problem is that we might write formulas like this:

i := 1 .. 10
a := 2
x := i times 2

In this case i is a vector and a is a scalar. The usual interpretation is that x is also a vector produced by multiplying each element of i by a.

The obvious way to do this is to overload the operators but as we are using OpenOfficeBasic we don't have that luxury. An alternative would be to call a function instead so that we can check the data types at run time and then call the appropriate routine.

The problem with this is that the parser must be more sophisticated because we must reorder the expression.

We cannot avoid dealing with this because none of the target languages support operator overloading so we must create generic functions to be used instead of the operators. This means that we must re-order the source from infix to functional form.

Interlude

Time passes ...

After many small iterations I have a working version that parses quite a useful subset of OpenOfficeMath syntax:

  • matrices,
  • sum and product expressions,
  • subscripted variables (both array and symbolic).

It generates OpenOfficeBasic code for all of the above but cannot yet refer to array elements.

See LiveMathsJan2008.

No comments:

Post a Comment

Blog Archive

Followers