Assignment 3 2024

Create a program that reduces lambda expressions to ground form.

There is an important note at the end of this rather long description.

I have provided a skeleton program so that you will not have to spend a lot of time on the less educational parts. It takes too much time to put together something that will read and write lambda expressions correctly and provide a bit of help with debugging. It is in Python and can be found here. You do not need to go through all the details, but you will need to look a bit in order to see how things are represented.

The program in this form is runnable, and will do everything except reductions. If you type in a lambda expression it will simply print it back out for you along with a reminder of the base method you need to implement. That method is in there, but only as a stub. It does nothing but give you that reminder. Everything is implemented as classes, the most public-facing one is called everything. If the constructor isn't given any parameters it will just read things from standard input and process them appropriately. If it is given a parameter, it must be a file name, and it will read and process that file, switching back to stgandard input at the end. The syntax used for lambda expressions is exactly what we used in class. It even accepts proper unicode λ lambdas. Those are not convenient to type, so you can use # instead. On output lambdas appear in their proper Greek form. Idle handles that without any trouble, but if it doesn't work for you just search for \u03BB in the program (it appears four times) and replace it with something else (anything other than # is liable to cause confusion).

By default the program just accepts expressions you type, and reduces them to ground form. After every beta reduction the entire expression is printed.

Variable names must begin with a lower case letter and contain only letters and decimal digits.

Comments are only accepted in the // whatever C++ style.

You can define macros, their names must begin with capital letters and the syntax is e.g.
        Inc = #n.#g.#y.((n g) (g y))
The point of the program accepting files is to let you fill a file with definitions to use in testing. It automatically checks to make sure there are no circular definitions, it is simply a matter of requiring that every macro must be defined before it appears anywhere. If you change the definition of a macro, any expression already using that macro will stay exactly as it is, with the old definition embedded in it.

There are a few very basic commands to help a little bit, their names begin with a colon: The program understands numbers. For example N17 will be automatically replaced by the standard lambda expression representation of the number 17. Numbers are also recognised in final results. If the result has the form of a number you are told which number it is. False is indistinguishable from 0, so don't let that trick you.

A very small sample session:
      import lamredinc as p
      e = p.everything()
      e.run()
      > (#a.(b a) (x x))
      reduce (λa.(b a) (x x))
      You need to implement reduce_leftmost_reducible
      -> (λa.(b a) (x x))
      > N3
      reduce λf.λx.(f (f (f x)))
      You need to implement reduce_leftmost_reducible
      -> λf.λx.(f (f (f x)))
      [which represents 3]
      > :exit


The normal prompt is "> ", but if any brackets have not been matched when you press enter, the pronpt will change to ": " to make it clear that more input is expected.

Some of classes in the Python code that I have supplied are things that you are not going to need to do anything with, but knowing what they are up to will help you to get things done. Two of the classes will be vital, but are not that complicated.

char_reader is very low level stuff that handles the drudge work involved in reading typed lambda expressions: ignoring whitespace and comments, getting the characters one at a time without having to worry about line breaks, being able to un-read a character, etc. read_lexeme reads a single symbol (possibly subscripted variabe name, lambda or # sign, dots and brackets, etc) and returns it as a string. Commands that need more information than their own names, such as :limit and :show, use this method, as does expr_reader which comes next.

expr_reader does what the name says. It makes use of char_reader to read entire lambda expressions and build very simple syntax trees that represent them in a way that the rest of the program can use conveniently. It is the read method that makes all that happen. You don't even need to look at read, but of course you can if you want. Changing anything in it (except perhaps for replacing the lambda symbols) would probably be a mistake. The read_expr_or_def method might be worth a slight mention as it is the only one that the rest of the program uses. When it is called, it will read a whole "thing" from the input. If it is a lambda expression it returns the tree representation as an object of type expr. If it is a command it returns a one-tuple containing the command string. If it is a macro definition it returns a list containing just the macro name and the lambda expression that it will represent. The off mix of types just makes it really easy to see what you've got.

expr is the most important to understand as it represents the lambda expressions that you will be working on. Every expr has a "type" attribute, it is 'V' for just a variable, 'A' for an application (X Y), and 'F' for a so-called function, λx.Y. The constructor is a bit odd looking but really tells you all about the structure. Calling the constructor expr(string v, int n) returns a V-type expr to represent variable v with substript n, they go in the "name" and "vers" attributes (vers for version). The constructor expr(expr A, expr B) returns an A-type expr, A and B are the already created expr objects for the function (attribute "func") and parameter (attribute "param") of an application. expr("#", expr V, expr B) creates an F-type expr representing λv.B, V must be a V-type expr for the bound variable (attribute "var") and B can be any expr object representing the body of the function (attribute "body". The expression λx.λy.(x (y λk.(x k))) would be built like this:
        
The method copy returns a deep copy of the expression, and the method change_to completely overwrites one expr with another. Be careful when accessing an expr, different types have different attributes, e.g. only a type F has a "var"attribute.

debug_printer is for when something has gone wrong. If you have constructed an expr incorrectly, the normal printing method won't be able to give you the information needed to see exactly what you have built. When that happens, just create a debug_printer object with the expr as its constructor's parameter, it will immediately print the whole expr tree with every detail displayed, using indentation to show the structure. It will detect any circular structures (an expr that through pointers to other exprs eventually leads back to itself) and other problems and tell you about them. A debug_printer object has an attribute called bad. After printing bad will be True if any problems were detected, and False if not. This is what the output looks like when the debug_printer is created from the input expr after a normal print(inp) immediately after inp has been read:
(#y.(y (y y)) #x.(x a))

expr = [0001], type A: 
   func =  [0002], type F: 
      var =  [0003], type V: y/0
      body = [0004], type A: 
         func =  [0005], type V: y/0
         param = [0006], type A: 
            func =  [0007], type V: y/0
            param = [0008], type V: y/0
   param = [0009], type F: 
      var =  [000A], type V: x/0
      body = [000B], type A: 
         func =  [000C], type V: x/0
         param = [000D], type V: a/0
Every expr is printed on its own line. A variable's name and version are on the same line, but any sub-exprs are on their own lines. The hexadecimal number in square brakcets is a unique identifier for each expr object so you can see when/if there is any structure sharing.

everything is where the tricky stuff happens, and where you will be adding youe reduce_leftmost_reducible method. The attribute "defs" is a dictionary mapping macro names to their exprs, and "versions" is a dictionary mapping variable names to the maximum subscript that variable has appeared with so far.
     note_variable_version takes a type V expr and makes sure that its version is correctly recorded in the versions dictionary.
     note_variable_versions takes a whole expr and appllies note_variable_version to every variable that appears anywhere within it. This is also useful because it shows you exactly how to run through a whole expr tree recursively if you haven't already worked that out. The test of name[0].isupper() is to stop macro names from being taken as variables.
     next_variable_version takes a type V expr and returns a new type V expr for a variable with the same name but a guaranteed unique version number.
     make_number, same_var, and recopgnise_number do not need to be looked at, they just convert e.g. N4 to λf.λx.(f (f (f (f x)))) and recognies λf.λx.(f (f (f (f x)))) as the number 4.
     replace_macros is another handy one. You won't need it but it shows a slightly more complex run through a whole expr tree than the previous one. It directly replaces every macro name is an expr with its defining expr. It is applied to every expr that appears in a macro definition and every expr that is ever reduced, so you will never see a macro when reducing.
     reduce_leftmost_reducible is your bit. It must search through an expr tree to find the leftmost β-reducible type 'A' expr and perform that reduction on it. You must make a considered decision: do you actually overwrite the 'A' expr with the result of the reduction using expr.change_to, or do you rebuild all parts of the tree that led to that reduction (as in the AFL function that inserts a value into a binary tree)? The decision has consequences. Each macro's defining expr is shared by all exprs that use that macro. The don't get a copy of it, they just get a pointer to it. If a reduction results in something that came from a macro being changed, and it will, that will result in other unrelated exprs being changed too, and that is wrong. You could change that and have macro substitutions make deep copies instead, but that alone is not enough, a βreduction can also result in the "parameter" appearing multiple times in the result, and automatically copying everything will use up a lot of memory.

     I'm sure nobody needs to be told that run is just the interactive loop. It obeys commands, records macro definitions, and reduces lambda expressions. Note that its call of reduce_leftmost_reducible forces one implementation decision on you: it must return the whole (possibly modified) input expr or None if the expression is in ground form. Right at the end of run, the call self.rdr.clear() is what tells the reading system that an expression has been processed so the next prompt will be "> " instead of ": ".

Here is the important note. In writing this I noticed that there is one tiny little thing that I'm pretty sure I never mentioned during our coverage of lambda expressions. The exactly correct, good-for-all-occasions rule for performing β-reductions. Nothing that we encountered had any need of the slightly different method, but such things do come up and need to be considered. And fortunately the exact method is easier to implement than the one we used.
     I have updated the "Lambda Expressions" pdf link in class 12 on the web site. The new bit starts near the end of the third page with "Those rules are good enough ...", and ends almost at the bottom of the fourth page with "instead of ns".
     It simplifies things for you because there are no restrictions on when a β-reduction can be performed, the need to do α-reductions if completely handled by the new definition of the rewriting operator {...}[.../...], and also explains the need for (and helpfulness of) variables having version numbers or subscripts.