Skip to content

Instantly share code, notes, and snippets.

@kmicinski
Last active March 18, 2025 19:49
Show Gist options
  • Save kmicinski/301f552d917840d286f15d2269b25b88 to your computer and use it in GitHub Desktop.
Save kmicinski/301f552d917840d286f15d2269b25b88 to your computer and use it in GitHub Desktop.
#lang racket
;; CIS 352 -- Spring '25
;; March 18, 2025
;; This week in class:
;; [ ] λ-calculus refresher
;; [ ] λ-calculus β reduction motivation
;; [ ] Extending the λ-calculus with numbers
;; [ ] Building an interpreter with closures
;; and builtins.
;; [ ] Normal form and reduction to normal form
;; [ ] Reduction strategies (CBN, CBV)
;; [ ] Church-Rosser
;;
;; Part 0: λ-calculus refresher
;;
;; Before the break, we introduced the λ-calculus
;;
;; The idea in the λ-calculus is to work with a tiny
;; core language consisting only of functions. The
;; language has the three forms:
;; e ::= x -- Variable
;; | (λ (x) e) -- λ-abstraction
;; | (e e) -- application
;;
;; In elementary mathematics we have a notion of "plug
;; and chug." So if you define a function
;; f(x) = x^2 + 3*x
;; Then if you see the formula f(z+2) you could ascertain
;;
;; f(z+2) -- start with this
;; = (z+2)^2 + 3*(z+2) -- apply definition of f
;; = z^2 + 4*z + 4 + 3*z + 6 -- expand / distribute
;; = z^2 + 7*z + 10 -- simplify to polynomial
;;
;; We can see the λ-calculus as forcing us to abstract
;; this operation into a λ-function:
(define f (lambda (x) (+ (* x x) (* 3 x))))
;; Exercise:
;; Let's say f(x,y) = x+y * g(y,x)
;; What is f(a+b)?
;; λ-calculus β reduction motivation
;;
;; We all intuitively know how to operate with terms
;; like f(z+2)--we even know how to do it symbolically.
;; The λ-calculus asks: what if we take this "plugging
;; in" operation (calling a function) as *the* singular
;; primitive operation in the language.
;;
;; The Church-Turing thesis is that any computable
;; function over the naturals is computable iff there
;; exists some Turing machine *or* λ-calculus term
;; encoding it.
;; Definition: convertability relations
;;
;; In the λ-calculus, the "meaning" of a term--or
;; the "behavior" of a term (its semantics), is
;; specified by all of the ways in which we could
;; *convert* the term
;;
;; The standard λ-calculus includes three types of
;; convertability relations, which we can think
;; of as "rewrite rules:" they tell us how to
;; make progress on the term.
;;
;; There are three standard conversions we discuss
;; in the λ-calculus:
;; [ ] β-reduction with capture-avoiding substitution
;; [ ] α-conversion (we'll cover today)
;; [ ] η-reduction/expansion (we'll cover soon)
;;
;; So now the observation is this: in our language up
;; until now, we've been a bit intuitive about the
;; definition of things like *, +, 3, etc... But
;; the λ-calculus asks the question: what if we focus
;; on the aspects of the computation that are relevant
;; only to *calling functions*?
;;
;; The thesis of the λ-calculus is that function calls
;; should be the focus of our computation. In fact,
;; in the pure λ-calculus, we ignore numbers!
;; For example, the following is *not* a valid
;; λ-calculus term:
(λ (x) 5)
;; nor are...
(λ (x) (+ x x))
(λ (x) (λ (y) (x (y +))))
;; However, Racket is not just a λ-calculus: it is an
;; *extension* of the λ-calculus to a real fully-featured
;; programming languages with macros, matching, lists, etc.
;;
;; The essence of the λ-calculus is to try to reduce
;; the number of features to an absolutely minimal
;; number. In this setting, even operations over *numbers*
;; are eschewed as secondary. We will see that it is possible,
;; however, to encode numbers as λ-calculus terms.
;;
;; For example, the number 5 is represented as the "Church
;; numeral:"
(λ (f) (λ (x) (f (f (f (f (f x)))))))
;; we will see this next week. In this encoding, the idea
;; is that a Church-encoded number is a function that
;; accepts another function (here f) and "does f five times"
;; to some argument x.
;;
;; For example, let's say you did this:
((λ (f) (λ (x) (f (f (f (f (f x))))))) add1)
;; By β-reduction, we can conclude that the above piece
;; of code is equal to:
(λ (x) (add1 (add1 (add1 (add1 (add1 x))))))
;; Now, what is this function, intuitively? Well, if
;; we think a little bit about the behavior of add1,
;; we can conclude that it's also equal to:
(λ (x) (+ x 5))
;; However, recognizing that required us to understand
;; the meaning of add1 (and its relationship to +).
;; We can't do that with the λ-calculus alone. We will
;; put this aside from now but will come back to it
;; shortly.
;;
;; For now, we want to concern ourselves with some
;; details of substitution.
;;
;; Capture-Avoiding Substitution (CAS) and α-conversion
;;
;; From now on, when we use β reduction, we'll assume
;; that we're using a "capture avoiding" version of
;; the substitution function.
;;
;; What is capturing / capture avoiding substitution?
;;
;; A β-reduction is an opportunity to reduce a single
;; syntactic callsite (of a λ) by "plugging in" the
;; argument. The idea of "plugging in" is made formal
;; by substitution:
;; β: ((λ (x) e-b) e-a) → e-b[x↦e-a]
;; In the definition e-b[x↦e-a], we're implicitly
;; assuming an ability to *substitute* ("plug in") the
;; value e-a in the term e-b.
;;
;; QUICK EXERCISE:
;; Do the following substitutions:
;; (x y) [x ↦ (λ (z) z)]
;; ((x z) (λ (z) x)) [x ↦ k]
;;
;; However, there are some issues with "naive"
;; substitution. The issue comes when we try to
;; do substitution of terms like these:
;; (λ (x) (z (λ (z) z))) [ z ↦ k ]
;; We might be tempted to think the answer is the
;; following, but it's not:
;; (λ (x) (k (λ (z) k)))
;; Why not? The answer is that the inner definition
;; of the (λ (z) z) really forms a *new* binding
;; for z, meaning that the z under the λ is really
;; a *different* z than the one above the λ.
;; Thus, the answer needs to be this:
;; (λ (x) (z (λ (z) z))) [ z ↦ k ]
;; = (λ (x) (k (λ (z) z))) (correct!)
;;
;; A different way to see it:
;;
;; Put differently, let's ask what would happen
;; if we allowed this, and see what goes wrong:
;; 1. Start with (λ (x) (z (λ (z) z)))
;; 2. Intuitively, this is the function that
;; accepts an argument x and returns z applied
;; to the identity function.
;; 3. Now, instantiate z with any value other
;; than the identity function, e.g.:
;; (z (λ (z) z)) [z ↦ (λ (x) (λ (y) x))]
;; Now, if we said this was:
;; ((λ (x) (λ (y) x)) (λ (z) (λ (x) (λ (y) x))))
;; Now: rather than applying z to the identity
;; function, we are applying it to this
;; other function! So, we've changed the term's
;; meaning! (And that's a bad thing!)
;;
;; In general, when we do substitution, we need
;; to be careful to avoid *capturing* a variable.
;;
;; A variable is "captured" by a substitution
;; whenever it changes from free to bound (or
;; vice-versa). Variables that are free need to
;; *stay* free, since *capturing* them will
;; change their meaning.
;; For example, let's look at this term:
(λ (x) (λ (x) x))
;; Let's say we take that and apply it to `add1`
;; It should produce:
;; β using CAS:
;; ((λ (x) (λ (x) x)) 6) → (λ (x) x)
;; because CAS understands that:
;; 1. Once you get to a new λ for the variable,
;; you stop substitution.
(((λ (x) (λ (x) x)) 6) 5) ;; 5
;; By the way, here we're using a kind of
;; "contextual equality:" two terms are equal when
;; you can substitute one for the other and get
;; the same set of possible derivations.
;;
;; If we do naive substitution and forget the rule,
;; we might come to this WRONG conclusion:
;; ((λ (x) (λ (x) x)) 6) → (λ (x) 6) WRONG
;; This shows a different behavior!
((λ (x) 6) 5) ;; 6!
;; In general, we need to be careful with the
;; substitution:
;; e [ x ↦ e' ]
;; Because we might have changed a variable from
;; being free to being captured. Whenever this
;; would happen, we disallow capture-avoiding
;; substitution, and instead require that we
;; instead perform an α conversion.
;;
;; Let's look at the bad case we need to avoid:
;; (λ (y) e0) [ x ↦ e1 ] and y ∈ FV(e0)
;; As a specific example:
;; (λ (add1) (x add1)) [ x ↦ (add1 z) ]
;; If we do *naive* substitition, then we end up
;; with:
;; (λ (add1) ((add1 z) add1))
;; But, in doing so, we've changed the *meaning*
;; of add1 here!
;;
;; Previously, add1 was a free variable (e.g.,
;; referring to Racket's add1). Now, because we've
;; added it under a λ, we've changed its meaning
;; to instead refer to the *lambda's* add1.
;;
;; This is a subtle point, so it may take a few
;; times reading through the above, but the
;; upshot is that:
;; 2. Whenever we want to perform CAS on a λ,
;; we need to be careful to ensure that no
;; free variables in the thing being substituted
;; would be captured by the λ.
;; These two points (1) and (2) give us two rules
;; we have to follow to define capture-avoiding
;; substitution.
;;
;; So now, we'll define it fully, and here it is:
;;
;; x [x↦e] = e
;; y [x↦e] = y
;; (e0 e1) [x↦e] = (e0[x↦e] e1[x↦e])
;; (λ (x) e-b) [x↦e] = (λ (x) e-b)
;; (λ (y) e-b) [x↦e] = (λ (y) e-b[x↦e]) WHEN y ∉ FV(e)
;; Instead, you must use α-conversion first to get
;; rid of the problematic variable y.
;;
;; The last two cases get at the tricky parts (rules
;; (1) and (2)). Whenever we see a *new* occurence of
;; x, we just stop (we don't want to redefine something
;; that belongs to a lambda!). But whenever we would
;; capture a variable (y), we disallow that.
;;
;; α-conversion
;;
;; Here's another thing thing to keep in mind:
;; you could always be doing a β reduction *in*
;; a context:
(λ (y) ((λ (x) (x y)) (λ (y) y)))
;; In this way, we can always imagine that free
;; variables are just variables that we have not
;; yet *closed* over. Indeed, this is exactly what
;; we're doing when we build closures in our closure-
;; creating interpreter from e3/p3.
;;
;; When I personally visualize terms with free
;; variables, I use Racket builtin functions like add1
(λ (y) (y add1))
;; We can imagine that at the very top level of Racket,
;; we are actually defining these:
(λ (add1)
;; other builtins...
;; your program
(λ (y) (y add1)))
;; This is not literally what is happening in Racket,
;; but it is not a terrible mental model either, and we
;; will do something akin to this in some of our
;; projects. Also, it helps us come up with good practical
;; examples for free variables: if you see a free variable,
;; imagine it was a builtin like add1/+/*. You don't want
;; to change the behavior of a builtin, so you know that
;; In class, we will often work with a superset of the
;; λ-calculus, extending the λ-calculus with things
;; like numbers:
(((λ (x) (λ (y) (y (x 5))))
(λ (z) 6))
(λ (a) a))
;; It is important to recognize that this is *not* the
;; λ-calculus in the strictest of syntactic terms.
;; However, it turns out that adding numbers adds
;; no expressive power per-se, and we can easily imagine
;; intuitively what it means to extend the λ-calculus
;; with basic numbers. What I mean is this:
(define (λ-number? e)
(match e
[(? number? n) #t]
[(? symbol? x) #t]
[`(λ (,(symbol? x)) ,(? λ-number? e-b)) #t]
[`(,(? λ-number? e0) ,(? λ-number? e1)) #t]
[_ #f]))
;; Notice that this means the following are λ-number?
(λ-number? '5)
(λ-number? '(λ (x) 5))
(λ-number? '(λ (x) (λ (y) (x (y 5)))))
;; This is *NOT* allowed.
(λ-number? '(λ (x) (+ 5 x))) ;; #f
;; See that just because we added numbers as atomic
;; constants in the language, we haven't added
;; any way to compute with them. In other words,
;; all we can do is introduce a numeric constant.
;; But we can't use its value to make decisions
;; and produce new values or drive control-flow
;; at runtime.
;;
;; SEMANTICS -- what does a λ-calculus program *MEAN*?
;;
;; The syntax of the λ-calculus is easy. The semantics
;; of the λ-calculus is harder. In the plainest
;; terms, we can see the λ-calculus as a rewrite system
;; where we use some (non-deterministic) application
;; of the β rule to "make progress" and evaluate a
;; specific callsite.
;;
;; We call each of
;;
;; We will stick with this for a while, to focus on
;; some technicalities of β-reduction.
;;
;;
;; EXERCISE: adding builtins and the interpreter
;;
;; It is not too challenging to add "builtins"
;; to our language, which operate over numbers.
;; We can do this by adding several "builtin" forms
;; for application. Let's do that now:
(define (λ-arith? e)
(define (op? x) (match x ['+ #t] ['* #t]))
(match e
[(? number? n) #t]
[(? symbol? x) #t]
[`(λ (,(symbol? x)) ,(? λ-arith? e-b)) #t]
[`(,(? λ-arith? e0) ,(? λ-arith? e1)) #t]
;; *two*-argument application to builtin ops
[`(,(? op? op) ,(? λ-arith? e0) ,(? λ-arith? e1)) #t]))
;; Interpreter for e3
;;
;; As an interlude, I will show how to build an
;; interpreter for this language in the style of
;; exercise e3. In that exercise, we built an
;; interpreter which created *closures* to perform
;; substitution lazily.
;;
;; Now, we can write some rules for this language
;; using natural deduction.
;;
;; The rules read: "if the thing on the top
;; is true, then the thing on the bottom may be
;; concluded."
;;
;; In this case, the rules define the way
;; to build a judgement: "e, ρ ⇓ v", which
;; is read: "the expression e in the environment
;; ρ evaluates to the value v."
;;
;; Here, our values are either closures or they
;; are numbers.
(define (value? v)
(match v
;; I will write env instead of ρ in code
[`(clo (λ (,x) ,e) ,env) #t]
[(? number? n) #t]
[_ #f]))
;; Now, we define the rules:
;;
;; "A lambda expression immediately evaluates to
;; a closure."
;;
;; [Lam] ---------------------------------
;; (λ (x) e), ρ ⇓ clo((λ (x) e), ρ)
;;
;; "To evaluate an application, evaluate the function
;; position to a closure, then evaluate the argument
;; and build a new environment to evaluate the body."
;;
;;
;; e0, ρ ⇓ clo((λ (x) e-b), ρ')
;; e1, ρ ⇓ v
;; [App] -----------------------------------
;; (e0 e1), ρ ⇓ v'
;; We will finish the interpreter in the next class,
;; we did not have time to discuss much today, and it is
;; tied to evaluation order--which is a top of
;; this Thursday's class.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment