kmicinski · March 18, 2025 19:49
diff --git a/03-18.rkt b/03-18.rkt
 #lang racket

 ;; CIS 352 -- Spring '25
 ;; March 18, 2025

 ;; This week in class:
 ;; [ ] λ-calculus refresher
 ;; [ ] λ-calculus β reduction motivation
 ;; [ ] Extending the λ-calculus with numbers
 ;; [ ] Building an interpreter with closures
 ;;     and builtins.
 ;; [ ] Normal form and reduction to normal form
 ;; [ ] Reduction strategies (CBN, CBV)
 ;; [ ] Church-Rosser

 ;;
 ;; Part 0: λ-calculus refresher
 ;;

 ;; Before the break, we introduced the λ-calculus
 ;;
 ;; The idea in the λ-calculus is to work with a tiny
 ;; core language consisting only of functions. The
 ;; language has the three forms:
 ;;     e ::= x          -- Variable 
 ;;         | (λ (x) e)  -- λ-abstraction
 ;;         | (e e)      -- application
 ;;

 ;; In elementary mathematics we have a notion of "plug
 ;; and chug." So if you define a function
 ;;    f(x) = x^2 + 3*x
 ;; Then if you see the formula f(z+2) you could ascertain
 ;; 
 ;; f(z+2)                     -- start with this 
 ;; = (z+2)^2 + 3*(z+2)        -- apply definition of f
 ;; = z^2 + 4*z + 4 + 3*z + 6  -- expand / distribute
 ;; = z^2 + 7*z + 10           -- simplify to polynomial
 ;;
 ;; We can see the λ-calculus as forcing us to abstract
 ;; this operation into a λ-function:
 (define f (lambda (x) (+ (* x x) (* 3 x))))

 ;; Exercise:
 ;;  Let's say f(x,y) = x+y * g(y,x)
 ;;  What is f(a+b)?

 ;; λ-calculus β reduction motivation
 ;;

 ;; We all intuitively know how to operate with terms
 ;; like f(z+2)--we even know how to do it symbolically.
 ;; The λ-calculus asks: what if we take this "plugging
 ;; in" operation (calling a function) as *the* singular
 ;; primitive operation in the language.
 ;;
 ;; The Church-Turing thesis is that any computable
 ;; function over the naturals is computable iff there
 ;; exists some Turing machine *or* λ-calculus term
 ;; encoding it.

 ;; Definition: convertability relations
 ;;
 ;; In the λ-calculus, the "meaning" of a term--or
 ;; the "behavior" of a term (its semantics), is
 ;; specified by all of the ways in which we could
 ;; *convert* the term 
 ;;
 ;; The standard λ-calculus includes three types of
 ;; convertability relations, which we can think
 ;; of as "rewrite rules:" they tell us how to
 ;; make progress on the term.
 ;; 
 ;; There are three standard conversions we discuss
 ;; in the λ-calculus:
 ;;  [ ] β-reduction with capture-avoiding substitution
 ;;  [ ] α-conversion (we'll cover today)
 ;;  [ ] η-reduction/expansion (we'll cover soon)

 ;; 

 ;; So now the observation is this: in our language up
 ;; until now, we've been a bit intuitive about the
 ;; definition of things like *, +, 3, etc... But
 ;; the λ-calculus asks the question: what if we focus
 ;; on the aspects of the computation that are relevant
 ;; only to *calling functions*?
 ;;
 ;; The thesis of the λ-calculus is that function calls
 ;; should be the focus of our computation. In fact,
 ;; in the pure λ-calculus, we ignore numbers!

 ;; For example, the following is *not* a valid
 ;; λ-calculus term:
 (λ (x) 5)
 ;; nor are...
 (λ (x) (+ x x))
 (λ (x) (λ (y) (x (y +))))

 ;; However, Racket is not just a λ-calculus: it is an
 ;; *extension* of the λ-calculus to a real fully-featured
 ;; programming languages with macros, matching, lists, etc.
 ;;
 ;; The essence of the λ-calculus is to try to reduce
 ;; the number of features to an absolutely minimal
 ;; number. In this setting, even operations over *numbers*
 ;; are eschewed as secondary. We will see that it is possible,
 ;; however, to encode numbers as λ-calculus terms.
 ;; 
 ;; For example, the number 5 is represented as the "Church
 ;; numeral:"
 (λ (f) (λ (x) (f (f (f (f (f x)))))))

 ;; we will see this next week. In this encoding, the idea
 ;; is that a Church-encoded number is a function that
 ;; accepts another function (here f) and "does f five times"
 ;; to some argument x.
 ;;
 ;; For example, let's say you did this:
 ((λ (f) (λ (x) (f (f (f (f (f x))))))) add1)

 ;; By β-reduction, we can conclude that the above piece
 ;; of code is equal to:
 (λ (x) (add1 (add1 (add1 (add1 (add1 x))))))

 ;; Now, what is this function, intuitively? Well, if
 ;; we think a little bit about the behavior of add1,
 ;; we can conclude that it's also equal to:
 (λ (x) (+ x 5))

 ;; However, recognizing that required us to understand
 ;; the meaning of add1 (and its relationship to +).
 ;; We can't do that with the λ-calculus alone. We will
 ;; put this aside from now but will come back to it
 ;; shortly.
 ;; 
 ;; For now, we want to concern ourselves with some
 ;; details of substitution.

 ;;
 ;; Capture-Avoiding Substitution (CAS) and α-conversion
 ;;

 ;; From now on, when we use β reduction, we'll assume
 ;; that we're using a "capture avoiding" version of
 ;; the substitution function.
 ;;
 ;; What is capturing / capture avoiding substitution?
 ;;
 ;; A β-reduction is an opportunity to reduce a single
 ;; syntactic callsite (of a λ) by "plugging in" the
 ;; argument. The idea of "plugging in" is made formal
 ;; by substitution:
 ;;     β:  ((λ (x) e-b) e-a) → e-b[x↦e-a]
 ;; In the definition e-b[x↦e-a], we're implicitly 
 ;; assuming an ability to *substitute* ("plug in") the
 ;; value e-a in the term e-b.
 ;; 
 ;; QUICK EXERCISE:
 ;; Do the following substitutions:
 ;; (x y) [x ↦ (λ (z) z)]
 ;; ((x z) (λ (z) x)) [x ↦ k]
 ;;
 ;; However, there are some issues with "naive"
 ;; substitution. The issue comes when we try to
 ;; do substitution of terms like these:
 ;; (λ (x) (z (λ (z) z))) [ z ↦ k ] 
 ;; We might be tempted to think the answer is the
 ;; following, but it's not:
 ;;   (λ (x) (k (λ (z) k)))
 ;; Why not? The answer is that the inner definition
 ;; of the (λ (z) z) really forms a *new* binding
 ;; for z, meaning that the z under the λ is really
 ;; a *different* z than the one above the λ.
 ;; Thus, the answer needs to be this:
 ;;   (λ (x) (z (λ (z) z))) [ z ↦ k ]
 ;; = (λ (x) (k (λ (z) z))) (correct!)
 ;;
 ;; A different way to see it:
 ;; 
 ;; Put differently, let's ask what would happen
 ;; if we allowed this, and see what goes wrong:
 ;; 1. Start with (λ (x) (z (λ (z) z)))
 ;; 2. Intuitively, this is the function that
 ;; accepts an argument x and returns z applied
 ;; to the identity function.
 ;; 3. Now, instantiate z with any value other
 ;; than the identity function, e.g.:
 ;; (z (λ (z) z)) [z ↦ (λ (x) (λ (y) x))]
 ;; Now, if we said this was:
 ;; ((λ (x) (λ (y) x)) (λ (z) (λ (x) (λ (y) x))))
 ;; Now: rather than applying z to the identity
 ;; function, we are applying it to this
 ;; other function! So, we've changed the term's
 ;; meaning! (And that's a bad thing!)
 ;;
 ;; In general, when we do substitution, we need
 ;; to be careful to avoid *capturing* a variable.
 ;;
 ;; A variable is "captured" by a substitution
 ;; whenever it changes from free to bound (or
 ;; vice-versa). Variables that are free need to
 ;; *stay* free, since *capturing* them will
 ;; change their meaning.

 ;; For example, let's look at this term:
 (λ (x) (λ (x) x))

 ;; Let's say we take that and apply it to `add1`
 ;; It should produce:
 ;; β using CAS:
 ;;     ((λ (x) (λ (x) x)) 6) → (λ (x) x)
 ;; because CAS understands that:
 ;;  1. Once you get to a new λ for the variable,
 ;;     you stop substitution.
 (((λ (x) (λ (x) x)) 6) 5) ;; 5

 ;; By the way, here we're using a kind of
 ;; "contextual equality:" two terms are equal when
 ;; you can substitute one for the other and get
 ;; the same set of possible derivations.
 ;;
 ;; If we do naive substitution and forget the rule,
 ;; we might come to this WRONG conclusion:
 ;; ((λ (x) (λ (x) x)) 6) → (λ (x) 6) WRONG
 ;; This shows a different behavior!
 ((λ (x) 6) 5) ;; 6!

 ;; In general, we need to be careful with the
 ;; substitution:
 ;;   e [ x ↦ e' ]
 ;; Because we might have changed a variable from
 ;; being free to being captured. Whenever this
 ;; would happen, we disallow capture-avoiding
 ;; substitution, and instead require that we
 ;; instead perform an α conversion.
 ;;
 ;; Let's look at the bad case we need to avoid:
 ;; (λ (y) e0) [ x ↦ e1 ] and y ∈ FV(e0)
 ;; As a specific example:
 ;; (λ (add1) (x add1)) [ x ↦ (add1 z) ]
 ;; If we do *naive* substitition, then we end up
 ;; with:
 ;; (λ (add1) ((add1 z) add1))
 ;; But, in doing so, we've changed the *meaning*
 ;; of add1 here!
 ;;
 ;; Previously, add1 was a free variable (e.g.,
 ;; referring to Racket's add1). Now, because we've
 ;; added it under a λ, we've changed its meaning
 ;; to instead refer to the *lambda's* add1.
 ;;
 ;; This is a subtle point, so it may take a few
 ;; times reading through the above, but the
 ;; upshot is that:
 ;; 2. Whenever we want to perform CAS on a λ,
 ;;    we need to be careful to ensure that no
 ;;    free variables in the thing being substituted
 ;;    would be captured by the λ.

 ;; These two points (1) and (2) give us two rules
 ;; we have to follow to define capture-avoiding
 ;; substitution.
 ;;
 ;; So now, we'll define it fully, and here it is:
 ;;
 ;;     x [x↦e]           = e
 ;;     y [x↦e]           = y
 ;;     (e0 e1) [x↦e]     = (e0[x↦e] e1[x↦e])
 ;;     (λ (x) e-b) [x↦e] = (λ (x) e-b)
 ;;     (λ (y) e-b) [x↦e] = (λ (y) e-b[x↦e]) WHEN y ∉ FV(e)
 ;;     Instead, you must use α-conversion first to get 
 ;;     rid of the problematic variable y.
 ;; 
 ;; The last two cases get at the tricky parts (rules 
 ;; (1) and (2)). Whenever we see a *new* occurence of 
 ;; x, we just stop (we don't want to redefine something 
 ;; that belongs to a lambda!). But whenever we would 
 ;; capture a variable (y), we disallow that. 


 ;;
 ;; α-conversion
 ;;

 ;; Here's another thing thing to keep in mind:
 ;; you could always be doing a β reduction *in*
 ;; a context:
 (λ (y) ((λ (x) (x y)) (λ (y) y)))
 ;; In this way, we can always imagine that free
 ;; variables are just variables that we have not
 ;; yet *closed* over. Indeed, this is exactly what
 ;; we're doing when we build closures in our closure-
 ;; creating interpreter from e3/p3.
 ;;
 ;; When I personally visualize terms with free
 ;; variables, I use Racket builtin functions like add1
 (λ (y) (y add1))
 ;; We can imagine that at the very top level of Racket,
 ;; we are actually defining these:
 (λ (add1)
  ;; other builtins...
  ;; your program
  (λ (y) (y add1)))
 ;; This is not literally what is happening in Racket,
 ;; but it is not a terrible mental model either, and we
 ;; will do something akin to this in some of our
 ;; projects. Also, it helps us come up with good practical
 ;; examples for free variables: if you see a free variable,
 ;; imagine it was a builtin like add1/+/*. You don't want
 ;; to change the behavior of a builtin, so you know that

 ;; In class, we will often work with a superset of the
 ;; λ-calculus, extending the λ-calculus with things
 ;; like numbers:
 (((λ (x) (λ (y) (y (x 5))))
  (λ (z) 6))
 (λ (a) a))

 ;; It is important to recognize that this is *not* the
 ;; λ-calculus in the strictest of syntactic terms.
 ;; However, it turns out that adding numbers adds
 ;; no expressive power per-se, and we can easily imagine
 ;; intuitively what it means to extend the λ-calculus
 ;; with basic numbers. What I mean is this:
 (define (λ-number? e)
  (match e
    [(? number? n) #t]
    [(? symbol? x) #t]
    [`(λ (,(symbol? x)) ,(? λ-number? e-b)) #t]
    [`(,(? λ-number? e0)  ,(? λ-number? e1)) #t]
    [_ #f]))

 ;; Notice that this means the following are λ-number?
 (λ-number? '5)
 (λ-number? '(λ (x) 5))
 (λ-number? '(λ (x) (λ (y) (x (y 5)))))

 ;; This is *NOT* allowed. 
 (λ-number? '(λ (x) (+ 5 x))) ;; #f 

 ;; See that just because we added numbers as atomic
 ;; constants in the language, we haven't added
 ;; any way to compute with them. In other words,
 ;; all we can do is introduce a numeric constant.
 ;; But we can't use its value to make decisions
 ;; and produce new values or drive control-flow
 ;; at runtime.

 ;;
 ;; SEMANTICS -- what does a λ-calculus program *MEAN*?
 ;;

 ;; The syntax of the λ-calculus is easy. The semantics
 ;; of the λ-calculus is harder. In the plainest
 ;; terms, we can see the λ-calculus as a rewrite system
 ;; where we use some (non-deterministic) application
 ;; of the β rule to "make progress" and evaluate a
 ;; specific callsite. 
 ;; 
 ;; We call each of 

 ;; 
 ;; We will stick with this for a while, to focus on
 ;; some technicalities of β-reduction.
 ;;

 ;;
 ;; EXERCISE: adding builtins and the interpreter
 ;;

 ;; It is not too challenging to add "builtins"
 ;; to our language, which operate over numbers.
 ;; We can do this by adding several "builtin" forms
 ;; for application. Let's do that now:
 (define (λ-arith? e)
  (define (op? x) (match x ['+ #t] ['* #t]))
  (match e
    [(? number? n) #t]
    [(? symbol? x) #t]
    [`(λ (,(symbol? x)) ,(? λ-arith? e-b)) #t]
    [`(,(? λ-arith? e0) ,(? λ-arith? e1)) #t]
    ;; *two*-argument application to builtin ops
    [`(,(? op? op) ,(? λ-arith? e0) ,(? λ-arith? e1)) #t]))

 ;; Interpreter for e3
 ;; 
 ;; As an interlude, I will show how to build an
 ;; interpreter for this language in the style of
 ;; exercise e3. In that exercise, we built an
 ;; interpreter which created *closures* to perform
 ;; substitution lazily.
 ;; 
 ;; Now, we can write some rules for this language
 ;; using natural deduction.
 ;;
 ;; The rules read: "if the thing on the top
 ;; is true, then the thing on the bottom may be
 ;; concluded."
 ;;
 ;; In this case, the rules define the way
 ;; to build a judgement: "e, ρ ⇓ v", which
 ;; is read: "the expression e in the environment
 ;; ρ evaluates to the value v."
 ;; 
 ;; Here, our values are either closures or they
 ;; are numbers.
 (define (value? v)
  (match v
    ;; I will write env instead of ρ in code 
    [`(clo (λ (,x) ,e) ,env) #t]
    [(? number? n) #t]
    [_ #f]))

 ;; Now, we define the rules:
 ;;
 ;; "A lambda expression immediately evaluates to
 ;; a closure."
 ;; 
 ;; [Lam] ---------------------------------
 ;;       (λ (x) e), ρ ⇓ clo((λ (x) e), ρ)
 ;;
 ;; "To evaluate an application, evaluate the function
 ;; position to a closure, then evaluate the argument
 ;; and build a new environment to evaluate the body."
 ;;
 ;;      
 ;;              e0, ρ ⇓ clo((λ (x) e-b), ρ')
 ;;              e1, ρ ⇓ v 
 ;; [App] -----------------------------------
 ;;               (e0 e1), ρ ⇓ v' 

 ;; We will finish the interpreter in the next class, 
 ;; we did not have time to discuss much today, and it is 
 ;; tied to evaluation order--which is a top of 
 ;; this Thursday's class.
	#lang racket

	;; CIS 352 -- Spring '25
	;; March 18, 2025

	;; This week in class:
	;; [ ] λ-calculus refresher
	;; [ ] λ-calculus β reduction motivation
	;; [ ] Extending the λ-calculus with numbers
	;; [ ] Building an interpreter with closures
	;; and builtins.
	;; [ ] Normal form and reduction to normal form
	;; [ ] Reduction strategies (CBN, CBV)
	;; [ ] Church-Rosser

	;;
	;; Part 0: λ-calculus refresher
	;;

	;; Before the break, we introduced the λ-calculus
	;;
	;; The idea in the λ-calculus is to work with a tiny
	;; core language consisting only of functions. The
	;; language has the three forms:
	;; e ::= x -- Variable
	;; \| (λ (x) e) -- λ-abstraction
	;; \| (e e) -- application
	;;

	;; In elementary mathematics we have a notion of "plug
	;; and chug." So if you define a function
	;; f(x) = x^2 + 3*x
	;; Then if you see the formula f(z+2) you could ascertain
	;;
	;; f(z+2) -- start with this
	;; = (z+2)^2 + 3*(z+2) -- apply definition of f
	;; = z^2 + 4z + 4 + 3z + 6 -- expand / distribute
	;; = z^2 + 7*z + 10 -- simplify to polynomial
	;;
	;; We can see the λ-calculus as forcing us to abstract
	;; this operation into a λ-function:
	(define f (lambda (x) (+ (* x x) (* 3 x))))

	;; Exercise:
	;; Let's say f(x,y) = x+y * g(y,x)
	;; What is f(a+b)?

	;; λ-calculus β reduction motivation
	;;

	;; We all intuitively know how to operate with terms
	;; like f(z+2)--we even know how to do it symbolically.
	;; The λ-calculus asks: what if we take this "plugging
	;; in" operation (calling a function) as the singular
	;; primitive operation in the language.
	;;
	;; The Church-Turing thesis is that any computable
	;; function over the naturals is computable iff there
	;; exists some Turing machine or λ-calculus term
	;; encoding it.

	;; Definition: convertability relations
	;;
	;; In the λ-calculus, the "meaning" of a term--or
	;; the "behavior" of a term (its semantics), is
	;; specified by all of the ways in which we could
	;; convert the term
	;;
	;; The standard λ-calculus includes three types of
	;; convertability relations, which we can think
	;; of as "rewrite rules:" they tell us how to
	;; make progress on the term.
	;;
	;; There are three standard conversions we discuss
	;; in the λ-calculus:
	;; [ ] β-reduction with capture-avoiding substitution
	;; [ ] α-conversion (we'll cover today)
	;; [ ] η-reduction/expansion (we'll cover soon)

	;;

	;; So now the observation is this: in our language up
	;; until now, we've been a bit intuitive about the
	;; definition of things like *, +, 3, etc... But
	;; the λ-calculus asks the question: what if we focus
	;; on the aspects of the computation that are relevant
	;; only to calling functions?
	;;
	;; The thesis of the λ-calculus is that function calls
	;; should be the focus of our computation. In fact,
	;; in the pure λ-calculus, we ignore numbers!

	;; For example, the following is not a valid
	;; λ-calculus term:
	(λ (x) 5)
	;; nor are...
	(λ (x) (+ x x))
	(λ (x) (λ (y) (x (y +))))

	;; However, Racket is not just a λ-calculus: it is an
	;; extension of the λ-calculus to a real fully-featured
	;; programming languages with macros, matching, lists, etc.
	;;
	;; The essence of the λ-calculus is to try to reduce
	;; the number of features to an absolutely minimal
	;; number. In this setting, even operations over numbers
	;; are eschewed as secondary. We will see that it is possible,
	;; however, to encode numbers as λ-calculus terms.
	;;
	;; For example, the number 5 is represented as the "Church
	;; numeral:"
	(λ (f) (λ (x) (f (f (f (f (f x)))))))

	;; we will see this next week. In this encoding, the idea
	;; is that a Church-encoded number is a function that
	;; accepts another function (here f) and "does f five times"
	;; to some argument x.
	;;
	;; For example, let's say you did this:
	((λ (f) (λ (x) (f (f (f (f (f x))))))) add1)

	;; By β-reduction, we can conclude that the above piece
	;; of code is equal to:
	(λ (x) (add1 (add1 (add1 (add1 (add1 x))))))

	;; Now, what is this function, intuitively? Well, if
	;; we think a little bit about the behavior of add1,
	;; we can conclude that it's also equal to:
	(λ (x) (+ x 5))

	;; However, recognizing that required us to understand
	;; the meaning of add1 (and its relationship to +).
	;; We can't do that with the λ-calculus alone. We will
	;; put this aside from now but will come back to it
	;; shortly.
	;;
	;; For now, we want to concern ourselves with some
	;; details of substitution.

	;;
	;; Capture-Avoiding Substitution (CAS) and α-conversion
	;;

	;; From now on, when we use β reduction, we'll assume
	;; that we're using a "capture avoiding" version of
	;; the substitution function.
	;;
	;; What is capturing / capture avoiding substitution?
	;;
	;; A β-reduction is an opportunity to reduce a single
	;; syntactic callsite (of a λ) by "plugging in" the
	;; argument. The idea of "plugging in" is made formal
	;; by substitution:
	;; β: ((λ (x) e-b) e-a) → e-b[x↦e-a]
	;; In the definition e-b[x↦e-a], we're implicitly
	;; assuming an ability to substitute ("plug in") the
	;; value e-a in the term e-b.
	;;
	;; QUICK EXERCISE:
	;; Do the following substitutions:
	;; (x y) [x ↦ (λ (z) z)]
	;; ((x z) (λ (z) x)) [x ↦ k]
	;;
	;; However, there are some issues with "naive"
	;; substitution. The issue comes when we try to
	;; do substitution of terms like these:
	;; (λ (x) (z (λ (z) z))) [ z ↦ k ]
	;; We might be tempted to think the answer is the
	;; following, but it's not:
	;; (λ (x) (k (λ (z) k)))
	;; Why not? The answer is that the inner definition
	;; of the (λ (z) z) really forms a new binding
	;; for z, meaning that the z under the λ is really
	;; a different z than the one above the λ.
	;; Thus, the answer needs to be this:
	;; (λ (x) (z (λ (z) z))) [ z ↦ k ]
	;; = (λ (x) (k (λ (z) z))) (correct!)
	;;
	;; A different way to see it:
	;;
	;; Put differently, let's ask what would happen
	;; if we allowed this, and see what goes wrong:
	;; 1. Start with (λ (x) (z (λ (z) z)))
	;; 2. Intuitively, this is the function that
	;; accepts an argument x and returns z applied
	;; to the identity function.
	;; 3. Now, instantiate z with any value other
	;; than the identity function, e.g.:
	;; (z (λ (z) z)) [z ↦ (λ (x) (λ (y) x))]
	;; Now, if we said this was:
	;; ((λ (x) (λ (y) x)) (λ (z) (λ (x) (λ (y) x))))
	;; Now: rather than applying z to the identity
	;; function, we are applying it to this
	;; other function! So, we've changed the term's
	;; meaning! (And that's a bad thing!)
	;;
	;; In general, when we do substitution, we need
	;; to be careful to avoid capturing a variable.
	;;
	;; A variable is "captured" by a substitution
	;; whenever it changes from free to bound (or
	;; vice-versa). Variables that are free need to
	;; stay free, since capturing them will
	;; change their meaning.

	;; For example, let's look at this term:
	(λ (x) (λ (x) x))

	;; Let's say we take that and apply it to `add1`
	;; It should produce:
	;; β using CAS:
	;; ((λ (x) (λ (x) x)) 6) → (λ (x) x)
	;; because CAS understands that:
	;; 1. Once you get to a new λ for the variable,
	;; you stop substitution.
	(((λ (x) (λ (x) x)) 6) 5) ;; 5

	;; By the way, here we're using a kind of
	;; "contextual equality:" two terms are equal when
	;; you can substitute one for the other and get
	;; the same set of possible derivations.
	;;
	;; If we do naive substitution and forget the rule,
	;; we might come to this WRONG conclusion:
	;; ((λ (x) (λ (x) x)) 6) → (λ (x) 6) WRONG
	;; This shows a different behavior!
	((λ (x) 6) 5) ;; 6!

	;; In general, we need to be careful with the
	;; substitution:
	;; e [ x ↦ e' ]
	;; Because we might have changed a variable from
	;; being free to being captured. Whenever this
	;; would happen, we disallow capture-avoiding
	;; substitution, and instead require that we
	;; instead perform an α conversion.
	;;
	;; Let's look at the bad case we need to avoid:
	;; (λ (y) e0) [ x ↦ e1 ] and y ∈ FV(e0)
	;; As a specific example:
	;; (λ (add1) (x add1)) [ x ↦ (add1 z) ]
	;; If we do naive substitition, then we end up
	;; with:
	;; (λ (add1) ((add1 z) add1))
	;; But, in doing so, we've changed the meaning
	;; of add1 here!
	;;
	;; Previously, add1 was a free variable (e.g.,
	;; referring to Racket's add1). Now, because we've
	;; added it under a λ, we've changed its meaning
	;; to instead refer to the lambda's add1.
	;;
	;; This is a subtle point, so it may take a few
	;; times reading through the above, but the
	;; upshot is that:
	;; 2. Whenever we want to perform CAS on a λ,
	;; we need to be careful to ensure that no
	;; free variables in the thing being substituted
	;; would be captured by the λ.

	;; These two points (1) and (2) give us two rules
	;; we have to follow to define capture-avoiding
	;; substitution.
	;;
	;; So now, we'll define it fully, and here it is:
	;;
	;; x [x↦e] = e
	;; y [x↦e] = y
	;; (e0 e1) [x↦e] = (e0[x↦e] e1[x↦e])
	;; (λ (x) e-b) [x↦e] = (λ (x) e-b)
	;; (λ (y) e-b) [x↦e] = (λ (y) e-b[x↦e]) WHEN y ∉ FV(e)
	;; Instead, you must use α-conversion first to get
	;; rid of the problematic variable y.
	;;
	;; The last two cases get at the tricky parts (rules
	;; (1) and (2)). Whenever we see a new occurence of
	;; x, we just stop (we don't want to redefine something
	;; that belongs to a lambda!). But whenever we would
	;; capture a variable (y), we disallow that.


	;;
	;; α-conversion
	;;

	;; Here's another thing thing to keep in mind:
	;; you could always be doing a β reduction in
	;; a context:
	(λ (y) ((λ (x) (x y)) (λ (y) y)))
	;; In this way, we can always imagine that free
	;; variables are just variables that we have not
	;; yet closed over. Indeed, this is exactly what
	;; we're doing when we build closures in our closure-
	;; creating interpreter from e3/p3.
	;;
	;; When I personally visualize terms with free
	;; variables, I use Racket builtin functions like add1
	(λ (y) (y add1))
	;; We can imagine that at the very top level of Racket,
	;; we are actually defining these:
	(λ (add1)
	;; other builtins...
	;; your program
	(λ (y) (y add1)))
	;; This is not literally what is happening in Racket,
	;; but it is not a terrible mental model either, and we
	;; will do something akin to this in some of our
	;; projects. Also, it helps us come up with good practical
	;; examples for free variables: if you see a free variable,
	;; imagine it was a builtin like add1/+/*. You don't want
	;; to change the behavior of a builtin, so you know that

	;; In class, we will often work with a superset of the
	;; λ-calculus, extending the λ-calculus with things
	;; like numbers:
	(((λ (x) (λ (y) (y (x 5))))
	(λ (z) 6))
	(λ (a) a))

	;; It is important to recognize that this is not the
	;; λ-calculus in the strictest of syntactic terms.
	;; However, it turns out that adding numbers adds
	;; no expressive power per-se, and we can easily imagine
	;; intuitively what it means to extend the λ-calculus
	;; with basic numbers. What I mean is this:
	(define (λ-number? e)
	(match e
	[(? number? n) #t]
	[(? symbol? x) #t]
	[`(λ (,(symbol? x)) ,(? λ-number? e-b)) #t]
	[`(,(? λ-number? e0) ,(? λ-number? e1)) #t]
	[_ #f]))

	;; Notice that this means the following are λ-number?
	(λ-number? '5)
	(λ-number? '(λ (x) 5))
	(λ-number? '(λ (x) (λ (y) (x (y 5)))))

	;; This is NOT allowed.
	(λ-number? '(λ (x) (+ 5 x))) ;; #f

	;; See that just because we added numbers as atomic
	;; constants in the language, we haven't added
	;; any way to compute with them. In other words,
	;; all we can do is introduce a numeric constant.
	;; But we can't use its value to make decisions
	;; and produce new values or drive control-flow
	;; at runtime.

	;;
	;; SEMANTICS -- what does a λ-calculus program MEAN?
	;;

	;; The syntax of the λ-calculus is easy. The semantics
	;; of the λ-calculus is harder. In the plainest
	;; terms, we can see the λ-calculus as a rewrite system
	;; where we use some (non-deterministic) application
	;; of the β rule to "make progress" and evaluate a
	;; specific callsite.
	;;
	;; We call each of

	;;
	;; We will stick with this for a while, to focus on
	;; some technicalities of β-reduction.
	;;

	;;
	;; EXERCISE: adding builtins and the interpreter
	;;

	;; It is not too challenging to add "builtins"
	;; to our language, which operate over numbers.
	;; We can do this by adding several "builtin" forms
	;; for application. Let's do that now:
	(define (λ-arith? e)
	(define (op? x) (match x ['+ #t] ['* #t]))
	(match e
	[(? number? n) #t]
	[(? symbol? x) #t]
	[`(λ (,(symbol? x)) ,(? λ-arith? e-b)) #t]
	[`(,(? λ-arith? e0) ,(? λ-arith? e1)) #t]
	;; two-argument application to builtin ops
	[`(,(? op? op) ,(? λ-arith? e0) ,(? λ-arith? e1)) #t]))

	;; Interpreter for e3
	;;
	;; As an interlude, I will show how to build an
	;; interpreter for this language in the style of
	;; exercise e3. In that exercise, we built an
	;; interpreter which created closures to perform
	;; substitution lazily.
	;;
	;; Now, we can write some rules for this language
	;; using natural deduction.
	;;
	;; The rules read: "if the thing on the top
	;; is true, then the thing on the bottom may be
	;; concluded."
	;;
	;; In this case, the rules define the way
	;; to build a judgement: "e, ρ ⇓ v", which
	;; is read: "the expression e in the environment
	;; ρ evaluates to the value v."
	;;
	;; Here, our values are either closures or they
	;; are numbers.
	(define (value? v)
	(match v
	;; I will write env instead of ρ in code
	[`(clo (λ (,x) ,e) ,env) #t]
	[(? number? n) #t]
	[_ #f]))

	;; Now, we define the rules:
	;;
	;; "A lambda expression immediately evaluates to
	;; a closure."
	;;
	;; [Lam] ---------------------------------
	;; (λ (x) e), ρ ⇓ clo((λ (x) e), ρ)
	;;
	;; "To evaluate an application, evaluate the function
	;; position to a closure, then evaluate the argument
	;; and build a new environment to evaluate the body."
	;;
	;;
	;; e0, ρ ⇓ clo((λ (x) e-b), ρ')
	;; e1, ρ ⇓ v
	;; [App] -----------------------------------
	;; (e0 e1), ρ ⇓ v'

	;; We will finish the interpreter in the next class,
	;; we did not have time to discuss much today, and it is
	;; tied to evaluation order--which is a top of
	;; this Thursday's class.