CS349 - Principles of Programming Languages
Lambda Calculus
Lambda calculus consists of constructions of the form $M = x | (\lambda x.M) | (M M)$ where $M$ is a term and $x$ is a variable. These denote variables, abstraction and application respectively. Application is left associative, application binds more tightly than abstraction. $\lambda x.(\lambda y.M)$ can be abbreviated as $\lambda x\;y.M$.
The lambda symbol binds the variable which comes after it. The scope of the binding is the expression that follows the dot.
Free variables are not bound. For example, in $(\lambda x.\lambda y.(xz)(yw)w)$ the free variables are $w$ and $z$ and the bound variables are $x$ and $y$.
The free variables in an expression can be denoted
- $FV(x) = x$
- $FV(\lambda x.M) = M - \{x\}$
- $FV(M_1 M_2) = FV(M_1) \cup FV(M_2)$
A substitution in the term $M$ of the free variable $x$ by the term $N$ is denoted $M[x \leftarrow N]$.
- $x[x \leftarrow N] = N$
- $x[y \leftarrow N] = x$ when $x \neq y$
- $(M_1M_2)[y\leftarrow N] = (M_1[y \leftarrow N] M_2[y \leftarrow N])$
- $(\lambda x.M)[y \leftarrow N] = \lambda x.M$ if $x = y$
- $(\lambda x.M)[y \leftarrow N] = \lambda x.(M[y \leftarrow N])$ when $x \neq y$ and $x$ is not a free variable in $N$
- $(\lambda x.M)[y \leftarrow N] = \lambda z.((M[x \leftarrow z])[y\leftarrow N])$ where $x, y \neq z$ and $z$ is not a free variable in $M$ or $N$.
- Capture avoidance - substitution should not constrain variables unnecessarily
- Freshness - when a variable is not free in $N$ it can't be re-used
- Substitution is a purely syntactic operation in symbols
Alpha equivalence captures the notion that the variable name being bound doesn't matter so the expressions $(\lambda x y z.xxyy)$ and $(\lambda a b c.aabb)$ are equivalent. One expression can be $\alpha$-converted to another using substitution. Expressions that can be $\alpha$-converted to each other are $\alpha$-equivalent.
Beta reduction substitutes an argument into an abstraction, $(\lambda x.M)N \rightarrow M[x \leftarrow N]$, where the left expression is called a beta redex.
- A series of reductions is called a reduction sequence.
- A normal form is where no more reductions are possible.
- An evaluation is $A \Rightarrow B \equiv A \rightarrow^\star B$ where $B$ is a normal form.
- A program is said to diverge when it doesn't terminate.
Eta reduction is a method of simplifying expressions. $\eta$-equivalence is defined as $(\lambda x.Mx) \rightarrow_\eta M$. For example, $(\lambda y. \lambda x. yx)$ is equivalent to $(\lambda y. y)$.
Lambda Calculus is Turing complete.
A normal form is when no further reductions are possible.
- Normal order evaluation reduces the leftmost redex first. If there is a normal form it will be found.
- Applicative order evaluation reduces the rightmost redex first. It is possible that it results in divergence.
Terms are $\beta$-equivalent if they can be reduced to the same normal form. Church's thesis says it is not possible to always prove that two terms are $\beta$-equivalent. This is the incompleteness theorem.
Head normal form is achieved when there are no further useful reductions. A weak normal form does not optimise the body of the function.
Using $e = x | \lambda x. e | e e$, the normal forms ($E$) can be denoted as
- Normal form $E = \lambda x.E | xE_1 \dots E_n$
- Head normal form $E = \lambda x.E | xe_1 \dots e_n$
- Weak normal form $E = \lambda x.e | xE_1 \dots E_n$
- Weak head normal form $E = \lambda x.e | xe_1 \dots e_n$
Any variable or abstraction is in WHNF. An application is in WHNF if it is not a redex.
[ add notes on de bruijn representation ]
Any variable is in normal form. An abstraction is in NF is the body of the abstraction is in NF. An application is in NF if it is not a redex and there is no redex in the sub-expressions.
Some useful combinatory logic lambda expressions are given one letter names
- $I = \lambda x.x$ (Identity)
- $K = \lambda x.\lambda y.x$ (Konstant)
- $Y = \lambda f.(\lambda x.f(xx))(\lambda x.f(xx))$ (Y-combinator)
- $S = \lambda x.\lambda y.\lambda z.xz(yx)$
- $B = \lambda x.\lambda y.\lambda z.x(yz)$ (Composition)
- $T = \lambda x. \lambda y. yx$ (Transposition)
Any combinator can be encoded using only $S$ and $K$.
Church encoding uses the following lambda expressions
- $T = \lambda xy.x$ (True)
- $F = \lambda xy.y$ (False)
- $I = \lambda pxy.pxy$ (If then)
- $A = \lambda pq. pqp$ (And)
- $O = \lambda pq.ppq$ (Or)
- $N = \lambda pab.pba$ (Not)
These lambda functions work like binary operators. For example not true, $N(T)$, evaluates as follows:
- $(\lambda pab.pba)(\lambda xy.x)$
- $\lambda ab . (\lambda xy.x) ba$
- $\lambda ab . b$ which is $\alpha$-equivalent to $F$
There are also definitions for addition and natural numbers
- $add = \lambda ijfx. (if)(jfx)$
- $0 = \lambda fx.x$
- $1 = \lambda fx.fx$
- $2 = \lambda fx.ffx$
- $n \in \mathbb{N} = \lambda fx. f^nx$
For example, $1+1$ is evaluated as follows:
- $(+)(1)(1)$
- $(\lambda ijfx.(if)(jfx))(\lambda fx.fx)(\lambda fx.fx)$
- $\alpha$-equiv to $\lambda ijfx.(if)(jfx))(\lambda gy.gy)(\lambda hw.hw)$
- $(\lambda jfx. ((\lambda gy. gy) f)(jfx))(\lambda hw.hw)$
- $(\lambda fx. ((\lambda gy. gy) f)((\lambda hw.hw)fx))$
- $(\lambda fx . (\lambda y. fy)(fx))$
- $(\lambda fx . ffx)$ which is 2
Environments are associations of free variables with values. For example, $E = {T = \lambda xy.x, O=\lambda pq. ppq, \dots}$. $E \vdash e \rightarrow n$ denotes a judgement and means in a given environment $E$, expression $e$ evaluates to value $n$.
These can be used to write rules, such as $$ \frac{E \vdash e_1 \rightarrow T}{E \vdash O e_1e_2 \rightarrow T} $$ which states that in an environment where $e_1$ evaluates to $T$, we can conclude that $O e_1e_2$ evaluates to $T$.
Another example is for AND: $$ \frac{E \vdash e_1 \rightarrow T \quad e_2 \rightarrow n}{E \vdash A e_1e_2 \rightarrow n} $$
Types can be added to lambda calculus by including the following
- Type environment - List $H = x_1:t_1, \dots x_n:t_n$ of pairs of variables and types, where variables are distinct
- Typing judgement - Triple consisting of a type environment $H$, a term $M$ and a type $t$ such that all the free variables of $M$ appear in $H$ and stating that in the environment $H$, the term $M$ has type $t$, denoted $H \vdash M:t$
- Typing derivations - trees with nodes labelled by typing judgements, giving a demonstration of $H \vdash M:t$
The Simply Typed Lambda Calculus (STLC) consists of
- $x \in$ Variables
- $t ::= o | t \rightarrow t$
- $M ::= x | \lambda x: t.M | MM$
This type system is referred to as System $F_1$.
Dynamic semantics show what a lambda expression evaluates to. Static semantics show what type an expression has.
The natural semantics show the dynamic semantics of a language and are denoted $C \vdash e \Rightarrow v$ where $\Rightarrow$ is the transitive closure of $\beta$-reduction. The rules for application, variables and abstraction are as follows $$ \begin{aligned} &\frac{E \vdash M \Rightarrow \lambda x:t.M' \quad E \vdash M'[x \leftarrow P] \Rightarrow N}{E \vdash MP \Rightarrow N}appl \\ &\frac{}{{x=v, \dots} \vdash x \Rightarrow v} {var}\\ &\frac{E \vdash M \Rightarrow N}{E \vdash \lambda x:t.M \Rightarrow \lambda x:t.N} abs \end{aligned} $$
The rules for types (static semantics) are $$ \begin{aligned} &\frac{H \vdash M: s \rightarrow t \quad H \vdash N : s}{H \vdash MN : t}appl \\ &\frac{}{H, x:t \vdash x:t} {id}\\ &\frac{H,x:s \vdash M:t}{H \vdash \lambda x:s.M:s \rightarrow t} abs \end{aligned} $$
PCF
The PCF language combines Church encoding and type systems. A reduced subset of PCF, PCF-, is defined as $$ \begin{aligned} t ::= \text{num} &| \text{bool} | t \rightarrow t \ M ::= 0 &| \text{true} | \text{false} | \mathbb{N} \ &| \text{succ}(M) | \text{pred}(M) | \text{zero?}(M)\ &| M+M | M \times M \ &| \text{if } M \text{ then } M \text{ else } M | x | \lambda x. t.M | MM \end{aligned} $$
This can be extended with pairs and declarations $$ \begin{aligned} t &::= t * t \ M &::= \left<M, M\right>|\text{fst}(M)|\text{snd}(M) | \text{let } d \text{ in } M\ d &::= x = M | d \text{ and } d \end{aligned} $$
[missing relation between types and values]
Semantic consistency is given by the subject reduction theorems
- If $E \vdash e \Rightarrow v$, $E \in T$ and $T \vdash e: t$ then $v \in t$
- If $E \vdash d \Rightarrow E'$, $E \in T$ and $T \vdash d: T'$ then $E' \in T'$
The Y-combinator is used to represent recursive functions. $Yx = x(Yx)$. There are several ways of representing $Y$ in lambda calculus including $Y = \lambda x.VV$ where $V = \lambda y.x(yy)$ or $Y = ZZ$ where $Z = \lambda zx.x(zzx)$. The factorial function can then be denoted $\text{Fact} = Y(\lambda f: \text{num} \rightarrow \text{num} . \lambda n: \text{num} . \text{if zero?}(n) \text{ then succ}(n) \text{ else} (n \times f(\text{pred}(n))))$.
The shorthand $\mu f. [\dots]$ represents $Y(\lambda f. [\dots])$. PCF also provides other syntactic sugar
- $f = \lambda x.\lambda y. xy$ is written $f(x, y) = xy$
- $f = \mu g. g \dots$ is written $\text{let rec } f = f \dots$
- $\text{let } x = M \text{ in } N$ is written $\text{let } x = M;; N$
Small step semantics can be added to PCF to determine the order of evaluation. The axioms are as follows: [axioms go here]
[call by name?]
[call by value?]
MiniML
The Curry-Howard isomorphism looks at propositions as types. The type of closed expressions can be interpreted as tautologies.
Currying is the technique of transforming a function that takes multiple arguments in such a way that it can be called as a chain of functions each with a single argument.
MiniML has type polymorphism, type inference and variables to refer to unknown types. MiniML consists of terms of the form $$ \begin{aligned} e := & c_\text{bool} | c_\text{num} | \left<e_1,e_2\right> | \text{fst } e | \text{snd } e\ & | x | e_1e_2 | \text{fn } x.e | \text{let } d \text{ in } e\ & | e_1 = e_2 | e_1 + e_2 | \text{if } e_0 \text{ then } e_1 \text{ else } e_2\ & | \text{nil} | e_1 :: e_2 | \text{hd } e | \text{tl } e \end{aligned} $$ and type variables $\tau := \text{num} | \text{bool} | \tau_1 * \tau_2 | \tau_1 \rightarrow \tau_2 | \alpha, \beta,... | \tau \text{list}$.
Polymorphism uses universally quantified types. For a list, nil has type $\forall \alpha.\alpha \text{ list}$ and cons has type $\forall \alpha . \alpha * \alpha \text{ list} \rightarrow \alpha \text{ list}$.
Type inference algorithms
- to each bound variable and to every expression, assign a new type variable
- establish a set of equations
- solve the equations, by replacing variables as needed.
Expressions are unified (denoted $\sim$) by substituting type variables to obtain two identical type expressions. If inconsistent replacements, stop with error.
MiniML type inference works because it is first-order. In order to keep type inference efficient, MiniML keeps types and type schemes separate.
| $\sim$ | $\text{num}$ | $\text{bool}$ | $t_1 \times t_2$ | $t_1 \rightarrow t_2$ | $t_1 \text{ list}$ | $\alpha$ |
|---|---|---|---|---|---|---|
| $\text{num}$ | y | n | n | n | n | $\alpha \leftarrow \text{num}$ |
| $\text{bool}$ | n | y | n | n | n | $\alpha \leftarrow \text{bool}$ |
| $t_3 \times t_4$ | n | n | if and only if $t_1 \sim t_3$ and $t_2 \sim t_4$ | n | n | $\alpha \leftarrow t_3 \times t_4$ |
| $t_3 \rightarrow t_4$ | n | n | n | if and only if $t_1 \sim t_3$ and $t_2 \sim t_4$ | n | $\alpha \leftarrow t_3 \rightarrow t_4$ |
| $t_2 \text{ list}$ | n | n | n | n | if and only if $t_1 \sim t_2$ | $\alpha \leftarrow t_2 \text{ list}$ |
| $\beta$ | $\beta \leftarrow \text{num}$ | $\beta \leftarrow \text{bool}$ | $\beta \leftarrow t_1 \times t_2$ | $\beta \leftarrow t_1 \rightarrow t_2$ | $\beta \leftarrow t_1 \text{ list}$ | $\beta \leftarrow \alpha$ |
Hindley-Damas-Milner type inference algorithm
- Build a tree based on structural induction over the term.
- To each occurrence of a constant, ascribe an arbitrary ‘instance’ of its universally quantified type.
- Build a set of constraints based on each typing statement associated with each rule used in building the tree.
- Solve the constraints to find the most general unifier: two type expressions can be unified if they consist of the same basic type, two type expressions can be unified if one of them is a type variable, two expressions can be unified if they are made of the same type constructor and the sub-expressions can be unified.
[missing loads of stuff]
An abstract data type (ADT) consists of
- content $\left<3, \text{succ}\right>$
- interface $a * (a \rightarrow \text{num})$
- representation $a = \text{num}$
To check the type of an ADT, substitute the representation into the interface to check content.
Imperative languages
The IMP toy language models an imperative language. It consists of
- Arithmetic expressions $a := n|v|a_1+a_2|a_1-a_2|a_1*a_2$
- Boolean expressions $b := \text{true} | \text{false} | a_1=a_2 | a_1 \leq a_2 | ~b|b_1 \wedge b_2$
- Commands $c := \text{skip} | v := a | c_1;c_2 | \text{if } b \text{ then } c_1 \text{ else } c_2 | \text{while } b \text{ do } c$
The language uses a store $s$ which is a mapping from variables to values.
Semantic descriptions are functions that define the meaning of expressions. For example, $A[n]s = N[n]$ means that the semantics of a natural number in a store $s$ is its value. $B[a_1 = a_2] = \lambda s.A[a_1]s = A[a_2]s$ means that the equality operator semantics are the equality of the values of each operand in the store. $B[b_1 \wedge b_2]$ has semantics $B[b_1]s$ is $T$ and $B[b_2]s$ is $T$.
Commands modify stores. Operational semantics show the mapping from commands to stores, $\left<c,s_1\right> \Rightarrow s_2$. The operational semantics of assignment are $\left<v := a, s\right> = s[v \rightarrow A[a]s]$. $\left<\text{skip}, s\right> \Rightarrow s$ means the state is unaffected by skip.
The operational semantics for sequential composition are given by $$ \frac{\left<c_1, s\right> \Rightarrow s_2 \qquad \left<c_2,s_2\right>\Rightarrow s_3}{\left<c_1;c_2, s\right> \Rightarrow s_3} $$
The operational semantics for the if-then-else command are given by $$ \frac{B[b]s \text{ is } T \qquad \left<c_1, s\right> \Rightarrow s_2}{\left<\text{if } b \text{ then } c_1 \text{ else } c_2, s\right> \Rightarrow s_2} $$
and
$$ \frac{B[b]s \text{ is } F \qquad \left<c_2, s\right> \Rightarrow s_2}{\left<\text{if } b \text{ then } c_1 \text{ else } c_2, s\right> \Rightarrow s_2} $$
The operational semantics for the while command are given by $$ \frac{B[b]s \text{ is } T \quad \left<c,s\right> \Rightarrow s' \quad \left<\text{while } b \text{ do } c, s'\right> \Rightarrow s''}{\left<\text{while } b \text{ do } c, s\right> \Rightarrow s''} $$ and $$ \frac{B[b]s \text{ is } F}{\left<\text{while } b \text{ do } c, s\right> \Rightarrow s} $$
Structural operational semantics corresponds to the workings of an abstract machine in executing the first step of a computation, resulting in either a terminal configuration $s'$ or an intermediate configuration $\left<c', s'\right>$.
The structural operational semantics of the while command are given by $$ \frac{B[b]s \text{ is } T}{\left<\text{while } b \text{ do } c, s\right> \rightarrow \left<c; \text{while } b \text{ do } c, s\right>} $$
and $$ \frac{B[b]s \text{ is } F}{\left<\text{while } b \text{ do } c, s\right> \rightarrow s} $$
The structural operational semantics for sequential composition are given by
[missing this]
The derivation sequence of configurations can either be finite, where the last configuration is either a terminal configuration or a stuck configuration, or an infinite sequence. If the derivation sequence from $\left<c, s\right>$ is finite it can be denoted either $\left<c, s\right> \rightarrow^* s'$ or $\left<c, s\right> \rightarrow^* \left<c', s'\right>$.
Block is an extension to IMP which introduces variables with a scope. An example program is
begin var x := 0; var y := 1;
(
x := 1;
begin var x := 2; y := x + 1 end;
x := y + x
)
end
The syntax for commands can be extended to give $c := \dots | \text{begin } d_vc \text{ end}$ and $d_v := \text{var } v := a; d_v | \epsilon$.
Another approach to capturing variable declarations is to make explicit the fact that values are stored in locations. There are environments that map variables to locations and stores that map locations to values. The natural semantics of commands must take the environment into consideration, $E_v \vdash \left<c, s\right> \Rightarrow s'$.
Another extension to the Block language is the Proc language. This adds procedures without parameters. It has the abstract syntax $$ \begin{aligned} c &:= \dots | \text{begin } d_vd_pc \text{ end} | \text{call } p\ d_v &:= \text{var } v := a; d_v | \epsilon\ d_p &:= \text{proc } p \text{ is } c; d_p | \epsilon \end{aligned} $$
This introduces the problem of variable scoping.
[missing scope stuff]
Proc can be extended to add a parameter to procedures, $d_p := \text{proc } p(v) \text{ is } c$.