CS255 - Artificial Intelligence

Textbooks https://artint.info/2e/html/ArtInt2e.html

Russel Norvig 4th edition

Rational agents

An agent is an entity that perceives and acts.

An agent can be viewed as a function from percept histories to actions, $f: p \rightarrow A$.

Agents are typically required to exhibit autonomy.

For any given class of environments and tasks we seek the agents with the best performance.

Perfect rationality is usually impractical given computational restraints.

The aim is to design the best agent for a given set of resources.

Artificial intelligence is the synthesis and analysis of computational agents that act intelligently.

An agent acts intelligently if

its actions are appropriate for its goals and circumstances
it is flexible to changing environments and goals
it learns from experience
it makes appropriate choices given its perceptual and computational limitations.

Goals of artificial intelligence

Scientific goal - understand the principles that make intelligent behaviour possible in natural or artificial systems

analyse natural and artificial agents
formulate and test hypotheses about what it takes to construct intelligent agents
design, build and experiment with computational systems that perform tasks that require intelligence

Engineering goal - design useful, intelligent artifacts.

Inputs to an agent

Abilities - the set of actions it can perform
Goals/preferences - what it wants
Prior knowledge - what it comes into being knowing, what is doesn't get from experience
History of stimuli - what it receives from the environment now or has received in the past

Example - autonomous car

Abilities - steer, accelerate, brake
Goals - safety, reaching the destination
Prior knowledge - what road signs mean, what to stop for
Stimuli - vision, laser, GPS
Past experiences - maps, how braking and steering affects direction

Rational agents

Goals can be specified by a performance measure. A rational action maximises the performance measure given the precept sequence to date.

Rational is not the same as omniscient - an omniscient agent knows the outcome of its actions whereas a rational agent needs to do its best given the current percepts. Rational agents ate not expected to foresee future changes in the environment. Rational action is defined in terms of expected value so a failed action can still be rational.

Dimensions of complexity

Modularity
- Flat - Agent structure has one layer of abstraction
- Modular - Agent structure has interacting modules that can be understood separately
- Hierarchical - Agent structure has modules that are (recursively) decomposed into modules
Planning horizon
- Static - the world does not change
- Finite stage - the agent reasons about a fixed finite number of time steps
- Indefinite stage - the agent reasons about a finite but not predetermined number of time steps
- Infinite stage - the agent plans for going on forever
Representation
- Explicit states - a state is one way the world could be
- Features or propositions - states can be described using features
- Individuals and relations - There is a feature for each relationship on each tuple of individuals.
Computational limits
- Perfect rationality - The agent can determine the best course of action without exhausting its resources
- Bounded rationality - The agent must make good decisions based on its perceptual, computational and memory limitations
Learning from experience
- Knowledge is given
- Knowledge is learned
Sensing uncertainty
- There are two dimensions for uncertainty — sensing and effect In each dimension an agent can have:
  - No uncertainty: the agent knows what is true
  - Disjunctive uncertainty: there is a set of states that are possible
  - Probabilistic uncertainty: a probability distribution over the worlds
- Agents need to act even if they are uncertain.
  Predictions are needed to decide what to do:
  - definitive predictions: you will run out of power
  - disjunctions: charge for 30 minutes or you will run out of power
  - probabilities: probability you will run out of power is 0.01 if you charge for 30 minutes and 0.8 otherwise
  Acting is gambling: agents who do not use probabilities will lose to those who do
  Probabilities can be learned from data and prior knowledge.
- Fully-observable: the agent can observe the state of the world
- Partially-observable: there can be a number states that are possible given the agent’s stimuli.
Effect uncertainty
If an agent knew the initial state and its action, could it predict the resulting state? The dynamics can be:
- Deterministic: the resulting state is determined from the action and the state
- Stochastic: there is uncertainty about the resulting state
Preferences - What does the agent try to achieve?
- Achievement goal is a goal to achieve
- Complex preferences may involve trade-offs between various desiderata, perhaps at different times
  - ordinal only the order matters
  - cardinal absolute values also matter
Number of agents
Are there multiple reasoning agents that need to be taken into account?
- Single agent reasoning: any other agents are part of the environment
- Multiple agent reasoning: an agent reasons strategically about the reasoning of other agents
Agents can have their own goals: cooperative, competitive, or goals can be independent of each other.
Interaction - When does the agent reason to determine what to do?
- reason offline: before acting
- reason online: while interacting with environment.

Representations

Representations need to be

rich enough to express the knowledge needed to solve the problem
as close to the problem as possibleL compact, natural and maintainable
amenable to efficient computation
- able to express features of the problem that can be exploited for computational gain
- able to trade off accuracy and computation time and/or space
able to be acquired from people, data and past experiences

Physical symbol system hypothesis

A symbol is a meaningful physical pattern that can be manipulated. A symbol systems creates, copies, modifies and destroys symbols.

Physical symbol system hypothesis
A physical symbol system has the necessary and sufficient means for general intelligent action.

Two levels of abstraction are common among biological and computational entities:

The knowledge level is in terms of what an agent knows and what its goals are
The symbol level is a level of description of an agent in terms of what reasoning it is doing

The knowledge level is about the external world to the agent. The symbol level is about what symbols an agent uses to implement the knowledge level.

Agent system architecture

An agent comprises of a body and a controller. An agent interacts with the environment through its body. The body is made up of sensors that interpret stimuli and actuators that carry out actions. The controller receives percepts from the body and sends commands to the body.

Let $T$ be the set of time points. A percept trace is a sequence of all past, present and future percepts received by the controller. A command trace is a sequence of all past, present and future commands output by the controller. A transduction function is a function from percept traces into command traces. A transduction is causal if the command trace up to time $t$ depends only on percepts up to $t$. A controller is an implementation of a causal transduction.

Belief states

An agent does not have access to its entire history, it only has access to what it remembers. The memory or belief state of an agent at time $t$ encodes all of the agent's history that it has access to. The belief state of an agent encapsulates the information about its past that it can se for current and future actions.

At every time a controller has to decode on what it should do and what it should remember as a function of its percepts and its memory.

For discrete time, a controller implements

belief state function: remember(belief, state, percept) returns the next belief state
command function: do(memory, percept) returns the command for the agent

Hierarchical controllers

Each controller sees the controllers below it as a virtual body from which it gets percepts and sends commands. The lower level controllers can run much faster, react to the world more quickly and deliver a simpler view of the world to the higher level controllers.

A purely reactive agent does not have a belief state.

A dead reckoning agent does not perceive the world.

An ideal rational agent does whatever action is expected to maximise performance measure on the basis of percept sequence and built-in knowledge.

Types of agent

Simple reflex agents
other types (make notes)

Problem solving and uninformed search

A problem solving agent is a goal-based agent that will determine sequences of actions that lead to desirable states.

There are four steps a problem-solving agent must take:

Goal formulation - identify goal given current situation
Problem formulation - identify permissible actions and states to consider
Search - find sequence of actions to achieve goal
Execution - perform the actions in the solution

There are two types of problem solving:

offline - complete knowledge of problem and solution
online - involves acting without complete knowledge of problem and solution

Simple problem solving agent

function simpleProblemSolvingAgent(p) returns action
    inputs: p, percept
    static: s, an action sequence
            state, a description of the current world state
            g, a goal
            problem, a problem formation

    state <- updateState(state, p)
    
    if s is empty then
        g <- formulateGoal(state)
        problem <- formulateProblem(state, p)
        s <- search(problem)
    action <- recommendation(s, state)
    s <- remainder(state)
    return action

Dimensions of state space search

Modularity - flat
Presentation - states
Planning horizon - indefinite stage
Sensing uncertainty - fully observable
Effect uncertainty - deterministic
Preference - goals
Number of agents - single agent
Learning - knowledge is given
Computational limits - perfect rationality
Interaction - offline

Problem types

Deterministic, fully observable - Single state problem
- sensors tell agent current state and it knows exactly what its actions do
- means that it knows exactly what state it will be in after any action
Deterministic, partially observable - multiple-state problem
- limited access to state, but knows what its actions do
- can determine set of states resulting from an action
- instead of single states, agent must manipulate sets of states
Stochastic, partially observable - contingency problem
- don’t know current state, and don’t know what state will result from action
- must use sensors during execution
- solution is a tree with branches for contingencies
- often interleave search and execution
Unknown state space, i.e. knowledge is learned - exploration problem (online)
- agent doesn’t know what its actions will do
- example: don’t have a map, get from Coventry to London

State-space problems

A state-space problem consists of

a set of states
a subset of states called the start states
a set of actions
an action function
a set of goal states
a criterion that specifies the quality of an acceptable solution

A general formulation of a problem solving task is a state space graph. This is a directed graph comprising of a set of $N$ nodes and a set $A$ of ordered pairs of nodes called arcs or edges. Node $n_2$ is a neighbour of $n_1$ if there is an arc from $n_1$ to $n_2$. A path is a sequence of nodes such that every pair of adjacent nodes are neighbours. Given a set of start nodes and goal nodes, a solution is a path from a start node to a goal node.

Tree search is the basic approach to problem solving. It is and offline, simulated exploration of the state space. It works by starting with a start state and expanding one of the explored states by generating its successors to build a search tree.

Uninformed tree search

Uninformed search only uses information from problem definition. There is no measure of the best node to expand.

Two simple types of uninformed tree search are breadth and depth first search.

Breadth-first search is complete. It will always find a solution for a finite $b$. It has time and space complexity $O(b^d)$ where $b$ is the maximum number of children of each node and $d$ is the depth.

Depth-first search is not complete. It fails in infinite-depth spaces or those with loops. It has time complexity $O(b^d)$ and space complexity $O(bd)$, considerably better than breadth-first.

Graph search

With tree search, state spaces with loops give rise to repeated states that cause inefficiency. Graph search is a practical way of exploring the state space to account for loops.

input: a graph, a set of start nodes, boolean procedure goal(n)
let frontier = {<s>: s is a start node}
while frontier is not empty:
    select and remove path <n_0 to n_k> from frontier
    if goal(n_k):
        return <n_0, ..., n_k>
    for every neighbour n of n_k:
        add <n_0,...,n_k, n> to frontier

This is similar to tree search, but stores paths instead of nodes.

For breadth first graph search, the time and space complexity is $O(b^n)$ where $b$ is the branching factor and $n$ is the length of the path.

Depth first graph search has time complexity $O(b^n)$ and linear space complexity.

Lowest cost first search

Sometimes there are costs associated with arcs. The cost of a path is the sum of the costs of its arcs. An optimal solution is one with minimum cost. At each stage, lowest cost first search selects a path on the frontier with lowest cost. The frontier is a priority queue ordered by path cost. The first path to a goal is a least cost path to a goal node. When arc costs are equal this is the same as breadth first search.

Informed search

Uninformed search is generally very inefficient. Informed search makes use of problem specific knowledge.

Examples of informed search methods:

Best-first search
A* search
Pruning search
Depth bounded search
Branch and bound search
Heuristics

Heuristic search

Heuristics are extra knowledge what can be used to guide a search. Let $h(n)$ be an estimate of the cost of the shortest path from node $n$ to a goal node. $h(n)$ needs to be efficient to compute. $h(n)$ is an underestimate if there is no path from $n$ to a goal with cost strictly less than $h(n)$. An admissible heuristic is a non-negative heuristic function that is an underestimate of the actual cost of a path to a goal.

Best first search

The heuristic function can be used to determine the order of the stack representing the frontier. Heuristic depth first search selects a neighbour so that the best neighbour is selected first. Greedy best first search selects a path on the frontier with the lowest heuristic value. Best first search treats the frontier as a priority queue ordered by $h$.

The space complexity of greedy best first is $b^n$ where $b$ is branching factor and $n$ is path length. Time complexity is $b^n$. It is not guaranteed to find a solution, even if one exists and it does not always find the shortest path.

A* Search

A* search uses both path cost and heuristic values. $cost(p)$ is the cost of a path $p$ and $h(p)$ estimates the cost from the end of $p$ to a goal.

Let $f(p) = cost(p) + h(p)$. $f(p)$ estimates the total path cost of going from a start node to a goal via $p$.

A* is a mix of lowest cost first and best first search. It treats the frontier as a priority queue ordered by $f(p)$. It always selects the path on the frontier with the lowest estimated distance from the start to a goal node constrained to go via that node.

A search algorithm is admissible if whenever a solution exists it finds the optimal one. A* is admissible if

the branching factor is finite
arc costs are bounded above zero
$h(n)$ is admissible - it is non-negative and an underestimate of the cost of the shortest path from $n$ to a goal node

If a path $p$ to a goal is selected from a frontier it must be the shortest path to the goal. Consider a path $p'$ on the frontier. Because $p$ was selected before $p'$ and $h(p)=0$ then $cost(p) \leq cost(p') + h(p)$. Because $h$ is an underestimate, $cost(p) \leq cost(p')$, so there is no better path than $p$ to the goal.

A* can always find a solution if there is one. The frontier always contains the initial part of a path to the goal before the goal is selected. A* halts, since the costs of the paths on the frontier keeps increasing and will eventually exceed any finite number. Admissibility ensures that the first solution will be optimal, even in a graph with cycles. Paths with cycles can be pruned but I didn't make notes on that.

Heuristic function $h$ satisfies the monotone restriction if $h(m) - h(n) \leq cost(m, n)$ for any arc in the state space. If $h$ satisfies the monotone restriction it is consistent, meaning $h(n) \leq cost(n, n' + h(n')$ for any two nodes $n$ and $n'$.

A* with a consistent heuristic and multiple path pruning always finds the shortest path to a goal. This is a strengthening of the admissibility criterion.

Search complexity is $b^n$. A* can either use the forward (arcs out of a node) or backward (arcs into a node) branching factor, so to minimise the complexity the smaller branching factor should be used. This may not be possible if the graph is constructed dynamically as the backward branching factor might not be known.

Bidirectional search

Search backwards from the goal and forwards from the start simultaneously. This is more effective as $2b^{k/2} < b^k$ so can result in an exponential saving in time and space where $k$ is the depth of the goal. The main problem is ensuring the frontiers meet.

Island-driven search uses a set of islands, which are places the forwards and backwards searches might meet, between $s$ and $g$. There are now $m$ smaller problems which is effective as $mb^{k/m} < b^k$, improving both space and time complexity.

Dynamic programming

For statically stored graphs, build a table $dist(n)$ of the actual distance from node $n$ to a goal. This can be built backwards from the goal: $$ dist(n) = \begin{cases} 0 & \text{if is goal}\ \min_{\left<n,m\right>\in A}(\left|\left<n,m\right>\right| + dist(m)) & \text{otherwise} \end{cases} $$ This can be used locally to determine what to do, defining a policy of which arc to take from a given node. There are two main problems:

It requires enough space to store the graph
The $dist$ function needs to be recomputed for each goal

Iterative deepening search

Start with a bound $b=0$. Do a bounded depth first search with $b$. If a solution is found return that solution, otherwise increment $b$ and repeat. This will find the same first solution as breadth first search.

Iterative deepening has an asymptotic overhead of $(\frac{b}{b-1})$ times the cost of expanding the nodes at depth $k$ using breadth first search. As $b$ gets higher, the overhead reduces.

Depth first branch and bound

Combines depth-first search with heuristic information
Finds optimal solution
Most useful when there are multiple solutions, and we want an optimal one
Uses the space of depth-first search

Constraint satisfaction problems

A CSP is characterised by

a set of variables $V_1, ..., V_n$
each variable $V_i$ has an associated domain $D_{V_i}$ of possible values
there are hard constraints on various subsets of the variables which specify legal combinations of values for these variables
a solution to the CSP is an assignment of a value to each variable that satisfies all the constraints

Types of CSP

Discrete variables
- finite domains, n variables of domain size $d$, $O(d^n)$ complete assignments
- infinite domains - integers, strings. Can't enumerate all possible assignments, linear constraints solvable, non-linear are undecidable
Continuous variables (time) - Domain is continuous, linear constraints solvable in polynomial time by linear programming

Types of constraints

Unary constraints involve a single variable, $X \neq \text{green}$, for example.
Binary constraints involves pairs of variables, $X \neq Y$, for example
Higher order constraints involve 3 or more variables
Preferences or soft constraints

Backtracking search

Systematically explore $D$ by instantiating the variables one at a time
Evaluate each constraint predicate as soon as all its variables are bound
Any partial assignment that does not satisfy the constraint can be pruned

Backtracking search heuristics

Minimum remaining values (MRV) - choose the variable with the smallest domain, also called fail-first, as it picks the variable most likely to fail first
Degree heuristic - tie-breaker for MRV variables, chooses the variable with the most constraints on remaining variables, attempts to reduce branching factor of future choices
Least constraining value (LCV) - given a variable, choose the least constraining value - the one that rules out the fewest values values in the remaining variables

CSP as graph search problem

A CSP can be solved by a graph search

a node is an assignment of values to some of the variables
suppose $N$ is the assignment $X_1 = v_1, ...,X_k = v_k$. Select a variable $Y$ that is not assigned in $N$.
The start node is the empty assignment
A goal node is a total assignment that satisfies the constraints.

Consistency algorithms

Prune the domains as much as possible before selecting values from them. If constraint $c$ has scope $\{X\}$ then arc $\left<X, c\right>$ is domain consistent if every value of $X$ satisfies $c$.

Finding solutions when arc consistency finishes

If some domains have more than one element then search.
Split a domain and then recursively solve each half - domain splitting or case analysis
It is often best to split a domain in half

Problem structure

Suppose that each sub-problem has $c$ variables out of a total of $n$ variables. There are $n/c$ sub-problems each of which takes at most $d^c$ to solve. The worst case solution cost is therefore $n/c \times d^c$, which is linear in $n$.

Tree structured CSPs

If the constraint graph is a tree the CSP can be solved in $O(nd^2)$ time, as opposed to $O(d^n)$ for general CSPs.

The algorithm for tree CSPs is as follows:

Choose a variable as the root, order variables from root to leaves such that every node's parent precedes it in the ordering.
For $j$ from $n$ down to 2, remove inconsistent domain elements for $\left<Parent(X_j), X_j\right>$.
For $j$ from 1 to $n$, assign $X_j$ consistently with $Parent(X_j)$.

Nearly tree-structured CSPs

Conditioning - instantiate a variable, prune its neighbours' domains so the remainder is a tree
Cut-set conditioning - instantiate a set of variables such that the remaining constraint graph is a tree. Cut-set size $c$ has runtime $O(d^c \cdot (n-c)d^2)$ which is very fast for small $c$.

Variable elimination

Eliminate the variables one by one, passing their constraints to their neighbours.

If there is only one variable, returns the intersection of the constraints that contain it
Select a variable $X$
Join the constraints in which $X$ appears, forming constraint $R_1$
Project $R_1$ onto its variables other than $X$, forming $R_2$
Replace all of the constraints in which $X$ appears by $R_2$
Recursively solve the simplified problem, forming $R_3$
Return $R_1$ joined with $R_3$

When there is a single variable remaining, if it has no values, the network was inconsistent. The variables are eliminated according to some elimination ordering. Different elimination orderings result in different size intermediate constraints.

Local search

In some problems the path is irrelevant, the goal state is the solution. Local search uses a single current state and moves to neighbours of state. It isn't systematic, but it has low memory usage and can find reasonable solutions in continuous spaces. It is useful for optimisation problems including CSPs.

The state space is the set of complete configurations and the goal is a particular configuration which meets a criteria.

Hill climbing

Hill climbing (or greedy local search) always tries to improve state (or reduce cost if evaluation function is cost).

Each iteration, move in direction of increasing value (or decreasing if evaluation is cost called greedy descent)
No search tree, just keep current state and its cost
If several alternatives with equal value, choose one at random.

Hill climbing is not ideal when the problem has local maxima, ridges or plateau.

Greedy descent

Greedy descent is a local search method for CSPs.

Maintain an assignment of a value to each variable
Repeatedly select a variable to change, then select a new value for that variable until a satisfying assignment is found.
The goal is an assignment with no conflicts
The heuristic function to be minimised is the number of conflicts

There are several options for choosing a variable to change

Find a variable-value pair that minimises the number of conflicts
Select a variable that participates in the most conflicts and a value that minimizes the number of conflicts
some others

When domains are small or unordered, the neighbours of an assignment can correspond to choosing another value for one of the variables. When the domains are large and ordered, the neighbours of an assignment are the adjacent values for one of the variables. If the domains are continuous, gradient descent changes each variable proportionally to the gradient of the heuristic function in that direction.

Randomised greedy descent makes use of

random steps - move to a random neighbour
random restart - reassign random values to all variables

This is better for some types of search space.

Stochastic local search

Stochastic local search is a mix of:

Greedy descent: move to a lowest neighbour
Random walk: taking some random steps
Random restart: reassigning values to all variables.

Simulated annealing

Pick a variable at random and a new value at random
If it is an improvement, adopt it
If it is not an improvement, adopt it probabilistically depending on a temperature parameter $T$ which can be reduced over time

To prevent cycling we can maintain a tabu list of the k last assignments Do not allow an assignment that is already on the tabu list If k = 1, we do not allow an assignment of the same value to the variable chosen We can implement it more efficiently than as a list of complete assignments, e.g. using list of recently changed variables or including for each variable the step at which the variable got its current value It can be expensive if k is large.

Parallel search

A total assignment is called an individual. Maintain a population of $k$ individuals instead of 1. At each stage, update each individual in the population. Whenever an individual is a solution, it can be reported. Like $k$ restarts, but uses $k$ times the minimum number of steps.

Beam search

Like parallel search, but choose the $k$ best out of all the neighbours. When $k=1$ it is greedy descent, when $k=\infty$ it is breadth first search. The value of $k$ can be used limit space and parallelism.

Stochastic beam search

Probabilistically choose the next $k$ individuals. The probability that a neighbour is chosen is proportional to its heuristic value. This maintains diversity amongst the individuals.

Genetic algorithms

Genetic algorithms are like stochastic beam search but successors come from two parents. Starts with a population of $k$ randomly generated individuals. Each individual is represented as a string over a finite alphabet. Each individual is evaluated by a fitness function. The fitness function should return higher values for better states. Pairs of individuals are chosen according to these probabilities with individuals below a threshold being culled. For each pair chosen, a random crossover point is chosen. Offspring are generated by crossing over parent strings at the chosen crossover point. First child gets first part of string from first patent and remainder from second, vice versa for second child. Finally, each location in the newly created children is subject to a random mutation with a small probability. If two parents are quite different, crossover can produce a state that is very different to both parents. Genetic algorithms take large steps early in the search, then smaller steps as the population converges. The advantage is the ability to crossover large blocks that have evolved independently to perform useful functions. Choice of representation is fundamental as genetic algorithms work best if representation corresponds to meaningful components of the solution.

Adversarial search

Adversarial agents are those with opposite objectives. The agent must deal with contingencies.

A game has an initial state, a set of operators representing legal moves and resulting states, a terminal test to determine when the game is over and a utility function to give numerical values for terminal states. A game tree can be built based on the initial state and operators for each player.

Consider a zero sum game with two players, max and min. Max moves first, then they take turns. Max could find a sequence of actions to achieve a winning state but min tries to prevent this, so max has to form a strategy that will still win.

Minimax

The minimax algorithm gives the optimal strategy for max. The minimax value of a state is the utility for max of being in that state assuming both players play optimally from that state until the end of the game. Max prefers maximum values, min prefers minimal.

Alpha-beta pruning

Prunes branches that will not influence decision. If a better choice is discovered, the inferior branch is pruned.

Games with chance

Some games have elements of chance, so chance nodes need to be used. The expected value of a node can be calculated using its possible outcomes.

Computer Science Notes