Lecture 29: Review

class: center, middle

# Artificial Intelligence

## Review

---

# Planning

Given the actions, initial state and goal below, find a plan that solves this planning problem

```Lisp
(:action exchange :parameters (?a ?b ?w)
  :precondition (and (at ?a ?w) (at ?b ?w))
  :effect (and (when (has ?a money) 
                  (and (not (has ?a money)) (has ?b money)))
               (when (has ?b money) 
                  (and (not (has ?b money)) (has ?a money)))))

(:action move :parameters (?who ?fr ?to)
  :precondition (at ?who ?fr)
  :effect (and (not (at ?who ?fr)) (at ?who ?to)))

(:init 
   (at carl house)
   (at dieter yard)
   (has carl money))

(:goal (and (at carl street) (has dieter money)))
```

---

# Unsupervised Learning

Briefly Explain Lloyd's algorithm ("k-means algorithm")

You are running Lloyd's algorithm with k=2, and are currently in the state shown below. Draw (approximately) where the cluster centers will be in the next step, and explain why.

<a href="/CI-0129/assets/img/clusterproblemclear.png">Clear version</a>

---

# Unsupervised Learning

For the data given below, show a potential clustering Lloyd's algorithm could produce for k=3. Is this a good clustering? Why/why not?

---

# Q-Learning

Explain each of the terms in the Q update expression:

$$
Q(s,a) \leftarrow (1-\alpha) \cdot Q(s,a) + \alpha \cdot  (R(s) + \gamma \max_{a'} Q(T(s,a),a'))
$$

---

# Q-Learning

Given the Q-table below, which action would the policy defined by this table select in state 2?

State | Walk left | Walk right | Jump
------|-----------|------------|--------
1     |   2.313   | 1.337      | 6.1
2     |   1.5     | -2.8       | 0.24
3     |  -4.1     | 2.4        | 0.0
4     |  -2.6     | 3.4        | -1

---

# Q-Learning

Given the Q-table below, your agent is in state 2, performs the action "walk left", which results in a reward of 0.7 and leads to state 3. How will the Q-table change, using the Q-update rule using a learning rate of 0.5, and a discount factor (gamma) of 0.75?

$$
Q(s,a) \leftarrow (1-\alpha) Q(s,a) + \alpha \cdot (R(s) + \gamma \cdot \max_{a} Q(T(s,a),a))
$$

State | Walk left | Walk right | Jump
------|-----------|------------|--------
1     |   2.313   | 1.23       | 6.1
2     |   1.5     | -2.8       | 0.24
3     |  -4.1     | 2.4        | 0.0
4     |  -2.6     | 3.4        | -1

---

# Supervised Learning

* What is Regression?

* What is Classification?

---

# Classification

You are given a set of images, with 100x100 pixels, some of which are spoons, and some are forks. You are tasked with implementing a classifier to distinguish between these two types of silverware.

* What Neural Network architecture would you propose for this task (layers, neurons, activation functions)?

---

# PyTorch

Draw the neural network corresponding to this pytorch code. Clearly note the activation function of each layer!

```Python
class MysteryNet(torch.nn.Module):
    def __init__(self):
        super(TwoLayerNet, self).__init__()
        self.lin1 = torch.nn.Linear(3, 5)
        self.lin2 = torch.nn.Linear(5, 2)
        
        self.af1 = torch.nn.LeakyReLU()
        self.af2 = torch.nn.Sigmoid()
    def forward(self, x):
        h = self.lin1(x)
        h = self.af1(h)
        h = self.lin2(h)
        return self.af2(h)
```

---

# Ethical Considerations

G. Rind R. is the CEO of a startup that wants to make a dating app targeted at gay people. He offers to pay you 100 000 USD for a machine learning system that takes people's facebook pictures to predict if they are gay.

Briefly discuss ethical and legal concerns you have about this assignment.