Class Meeting 03: Robot State Estimation


Today's Class Meeting



What You'll Need for Today's Class



Groups for Today's Class


For today's class exercise, you'll be able to choose your own groups. Please get together in groups of 2-3 students.


Robot State Estimation


Today, we're covering topics related to how a robot estimates its state (\(x_t\)) using measurement data (\(z_t\)) and the knowledge of the actions the robot takes in the environment (\(u_t\)). This problem can be represented as a hidden Markov model or dynamic Bayes network as depicted in the following diagram.

DBN

Bayes Filter Algorithm


The following is the Bayes filter algorithm as we discussed during class:


\(\textrm{Bayes_Filter}( bel(x_{t-1}), u_t, z_t):\)
\(\qquad \textrm{for} \: \textrm{all} \: x_t \: \textrm{do} \)
\(\qquad \qquad \overline{bel}(x_t) = \int p(x_t | u_t, x_{t-1}) \: bel(x_{t-1}) \: dx_{t-1}\)
\(\qquad \qquad bel(x_t) = \eta \: p(z_t | x_t) \: \overline{bel}(x_t) \)
\( \qquad \textrm{endfor}\)
\( \qquad \textrm{return} \: bel(x_t) \)


Some useful tips/notes:


Class Exercise: Estimating the State of a Door (Bayes Filter Algorithm)


We'll first go over this as a class example and the belief calculation for \(t = 1\) as a class.

robot door state estimation

We'll practice applying the Bayes filter on a situation where a robot is estimating the state of a door using a forward-facing camera. We will assume that the door can be in one of two states: 1) open or 2) closed. Also, we'll assume that the robot doesn't know what state the door is in, so we assign a prior probability for the two states of: $$bel(X_0 = \textrm{open}) = 0.5$$ $$bel(X_0 = \textrm{closed}) = 0.5$$

Let's also assume that the robot's camera sensor is noisy and can be characterized by the following conditional probabilities: $$p(Z_t = \textrm{sense_open} \: | \: X_t = \textrm{is_open}) = 0.6$$ $$p(Z_t = \textrm{sense_closed} \: | \: X_t = \textrm{is_open}) = 0.4$$ $$p(Z_t = \textrm{sense_open} \: | \: X_t = \textrm{is_closed}) = 0.2$$ $$p(Z_t = \textrm{sense_closed} \: | \: X_t = \textrm{is_closed}) = 0.8$$ These probabilities indicate to us that the robot's sensors have less error when sensing when the door is closed (error = 0.2) than when sensing when the door is open (error = 0.4).

Our final set of assumptions are about the robot's ability to influence the environment. Let's assume that the robot can use its arm to push open the door with a 0.8 chance: $$p(X_t = \textrm{is_open} \: | \: U_t = \textrm{push}, X_{t-1} = \textrm{is_open}) = 1$$ $$p(X_t = \textrm{is_closed} \: | \: U_t = \textrm{push}, X_{t-1} = \textrm{is_open}) = 0$$ $$p(X_t = \textrm{is_open} \: | \: U_t = \textrm{push}, X_{t-1} = \textrm{is_closed}) = 0.8$$ $$p(X_t = \textrm{is_closed} \: | \: U_t = \textrm{push}, X_{t-1} = \textrm{is_closed}) = 0.2$$ If the robot decides not to open the door: $$p(X_t = \textrm{is_open} \: | \: U_t = \textrm{do_nothing}, X_{t-1} = \textrm{is_open}) = 1$$ $$p(X_t = \textrm{is_closed} \: | \: U_t = \textrm{do_nothing}, X_{t-1} = \textrm{is_open}) = 0$$ $$p(X_t = \textrm{is_open} \: | \: U_t = \textrm{do_nothing}, X_{t-1} = \textrm{is_closed}) = 0$$ $$p(X_t = \textrm{is_closed} \: | \: U_t = \textrm{do_nothing}, X_{t-1} = \textrm{is_closed}) = 1$$


Belief calculation for \(t = 1\)


Now, we are going to assume that the robot executes the action \(u_1 = \textrm{do_nothing}\) and receives the measurement \(z_1 = \textrm{sense_open}\) from its camera. We'll walk through this example and have you calculate the belief for the next time step (\(t = 2\)).

\(\overline{bel}(x_1) = \int p(x_1 | u_1, x_{0}) \: bel(x_{0}) \: dx_{0}\)
\(\qquad \quad = \sum_{x_0} p(x_1 | u_1, x_{0}) \: bel(x_{0}) \)
\(\qquad \quad = p(x_1 | U_1 = \textrm{do_nothing}, X_0 = \textrm{is_open}) \: bel(X_0 = \textrm{is_open}) + \) \(\qquad \qquad \: p(x_1 | U_1 = \textrm{do_nothing}, X_0 = \textrm{is_closed}) \: bel(X_0 = \textrm{is_closed}) \)

Now, we can calculate \(\overline{bel}(x_1)\) for both \(X_1 = \textrm{is_open}\) and \(X_1 = \textrm{is_closed}\).

\(\overline{bel}(X_1 = \textrm{is_open}) = p(X_1 = \textrm{is_open} \: | \: U_1 = \textrm{do_nothing}, X_0 = \textrm{is_open}) \: bel(X_0 = \textrm{is_open}) + \)
\(\qquad \qquad \qquad \qquad \quad \: p(X_1 = \textrm{is_open} \: | \: U_1 = \textrm{do_nothing}, X_0 = \textrm{is_closed}) \: bel(X_0 = \textrm{is_closed}) \)
\(\qquad \qquad \qquad \qquad = 1 \cdot 0.5 + 0 \cdot 0.5 = 0.5 \)

\(\overline{bel}(X_1 = \textrm{is_closed}) = p(X_1 = \textrm{is_closed} \: | \: U_1 = \textrm{do_nothing}, X_0 = \textrm{is_open}) \: bel(X_0 = \textrm{is_open}) + \)
\(\qquad \qquad \qquad \qquad \quad \: \: \: p(X_1 = \textrm{is_closed} \: | \: U_1 = \textrm{do_nothing}, X_0 = \textrm{is_closed}) \: bel(X_0 = \textrm{is_closed}) \)
\(\qquad \qquad \qquad \qquad \: \: = 0 \cdot 0.5 + 1 \cdot 0.5 = 0.5\)

It should not surprise us that \(bel(X_0) = \overline{bel}(X_1) \), since the robot action \(\textrm{do_nothing}\) does not influence the state of the world. Once we do the measurement update, however, our belief will change. Our belief update takes the form: $$bel(x_1) = \eta \: p(Z_1 = \textrm{sense_open} \: | \: x_1) \: \overline{bel}(x_1)$$

We have two possible resulting states, \(X_1 = \textrm{is_open}\) and \(X_1 = \textrm{is_closed}\):

\(bel(X_1 = \textrm{is_open}) = \eta \: p(Z_1 = \textrm{sense_open} \: | \: X_1 = \textrm{is_open}) \: \overline{bel}(X_1 = \textrm{is_open}) \)
\(\qquad \qquad \qquad \quad \: \: \: \: = \eta \: 0.6 \cdot 0.5 = \eta \cdot 0.3 \)

\(bel(X_1 = \textrm{is_closed}) = \eta \: p(Z_1 = \textrm{sense_open} \: | \: X_1 = \textrm{is_closed}) \: \overline{bel}(X_1 = \textrm{is_closed}) \)
\(\qquad \qquad \qquad \qquad \: \: = \eta \: 0.2 \cdot 0.5 = \eta \cdot 0.1 \)

We can now calculate the normalizer (\(\eta\)) so that \(\sum bel(x_1) = 1\) :

\( \eta = (0.3 + 0.1)^{-1} = 2.5\)

So, now our belief after time step 1 is:

\(bel(X_1 = \textrm{is_open}) = 0.75 \)
\(bel(X_1 = \textrm{is_closed}) = 0.25 \)


Belief calculation for \(t = 2\)


For the belief calculation for \(t = 2\), work in groups of 2-3 students. You will likely find it hepful to use scratch paper or an equivalent.

For time step 2, we are going to assume that the robot executes the action \(u_2 = \textrm{push}\) and receives the measurement \(z_2 = \textrm{sense_open}\) from its camera. In your group, calculate \(\overline{bel}(x_2)\) and \(bel(x_2)\).

Once you finish calculating \(\overline{bel}(x_2)\) and \(bel(x_2)\) you can check your solutions on this door state estimation exercise solutions page.


Acknowledgments


The content and exercises for today's class were informed by Probabilistic Robotics by Sebastian Thrun, Wolfram Burgard, and Dieter Fox.