Critical Summary: Thrun’s “MINERVA: A Second-Generation Museum Tour-Guide Robot”

Thrun’s paper discusses the successes of an interactive tour-guide robot, Minerva. The approach taken by this second generation robot addresses the issues such as safe navigation in unmodified and dynamic environments as well as short term human-robot interaction. In essence, the job of the robot was to guide people through a museum, explaining what they saw along the way. The intriguing aspect of this paper is that there are multiple issues with having this robot operator effectively and it is interesting to see how all these different solutions fit into the general architecture for Minerva.

Minerva succeeds in six specific areas where the previous generation, Rhino, failed. First, Minerva learns its maps. Second, Minerva used ceiling mosaics for localization. The main issue with localization is that the laser scanners are blocked by people’s legs and the ceiling cameras, which people block intentionally to confuse the robot. Minerva uses a probabilistic algorithm along with filter’s to address this issue. Third, Minerva’s path planner takes robot uncertainty into account and therefore avoids open, feature-less spaces. Fourth, high level control is performed using RPL, which uses learning for composing tours on-the-fly. Fifth, Minerva possesses a much richer interactive repertoire. Finally, Minerva possesses a much improved Web interface.

One area that stands out as interesting has to do with the emotional states of Minerva in the short-term interaction component of the layered software architecture. Minerva uses its face, head direction and voice to interact with people. The moods range from happy to angry according to the persistence of the people who blocked the robot’s path. Typically, one would not associate the “angry” emotional state of elevated voice and facial features with a “socially interactive” robot. It is amazing to how truly effective this emotional state is for moving people out of the way. I would have thought that from Mori’s “Uncanny Valley” that people would not be able to relate with the robot since it is distinctly not human, but apparently the emotion of anger is universal. The elevated tone in the voice and facial expression on the robot may also trigger a deep, instinctual response in all of us, regardless if it comes from another human or a robot.

Another interesting part of the short-term interaction component is that in order to attract people, Minerva used a memory-based reinforcement learning approach. Typical approaches with social interaction would have the robot interact based on coded logic to deal with the varying states. It does not seem common to have the robot experiment with human interaction based upon a reward system on the responses it receives. The interesting variable in this interaction is the emotional response the robot is able to emote and therefore receive varying responses from people. It is no surprise that people responded best to when the robot was friendly.

Reference:

Thrun, Sebastian. “MINERVA: A Second-Generation Museum Tour-Guide Robot.” <http://www.cs.cmu.edu/~thrun/papers/thrun.icra_minerva.pdf>

Patrick Hoey's Blog