Author: Tzung-Cheng Tsai (2007-06-18);
recommendation: Yeh-Liang Hsu (2007-06-18).
Note: This article is Chapter 2 of Tzung-Cheng Tsai’s
PhD thesis “Developing a telepresence robot for interpersonal communication
with the elderly in a home environment.”
Chapter 2. Design elements in telepresence systems
Design elements for
surveys the application-oriented telepresence literature which describes the
development of a telepresence system. The design elements emphasized in these
studies are extracted and summarized in Table 2-1. A discussion of these design
elements as they fit into the framework of projection-immersion and
observer-dialogist illustrated in Chapter 1 is given below.
Table 2-1. Design elements and related technological
keywords for telepresence
Related technological keywords
RF and Internet transmission, time-delay improved
simultaneous operation, robotic design
humanoid mechanism and expression
binocular and panoramic vision, image processing
head-related transfer function, stereo audio
camera and screen with specific placement
environmental map establishment, self-maintenance
transmission, the transmission of control commands and sensory feedback, is a
basic design element for the connection between the user and the remote telepresence
robot or system. Wireless radio frequency and Internet are used in most
telepresence applications, and dedicated lines are used in specific
applications (such as operation in space and deep sea).
From the user’s
view, timing of data transmission is important. Time delays would degrade the
telepresence performance in both projection and immersion of the user. From the
participant’s view, the time delays also affect the participant’s impression as
an observer and interactive capability as a dialogist. Therefore, past telepresence
research in data transmission focused on the development of a control scheme to
deal with time delays for promoting performance [Tzafestas & Prokopiou,
1997; Daniel & McAree, 1998].
Many studies in
telepresence emphasize on enabling the user to modify the remote environment [Stoker
et al., 1995; Engelberger, 2001; Spudis, 2001], that is, projecting the user to
the teleoperator. A teleoperator is a machine that extends the user’s sensing
and/or manipulating capability to a location remote from that user.
Teleoperation refers to direct and continuous human control of the
Telepresence Testbed (FITT)” developed by NASA, which combines a wearable
interface integrating human perception, cognition and eye-hand coordination
skills with a robot’s physical abilities, as shown in Figure 2-1, is a recent
example of research in teleoperation [Rehnmark et al., 2005]. The teleoperated
master-slave system “Robonaut” allows an intuitive,
one-to-one mapping between master and slave motions. The operator uses the FITT
wearable interface to remotely control the Robonuat to follow the operator’s
motion fully in simultaneous operation to perform complex tasks in the
international space station.
Figure 2-1. Full-Immersion Telepresence Testbed
(FITT) and Robonaut [Rehnmark et al., 2005]
to an advanced capability to modify the remote environment provided by a
dexterous robot or a precise telepresence system. From the user’s view, the
user’s manipulative efficiency for special tasks is enhanced when projecting
onto a telepresence robot with supersensory. Green et al.  developed a
telepresence surgery system integrating vision, hearing and manipulation. It
consists of two main modules, as shown in Figure 2-2, a surgeon’s console and a
remote surgical unit located at the surgical table. The remote unit provides scaled motion,
force reflection and minimized friction for the surgeon to carry out complex
tasks with quick, precise motions. Satava , Schurr et al., , Ballantyne
 and da Vinci® Surgical System (shown in Figure 2-3)  have also
applied supersensory in telepresence surgery.
Figure 2-2. A surgeon’s console and a remote
surgical unit (RSU) located at the surgical table [Green et al. 1995]
Figure 2-3. da Vinci® Surgical System [http://www.intuitivesurgical.com/index.aspx]
elements can also provide the user with a novel immersion feeling in a remote
environment. For example, the user can control the zoom function of the camera
on a telepresence robot to observe the small details of the remote environment,
which the user does not normally see with the naked eye.
applications, non-anthropomorphic telepresence robots are usually designed to
perform specific tasks which do not involve interacting with human. Anthropomorphic
elements are of great importance for robots involving human-robot interaction. Many
researches added anthropomorphic elements to their telepresence robots in order
to improve the interaction between users and participants.
interacting with the participants, the user’s face displayed on a LCD screen is
incorporated in many telepresence robots. Dr. Robot and the telepresence system PEBBLES
described in the first chapter of the thesis use a LCD screen to display the
user’s face, as shown in Figure 2-4 and 2-5. It lets participants
realize whom the telepresence robot represents.
Figure 2-4(a). A patient is consulting the doctor through Dr. Robot [http://www.intouch-health.com/]
Figure 2-4(b). Dr. Robot in Show-Chwan Memorial
Figure 2-5. Telepresence system PEBBLES [http://www.ryerson.ca/pebbles/]
product “Giraffe” , a remote-controlled mobile video conferencing
platform, is also a telepresence robot application. As shown in Figure 2-6,
Giraffe is composed of two subsystems: the client application, and the Giraffe
robot itself. On the Giraffe robot, there is a video screen and camera mounted
on an adjustable height robotic base. The user can move the Giraffe robot from
afar using the client application. Using software that runs on a standard PC
and webcam, the client application connects the user to the distant Giraffe robot
through the Internet.
Figure 2-6. Giraffe is a remote-controlled mobile
video conferencing platform [http://www.headthere.com/products.html]
al.  addressed appearance and behaviors of robot are essential in
human-robot interaction. A robot’s appearance influences subject’s impressions,
and it’s an important factor in evaluating the interaction. Humanlike
appearance can be deceiving, convincing users that robot can understand and do
much more than they actually can. Observable behaviors are gaze, posture,
movement patterns and linguistic interactions. Appearance and behavior are
It is arguable
whether the LCD display is an anthropomorphic element. An LCD display may even
turn the human users’ impression towards the telepresence robot into a “movable
teleconference system” such as Giraffe, instead of the humanoid-type robot.
There are many other solutions for anthropomorphic elements [Burgard et al.,
1999; Burgard et al., 2003; Fong et al., 2003; Schulz et al., 2000; Trahanias
et al., 2005]. For example, Burgard et al. installed mechanical facial
expressions and a touch screen interface on their tour-guide robots to attract
on-site visitors’ reactions.
Fukuda et al.
 introduced their robotic head system, the “Character Robot Face (CRF)”,
which is developed as a human-robot communication interface with natural
modalities. CRF has facial expressions used for natural user interaction.
Facial expressiveness in humanoid-type robots has received a lot of attention
because it is a key component to developing personal attachment with human
users. From a psychological point of view, using facial expressions is an
effective method to build personal attachment in communicating with a human
In summary, anthropomorphic
elements enhance the impression of the telepresence robot as a true
representation of the remote user. The friendly interface and characteristics
of the anthropomorphic telepresence robot also increase the interactive
capability of the participant as a dialogist. Mechanical facial expressions can
also be used to increase the humanoid characteristics of the telepresence robot
to further encourage people to interact and communicate with the user.
Stereoscopic and stereophonic
research, stereoscopic and stereophonic design elements are often emphasized to
create a telepresence illusion of the remote environment or people aiming to
increase the feeling of immersion for the user. For example, the user can
identify the distance between an object and the telepresence robot by binocular
vision [Brooker et al., 1999]; the head-related transfer function (HRTF) for
stereophonic effect enables the user to identify the location and direction of
a sound [Hawksford, 2002].
videoconferencing is an important application using stereoscopic and
stereophonic elements [Izquierdo, 1997; Ohm et al., 1998; Xu et al., 1999].
Telepresence videoconferencing enables the users and the participants to
communicate more efficiently. In other words, the interactive capability of the
participant as a dialogist is enhanced. Lei et al.  proposed a
representation and reconstruction module for an image-based telepresence
system, using a viewpoint-adaptation scheme and an image-based rendering
technique. This system provides life-size views and 3-D perception of
participants and viewers in videoconferencing. The purpose of this research is
to provide the feeling of a virtual-reality presence, in which realistic 3-D
views of the user should be perceived by the participant in real time and with
the correct perspective.
Rhee et al.
 presented a low-cost method for visual communication and telepresence in
a CAVETM-like environment (The CAVE is a multi-person, room-sized,
high-resolution 3D video and audio environment invented at EVL in 1991 [The
Electronic Visualization Laboratory, 1991]), relying on 2D stereo-based video
avatars. The system combines a selection of proven efficient algorithms and
approximations in a unique way, resulting in a convincing stereoscopic real
time representation of a remote user acquired in a spatially immersive display.
Figure 2-7 shows the demonstrations of the system.
Figure 2-7. Visual communication and telepresence
in a CAVETM-like environment [Rhee et al., 2007]
Eye contact is
an important element for human-to-human communications. It is a well-known cue
for gaining attention and attracting interest. In human-robot interaction, a robot
with eye contact would be more familiar and comfortable for humans to interact
with. Yamato et al.  focused on the effect that recommendations made by
the agent or robot had on user decisions, and designed a “color name selection
task” to determine the key factors in designing interactively communicating
robots. They used two robots as the robot/agent for comparison. From the
experiments, eye-contact and attention-sharing are considered to be important
features of communications that display and recognize the attention of
psychology, “joint attention” is people who are communicating with each other
frequently focus on the same object. The joint attention is a mental state
where two people not only pay attention to the same information but also notice
the other’s attention to it. Imai et al.  investigates situated utterance
generation in human-robot interaction. In their study, a person has joint
attention with a robot to identify the object indicated by a situated utterance
generation generated by the robot named Robovie. A psychological experiment was
conducted to verify the effect of eye contact on achieving joint attention. The
experiment divided 20 subjects into two equal groups; one was given Robovie
with eye contact and the other was given Robovie without eye contact in
interaction. From the experimental results; it was obvious that a relationship
developed by eye contact has a more fundamental effect on communications than
logical reasoning or knowledge processing.
applications, eye contact can increase the immersion feeling of the user and
the interactive capability of the participant as a dialogist. It is very
difficult to achieve eye contact during interpersonal communication between the
user and the participant through a telepresence robot when the face of the user
is displayed on a LCD screen, because the placement of the camera on a
telepresence robot is usually on top of the LCD screen, which hinders direct
eye contact between the user and the participant through the telepresence
Hopf  proposed
an implementation of an auto-stereoscopic desktop display suitable for computer
and communication applications, as shown in Figure 2-8. The goal of this
research is to develop a system combining a collimation optic with an auto-stereoscopic
display unit to provide natural face-to-face and eye contact communication
without causing eyestrain.
Figure 2-8. Auto-stereoscopic display unit [Hopf, 2000]
In principle, a
telepresence robot is operated by a remote user, and does not possess
autonomous behaviors. However, the telepresence robot should be able to deal
with possible hazardous situations autonomously when the remote user is not
aware of the hazardous situation, cannot control the telepresence robot
properly, or the data transmission is lost. From the user’s view, autonomous
behavior increases the user’s capability of projection to operate the
telepresence robot safely and reliably in a dynamic environment. From the participant’s
view, autonomous behavior also increases the interactive capability of the
participant as a dialogist. For example, a telepresence robot with the
autonomous behavior of identifying the direction of the participant who is
speaking can assist the remote user to respond more quickly and properly.
museum tour-guide robot, as shown in Figure 2-9, was developed by two research projects
TOURBOT and WebFAIR funded by the European Union [Burgard et al., 1999; Schulz
et al., 2000; Trahanias et al., 2005]. Thousands of users over the world
controlled this robot through the web to visit a museum. They developed a
modular and distributed software architecture which integrates localization,
mapping, collision avoidance, planning, and various modules concerned with user
interaction and web-based telepresence. With these autonomous features, the
user can operate the robot to move quickly and safely in a museum crowded with
Figure 2-9. An interactive museum tour-guide robot,
pleasing the crowd [Burgard et al., 1999; Schulz et al., 2000; Trahanias et
Basic data transmission
structure and design elements of TRIC
robot TRIC developed in this research aims to be a low-cost, lightweight
robot, which can be easily implemented in the home environment. Therefore the primary
decision was to use ADSL and Wireless Local Area Network (WLAN), which are
commonly found in the home environment, as the channel of data transmission.
Two-way audio and one-way video communication can be transmitted through a network
Internet Protocol (IP) camera, which is also a common tool for home monitoring.
cores of most telepresence robots are PC-based. Dr. Robot, PEBBLES and Giraffe
used video conferencing technology for data transmission. It needs specific
software and interface running in users’ computers. The channel between a user’s
computer and the telepresence robot is a peer-to-peer communication. The advantage
is that the remote user’s face can be displayed on the LCD mounted on the
telepresence robot’s head. However, it is difficult for multi users to log in
telepresence robot at the same time. The core of the interactive museum
tour-guide robot is a PC-based web server. It allows thousands of users over
the world to log in the robot through the web to visit a museum.
Instead of using
a PC, a “Mobile Data Server (MDS)” was developed as the core of TRIC. Figure 2-10 shows a picture of the
laboratory prototype of the MDS, which consists of a PIC server mounted on a peripheral
application board. The PIC server integrates a PIC microcontroller (PIC18F6722, Microchip), EEPROM (24LC1025,
Microchip) and a networking IC (RTL8019AS, Realtek). It provides networking
capability and can be used as a web server. The peripheral application board
(as well as the program in the PIC microcontroller) can be easily customized to
adapt to different sensors and applications. The dimensions of the MDS
prototype are 40mm×85mm×15mm. Internet and serial interface (RS-232) are the primary
communication interfaces of the MDS with client PCs and other devices. The MDS
also receives external signals (e.g., sensor signals) through specific analogue
or digital I/O ports, and provides inter-integrated circuit (I2C) communications to allow connections with external modules. A
Multi-Media Card (MMC) in the MDS can be used to store data in FAT16 file
format. Compared to a PC, the MDS is low-cost, has smaller dimensions, consumes
less energy (thus can be powered by batteries), is not affected by viruses, and
is safer and more reliable.
Figure 2-10. A picture of the laboratory prototype
of the MDS
shows the basic data transmission structure of TRIC. The user projects
herself/himself to TRIC in the remote
environment by sending control commands to TRIC through the Internet gateway. The user is able to immerse in
the remote environment from the sensory feedback transmitted through the
Internet gateway. TRIC uses a WLAN the connector by connecting to the
WLAN in the home environment. MDS takes charge of receiving commands from the
user and sending commands to specific modules which coordinate with each other
to perform specific tasks. Finally the user can have physical interaction and
verbal communication with the participant by controlling TRIC as his/her physical extension in the remote environment.
Figure 2-11. The data transmission structure of TRIC
Under this basic
structure, Table 2-2 lists the design elements currently planned for the design
of TRIC. The implementation of “teleoperation” in TRIC is quite
fundamental. Teleoperation allows the user to move TRIC through the
environment while controlling the pan and tilt of the IP camera from a remote
client PC. It lets user be in two places at once by teleoperating TRIC. Supersensory
ability is reflected in the zooming capability of the IP cam and the sensing
capability of the various sensors installed for environment detection.
Table 2-2. Design elements included in TRIC
Corresponding Technological Strategies
use MDS for the core of system
design of mobility platform
provide zoom of IP cam, implement various sensors
for environment detection
design of humanoid appearance and interactive
control TRIC to gaze at participant
share control authority to participant and
TRIC is not intended to be only a communication media, such as the “movable
teleconference system” Giraffe. Through TRIC,
one important goal is to give the participant the impression the remote user that
he/she is communicating is actually in the local environment. Anthropomorphic
elements enhance the impression of TRIC as a true representation of the
remote user. Design of humanoid appearance and interactive behaviors for TRIC
can facilitate interaction with participants.
For this reason,
we also decided not to use an LCD to display the user’s face, which would
result in an impression that the user is in a remote location. In most
telepresence applications utilizing an LCD display, the camera is mounted on
top of the LCD screen, which hinders direct eye contact between the user and
the participant. Instead, the camera on TRIC
is packaged into a “head” with humanoid expression, which also facilitate the
design of “eye contact” because the camera is indeed the “eye” of TRIC. Sophisticated
stereoscopic and stereophonic elements have been omitted to keep TRIC a
low-cost, affordable homecare robot.
behavior is the design element that received the most attention during the
planning of TRIC. In principle, a telepresence robot is operated by a
remote user who possesses complete control authority. However, a major emphasis
of this research is to implement key autonomous behaviors in TRIC in
order to increase the user’s operating capability and reduce the user’s
workload during operation. By doing so, the aim was to also increase the
interactive capability of elderly people as reciprocal communicators.
autonomous behaviors implies that the control authorities of the telepresence
robot are shared with the participant or the environment it is interacting
with. Several possible features for sharing control authority with the remote
participants are discussed below:
“Look at that!”
engaged in a face-to-face conversation often share the same view by pointing to
an object in discussion. However, it will be difficult for the user to either
point to a certain object or to find the object the remote participant is
pointing at through the telepresence robot. A 2 degree-of-freedom robot arm equipped
with a laser pointer is used as a joint attention device to realize the “look
at that!” function. The remote participant can direct the view of the
telepresence robot by pointing the laser pointer to the object in question.
“Where is the speaker?”
It is not
easy for the user to locate the source of sound in 3D space through the
telepresence robot. When interacting with the remote participant, “Where is the
speaker?” enables the telepresence robot to automatically locate and track
speakers without control from the user. With this feature, the participant
controls the telepresence robot by using her/his own voice.
“Come here!” and “Follow me!”
is the speaker?” the telepresence robot can locate the source of the sound.
Therefore the “Come here!” feature allows the user to command the telepresence
robot to go to the source of the sound. “Follow me!” is another interactive
behavior which is common in interpersonal communication. The passive infrared
motion sensors combined with ultrasonic range-finding sensors are used to
perform the low cost and reliable function of “Follow me!” where TRIC
continuously follows the intended participant.
modes in sharing control authority with the remote environment are discussed
difficult for the user to identify environmental information from the robot’s
limited viewing angles. Therefore automatic obstacle avoidance is necessary.
When an obstacle is detected within a specific distance from the robot, the
obstacle avoidance algorithm is activated, and the robot deviates from the movement
direction controlled by the user in order to avoid this obstacle.
fundamental self-maintenance function is the ability of TRIC to automatically
recharge its battery when needed. This includes the ability to detect energy
capacity, self-positioning to locate and move to the charging station, and automatic
parking control to dock the robot in the charging station.
The hardware and
software design of TRIC to achieve
these functions will be described in details in the following chapters.