drascic@ie.utoronto.ca
Figure 1.2: RMI Monoscopic Viewing Aids
Figure 2.1: Dynamic Stereoscopic Camera Mount
Figure 3.1: SV Dependence Spectrum
Figure 3.2: Position on the SV-Dependence Spectrum of Experiments 1 and 2.
Figure 4.1: Driving Task for Experiment One
Figure 4.2: Pointing Device for Experiment One
Figure 4.3: Results of Experiment One
Figure 5.1: Fitts' Law Task for Experiment Two
Figure 5.2: Experiment Two: Training
Figure 5.3: Average Trial Times of Experiment Two,
Figure 5.4: Average Trial Times of Experiment Two
Figure 5.5: Experiment Two Part A Day 1 Trends
Figure 5.6: Experiment Two Part A Day 1: Number of Trials with Errors Made while completing Four Error-Free Trials, versus Trial Number and Index of Difficulty
Figure 5.7: Experiment Two Part A Day 2 Trends
Figure 5.8: Experiment Two Part A Day 2: Number of Trials with Errors Made while completing Four Error-Free Trials, versus Trial Number and Index of Difficulty
Figure 5.9: Results of Experiment Two Part B
Figure 5.10: Experiment Two Part B Number of Trials with Error made while completing Eight Error-Free Trials
Figure 5.12: Expt 2 Part B Performance Graphs, where # Errors refers to the number of errors made while completing the 8 successful runs of Experiment 2 Part B.
Figure 5.13: Expt 2 Part A Runs 9-16 Performance Graphs, where # Errors refers to the number of errors made while completing the last 8 successful runs of Experiment 2 Part A.
Table 4.2: ANOVA of Trial Times for the First Video Condition
Table 4.3: ANOVA of Trial Times for the second video condition
Table 5.1: Relationship between Target Width and Index of Difficulty
Table 5.2: Analysis of variance of the number of trials needed to complete the training procedure.
Table 5.3: ANOVA of Experiment 2 Part A Trial Times, as a function of Order (MV First or SV First), Video System (MV and SV), Difficulty (8, 16, 32, 64 cm), and Trial Number (16 trials per condition).
Table 5.4: Analysis of Variance of Trial Times for Experiment Two Part A Day 1, grouping the 16 trials into four sets of four ("Learn4") to elucidate any trends.
Table 5.5: Analysis of Variance of Trial Times for Experiment Two Part A Day 2, grouping the 16 trials into four sets of four ("Learn4") to elucidate any trends.
Table 5.6: Analysis of Variance for Errors for Experiment 2 Part A Day 1, grouping the 16 trials into fours sets of four ("Learn4") to elucidate any trends.
Table 5.7: Analysis of Variance for Errors for Experiment 2 Part A Day 2, grouping the 16 trials into fours sets of four ("Learn4") to elucidate any trends.
Table 5.8: ANOVA of Experiment 2 Part B Trial Times
Table 5.9: ANOVA of Experiment 2 Part B Percent Errors
None of this work would have been possible without the generous contributions of Dr. Julius Grodski, of the Defence and Civil Institute of Environmental Medicine, and the considerable resources he provided. His comments and suggestions regarding this work are gratefully acknowledged.
The valiant labours of Prof. Mike Carter in protecting me from raging bureaucracy ensured my continued existence as a graduate student and is much appreciated. His contributions to and criticisms of this work were very instructive.
Several other people assisted in the formulation of the ideas discussed herein, including Brian Fitzsimmons, Phil Roberts, Dianna Drascic, and my fellow students in the Department of Industrial Engineering. Suzanne Rochford's contribution as laboratory assistant and guinea pig is greatly appreciated.
The comments and advice from the EOD staff and team at DCIEM were critical to this work, and is greatly appreciated.
The invaluable participation and input of those who devoted many hours of their time to serve as subjects in these experiments is also greatly appreciated. These people are Robert Drascic, Angela Gaudio, Laura Logan (twice), Drew van Camp (twice), Farshad Jajarmi, Parminder Kalirai, Yan Xiao (twice), Rhea Plosker, Yi Kou, Prabir Sarkar, Antonella Arcaro, Sue Clouse-Jensen (twice), Paul McInerney, Randy Sollenberger, Hao Zhao, Dave Colter, Zoltan Leskowsky, John Ovcauk, Deniz Ulguray, and Margaret Campell.
The work described herein was carried out under contract W7711-7-7009/01-SE with Supply and Services Canada for the Defence and Civil Institute of Environmental Medicine. Considerable personal financial assistance and support was provided by my parents, Janet and Savino Drascic, without which this work would have been impossible. At least for me.
To that end, a practical Stereoscopic Video (SV) system was developed that is compatible with standard video display and recording equipment. Two experiments were conducted to examine the potential benefits of SV for teleoperation, with particular emphasis on the effect of experience.
The first experiment examined the issue of whether it was easier to learn how to interpret a SV display than a standard monoscopic video (MV) display. Using a task that had very little demand for binocular depth cues (i.e. was SV-independent), it was found that there was a benefit in performance due to SV that diminished as the operators learned how to use the monocular cues of the MV display. Furthermore, the first experiment provided evidence to suggest that SV can be used effectively with little or no training, while MV requires a period of adjustment and learning.
The first experiment also revealed an interesting transient effect that changing from one video condition to another can have on performance. Those who change from an SV to a MV display show a temporary but dramatic drop in performance, while those who change from a MV to an SV display show a large improvement in performance. The results of the experiment and the literature suggest that the differing appearances of "reality" of the two displays may affect the confidence of the operators in their abilities to perform the task, and so therefore affect their performance.
The second experiment examined the second issue, that of how the transience of the benefits of SV are a function of the difficulty of the task and the dependence on binocular depth cues. It showed that the benefits of SV, even after a great deal of practice, will still be apparent for difficult tasks, long after the benefits have faded for easier tasks.
Most of these telerobots must be controlled manually by a human operator because computer intelligence systems are not yet smart enough to control the robots autonomously, except under very restricted circumstances. The operators must make all of the high level decisions, about where to go and what to do, as well as all of the low level decisions involved in carrying out these high level decisions, using the limited amount of information available from the sensors on the telerobotic device.
The goal of this work is to make the job of the operators of these telerobots easier. In this paper we report some efforts to improve the display part of the human-machine interface.
As Don Norman says in The Psychology of Everyday Things, "Nothing succeeds like a good display." (Norman, 1988) Giving necessary information in a natural form facilitates all human-machine interactions. The majority of teleoperation devices employed around the world are equipped with one or more standard video displays for feedback. (Meieran, 1988) This means that one very important type of feedback information missing from the standard display is the immediate and compelling binocular coding of depth, which is thwarted through the use of a monoscopic video system, making the operator more dependent on the variety of other depth cues. This is unfortunate, since most telemanipulation requires the operators to have a good sense of the relative locations of objects in the remote world. This depth information must be obtained through such cues as size constancy, interposition, shadows, and so on. While this is possible, and is a skill that can be mastered (Clapp, 1986), it is not the most direct way of conveying this information to the operators. By presenting the depth information in a natural manner, through stereoscopic perception, the entire task is made simpler.
The work discussed herein focuses on one particular telerobotic task, namely Explosive Ordnance Disposal (EOD), but the research and its implications applies to a much broader range of problems.
In general, EOD means rendering suspected or known bombs inoperative with as much regard for human safety as possible. Before the advent of suitably advanced telerobots, most of this work was done manually, at great risk to the safety of the bomb disposal expert. It is now possible in many situations to use the telerobot to disable the bomb from a safe distance.

It is possible to learn how to control the RMI without too much difficulty, although there are some violations of stereotypes with respect to the direction of operation of the toggle switches which control the movement of the arm. The actual driving of the device is relatively straightforward: in the course of this work, it seemed that most novices had only to master the technique of inching forward; the steering of the RMI came naturally (when moving forward, at least).
The feedback information provided by the RMI Control Station typically consists of a single black and white video monitor and a bi-directional sound system.
........1 paragraph and diagram deleted: operating procedure
A variety of tools can be attached to the RMI, and they are used in the following manner:
* x-ray unit: this consists of an x-ray projector mounted on the RMI base, and an x-ray film plate mounted on the end of the arm. The film plate, roughly 30 cm high and 50 cm wide, is mounted on a hinge so that it can swing freely, and thus remain vertical regardless of arm orientation. There is approximately a one metre gap between the x-ray projector and the film plate.
The plate is usually lowered behind the suspected parcel and slowly brought forward until it is as close to the parcel as possible. The closer the plate is to the parcel, the better the x-ray image. On the other hand, many bombs have anti-tampering mechanisms, and so it is important that the plate does not touch the parcel. Positioning the plate is a delicate task that must be done with a great deal of caution and that requires much training. It is highly dependent on such depth cue as are provided by lighting and shadows.
Once the x-ray has been taken, the RMI is returned to the control station where the film is developed to determine whether the parcel is indeed a bomb, and if so, where in the parcel the mechanism lies.
* claw: this is a simple hinged claw, driven by an electric motor.
........2 paragraphs deleted: claw limitations
* shotgun: this is mounted onto the "fore-arm" of the RMI.
........2 paragraphs deleted: shotgun use
* disruptor (centaur):
........1 paragraph deleted: disruptor description and use
The disruptor has an effective range of approximately 30 cm, with a spread of about 30 degrees. It must be positioned from 6 to 10 cm from the target, and should be aligned along the longest axis of the parcel to achieve greatest effectiveness.
In order to ensure that the disruptor is placed at the proper distance without disturbing the target in any way, the operators attach a loop of tape around the end of the disruptor, so that it sticks out about 8 cm. The operators then inch the RMI closer and closer to the target, until a movement in the tape observed on the remote monitor indicates that the RMI is at the proper distance. The tape is flexible enough that there is little risk of it triggering the explosive.
This task requires a very good perception of the relative distances of the target and the disruptor. In unknown environments, size cues are insufficient, and the operators must rely on lighting and shadows, and particularly on the loop of tape.
* horseshoe: The horseshoe is used to blast the cap off a pipe bomb and scatter its contents.
........2 paragraph deleted: horseshoe description and use
* pipe-carrier:
........1 paragraph deleted: pipe-carrier description and use
The pipe carrier can only be used for pipe-bombs that are known not to be motion sensitive. Even for such bombs, great care must be taken, since there is usually explosive powder in the threads of the screw-on caps, and a jar to the pipe may be enough to cause it to explode.
Even when it is possible to make judgements about the relative position of items in the remote world, it usually takes a long time, and may require that the telerobot be driven around for a while in order to provide motion parallax information and a variety of views.
Additions to the MV display, such as the marks made by the operators on the monitor as described above (Section 1.1.2) can improve this situation, but such techniques are very limited in their scope and applicability.
........1 paragraph deleted
The literature shows that using colour video displays can significantly improve obstacle recognition and course planning for terrestrial teleoperation (McGovern, 1987 b), and they are consistently rated higher on subjective scales of satisfaction than comparable monochrome displays (Miller, 1988). Changing to a colour display would be an obvious improvement with comparably little extra expense. While this question was not pursued further, being beyond the scope of this study, it is mentioned here to explain why colour displays are used for the experimentation described herein.
The second category of depth cues is binocular depth cues. These cues result from seeing with two eyes from a slightly different viewpoint, and include (i) the convergence angle of the eyes, and (ii) retinal disparity, that is, the differences between the retinal images of the left eye and the right eye. (For a detailed explanation of binocular depth cues, see Arditi, 1986.)
It is the sum of all available monocular and binocular cues which results in the perception of depth.
A further advantage of binocular viewing is binocular parallax , which while not a depth cue per se, provides additional visual information: since each eye views the world from a slightly different position, it is possible to see "around corners" (that is, what may be occluded from view with one eye may be visible to the other).
Telescopes were another technology, effectively bringing close what was far away. But both pictures and telescopes were a limited representation of the remote view, in that they were monoscopic, or "2-D", and made it hard to perceive the spatial relationships of objects in the remote view. Thus both were refined to permit stereoscopic, or "3-D", viewing: telescopes became binoculars, and pictures were presented as stereoscopic pairs for using in such display devices as the "stereoscope". By presenting both eyes of the observer with a slightly different viewpoint, binocular depth coding could be used, and a higher fidelity presentation of the remote world was possible.
Seeing at a distance has progressed from static, low-fidelity monoscopic images to dynamic, high fidelity stereoscopic images. Future innovations already being developed include holography (Frey, 1986), head-tracking displays (Merritt, 1987), graphic aids (Kim et al, 1987, Drascic et al, 1991 b), computer-assisted perception (Grodski et al, 1991, Milgram et al, 1991), and interaction.
"2D displays would provide all essential information for the formation of distance perception and estimation, and that 3D displays, adding no important distance cues, yet providing all 2D cues, would yield similar estimations." [page 325]What they failed to take into account was that binocular disparity provides a very powerful perception of relative distance, and so they completely ignored the most fundamental advantage of stereoscopic displays. While it is true that eye convergence is not effective as a depth cue beyond a distance of approximately 2 metres, retinal disparity is effective to a much greater distance. Human resolution of retinal disparity is generally considered to be approximately 10 seconds of arc, although some individuals have been able to detect disparities as small as 4 seconds of arc under ideal conditions (Arditi, 1986). Assuming an estimate of 10 arc seconds for stereoacuity, this means that a difference in depth of one metre can be detected at 37 metres.
It therefore stands to reason that the benefits of stereoscopic display will apply to a variety of teleoperation tasks, involving both near work, such as the precise placement of tools, and far work, such as in reconnaissance or driving.
This system could be improved by making the superimposed monoscopic graphics dynamic. If they were to be generated on-line by a computer, they could be drawn to correspond to any distance, not just one, and with the RMI in any configuration, not just the "home" position. In that way, they could be used for absolute depth estimation and absolute size estimation whenever the floor is flat and level and the target is clearly located above a visible spot on the floor. Unfortunately, if the target is located at some place where the floor is not visible, such as on a step, this operator will be required to guess.
Kim, Takeda, and Stark (1988) report a system which superimposes a horizontal grid and vertical reference lines for objects in the remote world on top of a monoscopic view. They found that this enhanced monoscopic view enabled performance similar to that of two perpendicular monoscopic views or a single stereoscopic view. Unfortunately, this system requires that the operator individually identify all items of interest in the remote world, and enter the three dimensional position of those items into a computer database (using a relatively simple graphical procedure). It also requires that the telerobot be able to accurately report its current position relative to a known reference point. Refitting the RMI to permit this would be prohibitively expensive.
* Enhanced image interpretation, especially in unfamiliar or complex environments.
* Visual noise filtering. Random noise of any sort, whether through poor transmission of the signal, or through sediment underwater, will obscure different parts of each eye's view. Using the two eyes' views for stereoscopic perception, the brain automatically suppresses the uncorrelated noise, enhancing the observer's ability to identify objects in the scene.
* Enhanced effective image quality. Stereoscopic displays can cause improvements in effective image quality which is degraded due to low resolution, lack of focus, motion blur, and so on.
* Wider effective field of view.
* Enhanced slope and depression detection. In situations where edges, textures and sizes vary randomly, such as when driving a telerobot on a field of grass, a change in slope may not be detectable with monocular cues, which may cause the telerobot to overbalance and fall.
* More accurate sense of depth. This will perhaps lead to reduced error rates for casual movements (critical movements are generally done with such care that there is little room for improvement), and faster execution of the task.
* Faster perception of spatial relationships in the remote world. This is due to a more salient presentation of the depth information; no search for non-salient monocular cues is needed. This should result in a faster execution of the task, as well as decreasing the amount of time needed to learn how to use the monoscopic display.
* More complete information about spatial relationships in the remote world. This should result in a more quickly perceived and more accurate mental image of the remote environment. This could reduce complex task execution time, and would certainly reduce execution time of a repetitive task.
In addition to these benefit, a further unproven benefit appeared likely prior to this research, and was in fact supported by the results of the experiments discussed herein:
Reduced Training and Practice Time. All operators must undergo extensive training in order to be qualified to conduct EOD operations, and all are required to partake in regular practice sessions with the RMI. Without these practice sessions, the operators' skill quickly deteriorates. It was our expectation, based on experience with the RMI and the reports of trained EOD specialists, that this deterioration of EOD operational skill is due primarily to the binocular deficiencies of the monoscopic display, and only to a lesser extent due to forgetting of the motor control skills.
It has been shown that when using a modified direct view, such as with a prism arrangement to artificially exaggerate eye separation, or magnifying lenses, monocular cues need to be learned, or recalibrated, a process which takes time (McGovern 1987 a). Furthermore, it is known that binocular depth cues play a fundamental role in the calibration of the monocular depth cues, and that binocular disparity is perceived more quickly than any other visual cue (Clapp 1986, 1987).
Even after having learned how to interpret monocular cues for a considerable time, it remains a fairly weak depth cue, and is easily dominated by other cues such as perspective and occlusion. (Wickens et al, 1990)
On the other hand, as Baker notes: "Because the accommodation/convergence differs in stereoscopy and the physical world, the ability to see binocular depth on a CRT must be learned. On first occasion, many people adapt in a few seconds, while others may take several minutes to see the image comfortably." (Baker, 1987) This short time period, however, is insignificant compared with the time needed to master interpretation of monocular cues.
These facts imply that it will take a novice longer to become proficient in teleoperation using a monoscopic display than a stereoscopic display. Since the view from the video monitor is very different from a direct view of the real world with respect to the relationship between the monocular depth cues and the binocular depth cues, it is reasonable to expect that without constant practice, the temporary voluntary recalibration (or learning) of the depth cues of the monoscopic display will fade.
These considerations suggest a benefit of stereoscopic displays not greatly discussed in the literature: they are easier for novices to use, and will thus require less time for training and practice than monoscopic displays. There has been little rigourous study of this suggestion, and so two experiments were designed to look at learning behaviour in monoscopic and stereoscopic displays.
Although considerable time was spent developing this technology, this was not the main point of this thesis. There are various other ways of achieving the same goals (although they are almost universally more expensive). Because of this, what follows includes only a short description of the implementation developed. What is important to this thesis is the Human Factors research conducted. That will be described in considerably more detail in the following chapters.
Dual optical path systems can produce high fidelity stereoscopic images, but in virtually all implementations can only be used by one observer at a time, since typically only one set of optics is provided. Furthermore, the need for lenses, mirrors and prisms is often expensive and inconvenient. Since the operator must be looking directly into the optics of the system, use of such systems can interfere with the performance of other tasks.
One version of this type of system uses what is known as a chromatic anaglyph , where the left image is presented in one colour, say green, and the right in another colour, say red. The observers wears one red and one green filter in front of their left and right right eyes respectively. The left eye, with the red filter, will not be able to distinguish the red image from the filtered white background, and yet the green will stand out as being considerably darker than the surrounding region. Similarly, with right eye with its green filter will not be able to see the green left image, but the red right eye image will stand out considerably darker than the background.
Chromatic anaglyphs are relatively simple and inexpensive to implement, can be used to produce stereoscopic images from a printed page (for example, many books on drafting use this technique). Unfortunately, chromatic anaglyphs are generally not acceptable for extensive use, resulting in too much eye strain and user dissatisfaction. (Lane, 1982)
A similar technique using polarised light has replaced the chromatic anaglyphs in most applications, where the left eye image is presented using light polarised in one direction, and the right eye image is presented using light polarised in a perpendicular direction. Viewing the combined images directly, the observer would see both the left and right images superimposed. If the images were viewed through a polarising filter, however, only the light polarised in the direction of the filter can pass through it, and so the observer will see either the left or the right eye view, depending on the orientation of the filter. For stereoscopic viewing, the observer wears spectacles with appropriately polarised lenses, so that the combined left and right eye images are separated for the viewer.
When used for viewing stereoscopic slides and motion pictures, the typical implementation uses separate projectors or projector optics for the left and right eye views, each set of optics fitted with a polarising filter. The two images are simultaneously projected onto the same screen (which must be capable of preserving the direction of polarisation). When used for video applications, the left and right images are time-multiplexed, where the image on the monitor switches rapidly between the left and right camera views, and a liquid crystal polarising filter, capable of rapidly changing the direction of polarisation, is fitted over the surface of the monitor. When the left image is presented, the filter is polarised on one direction, and when the right image is presented, the filter is polarised in a perpendicular direction. When the alternating rate is fast enough the illusion of continuous presentation can be achieved.
Polarised filter separation techniques have the advantage of being full colour, and are generally more acceptable to users than colour filter separation techniques (chromatic anaglyphs). Linear polarisers place restrictions on head tilt; circular polarisers do not. Both types have some problem with light transmission and crosstalk. And while relatively inexpensive to implement for slide and movie projection (since simple, small polarising filters can be used), polarised filter separation techniques are expensive to implement for a video system, since the polarising filter must be large enough to cover the entire display surface, and must be dynamic, able to change polarising direction many times per second.
A third variation of the filter separation technique uses alternating dual images with active shutters that alternately occlude one or the other eye are placed in front of the eyes of the observer. The presentation of the time-multiplexed left and right images is synchronised with the shutters, so that each eye is presented with only its corresponding image. A variety of shutters have been used, including mechanical, PLZT, and liquid crystal (Lipton & Meyer 1984, Lipton 1987, Milgram & van der Horst 1986). These systems, especially when using liquid crystal shutters, can give high quality stereoscopic images without complicated equipment. Some shutters suffer from low transmission problems and crosstalk. Depending on the configuration of the system, the stereoscopic image may be viewable by more than one observer. Unfortunately, observers with an oblique view will see a distorted image.
The intended advantage of such systems is that no filters or shutters are necessary, and that observers with oblique views will not see a distorted image. Unfortunately, many holograms and lenticular displays can be seen only when the observer's head is in a particular location, limiting the number of viewers. Systems using dynamic mirrors or lenses usually have a restricted field of view. Furthermore, even though holographic technology is continually improving, it cannot yet be used as a live, interactive display, and the enormous bandwidth required by such a system makes such a system in the near future extremely unlikely. Although some work has been done to use lenticular lenses for video display purposes (Butterfield, 1979), it has proven difficult and expensive to implement and use.
Those systems using dynamic mirrors or lenses have been successfully implemented for the display of simple computer generated images, but no implementation of such a system for a video display has been reported. The nature of this type of system imposes dramatic bandwidth limitations on such a display.
This can most easily be achieved by using carefully timed projection images or lasers to illuminate spots on an opaque surface that is sweeping through a particular volume (Lane, 1982, Williams & Garcia, 1988). Other techniques which have been investigated include using intersecting laser beams to excite small "spots" of gases in a display "tank", and sweeping a grid of light sources, such as LEDs, through a volume at a rapid rate (Fajans, 1979).
Such systems have the advantage of being viewable by many observers simultaneously, and suffer from no distortion due to oblique viewing angles. Unfortunately, stereoscopic real image displays are very costly and difficult to implement, and are still in their infancy. Some simple computer-generated images have been demonstrated, but there is considerable doubt that such systems could ever be used for the display of live video.
This is consistent with the demands of this particular teleoperation task, and with teleoperation in general: under most circumstances only a single operator need view the display, so there is no viewing angle distortion, and properly designed shuttering spectacles need not interfere with the operation of the telerobot and associated equipment.
The NTSC video standard has been described in considerable detail elsewhere (see (Harshbarger, 1984) for a good summary). In brief, due to early limitations in broadcasting bandwidth, the NTSC system has 525 horizontal scan lines for each image, or frame, with an update rate of 30 frames per second. Since such a slow update rate would result in a visible flicker under normal viewing conditions, the lines are not drawn on the display in numerical order from top to bottom. Instead, the 263 odd-numbered lines are drawn from top to bottom, presenting the odd-field, and then the 262 even-numbered lines are drawn from top to bottom, presenting the even field. Together, the odd and even fields constitute one frame. In this way the conflicting requirements of bandwidth limitations, sufficiently high image refresh rate to avoid visible flicker, desire for a high-quality image, and need for a frequent image-update rate to give the illusion of continuous motion have all been balanced fairly equitably.
In order to create a stereoscopic image with off-the-shelf video technology, it is necessary to alternate rapidly between the left and right images. When using the NTSC standard, it is most convenient to create the stereoscopic image by taking the odd field from one video camera, and the even field from another. This results in a standard video signal that can be displayed and recorded on standard video equipment. (For more details on field interlace techniques, see Milgram & van der Horst, 1986, Lipton & Meyer, 1984.)
The lenses used were 8 mm automatic iris TV Camera lenses manufactured by Cosmicar, model number C814DEX2.
The first technology used for shuttering glasses are based on cholesteric liquid crystal. In the OFF state, the liquid crystal lens has a milky white appearance which scatters the incident light, resulting in an extremely low contrast image. In the ON state, the liquid crystal lens becomes almost transparent, with a high transmission rate, resulting in a high contrast image. (See Milgram & van der Horst 1986 for a discussion of the merits of these shutters.) In order to reduce the apparent flicker when using a bright display, a neutral-density filter was added to the original shutters for use in the work described below.
The second technology used is based on twisted nematic liquid crystals. When in the OFF state, these liquid crystals cause incident polarised light to be rotated ninety degrees. When in the ON state, they simply transmit the incident light. Each liquid crystal cell is sandwiched between two aligned polarising filters. When the liquid crystal is in the ON state, incident light is polarised by the first filter, is transmitted through the liquid crystal cell, and is then transmitted (and attenuated further) by the second polariser. When the liquid crystal is in the OFF state, the liquid crystal rotates the polarised light from the first filter by ninety degrees, so that it cannot pass through the second filter; the lens effectively goes black.
The cholesteric liquid crystal glasses were constructed by Dr. Paul Milgram. The twisted nematic liquid crystal glasses are commercial units made for the Amiga computer by Haitex, with the trade name X-Specs.
The mount was intended to be able to symmetrically adjust the convergence angle and separation of the two cameras. This is accomplished by using a dual roman-screw design (see Figure 2.1).
The optional stepper motors and computer interface permit dynamic remote
control. For the experimental work described herein, however, the mount was
adjusted manually. Further research using this device is under way in this
department to investigate the utility of dynamically adjustable camera
configurations.

Human Information Requirements for Telerobotics:
1) Knowledge & Rules (e.g. what procedures to follow, which techniques are most effective for a particular situation, special considerations)
2) Skills, which can be divided into the following categories:
i) Interpreting the Display (e.g. how to translate the information displayed into a suitably accurate internal model of the remote world)
ii) Controlling the Telerobot (e.g. familiarity with the control dynamics and behaviour of the telerobot, how to use the buttons, knobs, and switches to operate the telerobot, and knowing how the robot responds to a certain control action or to a certain external influence, such as a steep grade, etc.)
3) State Information, that is, information about the current:
i) Robot Configuration (e.g. whether the arm is extended or not, the claw open or not)
ii) Spatial Layout of the Remote Environment (e.g. what the objects in the remote environment are, and where they are in relation to each other)
iii) Location of the Telerobot within the Remote Environment
Regarding (2ii) Skill in Controlling the Telerobot: while binocular cues are very important for conveying depth information, they are even more critical for conveying information about motion in depth (Clapp, 1987). AN SV display system can be expected to provide much better visual feedback of the motion of the telerobot to the operator, so it can be hypothesised that using SV can facilitate the acquisition of some of the skills needed to control the robot. There is little discussion in the literature about this point, and what there is is contradictory. Pepper & Hightower (1984) report that for a simple target positioning task, comparing the task execution time when using SV and MV display systems, the performance advantage of SV is greater for experienced operators than for novices, implying that the relative benefits of SV increase with experience. On the other hand, Smith et al (1979) found that the relative benefits of SV decrease with experience. In order to investigate these issues further, two experiments were designed and conducted, as discussed below.
Merritt states: "Certain visual tasks are trivially easy with 3D vision, but extremely difficult with only 2D vision. In some cases, the monocular cues do not carry the required depth information; in other cases, the monocular cues normally present are degraded by poor visibility or image quality." (Merritt, 1984) Extending this idea, teleoperation tasks can be seen as existing in a spectrum between the two extremes of not requiring any depth information (i.e. SV-independent), and being impossible without stereoscopic displays (i.e. SV-dependent). (See Figure 3.1) Driving a telerobot along a clear path would fall towards the SV-independent end of the spectrum: sufficient information to accomplish the task could be readily acquired without SV. Although there may be an initial advantage to using SV, as discussed above, it is reasonable to expect that for most such tasks, this initial advantage will fade with experience. (e.g. Pepper et al, 1981, commenting on Pesch, 1967)
At the SV-dependent end of the spectrum exist tasks that predicate entirely on relative and absolute positions in depth, information that may only be available with SV displays, such as the precise placement task with an edge view that is shown in Figure 3.1. Depending on the exact location of the task on the spectrum, it may or may not be possible to learn how to use the monocular depth cues to accomplish the task. For tasks at the extreme SV-dependent end of the spectrum, it is not unreasonable to expect that the initial performance advantage of SV will actually increase with time.
Between these two extremes exist the bulk of teleoperator tasks. Since most
telerobots in use today are equipped with MV system(s) (Meieran, 1988), the
only tasks possible have been those near the SV-independent end of the
spectrum. As SV becomes more commonly used, the variety of telerobotic tasks
will increase dramatically.

Hypothesis 1: Operators will learn how to interpret SV displays faster than MV displays.
Hypothesis 2: Operators will be able to perceive the static and dynamic spatial relationships of the remote scene faster and more accurately using an SV display, and will therefore show a performance advantage for SV displays as a function of the SV-dependence of the task.
Hypothesis 3: The advantage of SV over MV will change as a function of experience; the more SV-dependent the task, the longer the SV advantage will last.
These hypotheses are particularly important for EOD operations, since actual operations are relatively rare in Canada, and considerable time and expense is devoted to maintaining the high skill levels of EOD experts. If it were to be shown that skill acquisition occurred at a faster rate when using SV, the practical implications are obvious.
In order to isolate this particular type of learning, all other learning of the Human Information Requirements for Telerobotics listed above had to be controlled. The Knowledge and Rule based information was controlled automatically, since all subjects received exactly the same training. The Skill in Controlling the Telerobot was controlled by having the subjects train with the robot under direct viewing conditions, and by using an extremely simple task. The State Information of the Configuration of the Telerobot was controlled by presetting the robot into a fixed position. The Spatial Layout of the Remote Environment was simplified and presented to the subjects using direct viewing. The Location of the Telerobot within the Remote Environment was restricted to a very small range. By using this approach, it was hoped that any learning effects would be attributable to learning how to interpret the SV and MV displays.
The second experiment was designed to examine the question of whether the different display systems affected the learning of the Control of the Telerobot, and the role of different level of SV-dependence. Similar controls of the other learning factors were controlled using the techniques above, with the addition that the learning of how to Interpret the Display was controlled with training. Four different levels of SV-dependence were used.
If we consider the SV-dependence spectrum, the two experiments could be
characterised as shown in Figure 3.2.


The EOD disruptor task was explained to the subjects, and they were told that they had to position the pointer 7 cm from the target. The loop of paper was 7 cm from the front of the pointer, and so they should stop the RMI when they observe any movement of the paper.
The subjects were warned that if they drove too far forward, the "disruptor"
(pointer) would touch the "bomb" (target), causing it to "explode" (i.e. the
buzzer sounded, and the image on the viewscreen was temporarily disrupted).

Since the experience level of the subjects would be considerably different for their second set of trials than for their first, the order in which the subjects performed their trials needs to be considered another factor. This was a between-groups factor.
When the training was complete, the subjects were placed in a darkened booth, with a single colour monitor for viewing and the control panel for the RMI. No effort was made to restrict viewing distances or head positions. Since we were interested in examining the learning of how to interpret the display, the subjects received no training using the remote view.
Each subject, considered a "novice" at this point of the experiment, then performed sixteen consecutive trials of the driving task, half the subjects using MV (the MF "mono first" Group), the other half using SV (the SF "stereo first" Group). The RMI was positioned at a fixed starting distance three metres from the target. The subject was given a three second countdown, and on the word "Go" would drive the RMI forward, aimed at the centre line of the target, until a small movement of the paper on the pointer indicated that the pointer had indeed touched the target. The subjects would indicate that they were satisfied with the position of the RMI by calling out "Stop". Trial times were recorded by the experimenter with a stop-watch.
When all sixteen runs under the first viewing condition (either MV or SV) were complete, the subjects had a five minute break, followed by an additional sixteen runs under the other viewing condition.
The entire procedure lasted approximately two hours.
Pilot studies conducted in preparation for this experiment indicated that several simple SV-independent tasks, similar to the one used in this experiment, could be performed as quickly using MV as with SV once the operators had sufficient experience. It was therefore expected that as experience increased in this experiment, the difference between the MV and SV performance would decrease.
In the following discussion we using two different approaches to analysing this experiment, the first based on the information being presented to the subjects, the second based on the effects that this presentation of information may have on the confidence level of the subjects. From these two approaches are developed two sets of hypotheses. The results of the experiment will then be compared with these hypotheses to provide some understanding of the various performance issues involved.
Once the interpretation of the display is mastered, any performance differences between those using MV and those using SV will depend on the quantity and quality of visual information, as well as on the time required to process that information. With this particular task being SV-independent by design, the richness of the monocular cues should be similar in quantity and quality to the combined monocular and binocular cues of the SV display. The binocular cues are unnecessary and to a certain extent redundant. Therefore little or no long-term performance advantage due to SV is expected.
All of the subjects can be expected to have learned how to perform the telerobotic driving task to some extent during the first part of the experiment, and so task execution times should generally be somewhat shorter in the second half of the experiment than the first. On the other hand, changing from one display system to another should involve a certain amount of "overhead", since the subjects must learn how to interpret the new display. The MF Group ("Mono First") of subjects, who switch from MV to SV, must learn how to extract the information they are already experienced with from the new display, and must decide how to integrate the new binocular depth cues. This could theoretically increase task completion time, although given the natural ability to interpret SV displays that has been suggested above, it is more likely that performance will not change significantly. The SF Group ("Stereo First") of subjects, who switch from SV to MV, must learn how to do without the binocular depth cues they have presumably been relying upon, and must learn how to accomplish the task using only monocular cues. Given the experience they have already have, however, this transition should not be very dramatic, and the task execution time of the SF Group using MV in the second set of trials should generally be shorter than MF Group using MV in the first set of trials.
When the subjects begin the first trial, they are performing an unfamiliar task under unfamiliar circumstances. Under such conditions, it would be reasonable to expect the subjects to approach the task with some degree of caution, and thus perform the task relatively slowly. As they repeat the task, they learn how to interpret the visual cues and how to control the telerobot more accurately, and can therefore perform the task more quickly. Furthermore, it is reasonable to expect that they will be more confident in their abilities, and that this too could contribute to a reduction in trial time.
The literature shows that using SV rather than MV increases subject confidence in ability and performance, even when no performance benefit can be seen (Chavand et al 1986, Lippert et al 1982). This difference in confidence may be particularly relevant for the first few trials of the experiment.
When beginning the second set of trials, the subjects must adapt to using a new display system. The MF Group are given an SV display, with more information and a greater sense of reality (Chavand et al 1986), and will likely experience an increase of confidence in their ability to accomplish a somewhat familiar task. This could serve to decrease task execution time below what was obtained in the first set of trials. The SF subjects, on the other hand, are suddenly deprived of information and are faced with a loss of reality. This contrast could serve to shake their confidence, and their task execution times may increase correspondingly.
Effects of Information Presentation:
1A. Subjects using MV will perform slower than subjects using SV initially, due to extra processing demands in interpreting monocular cues.
2A. Subjects using MV will show significant improvement in performance due to learning how to interpret the display. Subjects using SV will show either a rapid improvement in performance, or very little improvement in performance; that is, they will learn how to interpret the SV display quickly, or they will already know how to interpret it and will show very little learning.
3A. Subjects switching from MV in the first set to SV in the second set will show no change or a small increase in task execution time, due to the overhead associated with re-interpreting the display. Task completion times should then continue to decrease.
4A. Subjects switching from SV in the first set to MV in the second set should show an increase in task execution time, since they must learn how to interpret the MV display. The slower task execution should still be faster than the MV results in the first set, however, since the subjects already have some skill in remote teleoperation and in using video displays for feedback.
Effects of Confidence
1B. Subjects using SV should perform faster than subjects using MV initially, due to greater confidence in their ability to perceive spatial relationships in the remote world. (Similar to 1A.)
2B. (Same as 2A.)
3B. Subjects switching from MV in the first set to SV in the second set should show a decrease in task execution time, since little or no learning of the display is needed, and since the subjects will have increased confidence in their ability to perceive spatial relationships in the remote world. (Different from 3A.)
4B. Subjects switching from SV in the first set to MV in the second set should show a large increase in task execution time due to the need to learn how to interpret the MV display, and a decreased confidence in their ability to perceive spatial relationships in the remote world. (Similar to, but slightly different from 4A.)
Figure 4.3 shows the mean trial times of each
of the two groups of subjects for all 16 trials of both video
conditions. The left half of the figure shows the results of subjects
completely naive to telemanipulation. Although they have had some
practice using the telerobot with direct view (from an "outside-in"
perspective), they have not yet had any experience using the telerobot
with remote viewing (an "inside-out" perspective). The right half of
the figure shows the mean trial times of the same two groups, now
experienced in one video condition, performing the same task using the
other video condition. In other words, the five subjects who started
with the SV continued using the MV system (grey hollow boxes), while
the four subjects who started with MV continued using the SV system
(black solid circles).

Notice that the mean trial times for the first trial for both groups of subjects are approximately the same (18.5 seconds), but that those subjects using SV appear to show a considerable reduction in trial time by the third trial, while those subjects using MV do not show the same improvement until the fifth trial. This tends to support hypotheses 1 and 2 (A and B) as stated in section 4.3.3. This difference is significant at the 10% level (F(1,7)=3.884, p=.089).
An analysis of variance (anova) on the full factorial balanced design, using
Subject as a blocking factor, Order of video system used (MF or SF) as a
between-groups factor, Video system used (MV or SV) as a within-groups factor,
and Trial Number ("Learn16") as a within-groups factor, is given in Table
4.1.
FACTOR: Subject Order Video Learn16 Time LEVELS: 9 2 2 16 288 TYPE : RANDOM BETWEEN WITHIN WITHIN DATA SOURCE SS df MS F p =============================================================== mean 60581.0078 1 60581.0078 607.614 0.000 *** S/O 697.9219 7 99.7031 Order 339.7891 1 339.7891 3.408 0.107 S/O 697.9219 7 99.7031 Video 477.9180 1 477.9180 12.176 0.010 * VS/O 274.7500 7 39.2500 OV 252.9258 1 252.9258 6.444 0.039 * VS/O 274.7500 7 39.2500 Learn16 528.1641 15 35.2109 1.866 0.035 * LS/O 1981.0000 105 18.8667 OL 475.6172 15 31.7078 1.681 0.066 LS/O 1981.0000 105 18.8667 VL 599.9102 15 39.9940 1.908 0.030 * VLS/O 2201.1719 105 20.9635 OVL 248.8203 15 16.5880 0.791 0.685 VLS/O 2201.1719 105 20.9635The marked difference between the two different sets of trials (as seen in Figure 4.3) and the results of anova suggest it would be appropriate to divide the data into the two separate sets for further analysis.
An analysis of variance on the trial times for the first set, using subject as
a blocking factor, and using Trial Number (Learn16) and Video system used as
treatments, is presented in Table 4.2.
FACTOR: Subject VidPart1 Learn16 TimePart1 LEVELS: 9 2 16 144 TYPE : RANDOM BETWEEN WITHIN DATA SOURCE SS df MS F p =============================================================== mean 33672.2500 1 33672.2500 374.315 0.000 *** S/V 629.6992 7 89.9570 VidPart 12.8008 1 12.8008 0.142 0.717 S/V 629.6992 7 89.9570 Learn16 445.7500 15 29.7167 1.095 0.370 LS/V 2848.8984 105 27.1324 VL 458.6016 15 30.5734 1.127 0.342 LS/V 2848.8984 105 27.1324Considering the results of the first set of 16 trials in Figure 4.3, where the subjects have a little previous experience in controlling the telerobot, but none in using the display system for telerobotic control, we find no significant difference in the overall mean trial times of the MV and the SV conditions, nor any evidence of significant learning effects. This is contrary to our expectations (1A and 1B above). One possible explanation is that the change between the "outside-in" training and the "inside-out" performance of the first set was much more significant than was expected, and resulted in a great deal of variability. A second possibility is that the subjects were being very cautious during their first set of trials, in order to avoid making any errors, since the importance of avoiding errors was stressed at the start of the experiment. If this were the case, then there need not be much difference in trial time at all. This issue will be discussed further in the next chapter.
Considering the results of the second set of trials, where the subjects are no
longer totally naive with telemanipulation, but are naive with that
particular display type, we find the following results:
FACTOR: Subject VidPart2 Learn16 TimePart2 LEVELS: 9 2 16 144 TYPE : RANDOM BETWEEN WITHIN DATA SOURCE SS df MS F p =============================================================== mean 27087.6738 1 27087.6738 552.854 0.000 *** S/V 342.9727 7 48.9961 VidPart 878.9160 1 878.9160 17.938 0.004 ** S/V 342.9727 7 48.9961 Learn16 351.9961 15 23.4664 1.848 0.037 * LS/V 1333.2754 105 12.6979 VL 596.1660 15 39.7444 3.130 0.000 *** LS/V 1333.2754 105 12.6979Considering these results regarding the second set of trials in Figure 4.3, we find that there is a significant difference in the overall mean trial time between the two video conditions, and that each video condition shows very different kinds of learning. There is considerably less variability, and the trends are much more obvious. The initial difference in performance and the different learning rates suggested in Hypotheses 1 and 2 (A and B) are apparent.
Considering the possible role that subject confidence might play, we see that the MF Group, who switched from MV to SV, shows a significant decrease in task completion time: the final trial of the first set was completed in a mean time of 15.25 seconds, while the first run of the second SV set was completed in a mean time of only 7.75 seconds; i.e. mean trial time was cut almost in half. This clearly supports expectation 3B and belies expectation 3A, lending credence to the argument of the important role which operator confidence plays in these results. Furthermore, the small upward trend in the SV results in the second set of trials may be due to the confidence level of the MF Group subjects, which is perhaps exaggerated as they switch to an SV display, returning to a more appropriate level.
(Examining the error data provides little insight into this issue, since the error rates were low, and no significant factors were found. On the other hand, this suggests that although the MF Group experienced an increase in confidence when switching from the MV to the SV, they were not over-confident, that is, they did not decrease their task completion times at the expense of increased errors.)
The SF Group, who switched from SV to MV, shows a large increase in task completion time: the final trial of the first SV set was completed in a mean time of 11.6 seconds, while the first trial of the second MV set was completed in a mean time of 23 seconds. The mean trial time was almost doubled, and was more than 5 seconds slower than the corresponding first MV trial in the first set (for the MF Group). (This difference in the first trial times of the two different MV sets is not significant, however (F(1,7)=1.337, p=.286).
Learning of two kinds was expected: (i) learning how to control the RMI, and (ii) learning how to use the display. The first kind of learning is a global type of learning, and can be expected to carry over from the first set of trials to the second set. That such learning did take place is suggested by the decreased variability of the results in the second set of trials. In light of this, the fact that for the first few trials the MV performance in the second set is at least as poor as the MV performance in the first set is contrary to Hypothesis 4A, but in agreement with Hypothesis 4B. This suggests that, in addition to having to learn how to use the MV display, the SF Group did indeed suffer from a loss of confidence in their abilities, which temporarily made their performance even worse than it should have been.
Note that this effect of confidence (if indeed real, as the data suggest) in this situation is small and transient, and is in addition to the need to learn how to use the MV display.
The results of this experiment support Hypothesis 2 (A and B), that subjects using SV will rapidly settle down to a steady-state performance, with or without an initial learning phase, and that subjects using MV will show a much more gradual improvement before reaching a steady-state performance. In the first set of trials, the SV results show a very quick improvement, while in the second, they appear close to a steady-state performance from the beginning. The MV results in both sets show a much more gradual approach to a steady-state performance.
The results provide support for Hypothesis 3B, which suggests that subjects switching from MV to SV will show a marked improvement in performance due to an increase in confidence, and belies Hypothesis 3A which suggests that performance should not change, since the added binocular cues are largely redundant and not necessary for the task at hand.
The results also provide support for Hypothesis 4B, which says that subjects switching from SV to MV will show a large increase in task execution time due both to a need to re-interpret the display and decreased confidence in their perception of the remote world. The results are somewhat in contradiction to Hypothesis 4A, which suggested that task execution times in the second set of trials should be shorter than the task execution times in the first set of trials due to a learning of the factors involved in controlling the telerobot.
In general, then, this experiment confirmed the hypothesis that SV can be used with little or no training, while considerably more training is necessary to use MV displays. The experiment also demonstrated the important role subject confidence can play in performance.
In particular, the second experiment was designed to examine the issue of skill acquisition within the context of a highly repeatable task. Furthermore, in order to see if the benefits of SV were dependent on the difficulty of the task, the highly repeatable task was designed so as to have well-calibrated different difficulty levels.
In order to fulfil the desired characteristics, the task was based on the procedure for using the X-Ray Unit for EOD (see section 1.1.3), using a Fitts' Law approach (Wickens, 1984) to control and calibrate the level of difficulty.
The task was to drive the RMI a distance of 3 metres forward, and to lower the mock "X-Ray plate" between two "bombs", set a particular distance apart. The X-Ray plate was simulated by using the pointer from Experiment 1, but hanging suspended from the end of the RMI's forearm, so that it could swing freely. The two "bombs" were flat black briefcases.
The operators began each condition with the forearm of the RMI pointed upward, so that the target was not visible on the monitor screen. The operators had to lower the arm until the target was visible, drive forward until the hanging pointer was between the two briefcases, and finally lower the forearm until the buzzer on the pointer sounded, indicating the end of the trial. (See Figure 5.1)
The separation of the suitcase for the training session was 24 cm. Employing a Fitt's Law paradigm, the separation between the suitcases was varied to control the difficulty of the task. The separations used for the experiment were 8 cm, 16 cm, 32 cm, and 64 cm. By the log2() relationship between the separation and the Fitts' Law Index of Difficulty, each separation is one log unit more difficult than the next larger one, and trial completion time should be a linear function of the Index of Difficulty.
The subjects were told that the "bombs" were touch sensitive, so that touching either suitcase would be counted as an error. This was done because pilot studies revealed that errors such as accidently touching either suitcase would very often require the operator to make complex recovery actions. More importantly, the design of the control panel for the RMI was ergonomically poor: the toggle switch controlling the movement of the forearm of the RMI was upside down with respect to stereotypes of control-response compatibility of most of the subjects. When most subjects in the pilot studies and in this experiment made an error such as lowering the arm onto one of the suitcases rather than between them, thus sounding the buzzer prematurely, their quick, instinctive response was to pull the toggle switch towards themselves, in the hopes of raising the forearm of the robot. Unfortunately, this control action causes the RMI's forearm to lower even further, resulting in a disruption of the experimental apparatus. Because of the massive interference in task execution times caused by errors, they had to be considered separately from the successful runs. Therefore, every time a subject made an error, an additional run was added so that the total number of successful error-free runs was constant for all subjects.
Note that this poorly designed interface did not cause significantly
more errors; rather, it interfered with recovery from other errors.

In Part A, the factors being examined were (1) task learning, by having the subjects repeat the same task 16 times consecutively; (2) video system, either SV or MV; and (3) task difficulty and SV-dependence, with the briefcases separated by one of four distances (8 cm, 16 cm, 32 cm, 64 cm). A full-factorial design was used, so that each subject performed 16 * 2 * 4 error-free trials, plus an unknown (at the design stage of the experiment) number of trials with errors. This meant that subjects with a high error rate performed more runs in total than those subjects with low error rates. Although this may influence the results somewhat, since some subjects therefore had more experience and practice than the others, it was felt that the additional experience from making errors would not contribute a great deal to the performance of the task, since errors were so disruptive.
In Part B, the factors being examined were (1) video system, either SV or MV; and (2) task difficulty and SV-dependence, with one of the four separations. Each of the 2 * 4 conditions was repeated 8 times, so each subject performed 2 * 4 * 8 error-free trials in Part B of the experiment. Again, those subjects with high error rates performed more trials than those with low error rates.
It took between two and three hours to complete both Part A and B using a single video condition, so each subject participated in the experiment on two separate non-consecutive days, performing both Part A and then Part B on each day, using the same video condition (MV or SV).
At the beginning of the experiment, each subject receives a certain amount of training in order to ensure all start with approximately the same ability. Throughout the course of Part A of the experiment, the subjects are practicing the task, and are most likely considerably more skilled for Part B of the experiment. On the second day of the experiment, they are presumably even more skilled.
By using a balanced design, it was expected that the grand effect of experience throughout the experiment could be average out. Within each group this is a reasonable expectation. However, given the results of Experiment 1, it is necessary to consider the two different groups of subjects as distinct, and their results should not be pooled without first establishing whether or not transfer effects are relevant.
* Order (2 levels: MF ("Mono First") and SF): between groups
* Video (2 levels: MV ("Mono Video") and SV): within groups
* Difficulty (4 levels): within groups
* Learning (16 levels): within groups
Part B of Experiment Two had the following factors:
* Order (2 levels: MF and SF): between groups
* Video (2 levels: MV and SV): within groups
* Difficulty (4 levels): within groups
* Replication (8 repetitions in a randomised order)
This was followed by a familiarisation period with the controls of the RMI. The subjects were given an opportunity to drive the RMI around the laboratory for several minutes, until they were able to manoeuvre the telerobot comfortably. The RMI was then placed in the standard starting position, and the briefcases were set to the training separation of 24 cm. Using direct view, the experimental task was demonstrated to the subjects by the experimenter. Still using direct view, the subjects were then made to practice the experimental task, with coaching on technique by the experimenter, until they were able to perform four consecutive trials in under six seconds.
Each trial was timed by the experimenter using a stopwatch, and began with a 3 second countdown. On the word "Go", the subjects would begin lowering the forearm of the RMI, and driving the robot towards the briefcases. As they approached the briefcases, the subjects would slow down, and adjust the height of the forearm to avoid accidently touching them. When satisfied that the pointer was hanging between the two suitcases, the subjects then lowered the robot arm until a buzzer sounded. The pointer was spring loaded, and would sound the buzzer when sufficiently compressed. Suspended between the two suitcases, out of view of the subjects, was a flat board for the pointer to touch. Without this board, the arc described by the lowering of the robot arm would cause the pointer to accidently touch a suitcase when the 8 cm separation was used, unless continual adjustment of the position of the RMI was made, which would result in an error. In order to avoid this problem, the arc angle was reduced sufficiently to minimise the unwanted displacement of the pointer.
When the subjects were able to perform four consecutive error-free trials in under six seconds using direct view, they repeated the above familiarisation period and training procedure using the remote view with the training separation of 24 cm, using the same training criteria: four consecutive error-free trials with trial times less than six seconds.
Using a three second countdown to mark the start of the trial, the subjects would drive the RMI forward and lower the pointer, hopefully between the briefcases, until the buzzer sounded. The subjects repeated the task until a total of 16 successful trials for each of the four separations were complete. They had a short break of approximately one minute between sets of trials.
When the subjects completed all four separations, that is, 64 successful trials, they would be given a longer break, of approximately five minutes, to rest their eyes and their hands. A few subjects complained of tension and eyestrain, and they were given longer breaks.
Index of Difficulty = ID = log2 BBC(F( 2 * distance ,width)),
where distance means the distance from the starting point to the middle of the target, and width refers to the width of the target.
Fitts said that the Movement Time (MT) is a linear function of the Index of Difficulty:
Movement Time = MT = a + b * ID
Much research has been done to improve and extend Fitts' work, but his original
formulation has proven robust and applicable to a wide variety of tasks. To
that end, the separation of the targets is translated into Index of Difficulty
bits:
Width Distance Index of
Difficulty (ID)
8 cm 3 m 6.2 bits
16 cm 3 m 5.2 bits
32 cm 3 m 4.2 bits
64 cm 3 m 3.2 bits
It is important to stress that in this experiment both the trial execution time and the number of faulty trials are relevant measures of performance. This is a task subject to a speed accuracy trade-off. In order to avoid making the subjects very cautious, the importance of speed was emphasized, while on the other hand the subjects were cautioned about the "dangers" of making errors (not the least of which was to have to repeat the trial). Because it is impossible to know exactly where on the speed-accuracy trade-off curve the subjects are at any given moment, it is important to examine both sets of results.
A previous study investigating the effects of degraded image quality on a telerobotic task using MV and SV found that, for the easiest condition, those subjects using MV showed considerable learning, while those using SV displayed no clear learning effect. (Pepper et al, 1981) The authors argue that since the task was designed to have very strong monocular cues (i.e. be near the SV-independent end of the spectrum), it is not unexpected that SV shows no improvement; the main improvement for the MV condition is likely through acquiring skill in using the MV display, skill they already had for SV displays. (Pepper et al, 1981)
Regarding the important role that confidence played in the first experiment, it was felt that by performing the two experimental video conditions on two separate days would diminish this effect, and that any remaining confidence effects, seen to be fairly transient and not very large in Experiment One, would likely be "washed out" during the second day's training period.
And finally, since in Part A of the experiment the subjects would know what the target separation was for all trials save the first due to the consecutive presentation, it was felt that performance would be somewhat better than for Part B of the experiment, where the subjects would have to figure out what the target separation time was for every trial.
Based on these ideas and those explored in previous chapters, the hypotheses posed for this experiment were:
Experiment Two Hypotheses
1. Subjects using SV will show an initial performance advantage over those using MV.
2. The performance difference between MV and SV will decrease as the subjects become more experienced, more so for the low difficulty task conditions than the high difficulty task conditions.
3. Performance will be better during the repetitive trials of Part A than the random trials of Part B.
An analysis of variance of these data is shown in
Table 5.1. The three-way interaction of Order, Day and Video is
significant at the 10% level (F(1,6)=4.402, p=.081). As we can see
from Figure 5.2, the only difference between
the two groups of subjects occurs on the first day, when the subjects
are first learning how to use the telerobotic device remotely. Here we
see that those subjects using SV take an average of 16 trials to pass
the training criteria, while those using MV take an average of 28.
This is clear support for Hypothesis 1, and suggests that SV can be of
great aid for novices to telerobotics.

FACTOR: Subject Order Day View Num LEVELS: 8 2 2 2 32 TYPE : RANDOM BETWEEN WITHIN WITHIN DATA SOURCE SS df MS F p =============================================================== mean 7657.0313 1 7657.0313 84.813 0.000 *** S/O 541.6875 6 90.2813 Order 52.5313 1 52.5313 0.582 0.474 S/O 541.6875 6 90.2813 Day 935.2813 1 935.2813 6.044 0.049 * DS/O 928.4375 6 154.7396 OD 16.5313 1 16.5313 0.107 0.755 DS/O 928.4375 6 154.7396 View 52.5313 1 52.5313 0.734 0.424 VS/O 429.1875 6 71.5313 OV 94.5313 1 94.5313 1.322 0.294 VS/O 429.1875 6 71.5313 DV 0.7813 1 0.7813 0.017 0.901 DVS/O 279.4375 6 46.5729 ODV 205.0313 1 205.0313 4.402 0.081 DVS/O 279.4375 6 46.5729


FACTOR: Subject Order Video Sep Trial Time LEVELS: 8 2 2 4 16 1024 TYPE : RANDOM ETWEEN WITHIN WITHIN WITHIN DATA SOURCE SS df MS F p =============================================================== mean 22391.0937 1 22391.0937 623.746 0.000 *** S/O 215.3867 6 35.8978 Order 79.3242 1 79.3242 2.210 0.188 S/O 215.3867 6 35.8978 Video 195.6973 1 195.6973 31.930 0.001 ** VS/O 36.7734 6 6.1289 OV 140.9453 1 140.9453 22.997 0.003 ** VS/O 36.7734 6 6.1289 Sep 1035.9531 3 345.3177 22.580 0.000 *** SS/O 275.2695 18 15.2928 OS 3.8711 3 1.2904 0.084 0.968 SS/O 275.2695 18 15.2928 VS 46.0098 3 15.3366 5.640 0.007 ** VSS/O 48.9473 18 2.7193 OVS 1.0469 3 0.3490 0.128 0.942 VSS/O 48.9473 18 2.7193 Trial 47.6445 15 3.1763 2.038 0.021 * TS/O 140.2969 90 1.5589 OT 22.5586 15 1.5039 0.965 0.498 TS/O 140.2969 90 1.5589 VT 17.4219 15 1.1615 0.954 0.509 VTS/O 109.5293 90 1.2170 OVT 22.4160 15 1.4944 1.228 0.266 VTS/O 109.5293 90 1.2170 ST 54.6797 45 1.2151 1.107 0.306 STS/O 296.2344 270 1.0972 OST 36.6465 45 0.8144 0.742 0.886 STS/O 296.2344 270 1.0972 VST 48.6582 45 1.0813 0.920 0.620 VSTS/O 317.2793 270 1.1751 OVST 58.1074 45 1.2913 1.099 0.318 VSTS/O 317.2793 270 1.1751Considering for a moment just the Trial Times of Experiment Two Part A, we find that on the first day of the experiment ( Figure 5.3), those subjects using SV performed consistently better than those using MV, at all difficulty levels (target sizes). This again confirms Hypothesis 1. On the second day of the experiment ( Figure 5.4) there is no apparent difference between MV and SV except at the most difficult level.
In order to observe any trends, these "noisy" data are grouped into four sets
of four trials, as with Experiment 1, and presented below in Figures 5.5 and
5.7. Furthermore, since subjects are able to trade speed for accuracy, the
corresponding error rates are shown in Figures 5.6 and 5.8.


Looking at the easiest condition, Index of Difficulty of 3.2, in Figures 5.5 and 5.6, we see that the trial times for SV are considerably shorter than those for MV. Furthermore, there is a clear downward trend in the MV times. The error rates for both MV and SV are relatively low and approximately equal for both MV and SV. The advantage of SV is decreasing throughout the set of 16 trials.
At ID = 4.2, the next harder condition, we see that the trial times for both MV and SV are slower than for the previous condition, as expected. MV shows a decreasing trial time throughout the 16 trials, with a consistent error rate. SV shows a slightly decreasing trial time, but has an increasing error rate, suggesting that the subjects are exploring the speed-accuracy trade-off more than exhibiting signs of learning. Again, the advantage of SV appears to decrease throughout the set of 16 trials.
At the Index of Difficulty level of 5.2, there is some indication of learning with MV (the second group of 4 trials have a decreased error rate while trial time remains the same; the third and fourth groups show fewer errors still and slower trial times which might suggest a simple speed-accuracy trade-off, or might indicate continued improvement in performance), while those using SV show no particular learning trend (decreasing error rates are matched by slower trial times, suggesting a speed-accuracy trade-off). The advantage of SV does not appear to decrease throughout this set of 16 trials, unlike the previous two conditions.
At the Index of Difficulty level of 6.2, both MV and SV show a strong learning trend in the error. The MV time drops for the second grouping of four trials, and then rises for the two groups, while the error rate continues to drop, suggesting some exploration of the speed-accuracy trade-off. SV shows a small decrease in trial time with a large drop in error rate at first, followed by constant times and an increasing error rate. This could be an indication of fatigue, or simply a statistical artifact. Again, the SV advantage does not appear to decrease throughout this set of 16 trials.
In summary, then, we find that for the easier conditions there is considerable improvement in performance (as measured by both time and error rate) for the MV condition, with very little change in the SV condition. At the higher levels of difficulty the learning trends are less obvious for both MV and SV, and there appears to be more active exploration of the speed-accuracy trade-off.
Furthermore, the advantage of SV decreases as the subjects become more experienced, more so for the easier conditions. This is consistent with hypothesis 2.
We now consider the results of the second day of the experiment. The subjects
are very much more experienced with the task on this day, albeit with the other
type of video.


At the Index of Difficulty level of 3.2, there is little difference in the times, but those using SV had a lower error rate for the first few trials. Those using MV showed some learning in both time and error rate, so that there was very little difference in performance by the end of the set of 16 trials.
At the Index of Difficulty level of 4.2, those using MV showed little change. Those using SV appear to get worse briefly, then slightly better. The small SV advantage at the beginning vanishes by the end of the 16 trials.
At the Index of Difficulty level of 5.2, those using MV curiously perform considerably better during the first set of four trials than the rest, getting considerably worse and then a little better. This suggests that the initial set of trials were unusually good, a statistical anomaly. Those using SV get consistently better, trading off time for errors a little. Although the performance of MV is better than SV at first, this situation is quickly reversed to the expected situation with SV performance being consistently better than MV. As on day 1, the SV advantage does not appear to decrease with experience.
At the Index of Difficulty (ID) level of 6.2, those using MV show consistent learning, predominantly in error rates. Those using SV performed better at first than in the rest of the set of trials, similar to the MV performance at ID = 5.2. Here the SV advantage does decrease, but does not vanish, with experience.
FACTOR: Subject Video Difficulty Learn4 Day1Times LEVELS: 8 2 4 4 128 TYPE : RANDOM BETWEEN WITHIN WITHIN DATA SOURCE SS df MS F p =============================================================== mean 3260.7100 1 3260.7100 442.966 0.000 *** S/V 44.1665 6 7.3611 Video 65.5120 1 65.5120 8.900 0.025 * S/V 44.1665 6 7.3611 Diff 123.6892 3 41.2297 13.903 0.000 *** SS/V 53.3777 18 2.9654 VD 3.3704 3 1.1235 0.379 0.769 SS/V 53.3777 18 2.9654 Learn4 7.9934 3 2.6645 3.746 0.030 * LS/V 12.8037 18 0.7113 VL 2.2695 3 0.7565 1.064 0.389 LS/V 12.8037 18 0.7113 DL 4.4180 9 0.4909 0.927 0.510 SLS/V 28.5981 54 0.5296 VDL 3.0137 9 0.3349 0.632 0.764 SLS/V 28.5981 54 0.5296
FACTOR: Subject Video Difficulty Learn4 Day2Times LEVELS: 8 2 4 4 128 TYPE : RANDOM BETWEEN WITHIN WITHIN DATA SOURCE SS df MS F p =============================================================== mean 2372.3132 1 2372.3132 754.210 0.000 *** S/V 18.8726 6 3.1454 Video 3.2312 1 3.2312 1.027 0.350 S/V 18.8726 6 3.1454 Sep 135.5498 3 45.1833 29.387 0.000 *** SS/V 27.6753 18 1.5375 VS 9.1101 3 3.0367 1.975 0.154 SS/V 27.6753 18 1.5375 Learn4 2.7700 3 0.9233 2.103 0.136 LS/V 7.9026 18 0.4390 VL 1.5083 3 0.5028 1.145 0.358 LS/V 7.9026 18 0.4390 SL 2.0579 9 0.2287 1.022 0.435 SLS/V 12.0869 54 0.2238 VSL 1.8606 9 0.2067 0.924 0.512 SLS/V 12.0869 54 0.2238These tables confirm the statistical significance of our observations. There is indeed a consistent benefit from SV on Day 1 at all difficulty levels being seen in reduced trial times, but there is no indication of a similar benefit on Day 2 in the trial times.
FACTOR: Subject Video Difficulty Learn4 #ErrDay1 LEVELS: 8 2 4 4 128 TYPE : RANDOM BETWEEN WITHIN WITHIN DATA SOURCE SS df MS F p =============================================================== mean 634.5703 1 634.5703 44.313 0.001 *** S/V 85.9219 6 14.3203 Video 2.8203 1 2.8203 0.197 0.673 S/V 85.9219 6 14.3203 Difficu 432.5234 3 144.1745 18.685 0.000 *** DS/V 138.8906 18 7.7161 VD 25.5234 3 8.5078 1.103 0.374 DS/V 138.8906 18 7.7161 Learn4 40.3984 3 13.4661 3.493 0.037 * LS/V 69.3906 18 3.8550 VL 3.1484 3 1.0495 0.272 0.845 LS/V 69.3906 18 3.8550 DL 54.3828 9 6.0425 1.721 0.107 DLS/V 189.5469 54 3.5101 VDL 17.8828 9 1.9870 0.566 0.819 DLS/V 189.5469 54 3.5101
FACTOR: Subject Video Difficulty Learn4 #ErrDay2 LEVELS: 8 2 4 4 128 TYPE : RANDOM BETWEEN WITHIN WITHIN DATA SOURCE SS df MS F p =============================================================== mean 381.5703 1 381.5703 96.971 0.000 *** S/V 23.6094 6 3.9349 Video 13.1328 1 13.1328 3.338 0.117 S/V 23.6094 6 3.9349 Difficu 227.8984 3 75.9661 13.330 0.000 *** DS/V 102.5781 18 5.6988 VD 19.9609 3 6.6536 1.168 0.350 DS/V 102.5781 18 5.6988 Learn4 18.3359 3 6.1120 2.806 0.069 LS/V 39.2031 18 2.1780 VL 13.8984 3 4.6328 2.127 0.132 LS/V 39.2031 18 2.1780 DL 24.8203 9 2.7578 0.847 0.577 DLS/V 175.8594 54 3.2567 VDL 94.1328 9 10.4592 3.212 0.003 ** DLS/V 175.8594 54 3.2567As expected from the figures, there is no consistent effect due to video, but there is a consistent difference due to the task difficulty: the harder the task, the more errors are made. There is also a significant trend, or "learning", reported for Day 1. For Day 2, the same effect is significant at the 10% level. Given the strange behaviour of MV at the ID=5.2 level and SV at the ID=6.2 level, this lowering of significance is not surprising.
In general, then, the results of Part A of the experiment strongly support Hypotheses 1 and 2.
Figure 5.9 shows the results of the trials
times of Part B of the experiment as a function of the Index of
Difficulty. Figure 5.10 shows the error rate
results for the same, where the error rate is defined as being the
number of trials with errors completed in the course of completing the
8 successful trials.


Considering the second day of the experiment, we find that performance for both MV and SV is much improved, thanks to the large amount of experience received on the first day. There is no significant difference between the MV and SV conditions with regards to trial times, although those using MV appear to be somewhat faster, at the expense of higher error rates. The only significant difference between the MV and SV performance is at the highest level of difficulty, and is found in the difference in error rate. The SV advantage decreases with experience, though less so at the higher difficulty levels.
An analysis of variance on these results gives the statistics listed in Tables 5.8 and 5.9. Again these analyses confirm the observations made from the graphs above.
In order to consider the differences in performance between Part A of the experiment and Part B of the experiment, it would be useful to combine the trial time and error data into a single graph. Furthermore, since the results of Part A exhibit changes within each set of 16 trials, it is important to consider only the "steady-state" results. For the purpose of this comparison, the last 8 trials of each set are being used.
Figures 5.12 and 5.13 show percent errors on the vertical axes, and trial time
on the horizontal axes. This way the performance of each particular
experimental condition is represented by a single point. For example, Figure
5.12 shows the performance of the subjects on both days of the experiment. The
performance, consisting of both trial time and percent errors, gets worse as it
moves further from the origin.
FACTOR: Subject Order Video Sep BTime LEVELS: 8 2 2 4 64 TYPE : RANDOM BETWEEN WITHIN WITHIN DATA SOURCE SS df MS F p =============================================================== mean 1461.4976 1 1461.4976 291.827 0.000 *** S/O 30.0486 6 5.0081 Order 2.4093 1 2.4093 0.481 0.514 S/O 30.0486 6 5.0081 Video 27.4589 1 27.4589 30.033 0.002 ** VS/O 5.4857 6 0.9143 OV 0.2848 1 0.2848 0.311 0.597 VS/O 5.4857 6 0.9143 Sep 28.0372 3 9.3457 28.084 0.000 *** SS/O 5.9901 18 0.3328 OS 0.7675 3 0.2558 0.769 0.526 SS/O 5.9901 18 0.3328 VS 0.3687 3 0.1229 1.252 0.321 VSS/O 1.7673 18 0.0982 OVS 0.8909 3 0.2970 3.024 0.057 VSS/O 1.7673 18 0.0982
FACTOR: Subject Order Video Sep %ErrB LEVELS: 8 2 2 4 64 TYPE : RANDOM BETWEEN WITHIN WITHIN DATA SOURCE SS df MS F p =============================================================== mean 202500.0000 1 202500.0000 47.578 0.000 *** S/O 25537.1093 6 4256.1851 Order 244.1406 1 244.1406 0.057 0.819 S/O 25537.1093 6 4256.1851 Video 20664.0625 1 20664.0625 7.960 0.030 * VS/O 15576.1718 6 2596.0286 OV 3525.3906 1 3525.3906 1.358 0.288 VS/O 15576.1718 6 2596.0286 Sep 227441.4370 3 75813.8125 23.147 0.000 *** SS/O 58955.0625 18 3275.2813 OS 322.2188 3 107.4063 0.033 0.992 SS/O 58955.0625 18 3275.2813 VS 32871.0625 3 10957.0205 4.420 0.017 * VSS/O 44619.1250 18 2478.8403 OVS 1806.6875 3 602.2292 0.243 0.865 VSS/O 44619.1250 18 2478.8403The hollow circles indicate the MV performance on the first day of the experiment. The hollow boxes represent the SV performance. For the three easy conditions, we see that the difference in performance is predominantly in trial time, with MV trial times being longer than SV trial times. For the difficult condition (ID=6.2), however, the large performance difference is almost entirely in the percent errors. On day 2 of the experiment, we find that the performance difference for the three easier conditions is much smaller than on the previous day, and consistently better. The difference at the most difficult level again is seen almost entirely in the error rate.
This is the same as we saw above. If we now compare these results to those shown in Figure 5.13, and look at the first day of the experiment, we find that the performance in Part A Runs 9-16 is slightly better than that in Part B except for the MV performance at the ID = 5.2 level, which is considerably worse for Part A than Part B.
For the second day of the experiment, we find that performance for the two
easiest conditions is very similar for both parts of the experiment.
Performance at the ID = 5.2 level shows a similar result to the first day,
where MV is worse for Part A, but SV is very similar on both days. Finally, at
the ID = 6.2 level, the SV is again very similar in performance, but Part A has
a marked advantage.


The anomalous behaviour at the ID=5.2 level for Part A of the experiment might possibly be due to an indecision observed in several subjects regarding how to treat the condition. Observation by the author of the subjects' behaviour during the course of the experiment suggested that many could not decide whether the 16 cm separation was easy, and should therefore be approached quickly, or difficult, and should therefore be approached slowly. As a result of trying to accomplish the task quickly, their error rate rose considerably. When approaching the same condition in Part B of the experiment, however, the subjects seemed much less uncertain about their approach, and used a fairly conservative, and much more successful, technique. Unfortunately, there is no way to verify these speculations.
Experiment Two Hypotheses
1. Subjects using SV will show an initial performance advantage over those using MV.
2. The performance difference between MV and SV will decrease as the subjects become more experienced, more so for the low difficulty task conditions than the high difficulty task conditions.
3. Performance will be better during the repetitive trials of Part A than the random trials of Part B.
The results of this experiment gave clear supporting evidence for the first two of these hypotheses. The results of the first day of trials supported Hypothesis 3 fairly consistently. However, the easy conditions of the experiment showed little difference on the second day of the experiment. This could be due to the fact that the performance of the subjects in the easy condition for both Part A and Part B was approaching the limit of the RMI, and so differences in difficulty were insignificant. At the highest level of difficulty, however, performance was consistently better during the repetitive trials of Part A than the randomly presented trials of Part B.
The first experiment conducted examined the first issue. Using a task that had very little demand for binocular depth cues (i.e. was SV-independent), it was found that there was a short-lived benefit in performance for SV that quickly vanished as the operators learned how to use the monocular cues of the MV display. Furthermore, the first experiment provided evidence to suggest that SV can be used effectively with little or no training, while MV requires a period of adjustment and learning.
The first experiment also revealed an interesting transient effect that changing from one video condition to another can have on performance. Those who change from an SV to a MV display show a temporary but dramatic drop in performance, while those who change from a MV to an SV display show a large improvement in performance. The results of the experiment and the literature suggest that the differing appearances of "reality" of the two displays may affect the confidence of the operators in their abilities to perform the task, and so therefore affect their performance.
The second experiment examined the second issue, that of how the transience of the benefits of SV are a function of the difficulty of the task and the dependence on binocular depth cues. It showed that the benefits of SV, even after a great deal of practice, will still be apparent for difficult tasks, long after the benefits have faded for easier tasks.
The implications for telerobotics and EOD are obvious. Given the nature of the most telerobotics applications and all EOD tasks, operators have only a very few chances to accomplish the task correctly. The performance benefits of SV, even though they fade with practice for highly repeatable tasks, should be very strongly evident in these single-attempt situations. Furthermore, given that operators can learn to use an SV display much more quickly than a MV display, operators should require less initial training and less constant practice in order to maintain their skills at a suitable level.
Aries Arditi, "Binocular Vision", Chapter 23 of Handbook of Human Perception and Performance, edited by Kenneth R Boff, Lloyd Kaufman, James P Thomas; John Wiley & Sons, New York, 1986
John Baker, "Generating images for a time-multiplexed stereoscopc computer graphics system", SPIE Vol 761 True 3D Imaging Techniques and Display Technologies, 44-52, 1987
K R Boff, and J E Lincoln, Engineering Data Compendium: Human Perception and Performance, AAMRL, Wright-Patterson AFB, Ohio, 1988
James F Butterfield "Autostereoscopy delivers what holography promised", SPIE Vol 199 Advances in Display Technology, 42-46, 1979
F Chavand, E Colle, JP Gaillard, A Mallem, JP Stomboni "Visual assistance to the operator in teleoperation and supervision situations", Proc Int Symp Teleoperation and Control, 237-248, July 1988
Robert E Clapp, "Stereoscopic Displays and the human dual visual system", SPIE Vol 624 Advances in Display Technology VI, 41-52, 1986
Robert E Clapp, "Stereoscopic Perception", SPIE Vol 761 True 3D Imaging Techniques and Display Technologies, 79-87, 1987
David Drascic, "Skill Acquisition and Task Performance in Teleoperation Using Monoscopic and Stereoscopic Video Remote Viewing", Human Factors Society 35th Annual Meeting, 1991a
David Drascic, Paul Milgram "Positioning Accuracy of a virtual stereographic pointer in a real stereoscopic video world", SPIE Vol 1457: Stereoscopic Displays and Applications II, 1991b
A A Dumbreck, C W Smith, S P Murphy "The Development & Evaluation of a Stereoscpoic Television System for Use in the Nuclear Industry", Int'l Workshop on Nuclear Robotic Technologies and Applications, University of Lancaster, June/July 1987
Joel Fajans "Three-dimensional display", SPIE Vol 199 Advances in Display Technology, 23-28, 1979
S S Fisher, M McGreevy, J Humphries, W Robinett "Virutal Environment Display System", ACM 1986 Workshop on Interactive 3D Graphics, Chapel Hill, North Carolina, 1986
Allan H Frey, "An evaluation of holograms in training and as job performance aids", SPIE Vol 615 Practical Holography, 57-63, 1986
Julius J. Grodski, Paul Milgram, David Drascic "Real and virutal world stereoscopic displays for teleoperation", NATO Defence Research Group Seminar: Robotics in the Battlefield, 6-8 March 1991
John H Harshbarger "Structure of the interlaced television raster", SPIE Vol 457 Advances in Display Technology IV, 80-84, 1984
Stephen J Hart, Michael N Dalton "Display holography for medical tomography", SPIE Vol 1212 Practical Holography IV, 116-135, 1990
Edwin R Jones Jr., A Porter McLaurin, LeConte Cathey "VISIDEP (TM): visual image depth enhancement by parallax induction", SPIE Vol 457 Advances in Display Technology IV, 16-19, 1984
Won S Kim, Munehisa Takeda, L W Stark "On-the-screen visual enhancements for a telerobotic vision system", Proceedings IEEE Systems, Man, and Cybernetics Conference, 126-130, 1988
Won S Kim, F Tendick, L W Stark "Visual Enhancements in Pick-and-Place Tasks: Human Operatosr Controlling a Simulated Cylindrical Manipulator", IEEE Journal of Robotics and Automation, v RA-3, no 5, pp 418-425, 1987
Bruce Lane "Stereoscopic displays", SPIE Vol 367 Processing and Display of Three-Dimensional Data, 20-32, 1982
Thomas M Lippert, David L Post, Robert J Beaton "A study of direct distance estimations to familiar objects in real-space, two-dimensional, and stereographic displays" , Proceedings of the Human Factors Society 26th Annual Meeting, 324-328, 1982
Lenny Lipton "Factors affecting `ghosting' in time-multiplexed plano-stereoscopic CRT display systems", SPIE Vol 761 True 3D Imaging Techniques and Display Technologies, 75-78, 1987
Lenny Lipton, Lhary Meyer "A Flicker-Free Field-Sequential Stereoscopic Video System", SMPTE Journal, v 93, n 11, 1047-1051, 1984
Colin Macilwain "Remote control robots seen through 3D spectacles" , THE ENGINEER, 35, 8 June 1989
Douglas E McGovern, "Current developments needs in the control of teleoperated vehicles", SANDIA National Laboratories Report SAND87-0646 UC-15, Albuquerque, New Mexico, August 1987a
Douglas E McGovern, "Experiences in Teleoperation of Land Vehicles", SANDIA National Laboratories Report SAND87-1908 UC-15, Albuquerque, New Mexico, October 1987b
H B Meieran "Robotics and Teleoperator-Controlled Devices", Health Physics, v 55 n 2, 215-222, 1988
John O Merrit "Visual-motor realism in 3D teleoperator display systems" SPIE Vol 761 True 3D Imaging Techniques and Display Technologies, 88-93, 1987
John O Merritt, "Visual tasks requiring 3-D stereoscopic displays", SPIE Vol 462 Optics in Entertainment II, 56-59, 1984
John O Merritt, "Often-overlooked advantages of 3-D displays", SPIE Vol 902 Three-Dimensional Imaging and Remote Sensing Imaging, 46-47, 1988
H B Meieran "Robotics and teleoperator-controlled devices", Health Physics, v 55. n 2, 215-222, Aug 1988
Paul Milgram, David Drascic, Julius Grodski "Enhancement of 3-D video displays by means of superimposed stereo-graphics", Human Factors Society 35th Annual Meeting, 1991
Paul Milgram, David Drascic, Julius Grodski "A Virtual Stereographic Pointer for a Real Three Dimensional World", Interact `90, Third IFIP Conference on Human-Computer Interaction, Cambridge, UK, August 1990
Paul Milgram, David Drascic, Julius Grodski "Stereoscopic Video + Superimposed Computer Stereographics: Applications in Teleoperation", Proc. Second Canadian Workshop on Military Robotic Applications, Kingston, Ontario, Aug 1989.
Paul Milgram, R van der Horst "Alternating-field stereoscopic displays using light-scattering liquid crystal spectacles", Displays: Technology & Applications, v 7, n 2, 67-72, April 1986
Dwight P Miller, "Evaluation of vision systems for teleoperated land vehicles", IEEE Control Systems Magazine, p37-41, June 1988
Donald A Normam, The Psychology of Everyday Things, Basic Books Inc, New York, 1988
Ross L Pepper, "Human Factors in Remote Vehicle Control", 30th Annual Meeting of the Human Factors Society, Dayton, Ohio, Sep-Oct 1986
Ross L Pepper, J D Hightower "Research Issues in Teleoperator Systems", Proceedings of the Human Factors Society 28th Annual Meetings, 803-807, 1984
Ross L Pepper, David C Smith, Robert E Cole "Stereo TV improves operator performance under degraded visibility conditions", Optical Engineering, v 20, n 4, 579-585, July/Aug 1981
J Rasmussen, Human Information Processing, 1986
M Robinson "Remote control vehicle guidance using stereoscopic displays", Proceedings of the HFS 28th Annual Meeting, 809, 1984
D C Smith, R E Cole, J O Merritt, R L Pepper "Remote Operator Performance Comparing Monoa nd Stereo TV Displays: the Effects of Visibility, Learning, and Task Factors", Naval Ocean Systems Center Technical Report No. 380, Feb 1979
Edward H Spain, A Psychophysical Investigation of the Perception of Depth with Stereoscopic Television Displays, PhD Dissertation, University of Hawaii, May 1984
Christopher D Wickens, "Three-dimensional stereoscpoic display implementation: Guidelines derived from human visual capabilities", SPIE Vol 1256 Stereoscopic Displays and Applications, 2-10, 1990
Christopher D Wickens, Engineering Psychology and Human Performance, Charles E Merrill Publishing Company, Toronto, 1984
Rodney Don Williams, Felix Garcia Jr. "A Real-Time Autostereoscopic Multiplanar 3D Display System", Proceedings of the Society for Information Display, Anaheim, California, 1988