5.3.1.1 Experimental Platform
MITS, the same experimental platform as in the previous experiments,
was used in this experiment. Due to the more complex graphics
rendering, the experiment was carried out on a more powerful computer,
a SGI IRIS 4D Crimson/VGX graphics workstation, in order to maintain
a 15 Hz update rate.
5.3.1.2 Experimental Task
Since depth display is the primary concern in this chapter, a 3 DOF (instead of 6 DOF) positioning task, comprising both perception and manipulation in 3-space, was carried out both with and without the semi-transparency effect. The task is a modification of the tracking task in Experiment 3.
In each trial of the experiment, a graphically rendered angel
fish "swam" around (moved in X, Y Z translations) randomly
within a 3D virtual environment (Figure 5.2). Subjects were asked
to control a 3D volume cursor (Figure 5.3) to chase the
fish, envelop it, and "grasp" it when the fish was
perceived to be completely inside the cursor. Subjects wore the
glove (Figure 2.5) designed in Experiment 1, working in isotonic
position control mode with a Control/Display ratio of 1:1. Grasping
was done simply by closing the hand naturally. If the fish was
entirely inside the cursor volume, the trial was successful and
the fish stayed "caught" within. The time score of each
trial was displayed to the subjects, along with a short beep.
If the fish was not completely inside the cursor when grasped,
the fish disappeared. In this case, which was considered a "miss" a long beep was sounded and error magnitudes in each of the x,
y, and z dimensions were displayed, along with the message "Missed!".
Each new trial was activated when the subjects pressed the spacebar
on the keyboard.
Figure 5.2Experiment 5 set-up | ![]() |
|
| Figure 5.3 The volume silk cursor: Figure 5.3 The volume silk cursor Use of a "silk" covering over a rectangular volume cursor in order to obtain occlusion-based depth cues. An object at point A is seen through two layers of "silk", and thus is perceived to be behind the volume cursor. An object at point B is seen through one layer, and thus is perceived as inside the cursor's volume. An object at point C is not occluded by the silk at all, and so is seen to be in front of the volume cursor. | ![]() |
|
Although presented as a game (which incidentally was greatly enjoyed
by the subjects), the 'virtual fishing' task is essentially
a 3D dynamic target acquisition task. Note that simply to select,
or designate, a target in a 3D space does not necessarily require
much depth information. That is, using a conventional 2D mouse
cursor, any target, be it 2D or 3D, can be easily selected
by clicking on its projection on the 2D screen plane. The purpose
of the fishing task was to require the subject precisely to locate
the target in 3-space, a capability essential for many 3D interaction
tasks.
5.3.1.2. The Targets and Their Motion
Each of the targets ('angel fish') used in this experiment had a flat body, except for two fins and two eyes protruding from the body (see Figure 5.1, and Figure 5.4 - 5.7). The angle between any fin and the body was 30 degrees. The size and colour of the fish changed from trial to trial, in order to eliminate size constancy cues in the experiment. The x (from lips to tail), y (vertical) and z (from left fin tip to right fin tip) dimensions of the largest ('adult') fish were 10 cm, 15 cm and 1.3 cm respectively. The smallest ('baby') fish was 30 percent of the size of the largest adult fish.
![]() |
Figure 5.4 Figure 5.4 A fish and the wireframe cursor |
![]() |
Figure 5.5 Figure 5.5 A fish in front of the silk cursor |
![]() |
Figure 5.6 Figure 5.6 A fish behind the silk cursor |
![]() |
Figure 5.7 Figure 5.7 A fish completely inside of the cursor |
The fish movements were driven by independent forcing functions in the x, y and z dimensions. In this experiment, the particular forcing functions applied to the fish motions were:
,
where t was the time from the beginning of each test (see section
5.3.3 on experimental design and procedure for the definition
of a test), A = 4.55 cm, p = 2, and fo = 0.02 Hz. The phase terms,
and
(i = 0, 1, ..., 5), were pseudo-random numbers, ranging uniformly
between 0 and 2
. This design resulted in
fish motions which were sufficiently unpredictable to the subjects
and different from trial to trial, but repeatable for each test
and between experimental conditions.
5.3.1.3 The Cursor and the Input
The cursor used to capture the fish was a rectangular box of size 11.3 cm, 16.3 cm and 2.6 cm in x, y and z dimensions respectively (Figure 5.3). Two versions of the cursor were used in the experiment. One was a wireframe cursor , as shown in Figure 5.4. In order to test the semi-transparent effect, the second version of the cursor was designed to be a silk cursor (Figure 5.1, 5.5 - 5.7). The silk cursor had exactly the same geometry as the wireframe cursor but its surfaces were semi-transparent. The intensity, I, of the semi-transparent surface was rendered by interpolating the cursor colour (source) intensity, Is, with the destination colour intensity, Id, according to (Foley, et al., 1990):

Although Is was chosen to be white (RGB values were set to 255, 255, 255) in this experiment, different colour compositions may be more suitable for other particular applications.
If
= 1, the cursor is totally opaque and
therefore completely occludes objects behind it. If
= 0, the cursor is totally transparent and no partial occlusion
cues are available. The wireframe cursor (Figure 5.4) therefore
effectively corresponds to a silk cursor with
= 0. On the basis of pilot experiments, we determined a suitable
coefficient of
= 0.38 for all surfaces
of the cursor, except for the back surface, which was set at
= 0.6. These values resulted in partial occlusion states (i.e.,
in front of and between two layers of the silk surface) which
were judged to be satisfactorily distinguishable. The transparency
interpolation was realised by means of blendfunction(sfactr, dfactr)
in the GL library. Note that the actual sequencing of rendering
commands is critical to the transparency effect. Polygons further
away from the user's viewpoint must be drawn before polygons closer
to the user.
In the experiment, the "home" position of the glove corresponded to a cursor location of (0, 0, 0) and was calibrated to make the subject most comfortable when using the glove. Since only translations were needed in the fishing task, rotational signals from the glove were disabled for this experiment.
5.3.1.4 The Display
Two modes of display were used in the experiment: stereoscopic and monoscopic. In the stereoscopic case, subjects wore 120 Hz flicker-free stereoscopic CrystalEyesTM viewing glasses (Model No. CE-1), manufactured by StereoGraphics Inc.
5.3.2 Experimental Conditions and Hypotheses
The primary goal of this experiment was to evaluate the effectiveness of semi-transparent surfaces as an interactive medium for displaying usersÌ input actions in depth. Since stereoscopic viewing is widely recognised as one of the most effective 3D interface techniques (Wickens, et al. 1989; Yeh and Silverstein 1992; McAllister 1993), the stereo display condition was used as the standard of comparison for the semi-transparency effect. Two display modes (monoscopic versus stereoscopic) and two types of cursor (silk cursor versus wire frame cursor) were included in the experiment. Thus, the experiment had four conditions: silk cursor with stereo display (SilkStereo); wire frame cursor with stereo display (WireframeStereo); silk cursor with mono display (SilkMono); and wire frame cursor with mono display (WireframeMono).
The reason for including the WireframeMono case was to provide a baseline standard of comparison for judging potential interactions between the stereoscopic and semi-transparency cues (e.g. Sollenberger and Milgram 1993). In the WireframeMono case the subjects had to rely on occlusions between the edge of the cursor and the fish. They tended to move the cursor so that the fish first was apparently located between the edges of the cursor in the z dimension (Figure 5.4) and then slightly adjust the cursor in the x and y dimensions to bring the fish into the centre of the cursor before grasping.
In the WireframeStereo case, subjects no longer had to depend on edge occlusion. Because the stereoscopic cue gave them a strong 3D sensation, they could judge the depth dimension directly and simultaneously with their judgement along the x and y dimensions.
In the SilkMono case, portions of the target appeared with different contrast ratios when located in front of (Figure 5.5), behind (Figure 5.6) or inside the cursor (Figure 5.7). The subjects tended to use the semi-transparency cue interactively, by moving the silk cursor first through the target to observe the continuous change of target appearance (Figure 5.1) and then grasping immediately after the front surface of the silk cursor moved in front of the fish fin.
In the SilkStereo case, subjects had the advantage of both the stereo cue and the semi-transparency cue. SilkStereo was expected to be the most efficient case and WireframeMono to be the least efficient. What was of particular interest to us, however, was whether the SilkMono case (semi-transparency cue alone) would generate superior, or in any case comparable, performance scores relative to the case of WireframeStereo (stereo cue alone), which would confirm the potentially powerful advantages of the semi-transparency cue on its own.
Stated formally, the hypotheses for this particular class of tasks were:
1. Semi-transparent surfaces improve performance over simple wireframes;
2. Stereoscopic displays improve performance over monoscopic displays;
3. The effect of the semi-transparency cue is superior to, or in any case comparable with, the stereo cue.
4. The use of semi-transparency will further improve users' interaction performance in addition to the benefit of stereoscopic displays and therefore performance is best when both cues are present.
5.3.3 Experimental Design and Procedure
Eleven males and one female paid volunteers served as subjects in this experiment. The subjects were screened using the Bausch and Lomb Orthorator visual acuity and stereopsis tests. Subjects' ages ranged from 18 to 36, with the majority in their early and mid-20's. One of the 12 subjects was left handed and the rest were right handed, as determined by the Edinburgh inventory (Oldfield 1971) . Subjects were asked to wear the input glove on their dominant hand.
A balanced within-subjects design was used. The 12 subjects were randomly assigned to a unique order of the four conditions (SilkStereo, WireframeStereo, SilkMono, WireframeMono) using a hyper-Graeco-Latin square pattern, which resulted in every condition being presented an equal number of times as first, second, third and final condition.
Following a 2 minute demonstration of all four experimental conditions, the experiments with each subject were divided into four sessions , with one experimental condition in each session. There was a 1 minute rest period between every two sessions. Each session comprised 5 tests . Each test consisted of 15 trials of fish catching. Test 1 started when the subject had no experience with the particular experimental condition. Test 2, 3, 4, and 5 started after the subjects had 3, 6, 9 and 12 minutes worth of experience respectively. Practice trials filled the gap following a test and before the next test began, so that each test (e.g. Test 3) always started when the subject had a fixed amount of practice with the particular experimental condition (e.g. 9 minutes for Test 3). At the end of each test, the number of fish caught and missed (as both an absolute number and a relative percentage) and mean trial completion time were displayed to the subject.
At the end of the experiment, a short questionnaire was administered to assess users' subjective preferences for all experimental conditions.
Task performance was measured by trial completion time, error rate and error magnitude. Trial completion time was defined as the time duration from the beginning of the trial to the moment when the subject grasped the target. Error rate was defined as the percentage of fish missed in a test (15 trials). Whenever a fish was missed, the error magnitude was defined as the Euclidean summation of errors (portions of the body outside of the cursor) in the x, y, z dimensions:

Note that the error magnitude is not a primary measure for two reasons. First, the subjects' task was to capture the fish as quickly as possible. Error magnitude was not an explicit requirement. Second, error magnitude is relevant only when the subject missed the fish. It was included, however, to gather a complete set of performance measures.
3600 experimental trials (i.e., 12 (subjects) x 2 (cursor types) x 2 (display modes) x 5 (tests) x 15 (trials per test)) of data were collected during the experiment. Repeated measure analyses of variance were conducted through the multivariate approach to test the statistical significance of the individual effects and their interactions under each of the three performance measures. As in earlier experiments, the data on trial completion times, error rates and error magnitudes collected here were not normally distributed, but rather skewed towards lower values. In order to increase the validity of the statistical analysis (Howell 1992), logarithmic transformations were applied to the trial completion time and error magnitude data and a square root transformation was applied to the error rate data. These transformations made the data meet the variance analysis assumptions of normality and homogeneity of variance. The following are the primary results of the statistical analysis.
5.3.5.1 Trial Completion Time
Variance analysis (Table A3.5.1, Appendix 3) indicated that cursor type (silk vs. wireframe cursor: F(1,11) = 66.47, p<.0001), display mode (stereo vs. mono display: F(1,11) = 15.0, p < .005), experimental phase (F(4,44) = 21.59, p<.0001), trial number (different fish size and 3D location: F(14,154) = 12.55, p<.0001), cursor x display interaction (F(1,11) = 6.68, p < .05), and cursor x display x phase interaction (F(4, 44) = 4.0, p <.01) all significantly affected trial completion time.
Figure 5.8 illustrates the effect of cursor type and display mode
on trial completion time. Multiple contrast tests (Table A3.5.2,
Appendix 3) showed that the silk cursor produced significantly
shorter completion times than the wireframe cursor, for both monoscopic
and stereoscopic displays. With regards to the magnitude of the
differences, the mean completion time with the silk cursor was
48.4% shorter than that of the wireframe cursor in monoscopic
display mode and 28.1% shorter in stereoscopic display mode. Finally,
the mean completion time for SilkMono (semi-transparency cue alone)
was 18.1% shorter than for WireframeStereo (stereo cue alone),
even though this difference was not statistically significant
(p = .28), due to the limited power of the test. These results
suggest that, for tasks like the one presented here, semi-transparency
is indeed a more effective cue than stereopsis.
![]() |
5.3.5.2 Error Rate
As illustrated in Figure 5.9, the pattern of the error rate data as a function of cursor type and display mode is very similar to that of the trial completion time data. Repeated measure ANOVA (Table A3.5.3, Appendix 3) showed that the statistically significant factors affecting error rate were cursor type (F(1,11) = 92.16, p<.0001), display mode (F(1,11) = 14.48, p < .005), and cursor type x display mode interaction (F(1,11) = 7.47, p < .05). Neither experimental phase nor any interactions between experimental phase and other factors were significant.
Multiple contrast tests (Table A3.5.4, Appendix 3) showed that the silk cursor produced significantly fewer errors than the wireframe cursor, both for monoscopic displays and for stereoscopic displays. Regarding the actual differences in magnitude, for monoscopic displays the mean error rate of the silk cursor was 59% less than that of the wireframe cursor. For stereoscopic displays the mean error rate with the silk cursor condition was 36.7% less than for the wireframe cursor. For the semi-transparency cue alone (SilkMono) the mean error rate was 19.5% lower than for the stereo cue alone (WireframeStereo), although this difference was not statistically significant (p = .21), once again due to the low power of the test. As with the trial completion time data, therefore, the error rate data also suggested that the semi-transparency cue was more effective than the stereo cue.
![]() |
5.3.5.3 Error Magnitude
The effects of cursor type and display mode on error magnitude are shown in Figure 5.10. When examining the error magnitude data, it should be noted that error magnitude was defined only when an error was made (i.e., a target was missed), and that fewer errors occurred in some conditions than for others. The variance analysis (Table A3.5.5, Appendix 3) showed that error magnitude was significantly affected by cursor type (F(1,11) = 11.37, p < .01), display mode (F(1,11) = 18.19, p < .005), and experimental phase (F(4,44) = 3.97, p < 0.01). No significant between factors interactions of any order were found.
Multiple contrast tests (Table A3.5.6) showed that the silk cursor produced significantly lower error magnitudes than the wireframe cursor, both for the monoscopic displays and for the stereoscopic displays. For the monoscopic displays the mean error magnitude of the silk cursor was 15.1% lower than that of the wireframe cursor. For the stereoscopic displays the mean error magnitude of the silk cursor condition was 41.5% smaller than that of the wireframe cursor.
In conclusion, therefore, in contrast to the trial completion times and error rate data, it appears that when an error did occur, the stereo cue was more effective than the semi-transparency cue in reducing the error magnitude. The SilkMono mode (semi-transparency cue alone) produced a larger mean error magnitude than the WireframeStereo mode (stereo cue alone); however, this difference was not statistically significant (p > 0.5).

5.3.5.4 Temporal Effects and Results in Final Phase
As indicated in the variance analyses above, experimental phase was a significant factor for trial completion time and error magnitude, but not error rate. It also interacted significantly with cursor display combinations, as measured by trial completion time. This subsection describes the performance changes as learning apparently progressed, and the results in the final phase of the experiment are examined.
Figure 5.11 shows the trial completion time data for each technique as a function of the experimental phase. It shows clearly that the relative scores between the different conditions were ordinally consistent over all experimental phases. Subjects improved their time scores for the SilkStereo, SilkMono and WireframeStereo modes as they gained more experience, and presumably more confidence. Little improvement in completion time was evident with the WireframeMono condition however.
Variance analysis (Table A3.5.7, Appendix 3) was conducted on the trial completion time data in the final experimental phase (Test 5 in Figure 5.11). The statistical conclusions were the same as those drawn from the overall data above (section 5.3.5.1): cursor type (F(1,11) = 90.8, p < .0001), display mode (F(1,11) = 21.5, p < .001), cursor type and display mode interaction (F(1,11) = 17.3, p < .005), trial number (F(14, 154) = 6.4, p < .0001) all significantly affected trial completion times. Results of the multiple contrast comparisons for the final phase completion time data also agreed with the results from the overall data: SilkStereo vs. SilkMono (p = .27) and SilkMono vs. WireframeStereo (p = .32) were not significantly different. All other pair comparisons were significant (p < 0.05). Mean trial completion time reductions due to the semi-transparent effect in the final phase are as follows. For the mono displays, SilkMono (mean 2.064 sec.) was 52.8 % less than WireframeMono (mean 4.376 sec.). For the stereo displays, SilkStereo (mean 1.850 sec.) was 20.6% less than WireframeStereo (mean 2.329 sec.).


Figure 5.12 presents the error rate data as a function of experimental phase. Again, the relative rank of each mode was consistent across all five phases of the experiment. Interestingly however, in contrast to the completion time data (Figure 5.11), the error rates for the WireframeMono condition showed the most obvious improvement over the experiment. A small amount of improvement was also found in the SilkMono condition, but essentially none in the SilkStereo and WireframeStereo modes. Variance analysis (Table A3.5.8, Appendix 3) for the final (test 5) phase error rate data showed that cursor type (F(1,11) = 26.6, p < .0005) and display mode (F(1,11) = 6.05, p < .05) were both significant factors, but the cursor type x display mode interaction (F(1, 11) = 1.53, p = .24) was not significant. Multiple contrast comparisons showed that final phase error rate with WireframeMono was significantly higher than the other three cases (p <0.05). Other contrasts were not significant, however. Mean error rate reductions resulting from the semi-transparent effect in the final phase are as follows. For the mono display, error rate with SilkMono (mean 13.9%) was 60.8% lower than WireframeMono (mean 35.0%). For the stereo display, SilkStereo (mean 13.9%) was 26.5% lower than WireframeStereo (mean 18.9%).
Comparing Figure 5.11 with Figure 5.12 reveals important information about speed accuracy trade-off patterns with respect to learning. For the WireframeMono mode, subjects had more than a 50% error rate at the beginning of the experiment, which apparently caused them to focus on improving the accuracy aspect of the task at the expense of completion time performance. In the other three cases (SilkStereo, SilkMono, and WireframeStereo), subjects already had less than a 25% error rate and it appears that they were relatively satisfied with this level of accuracy, and thus decided to devote more effort to reducing their trial completion times.
The error magnitude data were not suitable for statistical analysis
as a function of each experimental phase, since very few errors
occurred for some of the phase and technique combinations.
5.3.5.5 Subjective Preferences
Figure 5.13 shows the mean scores for the subjective evaluation data collected after the experiment. On average, the SilkStereo condition was the most preferred and WireframeMono was least preferred, with SilkMono ranked higher than WireframeStereo. Statistically, significantly different preference scores were found across conditions through repeated measure variance analysis (F(3,33) = 54.36, p<0.0001, see Table A3.5.9 in Appendix 3 for detail). Multiple contrast tests (Table A3.5.10, Appendix 3 ) show that subjectsÌ preferences between every pair of techniques were significantly different (including WireframeStereo vs. SilkMono). The subjective evaluation data in this experiment were consistent with the acquired performance measures (completion time and error rate) in terms of ordinal rankings but were more sensitive in detecting differences between conditions.

5.3.5.6 Summary of Results
The experiment largely confirmed the initial hypotheses. In terms
of all three measures of performance, trial completion time, error
rate and error magnitude, both stereopsis through binocular disparity
and partial occlusion through semi-transparency were significantly
beneficial to the manual 3D localisation task. The semi-transparency
cue was effectively utilised by the subjects in both monoscopic
and stereoscopic displays. Comparing the two cues, semi-transparency
appeared to be slightly more effective than binocular disparity
for successful 3D target acquisition. Subjects' performance with
each of the techniques improved with learning but the relative
rank of the techniques remained unchanged throughout. Subjective
evaluations supported the conclusions drawn from performance measures.