To appear in Proceedings of INTERACT97: The Sixth IFIP Conference on Human-Computer Interaction, Sydney, Australia, July 14-18.

Improving Browsing Performance:

A study of four input devices for scrolling and pointing tasks

Shumin Zhai - Barton A. Smith - Ted Selker

IBM Research - Almaden

650 Harry Road, NWE-B2 , San Jose, California 95120, USA

{zhai, smith, selker}@almaden.ibm.com

ABSTRACT Navigating through online documents has become an increasingly common HCI task. This paper investigates alternative methods to improve user performance for browsing World Wide Web and other documents. In a task that involved both scrolling and pointing, we compared three input methods against the status-quo. The results showed that a mouse with a finger wheel did not improve user's performance; two other methods, namely a mouse with an isometric rate-control joystick operated by the same hand and a two handed system that put a mouse on the dominant hand and a joystick on the other, both significantly improved users' performance. A human factors analysis on each of the three input methods is also presented.

KEYWORDS Input Devices, Interaction Techniques, Web Browsing, Scrolling, Mouse, Isometric vs. Isotonic Devices, joystick, Wheel Mouse, IntelliMouseTM, TrackpointIIITM, Two-handed Input, Bimanual Interaction.

1. INTRODUCTION

Today's mainstream interaction style (WIMP - window, icon, menu and pointer), although with a long history (Smith, Irby, Kimball, Verplank & Harslem, 1982), is still gaining a wider range of applications and a larger user population. The rapidly developing World Wide Web (WWW) makes the use of such style of interaction even more frequent and intense. As a result, the limitations of existing WIMP features also become more severe and obvious. There have been numerous interface inventions and studies since the basic WIMP style was developed (e.g Buxton 1986), but they have been largely restricted to the research literature and isolated demonstrations. The unavailability of commercial hardware and software and an incomplete understanding of human factors both have contributed to the lack of major improvements in the mainstream interfaces.

One basic feature of the existing mainstream WIMP interfaces is that the user communicates with the computer system via a single stream of spatial input, physically driven by a 2 degree of freedom input device, typically a mouse, and graphically displayed as a cursor. The universal cursor travels around the entire interface, switching its functions from pointing, to selection, to drawing, to scrolling, to opening and to jumping, according to what virtual devices (widgets), such as the main document/window, a menu, a scrolling bar, an icon or a hyperlink, has been acquired and engaged. Such a single stream operation, needless to say, has offered the users many advantages such as the ease of understanding and learning the interaction mechanism. The disadvantage, however, is the limited communication bandwidth (Buxton 1986) and the costs in time and cognitive effort of acquiring widgets and control points (Buxton and Myers 1986, Leganchuk, Zhai and Buxton 1996). A particular case at point is document browsing, one of the most frequent tasks in interacting with computers. A document, such as a text file, a spreadsheet, a folder, and most importantly, a WWW page, is often larger than the viewing window that competes for space with multiple other windows on the same limited computer screen. When working on such a document, the user's point of interest often moves to outside of the viewing window, forcing the user to move (scroll) the document. With the traditional scroll bar method, there are at least the following three limitations:

1. It takes a certain amount of time, T1, to acquire the scrolling bar. According to the well studied Fitts' law (Fitts 1953), T1 is logarithmically proportional to the ratio of A and W. A is the distance the cursor has to travel and W is the size of the widget acquired. At the extreme case (travel across the entire screen to acquire the arrow widget at the end of a scroll bar), the Fitts index of difficulty can be up to 8 bits, which may take more than 2 seconds to complete.

2. There are three methods of using a scrolling bar, each has some limitations. First, the user can acquire the moving handle and drag it. The advantage with this method is that the user can scroll the document at a controlled speed that is suitable for the particular task. The disadvantage is that the dragging function, requiring maintaining pressure on a button while moving the input device, is more difficult and takes more time than pointing over the same Fitts' index of difficulty (MacKenzie, Sellen and Buxton 1991). The second method is to use the cursor to press the arrow buttons at the ends of the scrolling bar, causing the document to scroll at a speed that is not adjustable by the user. This is binary control: either move at a fixed speed or stop. The speed could be too slow when the user wants to move very far, or too fast when the user wants to visually track the document. The third method is to click on the rest of the space on the scroll bar, causing the document to "jump" at a speed faster than the second method. The location of the window jumped to is often unpredictable by the user, causing visual discontinuity or "loosing track".

3. Perhaps most importantly, when the user has to go to the scroll bar to move a document, even by just one line, it takes the perceptual, cognitive and motor resources away from the target that the user focused attention on, breaking the work flow.

The above analysis shows that the standard, single input stream WIMP interface is inadequate for browsing, one of today's most common interaction tasks. This study looks into three alternative methods for browsing. We will conduct a human factors analysis on each of the three techniques and we will then present a formal experiment that compared these methods against the standard single stream method.

2. THREE ALTERNATIVE METHODS FOR BROWSING

2.1. Mouse with Isometric joystick

As shown in Figure 1, the first alternative device we studied is JSMouse, a two button mouse with a miniature joystick mounted between the two buttons. The Mouse retains all usual functions as in a standard mouse. The miniature isometric joystick, an IBM TrackPoint IIITM, is a rate controlled input device (Rutledge and Selker 1990, Barret et al 1995). Each of the two devices can function as an independent normal 2 degree-of-freedom input device. In the current study we assign the mouse for pointing and the miniature joystick for scrolling. We hypothesized that the isometric joystick is particularly suitable for scrolling tasks based on the following reasoning.

The Joystick Mouse

Figure 1 The JoystickMouse (JSMouse) is a mouse with an miniature joystick mounted between the two buttons. The users may use index or middle finger to manipulate the joystick.

First, let us briefly review the two basic types of transfer functions in input: position and rate control (see Poulton 1974 for detailed review). Position control, also referred as zero order control, maps the user input variable to the cursor displacement according to a constant, or a variable gain. Rate control, also called first order control, maps the user input variable to cursor velocity. As shown in recent six degree of freedom input control studies (Zhai and Milgram 1993, Zhai, Milgram and Drascic 1993, Zhai 1995), position control is better conducted with isotonic, free moving devices, such as the mouse; and rate control is better conducted with isometric or elastic devices. The key factor to this compatibility issue is the self-centering effect in isometric or elastic devices. With self centering, rate control can be easily done. Without it, rate control requires conscious effort. Either position control or rate control can give users the ability to control all aspects of movement, including displacement, movement speed or higher order derivatives, but each mode corresponds to only one aspect directly: displacement or speed.

Scrolling, or navigating through a document, requires the user not only to control the final displacement of the document to make the target appear in the viewing window (a rather easy requirement by Fitts' index of difficulty because of the large effective "width" - the difference between the viewing window size and the target area), but also to control the speed of the movement so that the user can comfortably scan the document to look for the target. An isometric rate control device apparently meets these requirements. On the other hand, if we use an isotonic position control device, such as the mouse, the user may not be able to control the speed of movement continuously. In particular, due to physical constraint (of either the human arm or the mouse pad), position control allows the user to move only within a certain distance at one stroke. The user has to release (by lifting the mouse) and re-engage the position control device repeatedly in order to scroll over a longer distance.

When using the JSMouse, the user can either use the index finger or the middle finger to operate the joystick. When using the index finger, the user has to switch the same finger between the left button and the joystick. Due to the close proximity, the user can rely on kinesthetic memory to locate the stick without looking at it.

2.2. Mouse with a track wheel

The Wheel Mouse

The idea of adding an additional sensor onto a mouse is not new. As described in (Venolia 1993), a thumb wheel can be mounted onto a standard mouse for additional degree of freedom in 3D interface. More recently, the Microsoft IntelliMouseTM provides a finger wheel on the top of a mouse. The later, which was called WheelMouse, is the second device we included in the current study.

Figure 2 The WheelMouse used in the study was a Microsoft IntelliMouseTM. It works in three modes: Wheel rolling, press (the wheel) and move (mouse) to do rate control, and click and move.

The track wheel in the IntelliMouse is largely free moving (isotonic) but with a detent mechanism. Each detent step corresponds to one line of scrolling. The wheel works in position control mode. Position control requires repeated release-reengage for long movement. In the case of the track wheel, the user can quickly repeat the stroking of the wheel. The IntelliMouse provide two additional modes of scrolling; both turn the mouse itself into a rate control device. As analyzed earlier, an isotonic device lacks the self centering effect that is desirable in rate control. In one mode of the IntelliMouse, the user presses down the wheel, which is also a button, to engage in rate control scrolling. The more the mouse is moved from where the wheel is pressed down, the faster the document scrolls. When the user releases the wheel, scrolling stops. In the second mode, the user presses and releases the wheel (click) to start the rate control scrolling. Any following click, either on the wheel or on other buttons stops the scrolling. In both cases, a visual anchor is left on the screen to indicate where the rate control scrolling starts. This may help the lack of centering effect in the mouse for rate control, but such a centering feedback comes from the visual channel, not the haptic feel.

As with the JSMouse, the user can use either the index finger, or the middle finger to roll the track wheel for scrolling. Pointing is done by normal mouse movement.

2.3. Two handed joystick and mouse

The third method we studied was a two-handed input method. A keyboard with a TrackPoint IIITM (between the G, H, B keys, as in IBM Thinkpad computers) and a standard mouse were used in this method (Figure 3). The user operates the joystick with non-dominant hand to do scrolling and manipulates the mouse with the dominant hand to do pointing.

The idea of using the non-dominant hand for a scrolling task has been advocated by researchers such as Buxton (1986) for over a decade. Scrolling was also one of the first scenarios in which two handed input was experimentally demonstrated to be superior to the standard one handed input. Equipping subjects' non-dominant hand with two strips of touch-sensitive tablet and their dominant hand with a puck on a graphics tablet, Buxton and Myers (1986) studied users' performance in a text document navigation (jump or scroll) and selection (pointing) task. In that experiment the subjects used their non-dominant hand to jump (one strip that was absolute position sensitive) or scroll (another strip that was relative movement sensitive) the document and used the dominant hand to select targets. With such a two-handed set-up, 15% (for expert users) to 25% (for novice) performance improvement was measured.

The present two-handed system studied here differs from that of Buxton and Myers, and from any other published two-handed techniques to our knowledge such as (Kabbash, Sellen and Buxton, 1994; Leganchuk, Zhai and Buxton, 1996), in terms of the physical devices used in two-handed interaction. One of the two devices in the system is an isometric rate control joystick. There are four potential advantages to including an isometric joystick in a two-handed system.

First, there will always be some individual preference for a certain type of device. Some users may prefer one type over another. Having one joystick and one mouse in the system gives the user a choice when they need only one device. Second, device performance is task dependent. A unique advantage of in-keyboard isometric joysticks is that the user's fingers do not have to leave the keyboard, making mixed typing and pointing task much faster (Rutledge and Selker, 1990). Including an isometric joystick in the dual device system gives the user the choice when needed for a particular task.

In-Keyboard Isometric Joystick

Figure 3 In-keyboard Isometric Joystick (top), operated by the non-dominant (left for this user) hand for scrolling while the dominant hand moves a mouse for pointing (bottom).Note that the system can be easily set for left-dominant user.

Mouse

Third, an isometric joystick requires less space, or "footprint", than any other device (mouse, tablet or trackball). This is not only important for portable computing, but also important for a two-handed desktop environment where a keyboard with a mouse has already crowded the workspace. Fourth, as pointed out earlier, a rate control technique that is compatible with isometric devices can be particularly suitable for scrolling tasks, no repetitive release-reengage problem exists as in position control techniques.

However, the joystick-mouse two handed system also poses an unanswered theoretical question: with those two handed systems that have been demonstrated to be advantageous, both hands were engaged in isotonic position modes (consistent or similar motor action across two hands). In the current system, the two hands are engaged in different motor control mechanisms: one in isotonic position control one in isometric rate control. Is such a combination still superior to the standard one handed input system?

What is also conceptually interesting is the contrast between the two-handed system (Figure 3) and the Joystick Mouse (Figure 1) both included in the current study. Identical transducers were used in the two input methods. The difference was entirely the location of the joystick. In the case of JSMouse, the joystick was on the mouse and was manipulated by the same hand that operates the mouse. In the current case, the joystick was in the keyboard and was manipulated by a different hand. In other words, we are distributing two streams of input in two ways: one puts both streams into one hand and the other separates them to two hands. It is interesting to find out how user performance differs between the two methods.

We should briefly mention a human bimanual action theory: Kinematic Chain model (Guiard 1987). The KC model strongly suggests that the two human hands work in a cooperative but asymmetric manner. The non-dominant hand, like a base link in a chain, tends to take precedence (act first), work on a larger but coarse scale, and set the frame of reference. The dominant hand, like a terminal link in a chain, tends to act later, work in a smaller but finer scale and operate within the frame-of-reference. The current two-handed system coincides with these characteristics very well: the non-dominant side acts first (scroll first), sets the frame of reference, and moves at a larger distance (rate control). The dominant hand acts later and operates within that frame on a smaller scale. This is also what we do in natural life: hold and move a document with our non-dominant hand and write within the page with our dominant hand.

3. The Experiment

3.1. Experiment Design

We choose to model our experimental task after one of the most frequent interaction tasks today's computer users do: Web browsing. A web page, stored as a local file to avoid transmission delay, was presented to the subject. The document contains texts from an IBM computing terminology dictionary (Figure 4). A hyper link is embedded at an unpredictable location in each page. The user's task was to scroll the document until he/she found the target hyperlink (Figure 4, bottom). Clicking on the target word "Next" would bring the subject to the beginning of the next web page. Each test of the experiment consisted of 10 pages of browsing (scroll and point). The size of the web pages was set as such that the scroll handle was 1.3 cm (so it was not too difficult to acquire for the standard mouse condition, see Figure 4, bottom). The web browser viewing area was 24 cm wide and 15 cm long on a CRT display. Subjects were allowed to adjust the positions of mouse, keyboard, and display on the desk to suit their own preferences.

Four interaction methods were tested in the experiment: Standard Mouse (Mouse), Mouse with a track wheel (WheelMouse), Mouse with Joystick (JSMouse), and Mouse with in-keyboard joystick (2hand). Note that pointing mechanism is the same with all four methods: mouse movement by the dominant hand. A total of 12 volunteer subjects participated in the experiment. An order balanced within subject design was used. Each subject performed the tests with all four methods in a pre-assigned order of the four methods. With each method, the subjects were first given one practice run, during which they were asked to explored all modes (in the cases of Mouse and WheelMouse) and strategies (aggressive or careful). They could take as much time as they liked to finish these 10 pages of browsing. The subjects were then asked to performed two consecutive tests (10 pages each test) as quickly as possible. The same 10 web pages were for all tests.

Page 5

Figure 4 Web page browsing was used as the experimental task. The subjects had to scroll and point at a hyperlink to proceed in the task. Shown here are the beginning (top picture) and the middle (bottom picture) of page 5.

Page 5

Of the 12 subjects, all had extensive experience with using a mouse; five had much experience with using the in-keyborad isometric joystick; all but one had no experience with the three alternative methods; one subject had a little experience with the three alternatives.

Trade marks on the devices were covered in order not to bias subjects' opinion on each of the methods. After completing all four methods, subjects were asked to rate each of the four methods on a -3 (terrible) to +3 (great) scale based on their experience.

3.2. Results

Figure 5 shows the mean completion time and 95% confidence bars in each of the two consecutive tests. A repeated measure variance analysis showed that subjects completion time was significantly affected by input method (F 3, 11 = 20.3, p < .0001). Although Test 2 was significantly faster than Test 1 (F 1,11 = 12.4, p < .01), such an improvement did not alter the relative performance pattern of the input methods (Method X

Test insignificant: F 3, 11 = 1.1, p = .37).

Completion time of web browsing task

Figure 5 Completion time of web browsing task

Taking the Mouse condition as the reference, the JSMouse and 2Hand conditions were 22.4 and 25.5 percent faster, and the WheelMouse condition was 8.7 percent slower than the standard mouse condition. Statistically, the difference between Mouse and WheelMouse conditions (p=.086) and the difference between JSMouse and 2Hand (p=.57) were not significant. All other pair wise comparisons were significant (p < 0.0001, t-Test).

Subjects subjective rating based on their experience were similar to the performance measurements (Figure 6) except for the difference between Mouse and WheelMouse. Subjects gave the WheelMouse a significantly lower rating than the standard mouse (p<.05, t-Test). The JSMouse and 2Hand conditions were rated significantly higher than the other two methods ( p value from .01 to .0001), but the difference between the two was not significant (p=.86).

Mean subjective ratings

Figure 6 Mean subjective ratings, with 95% confidence error bars, on the four input methods: 3 = great, 2 = very good, 1 = good, 0 = OK, -1 = poor, -2 = very poor, -3 = terrible.

4. Discussion

WheelMouse. Surprisingly, although it offered dual-stream input, the WheelMouse did not outperform the standard mouse, despite the fact that with a single stream mouse one has to switch between target selection and acquiring the scroll bar. Three subjects commented that it was tedious and tiring to repeatedly roll the wheel to scroll a long distance, although this was an intuitive mode. Although encouraged to explore all three modes in the practice phase, only 6 subjects used the two additional rate control modes in addition to wheel rolling in the real tests. It was felt that the rate control mapping functions in the IntelliMouse could be improved. However, we believe the lack of self-centering in the isotonic device (mouse) places it at a fundamental disadvantage to do effective rate control (Zhai and Milgram 1993, Zhai et al. 1993, Zhai 1995). Alternatively, if the mouse functioned in position control mode when the button was pressed, user's performance might have been much higher. The low performance of the WheelMouse in this task shows that a dual-stream solution is not guaranteed to outperform the status-quo single stream input.

JSMouse. Supporting our analyses in the introduction, this dual-stream input device outperformed the standard single stream input significantly. Subjective ratings also verified its advantages. Comparatively, although both the JSMouse and the WheelMouse used one hand to handle two streams of input (even with the same fingers), the JSMouse significantly outperformed the WheelMouse, by a mean magnitude of 29 percent.

2Hand. Interestingly, no significant performance or rating difference was found between the two handed system and the JSMouse, even though the two streams of input were assigned very differently. Nonetheless, the results showed that an asymmetric two handed design, one hand with isometric rate control and the other hand with an isotonic position control worked well, outperforming the status-quo by 25 percent for the browsing task. Concerns were raised if such a two handed system would work at all and if the user would confuse the functions of the two hands. Clearly this is not the case. For more demanding tasks, such as a graphical mail sorting task in which the user needs to drag a mail icon into a folder window, scroll the window while keeping the dragged object and then drop it into an intended folder, we have observed more advantage with the two handed system than the one handed dual-stream solution that might be overloaded. Furthermore, it is extremely difficult, if not impossible, to use the one handed solutions in tasks that requires parallel actions, such as scaling, translating, and rotating a 2D geometry by controlling two vertices (Leganchuk, Zhai, Buxton, 1996).

5. Conclusions

Three dual-stream input systems, two single handed and one two handed, were analyzed and compared in a web browsing task that required scrolling and pointing. Results showed that a mouse with a joystick all controlled by one hand, or a mouse in one hand and joystick on the other, significantly outperformed the current standard single stream mouse input. However, the mouse with a track wheel device did not performed any better than the standard mouse. In order to take advantage of additional input streams, the types of input devices must be appropriately matched to the tasks being performed. In addition to much evidence in the literature, this study indicates that it is time to add multi-stream input into mainstream commercial systems, although each step of new design has to be guided by thorough human factors research to avoid very possible mistakes.

Acknowledgment

We sincerely thank our colleagues who made significant contributions to the project that this study is based on, in particular Bobby Lee, Ron Barber and Bob Olyha for software development, Kim May for hardware development and Satoru Yamada for commenting on the experimental design.

6. REFERENCES

Barrett, R.C., Selker, E.J., Rutledge, J.D. and Olyha, R.S. (1995) The Negative Inertia: A dynamic pointing function, in CHI'95 conference companion: Human Factors in Computing Systems, 316-317.

Buxton, W. (1986) There is more to interaction than meets the eye: some issues in manual input, in Norman, D.A and Draper, S.W. (Eds) User Centered System Design, Lawrence Erlbaum Associates, 319-337.

Buxton, W. and Myers, B. (1986) A study of two-handed input, in Proc. of CHI86: ACM Conference on Human Factors in Computing Systems, 321-326.

Fitts, P. (1954) The information capacity of the human motor system in controlling the amplitude of movement. Journal of Experimental Psychology, 46, 199-210.

Guiard, Y. (1987) Asymmetric division of labor in human skilled bimanual action: The kinematic chain as a model. Journal of Motor Behavior, 19(4) 486-517.

Kabbash, P., Buxton, W., Sellen, A. Two-handed input in a compound task, in Proc. of CHI94: ACM Conference on Human Factors in Computing Systems, 417-423.

Leganchuk, A., Zhai, S. and Buxton, W. (1996) Manual and Cognitive factors in two-handed input: an experimental study. submitted for publication.

MacKenzie, I. S., Sellen, A., and Buxton, W. A comparison of input devices in elemental pointing and dragging tasks (1991), in Proc. of CHI'91: ACM Conference on Human Factors in Computing Systems New Orleans, Lousiana, 161-166.

Poulton, E.C. (1974) Tracking skill and manual control. New York, Academic Press.

Rutledge. J. and Selker, Force-to-motion function for pointing, in Proceedings of Interact90: The IFIP Conference on Human Computer Interaction, 701-705.

Smith, D.C., Irby, C., Kimball, R., Verplank, W. & Harslem, E. (1982) Designing the Star user interface. Byte, 7(4), 242-282.

Venolia, D. (1993). Facile 3D direct manipulation. In Proc. of INTERCHI'93: ACM Conference on Human Factors in Computing Systems, Amsterdam, The Netherlands, 31-36.

Zhai, S. (1995) Human Performance in Six Degree-of-Freedom Input Control, Ph.D. Thesis, University of Toronto. http://vered.rose.toronto.edu/people/shumin _dir/publications.html

Zhai, S. Milgram, P (1993) Human Performance in evaluation of manipulation schemes in virtual environments, in Proc. of VRAIS'93: IEEE Virtual Reality Annual International Symposium, Seattle, USA, 155-161.

Zhai, S. Milgram, P, Drascic, D. (1993) An Evaluation of four 6 degree-of-freedom input techniques, in Adjunct Proc. of INTERCHI'93: The IFIP Conference on Human Computer Interaction, Amsterdam, The Netherlands , 155-161.