The purpose of this study is to evaluate different presentation techniques that assist users with the extraction of usable information from two incongruent maps.
For our study, we chose to use a political map of Great Britain and a schematic diagram of its waterways. The political map features a list of cities and regions. The waterway map includes a representation of waterway paths, junctions and intersections. Often, different regions in a nation are represented by its own administration, with different bylaws and policies governing its waterways. As a result, citizens need a method to easily determine which waterway passes through a certain region, and which region contains a given waterway.
Four different presentation techniques are evaluated; such techniques range from a side-by-side view to a more complex technique that involves smooth animations and transitions between the two maps - map morphing. Each of the four techniques is evaluated in order to find the right presentation style to reduce errors and shorten the time required for the users to find the waterway or region they are looking for.
Map Morphing is an interactive morph between two maps covering approximately the same region. Using both blending and distortion, the maps appear to merge into one another. This allows the user to directly visualize how the two maps diverge from one another in their presentation. This is particularly useful when one or both maps provide views that aren't spatially accurate (such as the typical "schematic" subway map).- D. Reilly
The participants selected for the study fell into the age group of 18 - 25 years old, and were familiar with using touch screen devices. The participants started from a simple pre-test evaluation that examined their past experiences with map usage in general (mobile or paper) and any form of map matching. The general criteria for the participants is that they are able to read a maps proficiently, but are not often in use of incongruent maps.
At the beginning of the study, the four different techniques were demonstrated to the participants. The four presentation techniques are listed below.
Since each technique contains different controls, the participants were allowed to spend 10 seconds to run through each of the four demos. The demos of the controls were meant to eliminate the “startup” time required to learn the operation during the trials. During the introduction of the controls, we asked the participants to speak out loud on their interactions with each technique. This allowed us to record feedback on each presentation technique before diving into the tasks.
Each of the tasks contains 2 conditions, with the second condition being more difficult to complete correctly. Due to the extensive amount of presentation techniques, this study only focuses on two main tasks, each of the tasks originate from one map and a “switch” to the other map is needed. These tasks were chosen as representative uses of region/waterway lookups in order to determine the best of the four techniques.
In the first task the user finds a region with jurisdiction over a given set of waterways, while in the second task the user finds a waterway passing through a given set of regions. Each participant was then given two tasks.
Initially there was concern that since it is possible for a task to contain more than one correct answer, the cognitive load might be too high for the user to decide between alternatives. We were worried that this might confound our results. However, we discovered that participants selected the first correct answer they located on the map. Interviews indicate that none of the participants in the pilot felt indecisive between multiple answers owing to the use of the word “any” in the task description. This led us to conclude that our tasks were sound and did not suffer from confounding in this respect.
While the participants perform the tasks and condition, the program logs the key variables to determine usability and accuracy.
We split the 4 conditions to blocks of 10 trials, each condition has 3 blocks or 30 trials in total. To reduce bias, the testing program randomly shuffles the blocks that are presented to the user. All participants completed the same amount of tasks but the blocks were presented in different order. Furthermore, the shuffling algorithm validated that the two tasks are shuffled to avoid “short-term” association of waterway or maps. Such shuffle makes sure that one task does not appear too often in a continuous test. Participants were suggested to take a rest period every 3 blocks.
At the end, participants were asked to complete the post-test questionnaire and participate in an unstructured interview about their experiences and feelings regarding the four techniques. An exit interview is a good balance to the impersonal test and allows to talk to the participants without a set agenda that might make it difficult for them to answer after completing the intensive trials.
For the purposes of the study, we will evaluate the effectiveness of four techniques on the user’s speed and error frequency with respect to two tasks. To arrive at our hypothesis, we categorized the four techniques through two relational methods used to bridge the incongruent maps.
These relational methods are:
Below is a table comparing relational methods used by each incongruent map presentation technique.
|Technique||Spatial Reference||Point of Reference|
Based on the table, we are able to compose our hypothesis for the experiment.
The results we obtained can be separated into three main categories, one qualitative and two quantitative. Our qualitative data includes experimenter observations during trials and analysis of questionnaire answers. The quantitative data analysis includes the effects of techniques on performance, and the effects of within-subjects design.
For this section, the following will apply:
Upon the conclusion of each experiment, the participants completed a questionnaire and participated in an unstructured interview. The importance of the unstructured interview is in the details that the participants felt while completing the tasks. From the questionnaire, participants rated each technique in regards to the usefulness of the assistance while completing the task. Participants gave high ratings to techniques that had some assistance level and always labelled Technique #1 as the least useful in all four trials.
It is interesting to note that Technique #2 and Technique #4 are very similar in nature, with Technique #4 having additional assistance in the form of a cursor. Participants recognized this similarity and added in the response that Technique #4 helped them to better identify the regions when compared to Technique #2. The decrease in time spent on each task is also reflected on the reduced amount of “flips” between the two maps. The difference between the #2 and #4 techniques is visible in the test data as participants spent less time and produced less errors while interacting with two very similar interfaces. The ability to pin a map point allowed the participants to test “uncertainties” when trying to find the boundaries of an area.
Participants noted that the first technique was useless as it provided no benefit when compared to normal paper maps that are distributed in subway stations. Such dissatisfaction with the Technique #1 can be visible in the test results as the average completion time for both tasks was the slowest recorded from all the tasks and contained the most errors.
Additionally, having maps side-by-side (as opposed to morphing one into the other) was difficult for the participants because they had to consistently see a point on one map, memorize it, change focus to the other map, and internally understand the relation of the two maps.
From interviews and the questionnaires, participants noted that the Technique #3 was the most useful to operate. The method allowed users to quickly understand the relation of the two maps by moving a cursor and seeing a duplicated cursor on the second map. Participants noted that this behaviour is natural to them from other applications of map tracing and editing of word documents. As analyzed further, Technique #3 was one of the most effective techniques in terms of completion time and low error frequency.
During the trials, we logged the time required for a participant to complete a task, as well as the number of errors it took to complete a trial. We will compare these to the presentation techniques used to determine whether a technique has an effect on the participant’s performance with incongruent maps.
The average time required to complete a trial using each of the four techniques can be seen in Figure 1. Interestingly, the results are different between the two tasks. With the task of “finding region given waterway”, the results are similar to our predictions: synchronized cursor is fastest, followed by pinned animation, smooth animation, and side-by-side. However, the results are different for the second task “find waterway given region”, where the places for synchronized cursor and pinned animation are reversed, with pinned animation taking a huge lead.
The reason for this unexpected result on the second task could be due to the nature of the task. The task requires the participants to find a waterway that resides in a specific region. The defined regions are large in size but differ in color, which makes them recognizable. As participants approach this task, they instantly pin the “center” of the region and perform the animation. At the completion of the animation, they choose the closest waterway to the pin that was placed before the transition. Similar behaviour can be traced on the more difficult task where participants placed a pin closer to the edges of the region and performed the animated transition. After such transition, the pin is indicative of the waterways that run close to the edge of the region.
These results are supported by our ANOVA analysis. Our null hypothesis is that there is not a significant difference in the mean completion time between the four techniques. However, during our two-way within-subject ANOVA analysis, we found significant main effects of Task F(1,1904)=37.3, p<0.01 and Technique F(3,1904)=14.7, p<0.01 on the completion time. We also found a significant interaction of Device and Technique F(3,1904)=3.86, p<0.01. This allows us to reject the null hypothesis and conclude that Task, Technique, as well as Task and Technique have a significant effect on completion time.
Post hoc testing with a Tukey’s pairwise comparison revealed significant differences between the two tasks (p<0.01) as well as between all combinations of the four techniques (p<0.01).
The average number of errors prior to the completion of a trial using each of the four techniques can be seen in Figure 2. Once again, the results are not exactly as predicted. In the “find region given waterway” task, synchronized cursor is the most accurate technique with the least errors, followed by pinned animation, smooth animation, and side-by-side. The result of the “find waterway given region” task mirrors that of the completion time measurements, with pinned animation taking first place, followed by a close tie between synchronized cursor and smooth animation, and side-by-side in last place.
The reason for this result could be explained once again by the nature of the tasks. Finding a region given the waterway has fewer errors using synchronized cursor because the cursor tells the user exactly which region the waterway falls in. By comparison, finding waterway in a region is more reliable with pinned animation because the animation gives the user a much better idea on the surrounding waterways that exist in the area. For example, while checking for one waterway, participants placed the pin in the center of the region; while searching for more than one waterway resulted in offsets towards the edges of the region.
These results are supported by our ANOVA analysis. Our null hypothesis is that there is not a significant difference in the mean error frequency between the four techniques.
During our two-way within-subject ANOVA analysis, we found significant main effects of Task F(1,1904)=26.1, p<0.01 and Technique F(3,1904)=5.7, p<0.01 on the frequency of error. We also found a significant interaction of Device and Technique F(3,1904)=6.0, p<0.01. This allows us to reject the null hypothesis and conclude that Task, Technique, as well as Task and Technique all have a significant effect on error frequency.
Additional post hoc testing with a Tukey’s pairwise comparison revealed significant differences between the two tasks (p<0.01) as well as between all combinations of the four techniques (p<0.01), except Technique 2 to 1 (p<0.03) to a lesser extent, and a non-significant difference between Technique 4 and 3 (p=0.94).
The experimental design for this study is within-subjects design, where all participants are tested on every task and condition. This differs from the alternative between-subjects design, where individual subjects only test for a single condition or task. The most obvious effect this design will have on our results is the effect of participant learning, where the participant becomes more familiar with the task the more he or she is exposed to it. This commonly shows up in the form of increased performance on later trials.
To compensate for the effects of learning, we conducted our trials in three successive super-blocks, each of these includes 10 trials of each condition, task, and technique. This is to minimize the learning effect by making sure that each block (other than perhaps the first) is conducted at roughly the same level of experience. The results of the participants’ aggregated performance over the three blocks are graphed in Figures 3 and 4.
Notably, we can see that both the error frequency and the required time dropped on the second block for all participants. This is an indication of the participants becoming more familiar with the task. Observations during the study indicated that most participants have begun to memorize the map at this point. Paradoxically, however, the performance for the participants actually decreased during the third block, with increases for both completion time and error frequency.
After examination, we were able to attribute this drop to fatigue. By the start of the third block, the participants had already completed 360 trials. In spite of breaks between every 30 trials, it is possible that the participants were not adequately recovered at this point to continue performing optimally. Further effects of learning and fatigue could be investigated in future studies where the turnaround between blocks is quicker.
We predicted at the beginning of the study that the synchronized cursor technique would be the fastest, and the pinned animation technique would be the most accurate. Through the qualitative and quantitative analysis of our experimental data, we discovered that while we correctly hypothesized the effectiveness of side-by-side and smooth animation techniques, the difference in the nature of our tasks means that we only correctly predicted one task for the two remaining techniques. The accuracy of the synchronized cursor enables speedy and accurate selection of large targets, while the ballpark-estimates allowed by the pinned animation method allow users to rapidly narrow estimations to find correct smaller targets. This discovery was supported by our statistical analysis, which indicated that the differences in task were significant as well as the differences between techniques. Through this study we conclude that is it difficult to discover one best technique that can be applied to all tasks, because the differences between techniques makes them suitable for different tasks.