sábado, 29 de octubre de 2011

Deficiencies in the Internet Mass Media: Visualization of U.S. Election Results

Introduction
The Internet news media is an important source of information for many people in the information age.
Information visualization is not in widespread use in the news media, whether in print or on the web. One notable exception is the reporting of election results in the United States. Not only is visualization shown by the media during the coverage of the elections, it is also widely noted and
studied by consumers of the mass media. The image of blue and red states after the 2000 Bush-Gore elections and the 2004 Bush-Kerry elections has significantly influenced the way Americans view politics and society in their country.
In the 2000 U.S. presidential elections, for the first time, all major media outlets used blue to represent Democrats and red to represent Republicans, and the terms `blue states' and `red states' gained ubiquitous status after the 2004 presidential elections'' in mainstream political discussions, indicating the tremendous power and influence of this visualization.
Another major reason for the popularity of election visualization on the Internet is because of the real-time coverage it provides. People cannot wait till morning to read the print news, which by that time is outdated by
several hours. Because races in recent years have been decided by razor-thin margins, people want to get the latest results.
Visualization of election results is very challenging because of the complexity of the data. The data types need to be visualized include: candidate (name, party affiliation, and state), electorate (party affiliation and vote), geographical distribution, balance of power, change in balance of power, margin of victory.
The effectiveness of the visualization of election results is therefore very important, since it is the one visualization that people pay strong attention to. On one hand, rather than have users pore through tables of data, an effective visualization of the election results will quickly
give users a clear understanding of the situation. On the other hand, a poorly-designed visualization may add more confusion than clarity, and force users to turn to competitor's websites for information. Even worse, some visualizations may potentially lead users to draw incorrect conclusions.
We conducted a study on the effectiveness of two of the most popular online visualizations of the November 2006 mid-term elections in the United States from Internet news sources. In these elections, the electorate chose governors, and representatives of the House and the Senate
in many but not all the states. We study how easy or difficult it is for the users to get information from the visualizations. We also study whether the users were able to obtain accurate information. Furthermore, we created our own alternative visualizations, and tested whether our visualization is more effective than the two Internet mass media visualizations we studied.
The results show that these visualizations are alarmingly misleading and difficult to interpret. A large proportion of users were unable to obtain basic and pertinent information from the visualizations. In contrast,
users performed much better on the alternative visualizations that we designed. Such severe deficiency of the Internet mass media is of grave concern because of the widespread dependency of the population on the Internet for checking election results, and the financial impact on the various players in this competitive business.

Related Works
There are multiple ways of mapping the information in the election results to visual properties. A good mapping is able to convey facts to the user. However, a poorlydesigned visualization may confuse the user, obscure data, or even cause the user to draw false conclusions about the data. As Van Wijk [5] points out, visualization is often subjective, warning of the danger the visualizations can cause the user to gain negative knowledge.
Sutcliffe et al. [4] performed user tests on an integrated visual thesaurus and results browser system.
They found that although users liked using the system, the system did not actually improve performance. Their work shows an actual instance of the limitations of visualization, as the study found that users were confused by a visual metaphor used in the system. In our study, we are also interested in investigating whether the visual metaphor used in election visualization leads to positive knowledge.
Plaisant [3] points out that information visualization techniques are being adopted in mainstream applications but users sometimes still struggle in using them, and stresses the need for studies to guide the more effective deployment of visualization functions. The goal of this paper is to study users' experience of using visualization in mainstream media, with the hope that it will lead to improvements.
The effectiveness of using red and blue states of the U.S. map to show election results have been studied. For example, Gastner et al. [2] show alternatives by distorting the map based on population.

User Studies on Election Visualization
We conduct two between-subjects studies comparing the performance of users using Internet mass media election visualization with the alternative visualization that we designed. We selected webpages from CNN.com and MSNBC.com, since they are the top and third most popular online news sites (excluding the Weather Channel) according to a HitWise study in May 2005.
The subjects in our studies were all undergraduate students in Anonymous University majoring in Computer Science. They were of ages ranging from 19 to 35.
They also had varying familiarity with the American political system. All subjects were asked if they knew beforehand the number of seats in the U.S. Senate and whether they knew beforehand which color represented which party. This is because their knowledge directly impacts their ability to answer some of the questions. The same users participated in both studies.

Study I
In Study I, we investigate the effectiveness of Internet mass media visualizations in conveying information about the overall Senate election. There are a total of 100 seats in the U.S. Senate. Democrats need to control 51 seats to get a majority, while Republicans need to control 50 seat (because the Vice-President, not a member of the Senate, who can cast the deciding tie-breaker vote, was Republican at the time of the elections). Not all the seats are up for the elections in November 2006 because they have to serve out the rest of their term. The preceding facts are potentially confusing to people unfamiliar with U.S. politics, and make the visualization challenging. There are a few major issues that people reading the election coverage are concerned about: (1) Who will gain control of the Senate, (2) How many seats switched hands between the parties, and (3) Among the seats that are contested in November 2006, how many did each party win?

In this study, subjects were divided into three groups, 9 subjects in Group A, 9 in Group B, and 10 in Group C.
Group A viewed the display taken from MSNBC.com showing the Senate balance, Group B viewed table from CNN.com of the same data, while Group C were shown a bar chart visualization that we created. These three difference images are shown in Figures 1 through 3. Image A does not show how many of the seats were up for elections, nor how many seats changed hands. Image C does not show how many seats changed hands. Only Image B has all the information. In our test questions, we want to test how informative the visualizations are.
We asked each subject the following questions: 1. How many Senate seats are there altogether? 2. At this time, how many have been decided? 3. Among those decided, how many are in Democrats' hands? 4. At this time, is it conclusive who will have majority in the Senate? We selected these questions based on what we believe is the intention of the Internet visualizations tested in Study I; their intention is to convey to the user the status of the Senate race. These visualizations were meant to tell readers whether the Senate race has been decided, if so who won, and if not what the status is.






Figure 1: Study I: Visualization of overall Senate race
(Group A - MSNBC.com). Users were told to look at the
area marked ``look here'', which is intended to inform the
users of the status of the Senate race (Has anyone won? If
so, who won? If not, who is ahead, and by how much?).
Because the bar chart lacks vital information, users had
difficulty interpreting it.

Figure 2: Study I: Visualization of overall Senate race
(Group B - CNN.com). Users were told to look at the area
marked ``look here'', which is intended to inform the users
of the status of the Senate race (Has anyone won? If so, who
won? If not, who is ahead, and by how much?). Users had
difficulty locating relevant data to make conclusions about
the Senate race.

Figure 3: Study I: Visualization of overall Senate race
(Group C - Our alternate visualization). This visualization
is proposed as an alternate to the visualizations marked
``look here'' in Figures 1 and 2. Users were better able to
obtain important information about the status of the Senate
race than users of those two Internet media visualizations.


Results
The average score over all questions for Group A (MSNBC.com), B (CNN.com) and C (our alternate) are 0.7 , 0.22 and 0.98 respectively. The average time taken by were 160, 189 and 93 respectively. The Chi Square analysis
results are shown in Table 1. Group C users performed significantly better than Group B on Q1, Q2 and Q3, and better than Group A on Q1. Group A users performed significantly better than Group B on Q2 and Q3.
Table 1: Chi Square Statistical Analysis of Study I. The
threshold at p=0.01 is 6.63.

Users of the MSNBC.com (Figure 1 for Group A) visualization had difficulty answering Question 1: only 30% of them answered that question correctly. Otherwise, the users of the MSNBC.com visualization were able to accurately obtain the other important information asked in the other questions.
The CNN.com (Figure 2 for Group B) visualization fared much more poorly. Although all the information is present in the display, users were unable to understand it.
90% of the subjects failed to answer the first three questions correctly. In other words, they were unable to tell the total number of Senate seats, how many have been decided, and how many the Democrats have won. They did better in the last question, but still, only 60% of the subjects were able to answer the last question correctly.
Every subject also answered a survey where they were asked if they had known beforehand the total number of seats in the U.S. Senate. Among those who did not know the answer or got the wrong answer, none of the users of
the MSNBC.com or CNN.com visualizations answered Question 1 correctly, while all the users of our alternative visualization got that question right. This result is expected for the MSNBC.com visualization because it failed to present this important piece of information. But it is even more alarming that users of CNN.com also failed to answer this question, because this information is actually available in the CNN.com visualization. This just shows that the CNN.com visualization is too confusing for users to obtain this basic information.
In contrast, the new alternative visualization that we came up with (Figure 3 for Group C) was much more effective. Every subject answered questions 1 through 3 correctly, and only one subject did not answer Question 4 correctly. On average, subjects also answered the questions
significantly faster than users of the other two visualizations, taking about two-thirds the time of MSNBC.com, and half the time of CNN.com.
Note that we did not test the number of seats that changed hands, which is shown in the CNN.com visualization, but not in the other two. The CNN.com
visualization would probably have performed better on this question, but we did not test it, because we believed that this is not such crucial information compared to the questions that we asked.

4.2 Statistical Analysis
We performed Chi Square Test of significance for each pair of Groups for each of the Questions 1 through 4.
The results are shown in Figure 2. The results show that users of our alternative visualization performed significantly better (p<0.01) than the users of the CNN.com visualization on Questions 1, 2 and 3, and significantly better than the users of MSNBC.com on Question 1. The users of MSNBC.com performed significantly better than the users of CNN.com on Questions 2 and 3. Performing ANOVA (Analysis of Variance) on the time taken by users of different groups to answer the questions, we found that the difference observed is statistically significant (p<0.001). Users of our alternative visualization took the shortest time to answer the questions, followed by MSNBC.com, and CNN.com users were the slowest.



Figure 4: Bar chart comparing the performance on subjects
in Study I. On average, users of our alternative
visualization performed the best and most quickly, while
users of the CNN.com visualization performed the worst
and took the longest time. ANOVA analysis shows that the
difference in time taken is statistically significant, at p =
0.001.


4.3 Problems with the MSNBC.com Visualization
The flaws of the MSNBC.com visualization (Figure 1 for Group A) are obvious. It doesn't tell the total number of seats in the Senate. Therefore, users performed poorly on Question 1, which asks, “How many Senate seats are there altogether?” That is a crucial question because that
determines the outcome of who will have control over the Senate, which is the question on top of many readers' minds. The visualization only shows the balance, and a cryptic number “98.” It is not clear whether 98 represents the number of seats in the Senate or the number of seats that have been decided. Without knowing this, readers do not know the aggregate outcome of the elections, or whether the outcome is already known.

4.4 Problems with the CNN.com Visualization
The table shown in the CNN.com visualization actually contains a lot more information. However, the information is presented in a confusing and misleading manner. One column is labeled “Total.” It is unclear what “total” means; it could refer to (1) the total number of seats won by the party in this election, or (2) the total number of seats controlled by the party (including those not contested in this election). Furthermore, it is not clear how many seats are still undecided. For that, the reader would have to read the fine print above the table, and find out that actually 2 seats were still undecided at that time. Consequently, 90% of users answered Questions 1, 2 and 3 wrongly, which is an unacceptably poor result. Not only that, but users also had to try a lot harder to understand the table, taking twice as much time as those using our alternative visualization.

5. Study II
In Study II, we investigate the map of the United States showing which states voted for which party. This is an important visualization because it shows the geographic distribution of the support for each party. The color-coded map visualization used to depict the results of the 2000 presidential elections was very influential in shaping the understanding of the U.S. socio-political landscape because it showed a clear divide between the “blue” states (those who voted for the Democratic candidate) on the west coast and northeastern states, and the “red” states (those who voted for the Republican candidate) in the rest of the country


Figure 5: Study II Group A were shown this image:
Visualization of Senate election results from CNN.com.
Misleading and ambiguous legend led to some
misunderstanding and misinterpretation of results among
users.


In the November 2006 elections, the color-coded map was once again used to show the results of the elections.
However, the information to be shown is much more complex. In these elections, the terms of the Senate seats of some states were not over, and so those seats were not contested. Therefore, they had to be shown in a different color. In addition, there were some Independent candidates who won seats, whereas in the presidential elections, no states voted for any Independent candidate. Consequently, another color had to be used to show Independent candidates. Thirdly, the media would like to show which
states switched parties; in other words, which states voted for a candidate from a different party from the incumbent.
Finally, at the time of reporting, the results of some states were still unknown, and that had to be conveyed in the map.
The visualization provided by CNN.com is shown in Figure 5 and our alternative visualization is shown in Figure 6. Each subject was asked the following sequence of questions: 1. Was the Senate seat of Indiana (IN) contested? 2. If yes, which party won? 3. Was the Senate seat of Montana (MT) contested? 4. If yes, which party won? 5. Was the Senate seat of Ohio (OH) contested? 6. If yes, which party won? 7. Was the Senate seat of Oregon (OR) contested? 8. If yes, which party won? 9. Was the Senate seat of Connecticut (CT) contested? 10. If yes, which party won? 11. Was the Senate seat of Hawaii (HI) contested? 12. If yes, which party won? 13. Which states switched parties?

Figure 6: Study II Group B was shown this image: Our
alternative visualization of Senate election results. We use a
yellow star symbol to represent states which changed
parties, and we made the legend more informative.

The above questions all test basic information that users should be able to obtain from the figure. The states mentioned in the questions were all located for all the users so that the time taken by users unfamiliar with the geography of the United States to locate the states would not be a variable in the test. We selected the above six states to question the users because they represent the different categories: Republican (IN), Processing (MT), Democrat/party switch (OH), not contested (OR), Independent (CT) and Democrat (HI). Finally, since this visualization attempts to convey the information about which states switched parties, we test this in Question 13.
5.1 Results
The results of Study II are given in Figure 7. For each question, a wrong answer receives 0, and a correct answer receives 1. Questions 1 through 12 can only either be correct or wrong. For question 13, getting all the states correct receives 1, getting none of the states receives 0, while getting some of the states receives the corresponding fraction of the points.
Users of our alternative visualization answered the questions more accurately than users of the CNN.com visualization, except for Question 2 (users of both visualizations performed equally) and Question 9 (users of
CNN.com performed slightly better). Users of our alternative visualization also took a shorter time answering the questions.

5.2 Statistical Analysis
We performed the Chi Square Test of significance on Questions 1 through 12, since the answers are correct/incorrect and therefore nominal. We performed ANOVA on Question 13 since each participant's answer is given a numerical value of correctly. We found that Questions 6 and 13 shows significant differences (at the p<0.02 and p<0.01 levels respectively) in the performance of the users between the two groups.


Figure 7: Bar chart comparing the performance on subjects
in Study II. Users of our alternative visualization performed
significantly better on Questions 6 and 13. Although the
average time spent on our alternative visualization was less
than for CNN.com, the difference is not statistically
significant.

In Question 6, many users of the CNN.com map were confused about the bright blue color. They did not interpret the bright blue color to mean that Ohio was contested, and the Democrats won it from the incumbent Republicans. As for Question 13, many users of CNN.com once again did
not understand the bright blue and yellow colors to mean change of control.
Although the average time spent on our alternative visualization was less than for CNN.com, ANOVA shows that the difference is not statistically significant.

5.3 Problems with the CNN.com Visualization
The CNN.com visualization is confusing because of its lack of a clear legend and its poor choice of color scheme. In the legend, the Democrat (DEM) and Republican (GOP) colors are shown as a lighted blue and red sphere respectively. The problem with lighting the sphere is that the color is non-uniform, and therefore does not match with the actual color in the map. The next problem with the legend is that the symbol for states that are still ``processing'' the votes is shown as a saturated dark green circle with double circular arrows. However, this does not match with the actual color of the states shown on the map (MT and VA), which is a faded light green.
Furthermore, on the map of the actual states still processing the results, no symbol containing the double circular arrows is shown: the legend does not match the actual visual image used, which is a very serious error.
The decision of CNN.com to indicate party switch with brighter colors causes even more confusion. In this visualization, the states of MO, OH, PA and RI are shown in brighter blue because they are states that have switched from Republican to Democrat in these elections. However, this is confusing because this bright blue color matches with the color of “Voting” shown in the legend. This can mislead users into thinking that these states are actually still voting.
The results of our user study indicate that this choice of a misleading visual mapping indeed causes a significant problem. In Question 13 asking which states switched parties, subjects using the CNN.com visualization on
average scored 0.43. On our alternative visualization, which is to indicate party switch with a star symbol, users on average scored 0.98. (Note: the star symbol was also used by other mainstream print media to show states that switched parties.)
There is also no entry in the legend for Independent candidates. VT and CT were both won by Independent candidates. Because CT was also a party switch, it is shown in a brighter color. CNN.com has chosen to use Yellow to be a brighter version of Beige. This is not an intuitive choice of colors, and it is not universally accepted that Yellow is a brighter version of Beige. Once again, this leads to confusion.
There is also no entry in the legend for the states whose Senate seats are not contested. On the map, they are shown in grey, which is a sensible choice. However, because of the poor legend, grey could represent many
different things besides being uncontested; for example, they could mean “processing,” “too close to call,” or “voting.”

6. Questionnaire
All subjects filled out a questionnaire with questions that correct for any variables in the tests. Subjects were asked if they knew before-hand what the total number of seats in the Senate was. If they claimed that they knew it, they were told to write down that number. This knowledge
would enable a subject to answer Question 1 of Study I correctly regardless of the visualization used. The responses of the subjects were recorded and reported in Section 4.1, where we report the results of Study I. Subjects were also tested on their familiarity with U.S. politics, for example, the conventional use of Red to represent the Republicans and Blue to represent the Democrats. All visualizations implicitly assume such knowledge. Also, subjects were asked if they were affiliated in any way to CNN.com and MSNBC.com, since such affiliation may bias their answers. All subjects were sufficiently familiar with U.S. government to know about
Republicans, Democrats and their respective colors, and also none of the subjects were affiliated with the companies whose visualizations we tested.

7. Conclusions
Through our user studies, we conclude that the two examples we studied of Internet mass media visualizations of the 2006 elections were ineffective and misleading. We showed subjects screenshots of visualizations provided by MSNBC.com and CNN.com, two of the most widely-used Internet news sites. The subjects performed poorly on many of the questions asking about basic information. We have provided a detailed analysis of the users' performance on each question and discussion on why the users did not perform well. In summary, we found that users were unable to answer basic questions about the data because of (1) poor choice of visual cues (colors, symbols) in the visualization, (2) lack of a proper legend to explain the colors and symbols used, (3) misleading, ambiguous or wrong legend, and (4) lack of crucial information (such as total number of seats contested). The users thus performed poorly on the two visualizations tested. Users of the Internet visualizations took a longer time and were less accurate in their responses compared to users of our alternative
visualization.
Our study also thus shows that it is possible for a welldesigned visualization to convey the same information, and this means that the failure of the Internet mass media visualization is due to poor design, and not because the intrinsic complexity of the underlying data or any intrinsic limitations [1] of using a 2D display to present the data.
We believe that Internet visualization of election results is very important because many people depend on the Internet for real-time updates of the election results.
Many people check the results frequently on the Internet while the various states report their returns and exit polls.Unlike most news stories, where readers prefer to read textual descriptions, view videos and images, for elections, users like to look at tables, charts, maps and other visualizations to analyze the results. It is therefore very crucial that the Internet mass media provide visualizations that are accurate, user-friendly, and clear. This makes the results of our study particularly alarming, because our study shows that users experience great difficulty in using the visualizations, often drawing wrong conclusions. Furthermore, we have also designed some simple alternative visualizations that are able to convey the same information much more clearly. We hope that the Internet
mass media will improve on their election visualizations.

8. References
[1] C. Freitas, P. Luzzaerdi, R. Cava, M. Wincker, M. Pimenta, and L. Nedel. “Evaluating usability of information visualization techniques.” In Proceedings of the 5th Symposium on Human Factors in Computer Systems (IHC
2002), 2002.
[2] M. Gastner, C. Shalizi, and M. Newman. “Maps and cartograms of the 2004 US presidential election results.” Advances in Complex Systems, 8:117-123, 2005.
[3] C. Plaisant. “The challenge of information visualization evalution.” In Proceedings of the Working Conference on Advanced Visual Interfaces (AVI’04), pages 109-116, 2004.
[4] A. Sutcliffe, M. Ennis, and J. Hu. “Evaluating the effectiveness of visual user interfaces for information retrieval.” International Journal on Human-Computer Studies, 53(5):741-763, 2000.
[5] J.J. van Wijk. “The value of visualization.” In Proceedings of IEEE Visualization (VIS ’05), pages 78-86, 2005.

1 comentario:

  1. You really make it look so easy with your presentation, but I find this subject to be really something, I think that I would never understand. It seems too complicated and wide for me

    ResponderEliminar