Inform Visualization CS 7450
Popular in Course
verified elite notetaker
Popular in ComputerScienence
This 0 page Class Notes was uploaded by Alayna Veum on Monday November 2, 2015. The Class Notes belongs to CS 7450 at Georgia Institute of Technology - Main Campus taught by Staff in Fall. Since its upload, it has received 17 views. For similar materials see /class/234149/cs-7450-georgia-institute-of-technology-main-campus in ComputerScienence at Georgia Institute of Technology - Main Campus.
Reviews for Inform Visualization
Report this Material
What is Karma?
Karma is the currency of StudySoup.
Date Created: 11/02/15
Rebecca Shillingburg CS 7450 Information Visualization Spring 2005 Multivariate Visualization Systems Goals of Visualization Tool Analysis The goal of Visualization tools should be to enable the user to comprehend data in new and different ways A key property is the organization of the data so that the user can see things that would otherwise not be obVious The usefulness of a tool depends upon the user s task A good Visualization tool ideally would work to aid as many different tasks as possible but for some problems specialized Views may be preferable Some of the more common tasks are searching finding patterns or trends and solVing problems Design decisions for Visualizations must take into account what the users want to accomplish when Viewing the data For example if the goal is calculating sums and averages then a simple spread sheet may be adequate However if the goal is finding trends in consumer buying a user needs a tool that enables finding patterns and correlations within the data set The power of a Visualization tool lies not with the ability to present colorful 3D displays but with the ability to help the user perform tasks specific to their data set To evaluate these tools I looked at the tools ability to become familiar with the domain to discover unknown things within the data to identify patterns and trends and to facilitate making decisions The Data To critique the Visualizations systems I examined the mutual funds and cereal data sets using the four commercial tools provided After looking at the attributes available for each case I formulated the following questions to help me analyze the tools Mutual Funds As a user I would want to determine an investment strategy for long term investment Looking at the available data I formulated the following questions What funds were rated 5 stars by Morning Star What were the funds with the highest yield What were the funds with the lowest expense ratio What funds gave the highest return relative to expense Cereals As a user I would want to know which cereals were healthy For some this might mean high protein or low carbohydrates I decided to look at the levels of fiber sugar and sodium since these three things play an important role in current popular diet trends Which cereals had high fiber Which high fiber cereals were low in sugar and low in sodium Which cereals ranked high in all three factors 117 Rebecca Shillingburg CS 7450 Information Visualization Spring 2005 As I worked with the data I realized that my strategy for both data sets was nding one or more items that fit a specific goal either an investment goal or a health goal To expand my experience with the tools capabilities I decided to look at the grocery survey to see what could be found just by exploring the data The questions I used for this browsing activity involved looking at one type of occupations to find patterns of behavior To narrow the focus I picked the military for the occupations and examined that data Here my questions re ect looking for trends rather than specific items How do people in the military pay when they shop Does the level of income impact how they pay Commercial Visualization Tools Four commercial tools were provided for analysis Spotfire Pro from Spotfire SeeIt from Visual Insights InfoZoom by Siemens and Eureka from Inxight These tools provided different types of visualizations some more dynamic than others Spot re Spotfire follows Schneiderman s mantra of Overview first then zoom and filter then details on demand making it a good tool for dynamic queries Adjustments to specifications are made with tools that can be easily adjusted making the query exible If too many data points disappear from the visualization the user can adjust the query by changing the parameters for any of the attributes of the cases This makes it easy to find near matches when nothing in the data set is an exact match Opening a data set in Spotfire creates a colorful 2D scatter plot which is visually appealing A large data set can be overwhelming but the controls in Spotfire are user friendly Variables on both axes are easy to adjust and other controls allow filtering of the data Clicking on an individual item provides details on demand while retaining the overview Spotfire is designed so that users can take control easily through the ability to adjust the sliders or select on categories so that you can manipulate data in many ways The controls initially are on the side of the view but can be moved or resized to enable the user to maximize the screen area that works best for a particular data set Multiple views can be opened on the screen allowing the user to compare different visualizations This exibility enables the user to customize the tool for different tasks and data sets New variables can be created using simple expressions which increases the exploration capabilities of Spotfire Spotfire supports brushing by showing changes in filters or highlighting data in visualizations that is re ected in others The sliders allow the user to incrementally adjust variables and see immediate results of any adjustments One type of filtering can be achieved by sliding the ends of the sliders though this method is limited to the 717 Rebecca Shillingburg CS 7450 Information Visualization Spring 2005 adjustments by the size of the value Users immediately see changes in the visualization as data points are added or removed based on adjustments to the controls The data controls showed a surprising amount of data Extreme points can be seen on the controls which helped in making decisions about how to change the control Color was used to show information about the records such as those marked deselected empty and visible The impact of adjusting variables other than those on the axes can also be seen in the display If an adjustment makes too large a change it is easy to slide it back Also new variables can be created and are given a slider to allow adjustment to it This helps the user explore the data without losing what has already been discovered By clicking on a speci c data case the details are presented in the lower right box beneath the controls Here I discovered that AllBran had the highest ber with the lowest sugar In the mutual fund data I discovered that Janus Mercury is a fund that is ranked as a 5 star by Morning Star and has one of the highest returns in the last 3 months However this nding could have been misleading because in creating new views there is an automatic shift in the variables on the axes without a corresponding shift in adjustments to other variables So the main discovery here was to always watch the variables on the axes A bolder font would help prevent new users om making mistakes Also the grey data points show the funds without a Morning Star rating and there was not an option to lter out no rating Mn 5 m I In new HD EHQIEEEIB JM HJEHEEEIQEE Hlm mm Ital 21 Rebecca shillingburg cs 7450 Information Visualization Spring 2005 a View actually changing what you have on the screen In the View Tip you can see a scatter plot oforne 39 39 u 39 nu can 99 all different options because the current pair is still on the screen The user has in 39 p to look carefully 39 39 A better a r grouped by the ordering ofthe rst variable Here using the View Tip I discovered that the higher the ber in the cereal the lower the carbohydrates 39 39 L h e 39 cereal t have higher levels and cluster toward the middle or higher end a casein Attime n ngin A 39 n nu A m aw u to the screen 39 quot 39 L careful not to assume that the current variables on other views will be in the new View appear to be the same Here further experience with the tool would enable the user to pairing TL Overall Spot re would be useful for both search and discovery tasks The scatter plots nun i a n D cane L the View Tlp tool The ability to easily manipulate the data with the controls aided 41 2 Rebecca Shllllngburg cs 7450 Luformatroh Vrsualrzatroh Sphhg 2005 dlscovery The ple ehart was helpful u the grocery suryey data showlng abreakdown of eategohes Seen Seelt provldes apowerful yrsualrzauoh othe datarh a 371 laudseape format The detarls data ease The along eaeh axls so that the user ear easlly see what eaeh bueketholds The buekets ear be ordered eollapsed ahd splrt ohe orrhore bars u the bueket ear be seleeted allowlng glvlng the user the opuoh to experrrhehtwrth dlfferent ways to vlew the same presehtatroh Whlle 37D yrsualrzauohs ear be powerful the data ear be hard to rhterpret due to oeelusroh and eorrelauhg the rhdwrdual data eases wth the buekets To oyereorhe the hl l w or m well as pttehhg The yrsualrzatroh ear be turned pahhed and the user ear zoom u on speclflc areas The data ear ever be vlewed from above glvmg a 271 vlew whlch ear be hrghlrghted as well as rows whlch changes the eolor of the background for those rows another vlew The bounds of the ler eould be adjusted for dlfferentvanables from er ehd orusmg a slrderto see the result v w I fl 3 Earl22 The water level also added a way to vlew ltered data m the eohtext ofthe overvlew Thls provldes the user wth away to rhterpret the dlfferences rh herght u the 371 format sl Rebecca Shillingburg CS 7450 Information Visualization Spring 2005 walls The projections could be as simple as the count or could be simple calculations such as the average While the documentation available was not as detailed as for other tools the SeeIt documentation did provide the overview and details in one picture This helps the user connect speci c actions with what changes will be made in the visualization Presumably more detailed documentation was available in lgwgwabw I In the cereal data clicking a speci c bar showed the range of calories and of ber The height which here was the count is also displayed Then below the summary data the speci c data points are listed with the number of calories and grams of ber These data points are brushed in the corresponding spreadsheet presentation Here the user can see that Cheerios one of the most common snacks for babies learning to eat nger food has 120 calories and no ber perhaps a good thing for babies But then is that really the data for Cheerios Further examination of the data shows that the items listed in the popup box are labeled sequentially This at rst appears to correspond to the data in the spreadsheet since so many of the upper data points are highlighted However data point 2 is not highlighted This leaves a question as to whether the popup box is misleading in how the data points are listed There are 32 total and they appear to be numbered from 1 to 32 These numbers do not correspond with the numbers in the spreadsheet and therefore could cause unnecessary confusion NW l 7 eE39iuia as 39z 39 The use of the highlighted rows in the nancial data helped to group the data by placing visual boundaries to see where most of the data fell in the visualization Data had already 612 Rebecca Shillingburg CS 7450 Information Visualization Spring 2005 been ltered to Morning Stars 5 star mutual funds Looking at the visualizations shows that funds with low expense ratios also have had lower yields This raises questions about what the expense ratio comprises and whether this re ects the recent volatile market This would change the task from searching for speci c data cases to an exploration of information that is included in the data set atmnmalfumts m at a QR k egg ITw V In the grocery survey categories could be selected for further examination but I was not able to do a comparison of two subcategories This may have been due to time constraints quotMa kgh39nrzsasaw 39 Overall SeeIt is a good visualization tool Looking at individual data points was not as easy as with Spotfire The strength of SeeIt might be better with some data and tasks than others It was helpful in categorizing data with the buckets but not as strong for located specific cases Eureka Eureka appears to be designed to explore data patterns and correlations between different variables By showing the data as bars patterns can be seen by sorting on different attributes of the data cases Eureka allows the user to focus on specific data while keeping it in context of the big picture 717 Rebecca Shillingburg CS 7450 Information Visualization Spring 2005 Clicking a specific column sorts the data in either values in that column and can be changed between ascending and descending order This allows a user to visually determine whether there are obvious correlations between different attributes of the data The use of the table lens allows further exploration of specific data points Clicking on a speci c data point or group of points displays the data These data points stay in the table lens as the data is sorted using different attributes This highlighting allows the user to see how a subset of data falls in different orderings For example the top ten items in one sorting can be selected and then viewed in another sorting By moving a variable to the left edge of the table Eureka groups or clusters the data in the main table by that variable Multiple variables can be places to the left creating subtables based on these attributes with the order maintained in the sorting To handle the occlusion the user can use span control to expand sections so that the data can be seen in these subtables Use of the clustering first on fiber by moving it to the left edge to cluster enabled the selection of cereals with more than 5 grams of fiber Focusing on these cereals revealed details in all columns helping determine what other factors should be examined While in other tools the pairing used had been fiber and sugar here the visualization provided other alternative that merited exploration A sort on sodium revealed that not all fiber is equal in that category AllBran which has the highest fiber also has the highest sodium Rl Rebecca Shillingburg CS 7450 Information Visualization Spring 2005 Eureka works well when the data set contains obvious categories for sorting The grocery survey contained several categories such as occupation and purchase type which could be pulled to the left edge and allow the user to sort other attributes within these categories and subcategories By using the span controls data can be ltered to view a single occupation as is shown below Further ltering can be achieved by adjusting to control other The table lens capability is helpful for zooming in on specific data This feature allows the user to categorize and filter to see details that fit specific criteria The financial data could be sorted by the Morning Star rating and then investment category To view the details after sorting by a yield items that had a yield greater than 10 could then be viewed in detail sorted on the eXpense ratio Eureka s ability to categorize organize and filter allows a user to zoom in on specific data within the context of the whole data set or without it Eureka has the ability to look for obvious patterns and trends but some data could follow the same trends look related but yet not have relationship One example is time and prices both continue to go up but time is not a good indicator of prices However a sophisticated user would know not to just look for visually similar patterns and go beyond the visual Eureka would not be the best tool for all data sets given these limitations 017 Rebecca Shillingburg CS 7450 Information Visualization Spring 2005 InfoZoom InfoZoom appears to be a spreadsheet designed to hold a variety of data types Initially it does not have the instant appeal of the other 3 tools and appears to be suited more for databases that include a visual attribute such as pictures The query tool would be helpful for this type of data set but requires user knowledge of how to formulate the queries The race car driver data set in the tutorial appears to be the type of dataset that would work well for searches based on visual recognition of drivers The documentation provided shows what can be done but not how Without the class lecture on these tools the user would have tossed this tool in the recycle bin Clicking on the Morning Star ratings and double clicking on a 5 star value filtered the data so that the 5 star rated funds could be viewed Double clicking selected values of a second attribute ltered the data again which then allowed the user to sort by another value This led to the discovery of Gabelli EquityIncome which is has a 5 star rating by Morning Star and has a highest yield and a low expense ratio While unimpressed with this tool this did provide a combination filtering and sorting capabilities that were helpful 1017 Rebecca Shillingburg CS 7450 Information Visualization Spring 2005 JSJgtltJ ate at wew ntmhutes owed we owe wweew Help 491 a 5 D3 B New Elven Save Pm Pveview mew ExcelW 313W w 3lt7lt1w Wide Eampve Elvemew Back mesid All Znamm Zaamaul Exclude Seamh 3 I meula Analysis Balms am I szanv New Fledenn ILavgeValue L WageOhms Corporate USAAlncume Stuck Strung Blue cmpmu GabemEqumelncume Strung Blue cm 188 Gabelh Equm nmme A SECHX 49 92 2845 27 88 28 59 49 92 2845 7 21 98 7 1888 794 8 79 1 78 8 8 7 999789798 12978488 297228899 128555817 Large 3mm 2 a e Weld lt 7 94 Mmmnqstav new e ategavy e Lavqe Value 7 WM start jggwg jm 7w 7 8 e weemeweee In m Fllllinks r7 f quot mmi izst By creating a new analysis group users can make comparisons using analysis groups By using the maximum values for ber the group showed the components of the analysis groups showing that AllBran with Extra Fiber had the lowest sugar and relatively low sodium In the overview mode the user could select one data point and zoom in which enlarged the view of the selected data points The user could also exclude all data points with that attribute value By continuing this process the user could drill down through layers of data making this a helpful tool for data mining and other query activities Also zooming in on an age range provided a way to lter for further examination of spending patterns for a particular group Further drilling could provide spending differences or trends By continuing to select various attributes the data set could be culled down to the items that t the speci cs the users de ned By selecting occupations then payment method then family size patterns could presumably be found in larger data sets The multiple ways to drill down can be seen in the next two screen shots In the rst females in the military with family size of three used tended to use store cards a helpful discovery for marketing 11