Visualization Blog

Ideas, Papers and Thoughts on the field of Visualization

Visualizing disease incidence in the context of socioeconomic factors

with one comment

Disease incidence is usually connected to biological factors such as genetics, eating habits, exercise and so on. But are there are other socioeconomic factors that influence disease incidence as well? This TED talk from Bill Davenhall inspired us to explore socioeconomic factors that may influence disease incidence.

To explore the connections of socioeconomic factors such as education levels, regional population, income level of the area where you live and the air pollution in terms of toxic levels, we developed a visualization tool called DiseaseTrends. DiseaseTrends allows the exploration of possible correlations between those socioeconomic factors with diabetes prevalence and cancer incidence rates across counties throughout the United States. A user can interactively explore these factors at a county, regional (user defined cluster of counties), state or national level.

When a user explicitly selects a county, we display 5 similar counties based on their socioeconomic factors. The motivation behind this feature is to allow users to identify similar counties that may have varying disease incidence rates, which may in turn lead to further exploration.

 

As mentioned above, a user can specify regions manually that cross state boundaries. A user defined circular cluster can be specified using Ctrl on PC, Cmd on Mac – then click and drag. Here the user has specified four regions. The maximum prevalence (in red) and the minimum (in green) across the selected region is highlighted in the panels below.

We use bullet graphs (pdf link) as introduced by Stephen Few to display the quartile distribution of a factor as well as the corresponding state and national average (faint and dark vertical bars).

Through this tool, we can easily see the now popular diabetes belt, as shown here

Diabetes belt

 

High incidence rates in Native Indian reservations such as Navajo County, Sioux County, Rolette County and Big Horn County too can be seen. 

We would like to mention that DiseaseTrends does not imply any causation and can merely hint at possible associations. It is completely up to a researcher in the field of public policy / public health to further investigate the findings.

More details about DiseaseTrends can be found in our paper.

Visweek 2011 – Perception and Cognition for Visualization, Visual Data Analysis & Computer Graphics

with one comment

I had mentioned in an earlier post that this tutorial promises to be an amazing one and so it was. Here is an outline of the workshop.

Bernice E. Rogowitz covered fundamentals in human perception and cognition, and discussed how they apply to visualization. She covered a huge array of topics, ranging from the pupil being partially responsible for our depth perception, all the way to color theory and how it relates directly to the biology of the human eye.

The presentation had a great flow, starting at a very high level to give everyone an idea of what questions they would be able to answer at the end. As the talk progressed, she covered detailed biological details of the human eye, and progressed to the intersection of perceptual issues and computer science.

In the biological portion, we learned that there are five layers of cells in the retina, each responsible for different tasks. Much of the interesting stuff happens at the very beginning (photoreceptor distribution) and then further into the process at the ganglion cells. She went over how lateral inhibition is caused by the spatial distribution of the photoreceptors connected to a single ganglion cell, and how this is the reason for several of the optical illusions we perceive. She did a great job of explaining the connections between biology and perceptual issues.

Cultural differences were also addressed. The eye movements we have are actually learned when we learn how to read. Cultures with different reading directions have substantially different reading directions.

The section on the Striate Cortex was especially interesting. This is the first time in the visual system that images from each eye are merged (the point where depth perception occurs). This section sends output to 60% of the brain! This is a huge amount, and makes the visual system incredibly important to the decision making process.

This tutorial had a huge quantity of useful information and was really well put together! She concluded with a great summary of four things to remember:

  1. There are different response rates for different stimuli, how well do you want to convey magnitude information?
  2. Color and luminance mechanisms have different spatial sensitivities.
  3. Certain visual information is perceived “pre-attentively” such as color.
  4. How the world is perceived depends on what the user is trying to accomplish.
These notes were transcribed by Lane Harrison (@laneharrison) and Drew Skau(@seeingstructure). They are both graduate students at UNCC. Thanks guys!! This is almost as good as being there.

Visweek 2011 – VAST Challenge

leave a comment »

The IEEE VAST Challenge this year consisted of three mini challenges and one “grand challenge”. For the uninitiated, the challenge datasets are designed to reflect real-world analytical challenges. This year’s challenges involved microblog+geospatial data, cyber security data, and a corpus of plain-text news documents.

While there is no “winner”, the challenge entries are given awards based on their respective strengths. Participants are also given detailed feedback from both visualization experts and actual analysts who work with such data. This feedback is one of the most rewarding aspects of the VAST Challenge. As we all know, it’s hard to get access to analysts.

One highlight is a geo-text visualization approach from the Universität Stuttgart. ContentLens is a novel integration of tag-cloud methods and geo-visualization and interaction techniques. Perhaps if you ask nicely, they’ll give another demo.

Another interesting approach come from Penn State, who leverage their GeoViz toolkit to analyze a large (10gb+) set of cyber security data. In fact, the security data took center stage in a panel of security and/or visualization practitioners.

Some key points in the panel suggested a need for smart collaboration in large-scale security analysis. Analysts can often unnecessarily repeat work or miss threats that has already been addressed by other analysts in their organization. Maybe we need an “Amazon- Recommendations” component for security analysis tools.

Finally, workshop participants heard about and discussed the possible future VAST Challenges. Can visual analytics handle a million-node network? Can this community design tools that enable consumers to make discoveries in the vast sea of data that surrounds them? Well, that’s up to you.

Link to detailed challenge descriptions: http://hcil.cs.umd.edu/localphp/hcil/vast11/index.php/taskdesc/index .

These notes were transcribed by Lane Harrison (@laneharrison) and Drew Skau (@seeingstructure). They are both graduate students at UNCC. Thank you guys!!

Written by alark

October 25, 2011 at 9:00 am

Visweek 2011 – Telling stories with data

with one comment

Workshop: Telling Stories With Data 

The Telling Stories With Data session was a continuation of last year’s session, largely inspired by the Edward Segel and Jeffrey Heer paper; “Narrative Visualization: Telling Stories with Data.”

Steve Drucker from Microsoft Research talked about his rich interactive narrative player work. The tool uses XML to set up a queue of keyframes with audio narration. It supports audio narration and multimedia embedding.

Wesley Willet talked about supporting ad-hoc storytelling in social media. He raised several technical issues, including the need for interactive visualization state encoding in URLs. This would allow people to link to a specific state in the visualization, and share that state over social media.

Jerome Cukier talked about adding personal connections to visualizations to help audience engagement. He also discussed the importance of providing interaction that lets users feel like they are exploring the dataset and coming to their own conclusions. A significant take- away quote Jerome had is “Trust is not really an exact science.”

Sunah Suh framed a discussion about the impact of culture on visualization and visualization on culture. She discussed the multiple forms of literacy beyond just natural language, addressing issues of statistical literacy as well as visual. She raised the point that visualization does not just reflect and rely on societal norms, it also reinforces them. One example she gave was the pink and blue color coding in Baby Name Voyager reinforcing a binary gender concept.

The takeaways from the first half of the session dealt with technical issues of platforms and tools to support narratives, as well as some of the social issues going on in narrative visualizations. Both of these are important issues as narrative visualization finds its place in culture. The necessary tools must be developed to help narrative visualization become ubiquitous, but also there are social issues to be addressed as people come to terms with the new media. 

These notes were transcribed by Lane Harrison (@laneharrison) and Drew Skau(@seeingstructure). They are both graduate students at UNCC. Thanks guys!! There were other excellent speakers who presented, but Drew and Lane were presenting in an adjoining session.

Written by alark

October 24, 2011 at 11:40 pm

Visweek 2011 is upon us!

with 3 comments


The annual IEEE Visualization, IEEE Information Visualization and IEEE Visual Analytics Science and Technology conferences – together known as IEEE Visweek will be held in Providence, RI from October 23rd to October 28th.The detailed conference program is spectacular and can be downloaded here.Some of the new events this year are under the Professional’s Compasscategory. It includes a Blind date lunch (where one can meet some researcher they have never met and learn about each others research), Meet the Editors (where one can meet editors from the top graphics and visualization journals), Lunch with the Leaders session (an opportunity to meet famous researchers in the field) and Meet the faculty/postdoc candidates (especially geared towards individuals looking for a postdoctoral position or a faculty position). I think this is an excellent idea and hope that the event is a hit at the conference.I am also eagerly looking forward towards the two collocated symposia – IEEE Biological Data Visualization (popularly known as biovis) and IEEE LDAV (Large data analysis and visualization).  Their excellent programs are out and I’d encourage you to take a look at them.

The tutorials this year look great and I am particularly looking forward to the tutorial on Perception and Cognition for Visualization, Visual Data Analysis and Computer Graphics by Bernice Rogowitz. Here is an outline for the tutorial that can be found on her website. She was one of the first people to recommend that people STOP using the rainbow color map.

The telling stories with data workshop too looks great and will be a continuation of the great tutorial held by the same group last year. I am eagerly looking forward to it.

Apart from this are the excellent papers that will be presented at the conference. I shall write another post about the ones I am particularly looking forward to. With so many exciting events going on, it almost seems like a crime to have all of them happening in the span of a few days.

I shall definitely be blogging about the event as much as I can. You can also follow me on twitter, which will have more real time tweets than the blog which will distil a days worth of information into a post.

Let me know if you are going to be around and I’ll be happy to talk to you.

Written by alark

October 17, 2011 at 9:12 pm

Visualization for Fraud Detection

with 2 comments

Data visualization is being used for detecting fraud, especially with respect to wire and credit card transactions. Work done at the Charlotte Visualization Center at UNC Charlotte provides some interesting insights into fraud detection. This work was conducted in collaboration with the Bank of America.In the following paper they highlight four visualization techniques that allow for fraud detection.

Scalable and Interactive Visual Analysis of Financial Wire Transactions for Fraud Detection, Remco Chang, Alvin Lee, Mohammad Ghoniem, Robert Kosara, William Ribarsky, Jing Yang, Evan Suma, Caroline Ziemkiewicz, Daniel Kern, Agus Sudjianto, Journal of Information Visualization (IVS).

Heatmap: A heatmap depicting the relationship between accounts and transactions.

Search by example: Find accounts with transactions/activity similar to the current account being monitored.

Strings and beads: A line graph based visualization that shows critical events as ‘beads’ on the graph. The use of a log scale for the y-axis is a neat idea and probably allows for improved exploration.

Keyword graph: A graph visualization showing keyword similarity This paper was based on previous work done by the same group titled Wirevis. I would encourage interested readers in reading the original paper as well as the previous paper (Wirevis).

Centrifuge Systems, a Virginia based company,  have developed data visualization software for fraud analysis.

  • It is a web-based solution that allows interactive exploration of data for fraud detection.
  • Can read a wide variety of file formats (excel/access databases).
  • Allows interaction with visualizations such as node-link diagrams, bar charts etc.

You can check out a 10-min video on their website at http://www.centrifugesystems.com/shadowbox/libraries/mediaplayer/Centrifuge-1.8-for-Banking-Fraud-Analysis.flv. As per the company website, it has been used to detect fraud in Bulgaria called the “Bulgarian Money Mule ring”. Seems like a step in the right direction. It would be interesting to see, if they could save and share workspaces for collaborative exploration of data. With their web-based framework, it would make it particularly interesting for investigators located at different locations to immediately access and interact with the current state of the visualization.

Any other companies, products, research papers that you may have heard of that I missed?

Data Visualization Talks Online

with 5 comments

Lately, I have been collecting links to videos of talks related to Data Visualization. I found multiple talks for some people and so have categorized them accordingly. I have also tried to provide some context to the individual/group.

I think the first TED talk by Hans Rosling (@hansrosling) got a lot of media attention and made people sit up and appreciate the power of ‘narrative visualization’. He almost make it look like a sport with him serving as the role of a commentator. The title on TED’s website for the talk is “the best stats you’ve ever seen“. I am not sure about that, but it is a very entertaining talk.

It was followed up by an interesting study by information visualization researchers George Robertson, Roland Fernandez, Danyel Fisher, Bongshin Lee and John Stasko in the Infovis 2008 paper titled “Effectiveness of Animation in Trend Visualization.” Here is an interesting excerpt from the abstract of the paper:

Results indicate that trend animation can be challenging to use even for presentations; while it is the fastest technique for presentation and participants find it enjoyable and exciting, it does lead to many participant errors. Animation is the least effective form for analysis; both static depictions of trends are significantly faster than animation, and the small multiples display is more accurate.

Hans Rosling was back again at TED 2007 with “New Insights on poverty” and has spoken many times since at TED.

Fernanda Viégas and Martin Wattenberg (@wattenberg) (previously at IBM Research) have brought visualization to the masses in through IBM Many Eyes. They have recently started a new venture called FlowingMedia. Here are some links to their talks:

Manuel Lima (@mslima) of visualcomplexity.com gave an interesting talk at Made by Many. His talk titled Network Visualization in an Age of Interconnectedness was not only an excellent talk, but ended up starting quite a passionate debate which led to Manuel writing a post titled Information Visualization Manifesto. I urge you to read the post and look at the interesting perspectives that infovis experts in the field had to Manuel’s manifesto. Manuel gave another interesting talk at the Creativity and Technology (CaT) 2009:  Information Visualization.

Aaron Koblin (@aaronkoblin) has been involved with creating innovative and evocative data visualization pieces such as the New York Talk Exchange, Radiohead’s House of Cards music video (You can see Aaron in the “Making of House of Cards” video), the very entertaining ‘Bicycle built for 2000‘ project and many others.

Making of House of Cards

Links to a couple of Aaron’s talks are below:

Tom Wujec is a fellow at Autodesk. His talk on  3 ways the brain creates meaning provides an amazing insight into our brain.  He addresses issues related to why data visualization works and how the brain visualizes data.

Jeff Heer has developed information visualization tools that can be used by developers around the world for creating interactive visualizations of their own data. He is the authors of Prefuse, Flare (Check out the excellent demos) and most recently, Protovis (many great examples online). Lately, he has published an informative articles in the ACM Queue titled A tour through the visualization zoo – Jeffrey Heer, Michael Bostock, Vadim Ogievetsky. He does a great job interviewing Fernanda Viegas and Martin Wattenberg in the ACM Queue. A talk by him at the Stanford HCI seminar can be found here (html link, wmv).

Alex Lundry (@alexlundry) presents a very interesting point of view in his talk – “How visualization changes everything“.

Nicholas Christakis presents a very fascinating talk where he used social data visualization to explore the influence of social networks – “The hidden influence of social networks.” In his talk he says that spreading of obesity is due to your social network. Smoking and even divorce can be linked to the company you keep.

Sebastian Wernicke presents a light hearted look where he analyzes TED talks and presents some statistics based on the analysis in his talk – Lies, damned lies and statistics (about TEDTalks)

Please let me know if I have missed any interesting data visualization talks that are available online and I will be happy to update the post.

Written by alark

July 1, 2010 at 2:34 pm

Interactive visualization at your fingertips

with 2 comments

With today’s release of Tableau Public, Tableau Software has opened up infinite possibilites for researchers, corporations and enthusiasts alike to interact, explore and play with their data. More importantly, with Tableau Public one can now have ‘interactive’ visualizations online as opposed to static images. This is a step in the right direction for Data Visualization software, since increasingly one hears from domain experts who want to ‘use’ software and not have to write programs (however small or easy those programs may seem to the developer of the software). Tableau now allows researchers to explore their data and collaborate more effectively instead of having to share static ‘screenshots’ via email.

Such uses of visualization software have already been explored and shown to be hugely successful by the ManyEyes team in their CHI ’08 paper, but the capabilities and strengths of both the products are in somewhat disjoint areas. For example, Tableau focuses on the Business Intelligence community and lacks certain visualizations such as Treemaps or Text visualizations (which ManyEyes seem to do really well). Other interesting and inspiring uses of Tableau Public can be found in their Gallery at http://www.tableausoftware.com/public/gallery. Dont forget to check out the NYC Graffiti workbook that they have online. Detailed training videos can be found at http://www.tableausoftware.com/public/training

ManyEyes – Reader s of this blog already know my fondness for IBM’s Many Eyes. ManyEyes has been a pioneer in this field of online visualization software that facilitates data visualization without the need for programming. Research papers from the Many Eyes team detailing user interactions and unexpected uses of the visualization software can be found at http://bewitched.com/manyeyes.html

Verifiable is another such website that allows online visualization of data. So far the data visualizations that are possible are limited to bar charts, scatter plots and line charts but the trend is definitely promising and I hope they continue to improve the excellent service. A video can be found online at http://verifiable.com/screencast

Swivel is similar to Verifiable, where one can upload data and create online interactive visualizations. Videos for all the features in Swivel can be found at http://www.swivel.com/features. Unfortunately, they have a 15-day free trial that restricts the widespread use of their tools.

As I interact with experts and students from domains as wide as political sciences, biology, economics and so on, I am pleased to hear the awareness that they have for effective visualization but I am sometimes disheartened to have to tell them to learn programming to learn some of our nifty tools. Tableau Public, IBM Many Eyes and others are exceptional in the service that they provide. I envision more research groups, corporate websites and so on posting interactive visualizations with a ‘Powered by Tableau’ icon or something similar in the bottom right corner.

Redesigned Visualizations

with 8 comments

Lately, we have been seeing a high number of ‘bad’ visualizations in media. Over at Infosthetics, they even had a contest to identify the ‘Most Ugly and Useless Infographic‘. It was worth a few chuckles but it definitely made one realize the importance of effective data visualization. It is unfortunate that some people have to make decisions based on such visual representations.

More than just looking at bad visual representations, there seems to be an increasingly constructive trend of redesigning graphs/visualizations that seem to get very popular in media. I wholeheartedly support this endeavor and hope to see more. It is naturally easy to criticize other visualizations, but redesigning it to ‘put your visualization where your mouth is’ takes courage. Here are a few examples.

Original Visualization – Infamous Fox News Pie chart
Redesigned Visualization – Jon Peltier’s take on the same data
Original Visualization- Health Care Reform Chart
Redesigned Health Care Visualization
Original Visualization – GE’s The cost of getting sick
Robert Kosara’s Redesigned Visualization
Original Visualization – National Geographic’s The Cost of Care
Jon Peltier’s Redesigned Visualization

If you have seen any other interesting visualization critiques, please send them my way and I shall be happy to update this post.

Big car, Small car – Which one is safer?

with 4 comments

As a new parent, I have always been guilty of driving a compact car when everyone around me keeps telling me that even though SUV’s are bad for the environment they are so much safer in case of an accident. I cringed a bit at every such discussion but thought that maybe they had a point.

But then I thought why not use data visualization to get to the bottom of this and find out what the truth is. Let me preface this by saying that this is my first attempt at visualizing the data I could find for free and any visualization suggestions or data sources that you are aware of will be greatly appreciated.
[Note: No fancy visualizations here 🙂 Only good old bar graphs]

Step 1 – Type of the car vs Fatalities

I first wanted to find out what is the breakdown of car crashes as compared to the type of car. I found that there is extensive data (see data sources below) about car crashes and fatalities. I decided to use fatalities as a measure of how ‘safe’ the car is and so this graph shows the type of car as compared to the fatalities in 2008. I was sad to see that ‘Passenger cars’ were ranked first but happy to see that ‘Light trucks’ were pretty high up too. Minivans, Compact utility and Large Utility vehicles had far fewer fatalities and I was worrying whether my worst fears (SUV/Minivan = safer) were coming true.

Step 2 – Sales for each type of car

But then I thought that the number of accidents obviously is very dependent on the number of cars that get sold per year and if more passenger cars were getting sold, then more of them would be in a fatal accident thus giving it a higher number. So I found out what the car sale numbers were for 2008 (see data source below) and decided to plot that.

Step 3 – Comparing the Fatalities/Sales ratio

Then the next obvious thing to do was to compute a ratio of the number of fatal accident per type of car with the number of cars sold for that type in a year. On computing the ratio, I found something very interesting. Sorting the graph based on this ratio, I found that Compact Utility vehicles had the highest ratio of fatal accidents to sales. If you look at the first graph, you will see that the compact utility vehicles do not have a large amount of fatal accidents to begin with, but then when that number is divided by the total amount of compact utility vehicles sold, we find an interesting insight (much to my relief and joy).

Passenger cars have a lower ratio than Compact utility vehicles, Large utility vehicles and Light trucks. 🙂

Anyone who has used Tableau has probably already guessed that all these visualizations were created using Tableau Software and so I visualized the Ratio, Fatal Accidents, Sales all in one image. It shows clearly how compact utility vehicles have a high ratio even though trucks and passenger cars have higher fatalities and more cars of those types were sold.

My current data sources are (Please let me know if you are aware of better ones):

Fatality analysis reporting system – http://www-fars.nhtsa.dot.gov/States/StatesCrashesAndAllVictims.aspx

WSJ – Car sales for the year so far – http://online.wsj.com/mdc/public/page/2_3022-autosales.html