Big car, Small car – Which one is safer?

As a new parent, I have always been guilty of driving a compact car when everyone around me keeps telling me that even though SUV’s are bad for the environment they are so much safer in case of an accident. I cringed a bit at every such discussion but thought that maybe they had a point.

But then I thought why not use data visualization to get to the bottom of this and find out what the truth is. Let me preface this by saying that this is my first attempt at visualizing the data I could find for free and any visualization suggestions or data sources that you are aware of will be greatly appreciated.
[Note: No fancy visualizations here :) Only good old bar graphs]

Step 1 – Type of the car vs Fatalities

I first wanted to find out what is the breakdown of car crashes as compared to the type of car. I found that there is extensive data (see data sources below) about car crashes and fatalities. I decided to use fatalities as a measure of how ‘safe’ the car is and so this graph shows the type of car as compared to the fatalities in 2008. I was sad to see that ‘Passenger cars’ were ranked first but happy to see that ‘Light trucks’ were pretty high up too. Minivans, Compact utility and Large Utility vehicles had far fewer fatalities and I was worrying whether my worst fears (SUV/Minivan = safer) were coming true.

Step 2 – Sales for each type of car

But then I thought that the number of accidents obviously is very dependent on the number of cars that get sold per year and if more passenger cars were getting sold, then more of them would be in a fatal accident thus giving it a higher number. So I found out what the car sale numbers were for 2008 (see data source below) and decided to plot that.

Step 3 – Comparing the Fatalities/Sales ratio

Then the next obvious thing to do was to compute a ratio of the number of fatal accident per type of car with the number of cars sold for that type in a year. On computing the ratio, I found something very interesting. Sorting the graph based on this ratio, I found that Compact Utility vehicles had the highest ratio of fatal accidents to sales. If you look at the first graph, you will see that the compact utility vehicles do not have a large amount of fatal accidents to begin with, but then when that number is divided by the total amount of compact utility vehicles sold, we find an interesting insight (much to my relief and joy).

Passenger cars have a lower ratio than Compact utility vehicles, Large utility vehicles and Light trucks. :)

Anyone who has used Tableau has probably already guessed that all these visualizations were created using Tableau Software and so I visualized the Ratio, Fatal Accidents, Sales all in one image. It shows clearly how compact utility vehicles have a high ratio even though trucks and passenger cars have higher fatalities and more cars of those types were sold.

My current data sources are (Please let me know if you are aware of better ones):

Fatality analysis reporting system –

WSJ – Car sales for the year so far -

Visualization tools and their use

In the past, I have discussed visualization tools and a few companies that make them. They are used by a wide variety of professionals such as the Business Intelligence community, Scientific Data explorers, Financial data analysts and many more users.

I feel though that more often than not such tools are an afterthought in a company’s think tank and require a significant amount of training, learning and overcoming a mental block by the senior management of a company. Companies such as Tableau Software, Spotfire, and many others must have excellent sales teams which, on identifying companies that may be able to benefit from their really stellar software, have to then pitch it to them. Even if the software blows the company management away, the reluctance on their part in adopting it surely must be a problem. Additionally, I am sure that even if the top management think its a great idea, the real users (lower level management/marketing/sales professionals) might not have the time and willingness to invest into using the software.

I know that the people behind these visualization companies are brilliant researchers who are not only innovating in the field of visualization but also taking the extra effort to improve on proven visualization techniques in order to make them easy to use. 

I feel very strongly about this matter and wonder if a few things can be done to avoid this situation in the future. Clearly, just as training radiologists and other medical software users to use 3D volume rendering software is an uphill task, training business analysts to use visualization tools must be a difficult task.

1 – Provide an educational version of the software that can be available and used only through academic institutions. They could be fully featured or have only some evaluation features but let your software be one of the first tools that students use when analyzing their data. That will ensure that at least a few of them will be trained in using the tool when they go on to join a company and root for your visualization software at the company. If you think about it, companies like Microsoft allow free downloads of their express edition of Visual Studio for students which ensures a familiarity with the software that developers then take to companies when they join there. In my experience, students are far more ready to learn new software and technologies as compared to senior management in a company. They also have more time on their hands and can devote more time to learn the spiffy new features in your visualization software.

2 – Developing a course or two for data analysis in conjunction with a professor at a university – Merely providing educational versions of the software is of limited value. An interesting data analysis course that teaches use of your software or two/three other similar software tools along with basics of data analysis and an overview of visualization techniques might be another interesting way to approach the problem.

3 – Provide easy to use learning material – Take the time and make sure to have tutorials and multitudes of examples on your website that will allow users to use them and improve over time. Having a free PDF book or a step-by-step tutorial can vastly benefit the user and take some load off of your hands for training purposes. Tableau software does an amazing job in the training realm as can be seen by the examples at You can even download a ‘workbook’ for each example and play around with it in your copy of the software.

4 – Provide free training in the form of Webcasts – Recently, nvidia had a few webcasts focused on CUDA and its applicability for general purpose computing on the GPU. The webcast consisted of a few nvidia developers giving a presentation and answering some questions at the end. The webcast was free and was a great way to indoctrinate a few more researchers to use CUDA for which they would have to buy nvidia graphics cards. I thought it was a great idea which could be taken even further when applied to visualization software. If your users already have a running software, then publishing sample datasets and walking them through it can be even more compelling and interactive than reading an online tutorial or a book chapter.

5 – Providing training at a conference or a workshop might be another way to get users to download your evaluation version and play with some data. Google has been doing similar ‘training’ at conferences like SIGGRAPH and IEEE Visualization for the last couple of years now. This helps you get new users as well as allow for professionals attending the conference to learn something they might convey to their company when they go back to work, which could translate into acceptance and added sales for the company.

If you have any other suggestions, please feel free to let me know. Sorry there are no pretty visualizations in this post :)

IEEE VAST 2008 – Christian Chabot Keynote

I think one of my favorite events of the entire Visweek 2008 was the VAST 2008 keynote by the CEO of Tableau Software, Christian Chabot. [Apparently, he blogs too.]

He motivated the audience by making a very strong case for why there is a need to use Visual Analytics software. He basically said that there was a much wider customer base, than one would imagine, for quality visual analytics tool. What was interesting to me was that he said that his definition of a successful visual analytics tool was how widely adopted that tool was. I personally believe that its the best way to make sure that users have the power of visual analytics at their finger tips. 

He basically said that we havent optimized the impact of visual analytics until you help users with their own data. The demo of Tableau was my favorite part, where he would end up interacting with simple datasets to show how easy it was to get insight or just know more about the data. 

I think my favorite quote from the talk was ‘Visual analytics can help people test their hunches even when they lead to nowhere.’ :) This was great since this is exactly the purpose of visualization. The idea of interacting with your data to learn more but also just confirm what you already know. 

He then showed an amazing demo of  presidential donations data from new york city. Comparisons of Obama and McCain showed some wondreful, interesting patterns in parts of new york, like the upper east side and so on. Some of those patterns were expected for those who know the demographics of new york city

Some of the highlights of the talk were: 

- Most analytics tasks dont result in ‘Aha’ discoveries. 

- People dont like to admit they need outside help to make discoveries about their own data. 

- Visualization and Visual Analytics helps people think of what questions to ask. More importantly, it helps them enter the Visual analysis cycle of interact, explore, visualize, obtain insight – rinse and repeat :) 

His most imprtant statement of the keynote speech was that “The number one reason people buy visual analytics software was to save time.” 

I think it was the kind of keynote that makes you think and shakes you up a bit. I agree that we needed to hear some of those words. I particularly enjoyed the talk, since I had mentioned Tableau Software in one of my previous blog posts on ‘Visualizing companies leading the way.’  After listening to the talk, I feel more confident that Tableau will make a big difference in the Visual analytics and business analytics community. If you have anything to say about the keynote, please feel free to add a comment.

Visualization companies leading the way

As part of the visualization community, I have always believed that it is not sufficient to develop new techniques and present/publish them at the Visualization conferences. The strength of the techniques has to reach out to the people with the data. Companies that polish those techniques and take the effort to showcase their abilities to a large audience are the true leaders who reveal the power of visualization to a common user. 

Here a few companies that I have always been impressed with and have been following for some years: 

Tableau software – With an exceptional group like theirs, what else could you expect? Tableau software came out of Stanford university, with the guidance of Pat Hanrahan and the vision of Chris Stolte, this software company was formed. They now have a phenomenal team of researchers, academics, managers and software developers who are doing some great work. It is particularly exciting to see Jock Mackinlay on their board. He has always been an ardent believer in the ability of visualization as a whole. [His blog and some of his papers]. I am sure you can find much more information on their website.

I was really happy to hear that they secured $10 million in series B funding from New Enterprise Associates, Inc. (NEA). I read a few blogs to find more about Tableau, and came across this blog entry “Is Tableau the next Google?”  I sure hope so :)


Stamen Design - The group at Stamen seem to come up with the most effective, visually appealing visual representations of data. They seem to just ‘get it’. Their projects page shows a small sampling of the talent that the company houses. Recently, I was excited to see that MSNBC was using their visualization to show the progressions of hurricanes through the Atlantic. Click here for MSNBC’s storm center developed by Stamen


Spotfire/TIBCO – Spotfire was another great company that got acquired by TIBCO in Summer 2007. Spotfire has been developing some great software for business professionals and is putting in a lot of effort in getting their software into the hands of users. Their wide range of products fine-tuned based on industry or application make their products versatile and accessible. Their acquisition by TIBCO implies the need to incorporate data visualization into the toolbox of business professionals. 


Kitware – How can one ever forget Kitware in a list of visualization companies leading the way. Under the excellent and brave leadership of Will Schroeder this company, which started out as a open-source visualization toolkit company, has grown in leaps and bounds into a huge scientific visualization (and now infovis) powerhouse. They have continued the development and support of the Visualization Toolkit (VTK),  and Insight Toolkit (ITK) which many of us have used widely to get our feet wet in the field of visualization and image analysis. The notable and applaudable fact about the company is their commitment to the open-source community. They have been contributing, helping and supporting the endeavors of open source projects in fields as diverse as oil exploration, medical imaging, geographic data visualization and many more. They have a wide variety of products that are well supported and extremely useful. And in case, any of you are looking for a job, they are hiring too at the moment. 

If you know of other visualization companies that are out there working hard to get visualization techniques to the masses, please feel free to add their names in the comments section.

