Dry Goods: How to See Through the Lies of Data Visualization

From: Flowingdata; By: Nathan Yau; exclusive compilation by Sina New Media Lab;

This article is transferred from: Sina.com; original link: http://news.sina.com.cn/2017-02-15/doc-ifyarmcu6008746.shtml;  

    

In the past, when we saw a badly done chart, or a data visualization that was out of whack, we tended to laugh at them and forget about it. But sometimes, especially this past year, it seems more difficult to tell whether a visualization is simply bad product, or deliberately created out of prejudice false information.

  

Of course, using data to lie is nothing new, but graphs are becoming more and more widely circulated, all over the Internet, and many of them convey falsehoods. You may just glance at it casually, but a simple message can also take root in your mind. Before you know it, the little plum has already turned the top on the table, and no one cares whether it will stop or keep spinning.

  

Naturally, now we need to quickly see if a chart is lying, and this graphic is your intimate guide.

  

1) Truncate the number line

640?wx_fmt=jpeg&wxfrom=5&wx_lazy=1

The y-axis data on the left starts at 10, pure nonsense. The data on the right starts at 0, which is fine.

  

Length is the key to the visual presentation of a bar chart, so when someone intentionally shortens the length by truncating the number line, the difference across the chart becomes more apparent. These people want to show more drastic changes than they actually are. I talk about this in detail in another article.

  

2) Double number line

640?wx_fmt=jpegIt uses two very different proportions, probably in order to force a causal relationship.

  

By using a double number line, the magnitude of the data can shrink or expand according to two measures. People often use it to express correlation and causation. "Because of this stuff, another thing happened, look, it's very clear."

  

Tyler Vigen's project on fake correlation data is an excellent example.

  

3) The sum is wrong

640?wx_fmt=jpegThe proportions of all parts of the pie chart add up to more than 100%.

  

Some charts are dedicated to showing parts of the population, and when the parts add up to more than the sum, the problem is big. For example, a pie chart represents a total of 100%, and what if the proportions of each sector add up to more than 100%? Odd.

  

Check out this funny example.

  

4) Only look at the absolute value

640?wx_fmt=jpegThis is really just a population map. When you compare different places, species or groups, you must consider relative values, a fair comparison

  

Everything is relative. You can't say that just because there were two robberies in one town and one in the other, the first town is more dangerous. What if the population of the first town is a thousand times that of the second? It is often more effective to compare percentages and proportions rather than absolute and gross values.

  

This xkcd cartoon shows the impact of absolute population numbers very bluntly.

  

5) Limited scope

640?wx_fmt=jpegThe graph on the left looks like a large increase, but the graph on the right shows that this is the norm and that the increase in the selected time period is actually not significant.

  

People tend to choose dates and time periods carefully to match a particular narrative, so it should take into account historical context, recurring events, and reasonable benchmarks for comparison.

  

Interesting things may be discovered when you study the big picture.

  

6) Weird grading

640?wx_fmt=jpegThe picture on the left has only two grades. What does the one greater than 1 include? Possibly covering. The right image is better, showing more variables.

  

Some visualizations oversimplify a complex model instead of showing the full range of variables in the original data. Doing so easily turns a continuous variable into a variable belonging to a category.

  

Broad grading is useful in some cases, but complexity is often the point of things. To prevent oversimplification.

  

7) Chaotic area ratio

640?wx_fmt=jpeg30 is three times as large as 10, but perhaps for added significance, the largest rectangle on the graph is more than three times larger than the smallest.

  

If visual coding is done by area, the size ratio of the graphic should be the ratio of the area. Some people, however, change the ratio of the side length to highlight the size contrast when visualizing the area code, just to catch a horse.

  

Sometimes such mistakes are made inadvertently and require more vigilance.

  

8) Manipulate the area dimension

640?wx_fmt=jpegThe upper and lower figures have the same area, but they look very different.

  

Maybe someone knows how to use area to do visual coding, but (gu) (yi) made things like the above picture. I haven't seen such an exaggerated example, but maybe there will be in the future. I bet even pictograms will show up, just wait and see.

  

9) 3D for 3D's sake

640?wx_fmt=jpegNever when you see a graph that is clearly unnecessary and forced to use 3D, please question its data, graph, author, and anything derived from the graph.

  

Focus on:

  

If a visualization has any of the above problems, it does not necessarily mean that it is lying. As Darrell Huff put it in How to Lie with Data:

  

"The title of this book and some of the content in it may seem to say that all similar works are the product of deception. The president of a chapter of the American Statistical Association once criticized me for this. He felt that it was not so much out of deception. It's more like an incompetence."

  

Of course, this does not mean that it can be forgiven, after all, it is also wrong. But remember this, you can think twice before calling so-and-so a liar.

  

My experience is to scrutinize the charts that are shocking and more dramatic than imagined.

  

Charts don’t make false information true, and neither can data. They succumb to the person making the diagram and show more than the message itself. Well, keep your eyes open.

  

[This article is exclusively compiled by Sina New Media Lab, please indicate the source for reprinting];

From: Flowingdata; By: Nathan Yau; exclusive compilation by Sina New Media Lab;

Link: http://news.sina.com.cn/2017-02-15/doc-ifyarmcu6008746.shtml;

Copyright statement: Part of the content of this number comes from the Internet. Please indicate the original link and author for reprinting. If there is any infringement or the source is wrong, please contact us.

Business Cooperation|For invitations, please add qq: 365242293  .


For more relevant knowledge, please reply: "Moonlight Treasure Box";

Data Analysis (ID:  ecshujufenxi  ) Internet technology and data circle's own WeChat, and one of the members of the WeMedia WeMedia Alliance, which covers 50 million people.640?wx_fmt=png

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326024665&siteId=291194637