This is my episode on the data quality dimension of integrity.
The latest episode of my data management series is here:
I am writing up the following companion piece to go along with my Tableau Public visualization of COVID Cases per Million (North America Region). I created this visualization because I was curious about the state of the outbreak and how it varies from jurisdiction to jurisdiction. At first blush we can see that New York state is the hardest hit by the pandemic, with neighbouring states also affected. It is interesting to see how some states and regions are little affected in as far as pure number of cases go.
But herein lies a problem. There are challenges in using this data which I have drawn from public sources. Each jurisdiction has handled this crisis a little bit differently so we may not be comparing apples to apples here. This may call into question the integrity and the quality of the data, but sometimes you can only make do with what you have. Especially in a crisis.
I trust we will learn many lessons from this crisis in terms of public health, government effectiveness, and also in terms of data quality. This situation continues to unfold, and I shall continue to follow it closely.