Initial Data Organization
The Data
I have imported the dataset into Mathematica; below is how I imported it. I chose to use the “dataset” option to import the set neatly as a dataset for Mathematica to interpret. I also used the HeaderLines option to signify that the first row of values were headers. This will make it easy to index each column for storing specific values in the future.
I first chose to extract the starting and ending coordinate pairs of each tornado in the dataset. I can easily do this by indexing the specific headers.
Since these extract as datasets with the corresponding headers, I needed to convert each to a coordinate pair matrix. Using [All, Values]//Normal gets the data to display properly.
Using the GeoPosition function, I can now interpret each coordinate pair as a geo location. This function returns 64,825 points for analysis.
I finally used the GeoHistogram function, as well as, the United States Entity Class from Wolfram to plot the bins. I used a bin size of 100 to help clearly define where each tornado is located. I should make the bin size smaller for more accurate future analysis.
As shown the maximum amount of tornados in a single spot is around 175. From this display I was surprised to notice that Florida has a significant amount of tornados. It is clear that the central United States is what is impacted by tornados the most. However, tornados have existed almost everywhere else in the United States at some time from 1950 to 2018.
Looking Forward
I want to start to focus in on specific states for further analysis. I also can now use the dataset methods to organize the data to my liking; such as finding the max amount of damage from any one tornado. I have also added another question to my list:
- How has tornado frequency changed as time has gone on?