Clustering – Oklahoma

Oklahoma

This post is mainly going to be about finding a way to plot the beginning and ending latitude and longitude points of the dataset. To simplify things in this first analysis, I am going to only worry about the data points associated with Oklahoma. The reason for choosing Oklahoma is because the state is in one of the main hotspots for the overall tornado dataset. Below are the starting and end datasets derived from the main dataset.

From these sets I then used the GeoHistogram function to better visualize the tornado density and, hopefully, the direction.

Something of note here is that the ending dataset has less points than the starting one. This is because some of the ending latitudes and longitudes for the tornados were not recorded in some instances (showing a 0). Since this is the case, I am going to disregard those cases during analysis.

Clustering

The above histograms don’t give a good depiction on the starting and ending points of each tornado, so I need to implement a way I can effectively visualize each tornado path. I first selected a dataset of the four points that are from Oklahoma and that don’t equal zero. I then created four lists for each of the four latitudes and longitudes.

For my clustering I knew that visualizing a line for each tornado path would be a good depiction of the tornado’s activity. Through the GeoPath and Table functions I was able to plot each of the tornados on the Oklahoma map.

From this visualization I noticed a pattern for a majority of the tornados. It seems each travels in a Northeast direction. From here I separated the data into three different clusters: the tornados with a positive slope, a negative slope, and no/undefined slope.

To figure out the slopes of each tornado path, I created a function:

This function takes in the two starting and two ending points and calculates their slope. If x2-x1 = 0, then the slope is undefined as a divide by zero would occur. From this I created the three clusters.

Finally, these clusters were used with the GeoPath function to visualize if the above Northeast prediction was correct.

From here I can safely say that this Northeast trend exists.

For the Future

I want to plot the lengths of each of the tornado paths on a graph to see if there is any specific distribution. I also want to do the same analysis with two other states to see if the Northeast theory upholds.

Leave a Reply

Your email address will not be published. Required fields are marked *