Now it's time for you to start exploring some data to see what you can find. The dataset we will be using concerns earthquakes observed in an area around Fiji since 1964, obtained from Harvard University Department of Geophysics. A summary of the variables is below:

column details type
lat Latitude of earthquake numeric
long Longitude of earthquake numeric
depth Depth (km) numeric
mag Richter Scale Magnitude numeric
stations Number of stations reporting the earthquake numeric
order Order in which the earthquakes occurred numeric

You can access the raw data here: https://raw.githubusercontent.com/stats4sd/explore_earthquakes/refs/heads/dev/quakes.csv

This interactive tool has a limited set of functionality, to cover some of the key tasks that you might need:

  • View the data
  • Produce summary statistics of each variable, or split by categories created in the 'Categorise' menu
  • Visualise the data in various different ways
  • Filter the data, to produce different subsets. The results of the filter will then be applied to each of the other menus
  • Categorise variables into new variables - e.g. create groups based on latitude/longitude/order to allow exploration in different ways within the other menus. The 'fixed' option allows you to specify exactly the groups, or you can experiment with any of the other built in methods. Note that you can only create one categorised variable at a time.

Try to recreate what you have been able to produce using the interactive app, and then to keep exploring further beyond those pre-built capabilities!

You are welcome to explore the data in whatever way you see fit; depending on what you might find interesting or what patterns you start to uncover.

But to help guide your explorations, here are some questions that a researcher might have when getting started with a project based on this data:

  • Does the structure of the data make sense, and does it match what would be expected from the description above?
  • Are there certain properties of the data that are unexpected or might be problematic?
  • How could the distribution of the 'magnitude' and 'depth' variables be described?
  • How commonly observed are earthquakes that would be classified as “moderate” (5-6 magnitude on the Richter scale) or “strong” (6 and above)?
  • Are the depth or number of stations variables correlated with the magnitude of the earthquakes? How would you describe these relationships?
  • Are there any trends in the magnitude, the depth, or the number of stations reporting each earthquake over time?
  • Are there any patterns in the locations of the earthquakes over space, and does the magnitude, depth or number of stations reporting vary according to the locations?

Filter data

Number of rows:
  100 / 100
100%

Summarise data

View data

Visualise data

Create new columns

Convert Numeric to Factor