Friday, February 12, 2021

Laboratory earthquake forecasting: A machine learning competition

Laboratory earthquake forecasting: A machine learning competition. Paul A. Johnson et al. Proceedings of the National Academy of Sciences, February 2, 2021 118 (5) e2011362118; https://doi.org/10.1073/pnas.2011362118

Abstract: Earthquake prediction, the long-sought holy grail of earthquake science, continues to confound Earth scientists. Could we make advances by crowdsourcing, drawing from the vast knowledge and creativity of the machine learning (ML) community? We used Google’s ML competition platform, Kaggle, to engage the worldwide ML community with a competition to develop and improve data analysis approaches on a forecasting problem that uses laboratory earthquake data. The competitors were tasked with predicting the time remaining before the next earthquake of successive laboratory quake events, based on only a small portion of the laboratory seismic data. The more than 4,500 participating teams created and shared more than 400 computer programs in openly accessible notebooks. Complementing the now well-known features of seismic data that map to fault criticality in the laboratory, the winning teams employed unexpected strategies based on rescaling failure times as a fraction of the seismic cycle and comparing input distribution of training and testing data. In addition to yielding scientific insights into fault processes in the laboratory and their relation with the evolution of the statistical properties of the associated seismic data, the competition serves as a pedagogical tool for teaching ML in geophysics. The approach may provide a model for other competitions in geosciences or other domains of study to help engage the ML community on problems of significance.

Keywords: machine learning competitionlaboratory earthquakesearthquake predictionphysics of faulting

What Did We Learn from the Kaggle Competition?

Previous work on seismic data from Earth (3) suggests that the underlying physics may scale from a laboratory fault to large fault systems in Earth. If this is indeed the case, improvements in our ability to predict earthquakes in the laboratory could lead to significant progress in time-dependent earthquake hazard characterization. The ultimate goal of the earthquake prediction challenge was to identify promising ML approaches for seismic data analysis that may enable improved estimates of fault failure in the Earth. In the following, we will discuss shortcomings of the competition but also key innovations that improved laboratory quake predictions and may be transposed to Earth studies.

The approaches employed by the winning teams included several innovations considerably different from our initial work on laboratory quake prediction (1). Team Zoo added synthetic noise to the input seismic data before feature computing and model training, thus making their models more robust to noise and more likely to generalize.

Team Zoo, JunKoda, and GloryorDeath only considered features that exhibited similar distributions between the training and testing data, thereby ensuring that nonstationary features could not be used in the learning phase and again, improving model generalization. We note that employing the distribution of the testing set input is a form of data snooping that effectively made the test set actually a validation set. However, the idea of employing only features with distributions that do not evolve over time is insightful and could be used for scientific purposes by comparing feature distribution between portions of training data, for example.

Perhaps most interestingly from a physical standpoint, the fifth team, Team Reza, changed the target to be predicted and endeavored to predict the seismic cycle fraction remaining instead of time remaining before failure. Because they did not employ the approach of comparing input distribution between training and testing sets as done by the first, second, and fourth teams, the performance impact from the prediction of normalized time to failure (seismic cycle fraction) was significant.

As in any level of statistics, more data are in general better and can improve model performance. Thus, had the competitors been given more training data, in principle scores may have improved. At the same time, there is an element of nonstationarity in the experiment because the fault gouge layer thins as the experiment progresses, and therefore, even an extremely large dataset would not lead to a perfect prediction. In addition, Kaggle keeps the public/private test set split in such a way as to not reward overfitting. No matter how large the dataset is, if a model iterates enough times on that dataset, it will not translate well into “the real world,” so the competition structure was designed to prevent that opportunity.

It is worth noting that the ML metric should be carefully considered. In Earth, it will be important to accurately predict the next quake as it approaches, but MAE treats each time step equally with respect to the absolute error making this challenging.

Individuals participate on the Kaggle platform for many reasons; the most common are the ability to participate in interesting and challenging projects in many different domains, the ability to learn and practice ML and data science skills, the ability to interact with others who are seeking the same, and of course, cash prizes. The astounding intellectual diversity the Kaggle platform attracted for this competition, with team representations from cartoon publishers, insurance agents, and hotel managers, is especially notable. In fact, none of the competition winners came from geophysics. Teams exhibit collective interaction, evidenced by the step changes in the MAE through time (Fig. 6), likely precipitated by communication through the discussion board and shared code.

The competition contributed to an accelerating increase in ML applications in the geosciences, has become an introductory problem for the geoscience community to learn different ML approaches, and is used for ML classes in geoscience departments. Students and researchers have used the top five approaches to compare the nuances of competing ML methods, as well as to try to adapt and improve the approaches for other applications.

No comments:

Post a Comment