Automatic detection of seismic first breaks using artificial intelligence (AI)

On a recording produced following a seismic shot, one can observe the different modes of wave propagation in the ground (Figure 1). The first break corresponds to the travel time of the direct or refracted seismic wave from the source to the geophones, depending on which one arrives first in its path. These first breaks provide invaluable information for understanding the characteristics of the near surface. They can be used, among other things, to analyze refractions, obtain a near-surface velocity model, and determine the depth of the bedrock. In addition, they can be used as a complement in the processing of more complex methods such as seismic reflection.

Figure 1 - Diagram of the different types of seismic waves.

Traditionally, the identification of first breaks is done manually by an expert, a time-consuming and subjective processing step often leading to results that differ from person to person. To overcome this problem, different methods have been developed to automate the procedure, such as the STA/LTA (short-time-average over long-time average) method and cross-correlation methods that compare the seismic signal with known waveform templates. However, the efficiency of all these methods decreases considerably when the data is noisy (Figure 2).

Figure 2 -The STA-LTA method is used for the determination of first breaks in commercial software, but the accuracy of this method decreases in the presence of noise.

At Geostack, we have designed tools that utilize artificial intelligence to accurately automate the identification of first breaks that work both for clean data (Figure 3a to c) and noisy data (Figure 3d to f). This is part of our efforts to reduce the processing time of our combined methods while delivering high quality results. By leveraging these new tools, we can ensure that projects are delivered on time and within budget to our clients.

Figure 3 - Examples of automatic determination of first breaks for (a-c) clean and (d-f) noisy data. The solid blue line shows the manually selected first breaks, while the result of the automatic determination is represented by the dashed red line.

How do these tools work?

We first created a seismic dataset comprising raw records of seismic shots for which we manually identified the first breaks. These data were then used to train a Convolutional Neural Network (CNN), a type of artificial intelligence algorithm specialized in visual data processing. Once the model was trained on our initial dataset, it could already be used to find and identify first breaks in other seismic records, but with a limited level of accuracy.

To improve the accuracy of results in our projects, we utilize transfer learning, a machine learning technique that involves using a pre-trained model on one task as a starting point to accelerate and enhance the learning process for a new, similar, or related task. Specifically, for each project, we first manually identify the first breaks in 5 to 10% of the seismic records. We then use this newly labeled data to retrain an AI model that can then be used to accurately identify the first breaks in the remaining seismic records. By leveraging transfer learning, we benefit from the knowledge and features learned from the initial training dataset, and then fine-tune the model to adapt specifically to the characteristics of each new project. This approach helps us achieve a higher level of accuracy in identifying first breaks, even in challenging seismic data scenarios, and significantly reduces the manual effort and time required for data labeling.

For example, Figures 3d to 3f show seismic records from a noisy dataset for which we first manually identified the first breaks on approximately 7% of the records (20 records out of 291). Subsequently, it took us less time than it takes to make a cup of coffee to train a new AI model and accurately identify the first breaks on the remaining 271 records. The workflow that we have thus developed not only significantly reduces processing time, but also minimizes result subjectivity and the risks of human errors.

We have integrated this automatic first breaks identification tool into our internal software, enabling us to evaluate the selected first breaks and make quick adjustments in case of inaccuracies. An example of how the tool is used within our software is shown in Figure 4. The combination of AI-based automation and real-time evaluation within our software has proven to be a powerful solution in improving our seismic data processing capabilities, enabling us to provide more reliable and efficient seismic data processing solutions to our clients.

Finally, in line with our commitment to open source, we have made the source code of this new tool freely available on our GitHub page. By open-sourcing the code, we aim to contribute to the broader geophysical community and promote collaboration and innovation in seismic data processing. We encourage researchers, practitioners, and developers to explore and utilize the tool, provide feedback, and even contribute to its further improvement.