Abstract:
Salt boundary interpretation is important for the understanding of salt tectonics and velocity model building for seismic migration. Conventional methods consist of computing salt attributes and extracting salt boundaries. We have formulated the problem as 3D image segmentation and evaluated an efficient approach based on deep convolutional neural networks (CNNs) with an encoder-decoder architecture. To train the model, we design a data generator that extracts randomly positioned subvolumes from large-scale 3D training data set followed by data augmentation, then feed a large number of subvolumes into the network while using salt/nonsalt binary labels generated by thresholding the velocity model as ground truth labels. We test the model on validation data sets and compare the blind test predictions with the ground truth. Our results indicate that our method is capable of automatically capturing subtle salt features from the 3D seismic image with less or no need for manual input. We further test the model on a field example to indicate the generalization of this deep CNN method across different data sets. Introduction In seismic interpretation and subsurface modeling, extracting geologic structures such as faults, unconformities, horizons, and salt bodies from 3D seismic data are critical. Conventional methods derive seismic attributes based on geologic, physical, and geometric principles. Interpreting salt from seismic often involves visual features including steeply dipping events and chaotic signals. Specifically, seismic attributes used for automatically interpreting salt boundaries include discontinuities (Asjad and Mohamed, 2015), textures (Wang et al., 2015), reflection dip or normal vector fields (Haukås et al., 2013), and salt likelihoods (Wu, 2016). Although automatic methods have been proposed for computing salt attributes and extracting salt boundaries from those attributes (Ramirez et al., 2016; Wu et al., 2018a), it remains a manual-intensive and time-consuming task in practice. These attributes are designed with domain expertise knowledge and engineering; however, these attributes may not yet fully describe the complex noise-contaminated seismic data in real world (Marfurt and Alves, 2015). The recently developed machine-learning techniques enable computers to perform repetitive tasks, and unravel the relationships that contain useful patterns (Zhao, 2017). Ross and Cole (2017) review popular facies classification methods based on machine-learning algorithms. Deep neural networks (DNNs) are built on the premise that they can replicate a wide variety of nonlinear operator (universal approximation theorem, Csáji, 2001). Compared with traditional machine-learning algorithms, DNNs have the advantage that they extract useful features automatically via numerous hidden layers. Convolutional neural networks (CNNs) are a specialization of DNNs by replacing matrix multiplications with convolution operators, to focus on learning the locality and spatial relationship between input image and output label. Huang et al. (2017) show that CNNs provide improved results over traditional methods such as support vector machines and logistic regression for identifying geologic faults in 3D seismic data. Araya-Polo et al. (2017) use prestack seismic data to identify faults directly without migrating the data to migrated images. Waldeland and Solberg (2017) train a CNN to perform pixel-by-pixel salt body classification. These experiments show the encouraging accuracy of CNNs in a variety of seismic processing and interpretation tasks. To use the power of CNN in automatic salt interpretation, there have been patch-based studies that classify the seismic image as salt/nonsalt in a patch-by-patch fashion, and we assign the classification prediction to the central voxel of that patch. Di et al. (2018a) andWaldeland and Solberg (2017) propose a CNN architecture with fully connected layers attached after convolutional The University of Texas at Austin, Bureau of Economic Geology, John A. and Katherine G. Jackson School of Geosciences, University Station, Box X, Austin, Texas 78713-8924, USA. E-mail: yzshi08@utexas.edu (corresponding author); xinming.wu@beg.utexas.edu; sergey.fomel@ beg.utexas.edu. Manuscript received by the Editor 11 December 2018; revised manuscript received 15 February 2019; published ahead of production 7 April 2019; published online 28 May 2019. This paper appears in Interpretation, Vol. 7, No. 3 (August 2019); p. SE113–SE122, 10 FIGS., 1 TABLE. http://dx.doi.org/10.1190/INT-2018-0235.1. © 2019 Society of Exploration Geophysicists and American Association of Petroleum Geologists. All rights reserved. t Special section: Machine learning in seismic data analysis Interpretation / August 2019 SE113 D ow nl oa de d 05 /3 0/ 19 to 1 28 .8 3. 63 .2 0. R ed is tr ib ut io n su bj ec t t o SE G li ce ns e or c op yr ig ht ; s ee T er m s of U se a t h ttp :// lib ra ry .s eg .o rg / layers to predict the classification using a softmax activation layer at the end. Wu et al. (2018b) also use a similar CNN-based pixel-wise classification method to predict fault existence and position in each image patch. However, patch-based methods are born with disadvantages in geobody interpretations because they are originally designed for object classification problems. Figure 1a demonstrates how patch-based detectionmethods work: For each pixel, the network will take a window, centered at the point of interest, as its input and classify the category to which this pixel belongs. The process will repeat by sliding the window across the image until all of the pixels are scanned. The disadvantage of these methods is twofold: First, a local window or cube is required to slide through the full data set to make a prediction at every pixel or voxel; second, it could be challenging for patch-based classification to delineate the boundary of geobodies with high resolution, for example, Figure 1a shows two window inputs with similar content but should be classified to different categories. On the other hand, separating salt body from conformable seismic reflections is naturally an image segmentation task. Figure 1b shows an output example of the segmentation method; in the example, all pixels in the window input are classified to its category simultaneously and output together as a mask. Previous researchers (Lomask et al., 2007; Ramirez et al., 2016) discuss salt boundary extraction as a global image segmentation problem. Considering geobody interpretation as image segmentation addresses those disadvantages of patchbased methods. In computer vision area, image segmentation using deep-learning techniques is a topic being actively researched with promising progresses (Girshick, 2015; Ronneberger et al., 2015; Xie and Tu, 2015; Badrinarayanan et al., 2017; He et al., 2017; Ren et al., 2017). Zhao (2018) and Di et al. (2018b) present encouraging results using 2D encoder-decoder networks to separate different seismic facies including salt domes, low coherence, low amplitude dipping, high amplitude deformed, and compare with the patch-based method. Wang et al. (2018) show by 2D synthetic examples that it is possible to perform salt detection, even from prestack seismic data, via segmentation network. Wu et al. (2019) show that segmentation network can be highly effective and efficient for 3D seismic fault interpretations. Therefore, in this paper, we propose to apply a deep CNN-based segmentation model to tackle 3D seismic salt interpretation automatically. We adopt the network architecture from U-net (Ronneberger et al., 2015) to build a 3D encoder-decoder network with skip connections. The network takes a seismic subvolume with certain size (receptive field) as input, and it outputs a salt probability subvolume with the same size. To train and validate the model, we use SEG Advanced Modeling (SEAM) Phase I synthetic data migrated image (Fehler and Keliher, 2011) as the input image and we extract a binary salt mask from the corresponding velocity model by thresholding and clipping. We split the data volume to a training part and a validation part; the training part is used to optimize the network, and the validation part can be used to test the generalization of the trainedmodel via a blind test. During the training process, a data generator randomly crop and rotate a subvolume according to the size of network’s receptive field. After a sufficient amount of training, we use the network to find the salt probability of all parts of the data, and we compare with the ground truth salt mask via several quantitative metrics. Furthermore, the model is applied to a field seismic data set and outputs a decent salt model prediction. Network architecture The first semantic segmentation method using an encoder-decoder architecture is a fully convolutional network (Long et al., 2015). The encoder-decoder network consists of the stacking of multiple convolutional layers like the other CNNs; however, the difference is that instead of using a fully connected layer at the output to connect with categorical data, the encoder-decoder uses a convolutional layer to retain all spatial information and connect to multidimensional data. This allows for image-to-image segmentation rather than image-to-class classification in the case of ordinary CNN. Another important feature of encoder-decoder is the “bottleneck” architecture: The input data are gradually downsampled after passing through the encoder layers, and then they are upsampled layer-by-layer in the decoder section, as shown in Figure 2. The downsampling is achieved by selecting fewer pixels from the image feature according to a certain algorithm, e.g., max pooling or average pooling (Boureau et al., 2010), so that less significant a) b) Figure 1. (a) Demonstration of the patch-based classification methods: For each pixel, the network will take a window, centered at the point of interest, as its input and cl