Interpretation-A Journal of Subsurface Characterization 期刊官网-刊鹿论文编译

Attribute selection in seismic facies classification: Application to a Gulf of Mexico 3D seismic survey and the Barnett Shale

Abstract:

Automated seismic facies classification using machine-learning algorithms is becoming more common in the geophysics industry. Seismic attributes are frequently used as input because they may express geologic patterns or depositional environments better than the original seismic amplitude. Selecting appropriate attributes becomes a crucial part of the seismic facies classification analysis. For unsupervised learning, principal component analysis can reduce the dimensions of the data while maintaining the highest variance possible. For supervised learning, the best attribute subset can be built by selecting input attributes that are relevant to the output class and avoiding using redundant attributes that are similar to each other. Multiple attributes are tested to classify salt diapirs, mass transport deposits (MTDs), and the conformal reflector “background” for a 3D seismic marine survey acquired on the northern Gulf of Mexico shelf. We have analyzed attribute-toattribute correlation and the correlation between the input attributes to the output classes to understand which attributes are relevant and which attributes are redundant. We found that amplitude and texture attribute families are able to differentiate salt, MTDs, and conformal reflectors. Our attribute selection workflow is also applied to the Barnett Shale play to differentiate limestone and shale facies. Multivariate analysis using filter, wrapper, and embedded algorithms was used to rank attributes by importance, so then the best attribute subset for classification is chosen. We find that attribute selection algorithms for supervised learning not only reduce computational cost but also enhance the performance of the classification. Introduction In the exploration and production industry, automated seismic facies classification is gradually being integrated into common workflows. Several machine-learning algorithms, such as self-organizing maps (SOMs) and K-means clustering, have been applied to automate seismic facies classification, and they are available in several commercial interpretation software packages. A great number of different seismic attributes can be used as input to machine-learning algorithms for classification and pattern recognition. However, some attributes express geologic or depositional patterns more effectively than others. For instance, the envelope (reflection strength) is sensitive to changes in acoustic impedance and has long been correlated to changes in lithology and porosity (Chopra and Marfurt, 2005). In many cases, the instantaneous frequency enhances interpretation of vertical and lateral variations of layer thickness (Chopra and Marfurt, 2005). Coherence measures lateral changes in the seismic waveform, which in turn can be correlated to lateral changes in structure and stratigraphy (Marfurt et al., 1998). Exploration generates large amounts of seismic data, andmany attributes generatedmay be highly redundant. Adding to this problem, the original seismic amplitude data (and therefore the subsequently derived attributes) may contain significant noise (Coléou et al., 2003). Therefore, understanding the nature of seismic attributes is of crucial importance for providing the most reliable classifications. According to the Hughes phenomenon, adding attributes beyond a threshold value causes a classifier’s performance to degrade (Hughes, 1968). Several studies found that dimensionality reduction in machine-learning problems reduces computation time and storage space as well as having meaningful results for facies classification (Coléou et al., 2003; Roy et al., 2010; Roden et al., 2015). Principal component analysis (PCA) is one of the most popular methods, reducing a large multidimensional (multiattribute) data set into a lower dimensional data set spanned by composite (linear combinations of the original) attributes, while preserving variation. SOM also creates a lower dimensional representation of high-dimensional data to aid interpretation. PCA and SOM are types of unsupervised learning, in which the goal is to discover the underlying structure of the input data. The University of Oklahoma, ConocoPhillips School of Geology and Geophysics, Norman, Oklahoma, USA. E-mail: yuji.kim@ou.edu (corresponding author); bob@ou.edu; kmarfurt@ou.edu. Manuscript received by the Editor 17 December 2018; revised manuscript received 14 March 2019; published ahead of production 29 May 2019; published online 23 August 2019. This paper appears in Interpretation, Vol. 7, No. 3 (August 2019); p. SE281–SE297, 16 FIGS., 9 TABLES. http://dx.doi.org/10.1190/INT-2018-0246.1. © 2019 Society of Exploration Geophysicists and American Association of Petroleum Geologists. All rights reserved. t Special section: Machine learning in seismic data analysis Interpretation / August 2019 SE281 D ow nl oa de d 05 /0 8/ 20 to 6 8. 22 8. 16 8. 19 0. R ed is tr ib ut io n su bj ec t t o SE G li ce ns e or c op yr ig ht ; s ee T er m s of U se a t h ttp :// lib ra ry .s eg .o rg / Roden et al. (2015) use PCA to define a framework for multiattribute analysis to understand which seismic attributes are significant for unsupervised learning. In their study, the combination of attributes determined by PCA is used as input to SOM to identify geologic patterns and to define stratigraphy, seismic facies, and direct hydrocarbon indicators. Zhao et al. (2018) build on these ideas and suggest a weight matrix computed from the skewness and kurtosis of attribute histograms to improve SOM learning. In general, attribute selection in unsupervised learning relies on the data distribution of the input attributes and the correlation between input attributes. Supervised learning maps a relationship between input attributes and output using an interpreter-defined training data set. Several supervised learning studies introduced attribute selection methods, also known as feature selection or variable selection to reduce dimensionality (Jain and Zongker, 1997; Chandrashekar and Sahin, 2014). We present multiple strategies to select appropriate attributes for seismic facies classification with a case study. Our goals are to provide a good classification model in terms of validation accuracy, to avoid overfitting, and to reduce the computation and memory requirements needed for generating seismic attributes. A desirable attribute subset might be built by detecting relevant attributes and discarding the irrelevant ones (Sánchez-Maroño et al., 2007). Although relevant attributes are those that are highly correlated with the output classes, redundant attributes are highly correlated with each other. Barnes (2007) suggests that there are many redundant and useless attributes that breed confusion in seismic interpretation; we argue that these attributes also pose problems in machine-learning classification. To avoid building an unnecessarily complex model, we evaluate several attribute selection algorithms to maximize relevance and minimize redundancy to build an efficient subset of attributes for supervised facies classification analysis. Attribute selection methods can be classified into three groups: (1) a filter method that uses a correlation or dependency measure, (2) a wrapper method that applies a predictive model to evaluate the performance of an attribute subset, and (3) an embedded method, which measures the attribute importance during the training process. Because multiple attributes are analyzed simultaneously in the test, we consider our attribute selection algorithm to be a multivariate algorithm. We compare the three types of attribute selection algorithms to build an efficient subset to differentiate seismic facies in a Gulf of Mexico survey. We generate 20 attributes from amplitude, instantaneous, geometric, texture, and spectral categories. The aim of the case study is to classify the specific facies based on patterns from a labeled training data set. We define the target classes of training data as being the facies corresponding to salt diapirs, MTDs, and conformal reflectors, which are created from manual geologic and stratigraphic interpretation. Correlations between attributes and correlations between attributes and output classes are analyzed using different measures to investigate Figure 1. Different types of relationship between variables X and Y and their correlation coefficients and regression score. Each scatterplot describes a different relationship between X and Y : (a and c) linear and monotonic relationships, (b and e) nonlinear, monotonic relationship, and (c and f) nonlinear, nonmonotonic relationships. Gaussian noise of 10% has been added to variable Y in (d-f). Coefficients are computed using Pearson, rank, MI, and distance correlation methods. A regression score is computed for the linear Bayesian, NN, RF, and SVM repressor predictive algorithms. The best hyperparameters for each model were obtained using a grid-search algorithm. SE282 Interpretation / August 2019 D ow nl oa de d 05 /0 8/ 20 to 6 8. 22 8. 16 8. 19 0. R ed is tr ib ut io n su bj ec t t o SE G li ce ns e or c op yr ig ht ; s ee T er m s of U se a t h ttp :// lib ra ry .s eg .o rg / the relevance and redundancy of each seismic attribute. The selected attributes are tested using a random forest (RF) algorithm, and the classification results are discussed. We also apply our workflow to the Barnett Shale play in the Fort Worth Basin to differentiate shale and limestone facies using inverted physical properties as input attributes. The output class is labeled based on stratigraphic interpretation aided by adjacent wireline logs. The classification results using different attribute subsets are discussed. Correlation measures to maximize relevance and minimize redundancy Finding an optimal subset can be achieved by maximizing the relevance between attributes and output classes, while minimizing redundancy among attributes (Yu and Liu, 2004; Peng et al., 2005). To maximize relevance, attributes

Author Listing： Yuji Kim;Robert Hardisty;Kurt J. Marfurt

Volume： 7

Pages： None

DOI： 10.1190/INT-2018-0246.1

Language： English

Journal： Interpretation

Interpretation-A Journal of Subsurface Characterization

INTERPRETATION-J SUB

影响因子：1.1

是否综述期刊：否

是否OA：否

是否预警：不在预警名单内

发行时间：-

ISSN：2324-8858

发刊频率：-

收录数据库：SCIE/Scopus收录

出版国家/地区：UNITED STATES

出版社：Society of Exploration Geophysicists

期刊介绍

***Jointly published by the American Association of Petroleum Geologists (AAPG) and the Society of Exploration Geophysicists (SEG)***Interpretation is a new, peer-reviewed journal for advancing the practice of subsurface interpretation.

*** 由美国石油地质学家协会（AAPG）和勘探地球物理学家协会（SEG）联合出版 ***《解释》是一本新的同行评审期刊，旨在促进地下解释实践。

年发文量	81
国人发稿量	50
国人发文占比	61.73%
自引率	9.1%
平均录取率	-
平均审稿周期	-
版面费	-
偏重研究方向	GEOCHEMISTRY & GEOPHYSICS-
期刊官网	-
投稿链接	-

质量指标占比

研究类文章占比	OA被引用占比	撤稿占比	出版后修正文章占比
98.77%	2.37%	0.00%	0.98%

预警情况查看说明

时间	预警情况
2024年02月发布的2024版	不在预警名单中
2023年01月发布的2023版	不在预警名单中
2021年12月发布的2021版	不在预警名单中
2020年12月发布的2020版	不在预警名单中

JCR分区 WOS分区等级：Q3区

版本	按学科	分区
WOS期刊SCI分区 WOS期刊SCI分区是指SCI官方（Web of Science）为每个学科内的期刊按照IF数值排序，将期刊按照四等分的方法划分的Q1-Q4等级，Q1代表质量最高，即常说的1区期刊。（2021-2022年最新版）
	GEOCHEMISTRY & GEOPHYSICS	Q3

关于2019年中科院分区升级版（试行）

分区表升级版（试行）旨在解决期刊学科体系划分与学科发展以及融合趋势的不相容问题。由于学科交叉在当代科研活动的趋势愈发显著，学科体系构建容易引发争议。为了打破学科体系给期刊评价带来的桎梏，“升级版方案”首先构建了论文层级的主题体系，然后分别计算每篇论文在所属主题的影响力，最后汇总各期刊每篇论文分值，得到“期刊超越指数”，作为分区依据。

分区表升级版（试行）的优势：一是论文层级的主题体系既能体现学科交叉特点，又可以精准揭示期刊载文的多学科性；二是采用“期刊超越指数”替代影响因子指标，解决了影响因子数学性质缺陷对评价结果的干扰。整体而言，分区表升级版（试行）突破了期刊评价中学科体系构建、评价指标选择等瓶颈问题，能够更为全面地揭示学术期刊的影响力，为科研评价“去四唯”提供解决思路。相关研究成果经过国际同行的认可，已经发表在科学计量学领域国际重要期刊。

《2019年中国科学院文献情报中心期刊分区表升级版（试行）》首次将社会科学引文数据库（SSCI）期刊纳入到分区评估中。升级版分区表（试行）设置了包括自然科学和社会科学在内的18个大类学科。基础版和升级版（试行）将过渡共存三年时间，推测在此期间各大高校和科研院所仍可能会以基础版为考核参考标准。提示：中科院分区官方微信公众号“fenqubiao”仅提供基础版数据查询，暂无升级版数据，请注意区分。

中科院分区查看说明

版本	大类学科	小类学科	Top期刊	综述期刊
	地球科学 4区	GEOCHEMISTRY & GEOPHYSICS 地球化学与地球物理 4区	否	否
2021年12月基础版	地学 4区	GEOCHEMISTRY & GEOPHYSICS 地球化学与地球物理 4区	否	否
2021年12月升级版	地球科学 4区	GEOCHEMISTRY & GEOPHYSICS 地球化学与地球物理 4区	否	否
2020年12月旧的升级版	地球科学 4区	GEOCHEMISTRY & GEOPHYSICS 地球化学与地球物理 4区	否	否
2022年12月最新升级版	地球科学 4区	GEOCHEMISTRY & GEOPHYSICS 地球化学与地球物理 4区	否	否