CorePy: XRF clustering tools to interpret and visualize geological core data


Keywords
geology, machine-learning, python, subsurface, xrf
License
Other
Install
pip install corepytools==0.0.14

Documentation

CorePy package

CorePy is a data analytics tool designed to integrate core-based geological data for machine learning characterization 
- The primary focus of CorePy is to classify high resolution X-ray fluoresence data into chemofacies 
- unsupervised and supervised clustering tools are applied
- Folder structures are developed to simplify working on multiple cores and formations
- Visualizations are used to validate clustering results
- CorePy consists of individual scripts and Pip install python package called corepytools

Installation

1) Additional notes and tips about GitHub and steps I use are here: https://github.com/Totilarson/MyCheatSheet 
- fork the CorePy repo to your github account
- make a local clone:
 - command line: `git clone https://github.com/Totilarson/CorePy.git` 
 - if it is necessary to delete the local clone use: 'rm -rf .git*'
2) Navigate to the local repo //CorePy/ and inspect folders 'CoreData' and 'CorePycodes'
3) 'pip install -r requirements.txt' this will install all the necessary dependencies

Package Dependencies

Install packages with pip: -r requirements.txt

Data examples

- CoreData folder contains an example of a high reoslution XRF dataset and corebox photographs
- Naming patterns for core box sticker location, wireline depths, and elemental concentrations are shown 

Settings.py

1) In //CorePy/CorePycodes open 'settings.py'
- 'settings.py' contains variables for all the Python scripts
- "CoreOfStudy", "Depth_model", "Formation", and "RockClassification" should match values in Public_XRF.csv datafile
- machine learning parameters are stored here
- 'chemocolor' is generated here. It makes formation-specific color schemes. If you add a new formation you have to add its colorscheme here 

CoreBeta file

- <corename>.json files are stored for each core in //CorePy/CoreData/CoreBeta
- files provide core-specific data that is referenced in each script
- Wireline scripts also write data to the .jsom file

Attribute_merge.py

- 
- Merges attribute data from //CorePy/CoreData/CoreAttributes/<core name> with XRF input file
- The files are merged based on Core-box-inch input from the XRF and attribute files
- output is a .csv file that merges XRF and attribute data
- if no attribute data is in folder it will skip over it
- Running Attirbute_merge.py will build additional output folders

PCAexample

- Running PCAexample.py will run PCA-Kmeans.
- Output files are in output folder. CSV file includes additional columns of data
- Settings.py writes a Run_settings.json file that is accessed by other scripts
- Machine learning parameters for Neural model and XGBoost clustering have been added to settings

NN_Build.py and NN_apply.py

- these scripts build and apply results from supervised chemfoacies classifications
- An example training dataset is included in //CorePy/CoreData/CoreNeuralModel
- model parameters are output _XGB and __NN files in //CorePy/CoreData/CoreNeuralModel
- output .csv file has additional classification columns

Corebox_Crop.py

- This code does take trial and error to get the bounding parameters correct
- line 38 "corepy.cropCorebox((70, 125, 740, 920)" those four values are to be adjusted
- line 17: core_depth = 3978. This is adjusted to match core box photos
- Corebox photos are unique and it takes time to get this part correct

CorePy_plotting.py

1) provides additional elemenal plotting
 - elemental cross plots. Elements are selected from Run_settings['Elements_plotted']
 - elements plotted with respect to depth: Depth model selected by: Run_settings['Depth_model']
 - element box plots. majors and trace
 - pie chart of chemofacies abundance
 - Depth referenced chemostrat column output in a folder //CorePy/CoreOutput/CrossSection/

Coreimage.py

- Designed to overlay chemofacies results on corebox photographs
- Requires coreboxphotographs be converted to 'coretubes'
- Coretubes are created from Corebox_Crop.py
- Coretubes are depth registered and in folder //CorePy/CoreData/CoreTubes/

Core_attribute.py

- This is a plotting function and develops descriptive stats for each chemofacies based on attributes
- Core_attribute.py is run after Attribubte_merge.py
- output is a .csv file with descriptive statistics
- Box plots and depth plots show attribute results with respect to chemofacies
- It is necessasry to add "Attribute_plotted" to the core .json file

About the authors

CorePy is being developed by Toti Larson at the University of Texas at Austin, Bureau of Economic Geology, Mudrocks Systems Research Laboratory (MSRL) research consortium.

  1. Toti E. Larson, Ph.D. - Research Associate at the University of Texas at Austin. PI MSRL research consortium

  2. Esben Pedersen, M.S. - Graduate student (graduated 2020) at the University of Texas at Austin.

  3. Priyanka Periwal, Ph.D. - Research Science Associate at the University of Texas at Austin.

  4. J. Evan Sivil - Research Science Associate at the University of Texas at Austin.

  5. Geoforce students - Ana Letícia Batista (2020) - Jackson State University

Package Inventory

Notes

Install corepytools using pip install corepytools Follow over to the authors Github account to download example Python scripts that use corepytools

Folder structure

CorePy

|-LICENSE.txt         **MIT**

|-README.md           **edited in markdown**

|-gitignore          

|CoreData

    |CoreAttributes
    |CoreBeta
    |CoreBoxPhotos
    |CoreNeuralModel
    |CoreXRF

|CoreOutput

|CorePycodes