Databricks Inc.

05/12/2025 | News release | Distributed by Public on 05/12/2025 12:26

How Equinor Optimized Seismic Data Pipeline with Databricks

The oil and gas industry relies heavily on seismic data to explore and extract hydrocarbons safely and efficiently. However, processing and analyzing large amounts of seismic data can be a daunting task, requiring significant computational resources and expertise.

Equinor, a leading energy company, has used the Databricks Data Intelligence Platform to optimize one of its exploratory seismic data transformation workflows, achieving significant time and cost savings while improving data observability.

Equinor's goal was to enhance one of its 4D seismic interpretation workflows, focusing on automating and optimizing the detection and classification of reservoir changes over time. This process supports identifying drilling targets, reducing the risk of costly dry wells, and promoting environmentally responsible drilling practices. Key business expectations included:

  • Optimal drilling targets: Improve target identification to drill up a large number of new wells in the upcoming decades.
  • Faster, cost-effective analysis: Reduce the time and cost of 4D seismic analysis through automation.
  • Deeper reservoir insights: Integrate more subsurface data to unlock improved interpretations and decision-making.

Understanding Seismic Data

Seismic Cube: 3D models of the subsurface

Seismic data acquisition involves deploying air guns to generate sound waves, which reflect off subsurface structures and are captured by hydrophones. These sensors, located on streamers towed by seismic vessels or placed on the seafloor, collect raw data that is later processed to create detailed 3D images of the subsurface geology.

  • File Format: SEG-Y (Society of Exploration Geophysicists) - proprietary file format for storing seismic data, developed in the 1970s, optimized for tape storage
  • Data Representation: The processed data is stored as 3D cubes, offering a comprehensive view of subsurface structures.

Fig. 1: Seismic survey - acquiring seismic data. Raw data is then processed into 3D cubes. Retrieved 15‐06‐2015. Fetched from "Specificity of Geotechnical Measurements and Practice of Polish Offshore Operations", Krzysztof Wróbel, Bogumił Łączyński, The International Journal on Marine Navigation and Safety of Sea Transportation, volume 9, number 4, December -2015

Seismic Horizons: Mapping Geological Boundaries

Seismic horizons are interpretations of seismic data, representing continuous surfaces within the subsurface. These horizons indicate geological boundaries, tied to changes in rock properties or even fluid content. By analyzing the reflections of seismic waves at these boundaries, geologists can identify key subsurface features.

  • File Format: CSV - commonly used for storing interpreted seismic horizon data.
  • Data Representation: Horizons are stored as 2D surfaces.

Fig. 2: An example of two Seismic Horizons From Open Inventor Toolikt/Seismic Horizon (Height Field)

Challenges with the Existing Pipeline

The current seismic data pipeline processes data to generate the following key outputs:

  1. 4D Seismic Difference Cube: Tracks changes over time by comparing two seismic cubes of the same physical area, typically acquired months or years apart.
  2. 4D Seismic Difference Maps: These maps contain attributes or features from the 4D seismic cubes to highlight specific changes in the seismic data, aiding reservoir analysis.

However, several challenges limit the efficiency and scalability of the existing pipeline:

  • Suboptimal Distributed Processing: Relies on multiple standalone Python jobs running in parallel on single-node clusters, leading to inefficiencies.
  • Limited Resilience: Prone to failures and lacks mechanisms for error tolerance or automated recovery.
  • Lack of Horizontal Scalability: Requires high-configuration nodes with substantial memory (e.g., 112 GB), driving up costs.
  • High Development and Maintenance Effort: Managing and troubleshooting the pipeline demands significant engineering resources.
Databricks Inc. published this content on May 12, 2025, and is solely responsible for the information contained herein. Distributed via Public Technologies (PUBT), unedited and unaltered, on May 12, 2025 at 18:26 UTC. If you believe the information included in the content is inaccurate or outdated and requires editing or removal, please contact us at support@pubt.io