Utilities Module

The utilities module contains essential helper functions and common operations used throughout ATMOS-BUD for logging, coordinate transformations, data processing, and atmospheric feature detection.

Main Functions

src.utils.initialize_logging(results_subdirectory, args)[source]

Initializes the logging configuration for the application.

Parameters:
  • results_subdirectory (str) – Directory path to save the log file.

  • args (object) – The argparse object containing the command line arguments.

src.utils.convert_lon(input_data, longitude_indexer)[source]

Convert longitudes from 0:360 range to -180:180

Parameters:
  • xr (xarray.DataArray) – gridded data.

  • longitude_indexer (str) – corrdinate indexer used for longitude.

Returns:

xr – gridded data with longitude converted to desired format.

Return type:

xarray.DataArray

src.utils.handle_track_file(input_data, times, longitude_indexer, latitude_indexer, app_logger)[source]

Handles the track file by validating its time and spatial limits against the provided dataset.

Parameters:
  • data (xr.Dataset) – A Xarray Dataset containing the data to compute the energy cycle.

  • times (pd.DatetimeIndex) – The time series of the dataset.

  • LonIndexer (str) – The name of the longitude coordinate in the dataset.

  • LatIndexer (str) – The name of the latitude coordinate in the dataset.

  • args (argparse.Namespace) – Arguments provided to the script.

  • app_logger (logging.Logger) – Logger for logging messages.

Returns:

DataFrame containing the track information if the track file is valid.

Return type:

pd.DataFrame

Raises:
  • FileNotFoundError – If the track file is not found.

  • ValueError – If the time or spatial limits of the track file do not match the dataset.

src.utils.find_extremum_coordinates(ds_data, lat, lon, variable, args)[source]

Finds the indices of the extremum values for a given variable.

Args: ds_data: An xarray DataArray containing the data to compute the energy cycle. lat: An xarray DataArray containing the latitudes of the data. lon: An xarray DataArray containing the longitudes of the data. variable: A string containing the name of the variable to find the indices for.

Returns: A tuple containing the indices of the extremum values for the specified variable.

src.utils.slice_domain(input_data, args, namelist_df)[source]

Slices the input dataset according to the specified domain. The domain can be defined based on fixed boundaries, track information, or an interactively selected area.

Parameters: - input_data (xr.Dataset): The dataset containing meteorological variables. - args (argparse.Namespace): Parsed command-line arguments indicating the slicing method. - namelist_df (pd.DataFrame): DataFrame mapping variable names to their respective dataset labels.

Returns: - xr.Dataset: The sliced dataset within the specified domain.

src.utils.get_domain_extreme_values(itime, args, slices_plevel, track=None)[source]

Retrieves or calculates extreme values (minimum/maximum vorticity, minimum geopotential height, and maximum wind speed) within a specified domain at chosen pressure level.

Parameters: - track (pd.DataFrame): Track data potentially containing extreme values. - itime (str or pd.Timestamp): The specific time step for which to retrieve or calculate extremes. - args (argparse.Namespace): Parsed command-line arguments indicating the slicing method. - slices_plevel (tuple): Tuple containing slices of vorticity, geopotential height, and wind speed at chosen pressure level.

Returns: - tuple: Containing minimum/maximum vorticity, minimum geopotential height, and maximum wind speed.

Core Functions

Logging and Configuration

initialize_logging() - Comprehensive logging system setup

  • Configurable verbosity levels (DEBUG, INFO, ERROR)

  • Dual output: console and log file

  • Timestamped entries with level identification

  • Application-specific logger separation

  • Log file naming based on input data

Data Preprocessing Utilities

convert_lon() - Longitude coordinate standardization

  • Converts longitude from 0°-360° to -180°-180° format

  • Automatic coordinate sorting after conversion

  • Maintains data integrity during transformation

  • Compatible with all xarray datasets

slice_domain() - Spatial domain extraction

  • Supports fixed, tracking, and interactive domain selection

  • Automatic coordinate matching and boundary handling

  • Preserves metadata and coordinate attributes

  • Optimizes memory usage through selective data loading

Track File Management

handle_track_file() - Storm track data validation and processing

  • Time series validation against input datasets

  • Spatial boundary checking for track coverage

  • Automatic reindexing for temporal alignment

  • Comprehensive error handling and logging

  • Support for multiple track file formats

Atmospheric Feature Detection

find_extremum_coordinates() - Meteorological extrema identification

  • Vorticity minimum/maximum detection for storm centers

  • Geopotential height extrema for pressure systems

  • Maximum wind speed identification

  • Configurable detection criteria (min/max selection)

  • Precise coordinate extraction with grid alignment

get_domain_extreme_values() - Domain-specific extreme value extraction

  • Integration with track files for pre-computed values

  • On-demand calculation for missing track data

  • Multi-variable extrema processing (vorticity, height, wind)

  • Pressure-level specific analysis

  • Optimized for time series processing

Key Features

Coordinate System Management

Longitude Convention Handling: * Automatic detection of longitude format (0-360° vs -180-180°) * Seamless conversion between conventions * Proper dateline crossing management * Consistent coordinate ordering

Spatial Domain Processing: * Grid-aligned domain boundaries * Nearest neighbor coordinate matching * Memory-efficient spatial subsetting * Coordinate metadata preservation

Logging System

Multi-Level Logging: * DEBUG - Detailed processing information * INFO - General progress messages * ERROR - Critical error reporting * Console + File - Dual output streams

Configuration Options: * Verbose mode for detailed debugging * Quiet mode for production runs * Custom log file naming conventions * Timestamp formatting for analysis tracking

Track Data Integration

File Format Support: * CSV files with temporal indexing * Variable column naming flexibility * Missing data handling and interpolation * Automatic coordinate system detection

Validation Features: * Temporal range verification * Spatial coverage checking * Data quality assessment * Error reporting and logging

Storm Analysis Tools

Feature Detection: * Vorticity-based storm center identification * Pressure system tracking via geopotential height * Wind maximum detection for intensity analysis * Multi-criteria extrema finding

Track Integration: * Pre-computed track value utilization * Dynamic calculation for missing data * Multi-level analysis (different pressure levels) * Time series consistency checking

Usage Examples

Logging Setup

from src.utils import initialize_logging
import argparse

# Configure logging
parser = argparse.ArgumentParser()
parser.add_argument('--verbose', action='store_true', help='Enable verbose logging')
parser.add_argument('--infile', required=True, help='Input data file')
args = parser.parse_args()

# Initialize logging system
logger = initialize_logging('results/', args)

# Use logger throughout application
logger.info('Starting atmospheric budget analysis')
logger.debug('Loading configuration parameters')
logger.error('Critical error encountered')

Coordinate Conversion

from src.utils import convert_lon
import xarray as xr

# Load dataset with 0-360° longitude format
data = xr.open_dataset('data_0to360.nc')

# Convert to -180 to 180° format
data_converted = convert_lon(data, 'longitude')

print(f"Original range: {data.longitude.min():.1f} to {data.longitude.max():.1f}")
print(f"Converted range: {data_converted.longitude.min():.1f} to {data_converted.longitude.max():.1f}")

Domain Slicing

from src.utils import slice_domain
import pandas as pd

# Load namelist configuration
namelist_df = pd.read_csv('inputs/namelist', index_col=0)

# Configure domain selection
args.fixed = True  # or args.track = True, args.choose = True

# Extract spatial domain
sliced_data = slice_domain(full_dataset, args, namelist_df)

print(f"Original shape: {full_dataset.dims}")
print(f"Sliced shape: {sliced_data.dims}")

Track File Processing

from src.utils import handle_track_file
import pandas as pd

# Process track file
track_data = handle_track_file(
    input_data=dataset,
    times=time_series,
    longitude_indexer='longitude',
    latitude_indexer='latitude',
    app_logger=logger
)

# Access track information
for timestamp in track_data.index:
    lat = track_data.loc[timestamp, 'Lat']
    lon = track_data.loc[timestamp, 'Lon']
    print(f"{timestamp}: Storm center at {lat:.1f}°N, {lon:.1f}°E")

Feature Detection

from src.utils import find_extremum_coordinates, get_domain_extreme_values

# Find vorticity extremum
lat_center, lon_center = find_extremum_coordinates(
    ds_data=vorticity_slice,
    lat=latitude,
    lon=longitude,
    variable='min_zeta',  # or 'max_zeta', 'max_wind'
    args=args
)

# Get domain extreme values
min_zeta, min_hgt, max_wind = get_domain_extreme_values(
    itime=timestamp,
    args=args,
    slices_plevel=(vorticity_slice, height_slice, wind_slice),
    track=track_data
)

print(f"Storm center: {lat_center:.2f}°N, {lon_center:.2f}°E")
print(f"Min vorticity: {min_zeta:.2e} s⁻¹")
print(f"Max wind: {max_wind:.1f} m/s")

Complete Workflow Integration

from src.utils import *
import xarray as xr
import pandas as pd

# Setup logging
logger = initialize_logging('results/', args)

# Load and preprocess data
data = xr.open_dataset(input_file)
data = convert_lon(data, 'longitude')

# Load configuration
namelist = pd.read_csv('inputs/namelist', index_col=0)

# Process track file if needed
if args.track:
    track = handle_track_file(data, time_series, 'longitude', 'latitude', logger)

# Slice domain
domain_data = slice_domain(data, args, namelist)

# Process each time step
for timestamp in time_series:
    # Find atmospheric features
    if args.track:
        extrema = get_domain_extreme_values(timestamp, args, data_slices, track)

    logger.info(f"Processed timestamp: {timestamp}")

Error Handling and Validation

The utilities module implements robust error handling:

File Operations: * FileNotFoundError for missing track files * Validation of file formats and contents * Graceful handling of corrupted data

Data Validation: * Coordinate system consistency checks * Time series alignment verification * Spatial boundary validation * Missing data detection and reporting

Logging Integration: * Comprehensive error logging with context * Debug information for troubleshooting * User-friendly error messages * Stack trace preservation for development