Skip to Content
DocumentationPreparationInput Parameters

Input Parameters

The SpecFWAT employs a YAML file to define the parameters for forward simulation, adjoint simulation, post-processing, and optimization. This file is typically named fwat_params.yml and is located in the DATA directory of your SpecFWAT project. Below shows section of the fwat_params.yml file:

Download a full template of fwat_params.yml here.

NOISE Section

This section defines the noise parameters for the forward simulation and measurements of adjoint sources.

YAML
NOISE: MESH_PAR_FILE: DATA/meshfem3D_files/Mesh_Par_file # Mesh file RCOMPS: ['Z'] # Components of the receiver CH_CODE: BX # Channel code NSTEP: 4500 # Number of time steps DT: 0.06 # Time step for the noise data IMEAS: 5 SHORT_P: [6, 10, 20] # Short period of filters LONG_P: [15, 25, 40] # Long period of filters GROUPVEL_MIN: [2.3, 2.3, 2.5] # Approximate minimum group velocity GROUPVEL_MAX: [3.2, 3.5, 4.0] # Approximate maximum group velocity ADJ_SRC_NORM: False # Set the following to .true. to normalize adjoint sources across different bands USE_NEAR_OFFSET: False # Set the following to false if use only data > 1 average wavelength SUPPRESS_EGF: False # Set to .false. when the data are cross-correlation functions PRECOND_TYPE: 1 # 1: inner product of acceleration SIGMA_H: 5000 SIGMA_V: 5000

Mesh Parameters

  • MESH_PAR_FILE: Path to the mesh parameter file with the same format as meshfem3D. SpecFWAT uses internal mesh generator of Specfem3D to create the mesh based on the model parameters, but allow users to specify the mesh file rathar than a fixed mesh file DATA/meshfem3D_files/Mesh_Par_file of Specfem3D.

Solver Parameters

  • RCOMPS: Components of the receiver, For tomography of isotropic media, it is usually set to ['Z'] for vertical component.

  • CH_CODE: Channel code, which is used to identify the channel in the data. It is usually set to BX for broadband data.

  • NSTEP: Number of time steps for the forward/adjoint simulation. It overrides the NSTEP in DATA/Par_file, which allow users to specify the number of time steps for different data types.

  • DT: Time step for the noise data, It overrides the DT in DATA/Par_file, which allow users to specify the time step for different data types.

Adjoint source

  • IMEAS: Option of adjoint source measurements. It is used to determine the objective function. See measure_adj manual for more details of adjoint source measurements. The default value is 5, which means the adjoint source are time shift based on cross-correlation.

  • SHORT_P: Short cut-off period of filters for the adjoint source measurements. It is a list of values, which are used to filter the data in different frequency bands.

  • LONG_P: Long cut-off period of filters for the adjoint source measurements. It is a list of values with the same length as SHORT_P.

  • GROUPVEL_MIN: Approximate minimum group velocity to determine the time window for the adjoint source measurements. It is a list of values with the same length as SHORT_P.

  • GROUPVEL_MAX: Approximate maximum group velocity to determine the time window for the adjoint source measurements. It is a list of values with the same length as SHORT_P.

💡

The time window for the adjoint source measurements is determined by the Approximated group velocity and the cut-off periods. The time window [Tstart,Tend][T_{start}, T_{end}] is calculated as follows:

Tstart=ΔUmax0.5Tlong T_{start} = \frac{\Delta}{U_{max}}-0.5 T_{long} Tend=ΔUmin+0.5Tlong T_{end} = \frac{\Delta}{U_{min}}+0.5 T_{long}
  • ADJ_SRC_NORM: Set to True to normalize the adjoint sources across different bands.

  • USE_NEAR_OFFSET: Set to False if you want to use only data with distance greater than 0.5 * wavelength in each period band. It is usually set to False in practice.

  • SUPPRESS_EGF: Whether calculate difference to obtain empirical Green’s functions (EGF). Set to False if the data are cross-correlation functions.

💡

For checkerboard test, please note that the data are EGF and the SUPPRESS_EGF should be set to True.

Post-processing parameters of noise data

  • PRECOND_TYPE: Type of preconditioning for gradient. 0 (P0P_0) or 1 (P1P_1) are accepted value based on inner product of acceleration, The default value is 1.
P0=i=1Nt2u(x,t)t2u(x,Tt)dt P_0 = \left| \sum_{i=1}^{N} \int \partial_t^2 u(x, t) \partial_t^2 u^{\dag}(x, T-t) dt \right| P1=i=1Nt2u(x,t)t2u(x,Tt)dt P_1 = \sum_{i=1}^{N} \left| \int \partial_t^2 u(x, t) \partial_t^2 u^{\dag}(x, T-t) dt \right|
  • SIGMA_H: Horizontal smoothing length in meter of gradient.
  • SIGMA_V: Vertical smoothing length in meter of gradient.

TELE Section

This section defines the parameters for the teleseismic FWAT.

YAML
TELE: MESH_PAR_FILE: DATA/meshfem3D_files/Mesh_Par_file # Mesh file TELE_TYPE: 2 # 1: teleseismic data, 2: receiver function, 3: Teleseismic cross-convolution RCOMPS: ['Z', 'R'] # Components of the receiver CH_CODE: BX # Channel code SAVE_FK: True # Save the FK wavefield COMPRESS_LEVEL: 0 # Compression level of the saved FK wavefield in hdf5 format SUPPRESS_STF: True # Whether to convolve source time function when forward simulating teleseismic data NSTEP: 2500 # Number of time steps DT: 0.025 # Time step for the teleseismic data SHORT_P: [1] # Short period of filters LONG_P: [20] # Long period of filters TIME_WIN: [-5, 25] # Time window for the teleseismic data PRECOND_TYPE: 3 # 2: abs(Z); 3: root squared z SIGMA_H: 5000 SIGMA_V: 5000 RF: F0: [1.5] # Gaussian width for the RF MAXIT: 200 # Maximum number of iterations MINDERR: 0.001 # Minimum residual error when the RF converges TSHIFT: 5.0 # Time shift before P

Adjoint Source

The parameters in the TELE section with the same name as in the NOISE section are similar, but with different values. The following parameters are specific to the TELE section:

  • TELE_TYPE: Type of objective function for teleseismic data. It can be

  • SAVE_FK: Set to True to save the FK wavefield for teleseismic data. The FK wavefield will be checked in the LOCAL_PATH (Set in DATA/Par_file) before running the forward simulation. If the FK wavefield is not found, it will be generated and saved in the LOCAL_PATH/FK_{event_name} directory, otherwise, it will be loaded from the existing FK wavefield directory to reduce time consumption. It will be useful for GPU acceleration.

  • COMPRESS_LEVEL: Compression level of the saved FK wavefield in hdf5 format. same as compression_opts in h5py.File. It can be set to 0 (no compression), 1 (fastest), 2, 3, 4, 5, 6, 7, or 9 (best compression). The default value is 0.

💡
  • The FK wavefield take high disk storage space. Please check the disk space before running the forward simulation.
  • For CPU parallelization, the FK simulation will be very fast due to so many processors. Thus, it is recommended to set SAVE_FK to False for CPU parallelization.
  • SUPPRESS_STF: Whether to convolve source time function when forward simulating teleseismic data. If set to False, the source time function will be convolved, The STF files named STF_{event_name}.sac will be prepared in the src_rec directory.
💡

The time 0 in the STF should correspond to the onset time of the earthquake. An example is shown below:

Source Time Function

Source: SCARDEC Source Time Functions Database

  • TIME_WIN: Time window for the teleseismic data. It is a list of two values, which are the start and end time before and after direct P arrival.

  • PRECOND_TYPE: Type of preconditioning for gradient. 2 (P2P_2) or 3 (P3P_3) are accepted value based on Z-precondition The default value is 3.

P2=xz P_2 = |x_z| P3=xz1/2 P_3 = |x_z|^{1/2}

Receiver function parameters

For Receiver function adjoint tomography, the synthetic receiver function is calculated based on the iterative deconvolution method. The following parameters are used to control the receiver function inversion:

  • RF.F0: Gaussian width for the receiver function. It is a list of values, which are used to filter the data in different frequency bands.

  • RF.MAXIT: Maximum number of iterations for the receiver function inversion.

  • RF.MINDERR: Minimum residual error when the receiver function converges.

  • RF.TSHIFT: Time shift before P arrival for the receiver function inversion. It is used to align the receiver function with the direct P arrival.

ADJOINT_SOURCE Section

SpecFWAT employs ForAdjoint to measure adjoint sources of ambient noise and local earthquakes, which provides built-in measurement methods including:

  • cross-correlation time shift and amplitude ratio
  • waveform difference
  • multi-taper phase shift and amplitude ratio
  • Cross-convolution waveform difference
  • Receiver function difference
  • Exponentiated phase misfit
YAML
ADJOINT_SOURCE: ITAPER_TYPE: 1 # 1: Hanning, 2: Hamming, 3: Cosine, 4: Cosine P10: TAPER_PERCENTAGE: 0.3 CC: TSHIFT_LIM: [-5.0, 5.0] DLNA_LIM: [-1.5, 1.5] CC_MIN: 0.7 DT_SIGMA_MIN: 1.0 DLNA_SIGMA_MIN: 0.5 MT: NUM_TAPER: 5 MT_NW: 4.0 PHASE_STEP: 1.5 TRANSFUNC_WATERLEVEL: 1e-10 WATER_THRESHOLD: 0.02 DT_FAC: 2.0 ERR_FAC: 2.5 DT_MAX_SCALE: 3.5 MIN_CYCLE_IN_WINDOW: 3 USE_MT_ERROR: False ENV: WTR_ENV: 0.2

Adjoint Source Tapering

  • ITAPER_TYPE: Type of tapering window for the adjoint source measurements. It can be
    • 1 for Hanning window.
    • 2 for Hamming window.
    • 3 for Cosine window.
    • 4 for Cosine P10 window.
  • TAPER_PERCENTAGE: Percentage of the tapering window. It is used to determine the length of the tapering window based on the total length of the time window.

Cross-correlation (CC) Measurement Parameters

  • CC.TSHIFT_LIM: Time shift limits in seconds for cross-correlation measurement.
  • CC.DLNA_LIM: Logarithmic amplitude ratio limits for cross-correlation measurement.
  • CC.CC_MIN: Minimum cross-correlation coefficient to accept a measurement.
  • CC.DT_SIGMA_MIN: Minimum standard deviation of time shift measurement in seconds.
  • CC.DLNA_SIGMA_MIN: Minimum standard deviation of logarithmic amplitude ratio.

Multi-taper (MT) Measurement Parameters

  • MT.NUM_TAPER: Number of tapers to use in multi-taper measurement.
  • MT.MT_NW: bin width of multitapers (nw*df is the half bandwidth of multitapers in frequency domain, typical values are 2.5, 3., 3.5, 4.0)
  • MT.PHASE_STEP: maximum step for cycle skip correction .
  • MT.TRANSFUNC_WATERLEVEL: Waterlevel for the transfer function in multi-taper measurement.
  • MT.WATER_THRESHOLD: The triggering value to stop the search. If the spectra is larger than 10*water_threshold it will trigger the search again, works like the heating thermostat.
  • MT.DT_FAC: percentage of wave period at which measurement range is too large and MTM reverts to CCTM misfit.
  • MT.ERR_FAC: percentage of error at which error is too large.
  • MT.DT_MAX_SCALE: Used to calculate maximum allowable time shift
  • MT.MIN_CYCLE_IN_WINDOW: Minimum number of cycles in the time window for multi-taper measurement.
  • MT.USE_MT_ERROR: Whether to use multi-taper error for normalization.

Envelope (ENV) Measurement Parameters

  • ENV.WTR_ENV: Waterlevel for envelope measurement.

MODEL_GRID Section

SpecFWAT update model parameters on a regular grid, which is easily to take sum of gradients of different data sets with different mesh. The size of each regular grid should be small to ensure the accuracy of gradient information. We recommended set the size of each regular grid to be at least half of element size of the mesh.

YAML
MODEL_GRID: REGULAR_GRID_MIN_COORD: [833950, -44274.0, -80000] REGULAR_GRID_INTERVAL: [2000, 2000, 1000] REGULAR_GRID_SIZE: [104, 48, 81]
  • REGULAR_GRID_MIN_COORD: Minimum coordinate of the regular grid in meter. It is a list of three values, which are the minimum x, y, and z coordinates of the regular grid.

  • REGULAR_GRID_INTERVAL: Interval of the regular grid in meter. It is a list of three values, which are the interval in x, y, and z directions.

  • REGULAR_GRID_SIZE: Number of regular grid. It is a list of three values, which are the number of grid points in x, y, and z directions.

💡

the region of the regular grid must cover the whole mesh region. the minimum coordinate of the regular grid should be less than the minimum coordinate of the mesh, and the maximum coordinate of the regular grid should be greater than the maximum coordinate of the mesh.

POSTPROC Section

This section defines the common parameters for the post-processing of the gradient after the adjoint simulation.

YAML
POSTPROC: INV_TYPE: [False, True] # Inversion type of noise and teleseismic data JOINT_WEIGHT: [0.5, 0.5] TAPER_H_SUPPRESS: 5000 TAPER_H_BUFFER: 10000 TAPER_V_SUPPRESS: 0 TAPER_V_BUFFER: 0 IS_PRECOND: True
  • INV_TYPE: Inversion type of ambient noise and teleseismic data. It is a list of two boolean values, which are used to determine whether to perform inversion for noise and teleseismic data, respectively. The first value is for noise data, and the second value is for teleseismic data.

  • JOINT_WEIGHT: Weight between ambient noise and teleseismic data. It is a list of two values, which are used to weight the gradients of ambient noise and teleseismic data, respectively.

  • TAPER_H_SUPPRESS: Horizontal tapering length in meter to suppress the gradient at the margin of the mesh.

  • TAPER_H_BUFFER: Horizontal tapering length in meter to buffer the gradient at the margin of the mesh.

  • TAPER_V_SUPPRESS: Vertical tapering length in meter to suppress the gradient at the margin of the mesh.

  • TAPER_V_BUFFER: Vertical tapering length in meter to buffer the gradient at the margin of the mesh.

💡

The taper on gradient is necessary for teleseismic FWAT to avoid updating margin of the model. It guarantees SEM domain of updated model coupling with the FK domain with iterations.

  • IS_PRECOND: Whether directly apply preconditioning to the gradient. If set to False, the preconditioner will be saved and apply on L-BFGS as rescale vector (Modrak and Tromp 2016).

MODEL_UPDATE Section

This section defines the parameters for the model update based on the gradient and optimization method.

YAML
MODEL_UPDATE: INIT_MODEL_PATH: initial_model.h5 MODEL_TYPE: 1 # 1: vp,vs,rho; 2: L,Gc,Gs OPT_METHOD: 2 # Optimization method, 1: SD; 2: LBFGS ITER_START: 0 LBFGS_M_STORE: 5 MAX_SLEN: 0.02 # Maximum step length MAX_SHRINK: 0.618 # Maximum shrink factor MAX_SUB_ITER: 6 DO_LS: True # Do line search C1: 0.1 VPVS_RATIO_RANGE: [1.3, 2.5] # Min and max limitation of Vp/Vs ratio
  • INIT_MODEL_PATH: Path to the initial model file. It should be a HDF5 file with the same format as DATA/tomo_files/tomography_model.h5, but the size could be different from the MODEL_GRID section. An interpolation will be performed to interpolate INIT_MODEL_PATH to MODEL_GRID size and save to optimize/model_M00.h5.

  • MODEL_TYPE: Type of model parameters to be updated. It can be

    • 1 for updating VpVp, VsVs, and ρ\rho.
    • 2 for updating azimuthal anisotropic parameters (VpVp, VsVs, ρ\rho, GcGc', and GsGs').
  • OPT_METHOD: Optimization method for model update. It can be

    • 1 for steepest descent (SD).
    • 2 for L-BFGS.
    • 3 for conjugate gradient (CG).
  • ITER_START: Starting iteration number of L-BFGS optimization.

  • LBFGS_M_STORE: Number of previous iterations to store in L-BFGS optimization.

  • MAX_SLEN: Maximum step length for model update.

  • MAX_SHRINK: Maximum shrink factor for model update. It is used to control the step length reduce when the model update does not decrease the objective function.

  • MAX_SUB_ITER: Maximum number of sub-iterations for model update.

  • DO_LS: Whether to perform line search for model update. If set to True, the line search will be performed to find the optimal step length for model update, otherwise, the step length will be fixed to MAX_SLEN.

  • C1: Constant for the Armijo condition in line search. It is used to determine whether the step length is sufficient to decrease the objective function.

OUTPUT Section

This section controls the verbose output of the SpecFWAT workflow

YAML
OUTPUT: IS_OUTPUT_PREPROC: True # Output preprocessed data IS_OUTPUT_ADJ_SRC: False # Output adjoint sources IS_OUTPUT_EVENT_KERNEL: True # Output kernels IS_OUTPUT_SUM_KERNEL: True # Output sum of kernels IS_OUTPUT_HESS_INV: False # Output inverse Hessian when precond_type = 1 IS_OUTPUT_DIRECTION: False
  • IS_OUTPUT_PREPROC: Whether to output the preprocessed data in SAC format. If set to True, the preprocessed data will be saved in the solver/{model}.{simu_type}/{event_name}/OUTPUT_FILES/ directory.

  • IS_OUTPUT_ADJ_SRC: Whether to output the adjoint sources in SAC format. If set to True, the adjoint sources will be saved in the solver/{model}.{simu_type}/{event_name}/SEM/ directory.

  • IS_OUTPUT_EVENT_KERNEL: Whether to output the kernels for each event. If set to True, the kernels will be kept in the solver/{model}.{simu_type}/{event_name}/EKERNEL/ directory, otherwise, the kernels will be deleted after the post-processing step.

  • IS_OUTPUT_SUM_KERNEL: Whether to output the sum of kernels after post-processing. If set to True, the sum of kernels will be saved in the optimize/SUM_KERNEL_{model}/ directory.

  • IS_OUTPUT_HESS_INV: Whether to output the inverse Hessian. If set to True, the inverse Hessian will be saved in the optimize/SUM_KERNEL_{model}/ directory.

  • IS_OUTPUT_DIRECTION: Whether to output the final descent direction of each iteration. If set to True, the direction will be saved in the optimize/ directory.

Last updated on