Configuration Reference#

MEGPrep is configured through a Nextflow configuration file, usually nextflow.config. The top-level params block controls input discovery, pipeline stage selection, output locations, and the YAML snippets passed to the Python processing scripts.

Many nested YAML fields are passed directly to MNE-Python, MNE-BIDS, OSL-Ephys, PyPREP, or FreeSurfer/DeepPrep functions. When a field is a direct pass-through, MEGPrep preserves the name and meaning used by the upstream API. Useful upstream references include the MNE-Python documentation for Raw objects, Epochs, ICA, compute_raw_covariance, compute_covariance, and find_bad_channels_lof.

Command Line Mapping#

The Docker entrypoint copies the mounted config to /program/nextflow/run_nextflow.config and then applies selected command-line overrides before launching Nextflow.

Docker option

Config or Nextflow target

Notes

-c, --config

Input config file

Mounted project config. The effective copy is saved to <output_dir>/nextflow.config after the run.

-i, --input

params.dataset_dir

Root directory used by MEG and MRI import steps.

-o, --output

params.output_dir

Output root. params.preproc_dir defaults to ${params.output_dir}/preprocessed.

-s, --steps

Nextflow --steps / params.steps

Overrides the value in the config for this run.

--fs_subjects_dir

params.fs_subjects_dir

FreeSurfer SUBJECTS_DIR used for coregistration, BEM, forward model, and source reconstruction.

--fs_license_file

params.fs_license

Used by DeepPrep/FreeSurfer-related execution in the container.

--t1_dir

params.t1_dir

T1 input root when structural processing is enabled.

--t1_input_type

params.t1_input_type

nifti or dicom for non-BIDS anatomy input.

--t1_dicom_series_glob

params.t1_dicom_series_glob

Optional relative glob for selecting DICOM series directories under each T1 DICOM root, for example *T1* or *mprage*.

--resume

Nextflow -resume

Reuses completed Nextflow work directory tasks where possible.

--static_task_log_mode

params.static_task_log_mode

Controls how much Nextflow .command* log content is copied into the static HTML report. Values are failed, all-command-log, and none.

Docker Output Ownership#

The Docker image is designed to be run without Docker’s --user flag. The entrypoint starts as root only long enough to prepare mounted output permissions. It then drops privileges with gosu and runs Nextflow as the host UID/GID inferred from the mounted /input directory. Report-only runs that only mount /output infer ownership from /output instead.

This means users do not need to pre-create the host output directory. If Docker creates the bind-mounted output path as root:root before the container starts, MEGPrep changes ownership to the inferred host user before launching the pipeline.

If neither /input nor /output is owned by the user who should own the outputs, pass the desired IDs explicitly:

docker run --rm -it \
  -e LOCAL_UID="$(id -u)" -e LOCAL_GID="$(id -g)" \
  -v /data/bids:/input \
  -v /data/out:/output \
  cmrlab/megprep:<version> \
  -i /input -o /output

Pipeline Stage Selection#

params.steps is the primary switch for choosing how much of the workflow to run. It can be set in the config or overridden with --steps.

Value

Behavior

meg_all

Default full MEG workflow using an existing fs_subjects_dir: MEG import, continuous preprocessing, artifact detection, ICA, epochs, covariance, coregistration, forward solution, source reconstruction, and static report.

all

Structural MRI workflow plus full MEG workflow in one run.

anatomy

Structural MRI workflow only.

meg_artifacts

MEG import, continuous preprocessing, artifact detection, then static report.

meg_ica

Through ICA fitting, labeling, ICA application, then static report.

meg_epochs

Through epoch generation, then static report.

report

Rebuild the static HTML report from existing outputs only.

Aliases are supported: meg maps to meg_all, artifacts maps to meg_artifacts, ica maps to meg_ica, and epochs maps to meg_epochs.

Optional modifiers are comma-separated. skip_ica is valid only with meg_epochs and builds epochs from the OSL preprocessed raw files. It is not valid for meg_all or all because the downstream source reconstruction path expects ICA-clean raw data. with_anatomy is valid with meg_artifacts, meg_ica, or meg_epochs and runs structural MRI processing before the selected MEG milestone.

Global Paths and Execution Settings#

Data Import#

MEG input discovery is performed by meg_import_dataset.py. BIDS datasets are discovered through MNE-BIDS entities. Raw datasets are discovered by walking the input directory and selecting files by suffix.

Field

Type

Allowed values

Meaning

dataset_format

string

auto, bids, raw

auto treats a directory containing dataset_description.json as BIDS; otherwise it uses raw-file discovery.

file_suffix

string

Any basename ending, usually .fif, .ds, or c,rfDC

Used only for raw dataset discovery. Matches files and directories, so CTF .ds folders can be imported. Split FIF continuation files such as -1.fif are excluded.

meg_import_config.subject_id

null, string, or list

BIDS subject labels without sub-

Optional subject filter for BIDS MEG input.

meg_import_config.session_id

null, string, or list

BIDS session labels

Optional session filter.

meg_import_config.task

null, string, or list

BIDS task labels

Optional task filter.

meg_import_config.run_id

null, string, or list

BIDS run labels

Optional run filter.

meg_import_config.raw_exclude_keywords

null, string, or list

Case-insensitive substrings

Raw dataset only. Excludes files or directories whose basename contains any listed keyword, for example phantom or emptyroom.

meg_import_config.raw_include_keywords

null, string, or list

Case-insensitive substrings

Raw dataset only. When set, only files or directories whose basename contains at least one listed keyword are imported.

Anatomy Input and Reconstruction#

Structural processing is used by steps=anatomy, steps=all, or selected MEG milestones with with_anatomy. If structural processing is not selected, MEGPrep assumes fs_subjects_dir already contains subject reconstructions.

Field

Type

Allowed values

Meaning

is_bids

boolean

true or false

Selects BIDS anatomy import or non-BIDS T1 handling.

anatomy_preprocess_method

string

freesurfer or deepprep

Backend for anatomical reconstruction when anatomy is run.

anatomy_select_tag

string

Empty string or suffix such as _run-02_T1w

Appended to the MEG-derived subject id when matching anatomy.

mri_import_config.*

YAML filters

subject_id, session_id, task, run_id

BIDS filters used to select T1w images.

t1_dir

path

Any readable path

T1 input root for FreeSurfer and non-BIDS anatomy.

t1_input_type

string

nifti or dicom

Non-BIDS T1 input format.

t1_dicom_series_glob

string

unset

Optional relative glob for selecting DICOM series directories before conversion. When unset, the whole DICOM root is passed to dcm2niix.

fs_subjects_dir

path

FreeSurfer subjects directory

Used for recon outputs, BEM, coregistration, forward model, and source reconstruction.

deepprep_device

string

cpu or device supported by DeepPrep

Device passed to DeepPrep.

t1_bids_dir

path

BIDS directory

T1 BIDS root passed to DeepPrep.

fs_license

path

FreeSurfer license file

Required for FreeSurfer/DeepPrep execution.

bem_config.ico

integer

MNE ico grade

Resolution for BEM surface generation.

bem_config.conductivity

list of floats

For example [0.3]

Conductivity model passed to MNE BEM creation.

Continuous Preprocessing#

preproc_config is an OSL-Ephys preprocessing chain. Steps are executed in the order listed. A common chain is Maxwell/tSSS if needed, filtering, line-noise notch filtering, and resampling.

preproc:
  - maxwell_filter:
      calibration: /path/to/sss_cal.dat
      cross_talk: /path/to/ct_sparse.fif
      st_duration: 10.0
      st_correlation: 0.98
  - filter: {l_freq: 0.5, h_freq: 125, method: iir, iir_params: {order: 5, ftype: butter}}
  - notch_filter: {freqs: 50 100}
  - resample: {sfreq: 250}

The filter and notch_filter fields map to MNE raw filtering methods. resample maps to MNE raw resampling and is the current configurable downsampling mechanism. Use a target sfreq that preserves the frequencies needed for later analyses.

Artifact Detection#

Artifacts are detected after continuous preprocessing and before ICA. The results are saved as sidecar files under preprocessed/artifact_report/<recording>/ and are also loaded by later ICA and epoch steps.

Config path

Method

Description

find_bad_channels.pyprep.deviation

PyPREP deviation

Flags channels whose amplitude distribution deviates from the channel population. deviation_threshold controls sensitivity.

find_bad_channels.pyprep.snr

PyPREP SNR

Flags low signal-to-noise channels.

find_bad_channels.pyprep.nan_flat

PyPREP NaN/flat

Flags channels containing NaNs or flat signals.

find_bad_channels.pyprep.hfnoise

PyPREP high-frequency noise

Flags channels with excessive high-frequency noise.

find_bad_channels.pyprep.ransac

PyPREP RANSAC

Reconstructs channels from neighboring channels and flags channels with poor reconstruction correlation. This can be slow.

find_bad_channels.pyprep.correlation

PyPREP correlation

Flags channels with low correlation to other channels across windows.

find_bad_channels.psd

PSD outlier

Computes per-channel mean PSD and flags channels above mean + std_multiplier * std.

find_bad_channels.osl

OSL detect_badchannels

Runs OSL bad-channel detection for magnetometers and, when available, gradiometers. Common fields include ref_meg and significance_level.

find_bad_channels.mne.find_bad_channels_lof

MNE local outlier factor

Passes fields such as n_neighbors, picks, metric, and threshold to MNE’s LOF detector.

find_bad_segments.osl

OSL detect_badsegments

Marks outlier or zero-valued time windows. segment_len controls the detection window length in samples.

find_bad_segments.mne.annotate_muscle_zscore

MNE muscle z-score

Adds muscle-related annotations using MNE’s z-score based detector.

find_bad_segments.mne.annotate_amplitude

MNE amplitude annotation

Adds annotations for amplitude-based excursions.

find_bad_segments.mne.annotate_break

MNE break annotation

Marks long breaks between events.

interpolate_bads under artifact_config controls whether bad channels are interpolated immediately in the preprocessed raw file. If false, bad channels are retained in raw.info['bads'] for later exclusion or handling. artifact_images_enabled controls waveform and overview image generation for manual review, and meg_vendor selects vendor-specific plotting assumptions.

For ICA rule-based labeling, ic_label_config.ICA_classify.meg_vendor can be set to auto. This is the recommended cohort setting because each dataset may come from a different MEG system. When auto is used, MEGPrep infers the template family from the ICA channel names and applies bundled templates only when they are available. Current ECG/EOG template similarity bundles cover elekta/neuromag, ctf, 4d/bti, and kit. OPM datasets or unknown channel layouts skip template similarity gracefully and continue with the other ICA labeling methods.

If the dataset vendor is known, meg_vendor_by_dataset can override auto on a per-dataset basis. The key is matched against the ICA file path, so cohort dataset directory names are usually sufficient:

ICA_classify:
  meg_vendor: auto
  meg_vendor_by_dataset:
    OPM-Artifacts: opm
    Cam-CAN: neuromag
    My-CTF-Dataset: ctf
    default: auto

Dataset-specific mappings take precedence over meg_vendor. If no mapping matches, MEGPrep falls back to meg_vendor; if that is auto, it uses channel-name inference. For backward-compatible compact configs, ICA_classify.meg_vendor may also be a mapping with the same keys, although meg_vendor_by_dataset is preferred for clarity.

Bad Segment Marking and Exclusion#

MEGPrep separates marking bad time spans from excluding data:

  • Artifact detection writes MNE annotations to *_bad_segments.txt. This is a marking step.

  • ICA fitting loads the annotations and uses reject_by_annotation=True so marked spans are not used for fitting ICA.

  • ICA application saves a cleaned continuous raw file with the bad-channel and bad-segment metadata attached.

  • Epoch exclusion is controlled later by epoch_config. In particular, epochs.reject_by_annotation drops epochs overlapping bad annotations, and epochs.reject applies peak-to-peak rejection thresholds.

Epoching#

epoch_config controls optional segmentation after the continuous preprocessing and ICA stages.

Field

Allowed values

Meaning

task_type

task or resting

resting creates fixed-length events; task uses event triggers or BIDS event files.

resting.fixed_length_duration

float seconds

Duration used by MNE fixed-length event generation.

event_source

find_events or event_file

Selects MNE trigger discovery or BIDS events.tsv parsing.

find_events

MNE find_events kwargs

Fields such as stim_channel, shortest_event, and min_duration.

event_file

YAML mapping

Filters and optionally maps BIDS event labels to integer ids.

exclude_event_id

integer or list of integers

Drops these event ids before epoching. Set epochs.event_id: null to keep all other detected or BIDS events; if epochs.event_id is also set, both the include filter and this exclude filter are applied.

epochs

MNE Epochs kwargs

Fields such as event_id, tmin, tmax, reject_by_annotation, picks, baseline, reject, preload, and detrend.

autoreject

boolean

If true, MEGPrep estimates global rejection thresholds with autoreject.get_rejection_threshold and calls drop_bad.

interpolate_bads

boolean

Interpolates bad channels in the epoch object.

drop_bad_channels

boolean

Drops channels listed in epochs.info['bads'].

Covariance and Empty-Room Style Noise Records#

Noise covariance is controlled by covar_type and covar_config.

covar_type = "epochs" computes covariance from baseline epochs generated from each cleaned recording. covar_config.events and covar_config.epochs define the events and baseline window, and covar_config.covariance is passed to MNE compute_covariance.

covar_type = "raw" computes covariance from a continuous raw recording with MNE compute_raw_covariance. The workflow uses raw_covariance_task_id to find the paired noise or baseline recording: for each non-noise task file, it replaces the task-... entity in the filename with task-${params.raw_covariance_task_id} and uses that file if it exists. This is the current mechanism for empty-room or empty-room-like recordings. For example, set raw_covariance_task_id = "emptyroom" when the dataset contains files named with task-emptyroom. These records are not source-localized as experimental recordings; they are used as covariance input for the paired experimental recording.

Coregistration, Forward Model, and Source Reconstruction#

core_config controls automated MEG-MRI coregistration. It contains pre-cleaning parameters for head-shape points and two ICP stages: icp and finetune_icp. Weights such as nasion_weight, hsp_weight, and hpi_weight control how strongly fiducials, head-shape points, and HPI points influence the fit.

fwd_config controls the forward model. surface selects the cortical surface, and spacing controls source-space spacing such as ico4.

src_type selects whether source reconstruction consumes epochs or raw. src_config.source_methods currently supports methods implemented by source_localization.py, including dSPM and LCMV. The nested dSPM and LCMV blocks are passed to MNE inverse or beamformer functions.

Static Report Thresholds#

The static report uses simple, configurable thresholds for alarms. These are not normative or calibrated quality scores.

Field

Default

Meaning

bad_channel_threshold

30

Warn when detected bad channels exceed this count.

bad_segment_threshold

50

Warn when detected bad segments exceed this count.

coreg_mean_threshold

5.0

Danger alarm when mean coregistration distance exceeds this value in mm.

coreg_max_threshold

10.0

Danger alarm when max coregistration distance exceeds this value in mm.

epoch_reject_rate_threshold

0.30

Warn when rejected epoch fraction exceeds this value.

Static Report Task Logs#

static_task_log_mode controls command-log bundling for the Task Details and Task Failure Details sections in the static report.

Value

Meaning

all-command-log

Default. Copy .command.err, .command.log, and .command.out excerpts for failed or ignored tasks, and also copy .command.log for successful tasks.

failed

Copy command logs only for failed or ignored tasks when a smaller report directory is preferred.

none

Copy no .command* logs. Trace-derived task details remain visible.