Configuration Reference

Configuration Reference#

MEGPrep is configured through a Nextflow configuration file, usually nextflow.config. The top-level params block controls input discovery, pipeline stage selection, output locations, and the YAML snippets passed to the Python processing scripts.

Many nested YAML fields are passed directly to MNE-Python, MNE-BIDS, OSL-Ephys, PyPREP, or FreeSurfer/DeepPrep functions. When a field is a direct pass-through, MEGPrep preserves the name and meaning used by the upstream API. Useful upstream references include the MNE-Python documentation for Raw objects, Epochs, ICA, compute_raw_covariance, compute_covariance, and find_bad_channels_lof.

Command Line Mapping#

The Docker entrypoint copies the mounted config to /program/nextflow/run_nextflow.config and then applies selected command-line overrides before launching Nextflow.

Docker option	Config or Nextflow target	Notes
`-c`, `--config`	Input config file	Mounted project config. The effective copy is saved to `<output_dir>/nextflow.config` after the run.
`-i`, `--input`	`params.dataset_dir`	Root directory used by MEG and MRI import steps.
`-o`, `--output`	`params.output_dir`	Output root. `params.preproc_dir` defaults to `${params.output_dir}/preprocessed`.
`-s`, `--steps`	Nextflow `--steps` / `params.steps`	Overrides the value in the config for this run.
`--fs_subjects_dir`	`params.fs_subjects_dir`	FreeSurfer `SUBJECTS_DIR` used for coregistration, BEM, forward model, and source reconstruction.
`--fs_license_file`	`params.fs_license`	Used by DeepPrep/FreeSurfer-related execution in the container.
`--t1_dir`	`params.t1_dir`	T1 input root when structural processing is enabled.
`--t1_input_type`	`params.t1_input_type`	`nifti` or `dicom` for non-BIDS anatomy input.
`--t1_dicom_series_glob`	`params.t1_dicom_series_glob`	Optional relative glob for selecting DICOM series directories under each T1 DICOM root, for example `T1` or `mprage`.
`--resume`	Nextflow `-resume`	Reuses completed Nextflow work directory tasks where possible.
`--static_task_log_mode`	`params.static_task_log_mode`	Controls how much Nextflow `.command*` log content is copied into the static HTML report. Values are `failed`, `all-command-log`, and `none`.

Docker Output Ownership#

The Docker image is designed to be run without Docker’s --user flag. The entrypoint starts as root only long enough to prepare mounted output permissions. It then drops privileges with gosu and runs Nextflow as the host UID/GID inferred from the mounted /input directory. Report-only runs that only mount /output infer ownership from /output instead.

This means users do not need to pre-create the host output directory. If Docker creates the bind-mounted output path as root:root before the container starts, MEGPrep changes ownership to the inferred host user before launching the pipeline.

If neither /input nor /output is owned by the user who should own the outputs, pass the desired IDs explicitly:

docker run --rm -it \
  -e LOCAL_UID="$(id -u)" -e LOCAL_GID="$(id -g)" \
  -v /data/bids:/input \
  -v /data/out:/output \
  cmrlab/megprep:<version> \
  -i /input -o /output

Pipeline Stage Selection#

params.steps is the primary switch for choosing how much of the workflow to run. It can be set in the config or overridden with --steps.

Value	Behavior
`meg_all`	Default full MEG workflow using an existing `fs_subjects_dir`: MEG import, continuous preprocessing, artifact detection, ICA, epochs, covariance, coregistration, forward solution, source reconstruction, and static report.
`all`	Structural MRI workflow plus full MEG workflow in one run.
`anatomy`	Structural MRI workflow only.
`meg_artifacts`	MEG import, continuous preprocessing, artifact detection, then static report.
`meg_ica`	Through ICA fitting, labeling, ICA application, then static report.
`meg_epochs`	Through epoch generation, then static report.
`report`	Rebuild the static HTML report from existing outputs only.

Aliases are supported: meg maps to meg_all, artifacts maps to meg_artifacts, ica maps to meg_ica, and epochs maps to meg_epochs.

Optional modifiers are comma-separated. skip_ica is valid only with meg_epochs and builds epochs from the OSL preprocessed raw files. It is not valid for meg_all or all because the downstream source reconstruction path expects ICA-clean raw data. with_anatomy is valid with meg_artifacts, meg_ica, or meg_epochs and runs structural MRI processing before the selected MEG milestone.

Global Paths and Execution Settings#

Data Import#

MEG input discovery is performed by meg_import_dataset.py. BIDS datasets are discovered through MNE-BIDS entities. Raw datasets are discovered by walking the input directory and selecting files by suffix.

Field	Type	Allowed values	Meaning
`dataset_format`	string	`auto`, `bids`, `raw`	`auto` treats a directory containing `dataset_description.json` as BIDS; otherwise it uses raw-file discovery.
`file_suffix`	string	Any basename ending, usually `.fif`, `.ds`, or `c,rfDC`	Used only for raw dataset discovery. Matches files and directories, so CTF `.ds` folders can be imported. Split FIF continuation files such as `-1.fif` are excluded.
`meg_import_config.subject_id`	null, string, or list	BIDS subject labels without `sub-`	Optional subject filter for BIDS MEG input.
`meg_import_config.session_id`	null, string, or list	BIDS session labels	Optional session filter.
`meg_import_config.task`	null, string, or list	BIDS task labels	Optional task filter.
`meg_import_config.run_id`	null, string, or list	BIDS run labels	Optional run filter.
`meg_import_config.raw_exclude_keywords`	null, string, or list	Case-insensitive substrings	Raw dataset only. Excludes files or directories whose basename contains any listed keyword, for example `phantom` or `emptyroom`.
`meg_import_config.raw_include_keywords`	null, string, or list	Case-insensitive substrings	Raw dataset only. When set, only files or directories whose basename contains at least one listed keyword are imported.

Anatomy Input and Reconstruction#

Structural processing is used by steps=anatomy, steps=all, or selected MEG milestones with with_anatomy. If structural processing is not selected, MEGPrep assumes fs_subjects_dir already contains subject reconstructions.

Field	Type	Allowed values	Meaning
`is_bids`	boolean	`true` or `false`	Selects BIDS anatomy import or non-BIDS T1 handling.
`anatomy_preprocess_method`	string	`freesurfer` or `deepprep`	Backend for anatomical reconstruction when anatomy is run.
`anatomy_select_tag`	string	Empty string or suffix such as `_run-02_T1w`	Appended to the MEG-derived subject id when matching anatomy.
`mri_import_config.*`	YAML filters	`subject_id`, `session_id`, `task`, `run_id`	BIDS filters used to select T1w images.
`t1_dir`	path	Any readable path	T1 input root for FreeSurfer and non-BIDS anatomy.
`t1_input_type`	string	`nifti` or `dicom`	Non-BIDS T1 input format.
`t1_dicom_series_glob`	string	unset	Optional relative glob for selecting DICOM series directories before conversion. When unset, the whole DICOM root is passed to `dcm2niix`.
`fs_subjects_dir`	path	FreeSurfer subjects directory	Used for recon outputs, BEM, coregistration, forward model, and source reconstruction.
`deepprep_device`	string	`cpu` or device supported by DeepPrep	Device passed to DeepPrep.
`t1_bids_dir`	path	BIDS directory	T1 BIDS root passed to DeepPrep.
`fs_license`	path	FreeSurfer license file	Required for FreeSurfer/DeepPrep execution.
`bem_config.ico`	integer	MNE ico grade	Resolution for BEM surface generation.
`bem_config.conductivity`	list of floats	For example `[0.3]`	Conductivity model passed to MNE BEM creation.

Continuous Preprocessing#

preproc_config is an OSL-Ephys preprocessing chain. Steps are executed in the order listed. A common chain is Maxwell/tSSS if needed, filtering, line-noise notch filtering, and resampling.

preproc:
  - maxwell_filter:
      calibration: /path/to/sss_cal.dat
      cross_talk: /path/to/ct_sparse.fif
      st_duration: 10.0
      st_correlation: 0.98
  - filter: {l_freq: 0.5, h_freq: 125, method: iir, iir_params: {order: 5, ftype: butter}}
  - notch_filter: {freqs: 50 100}
  - resample: {sfreq: 250}

The filter and notch_filter fields map to MNE raw filtering methods. resample maps to MNE raw resampling and is the current configurable downsampling mechanism. Use a target sfreq that preserves the frequencies needed for later analyses.

Artifact Detection#

Artifacts are detected after continuous preprocessing and before ICA. The results are saved as sidecar files under preprocessed/artifact_report/<recording>/ and are also loaded by later ICA and epoch steps.

Config path	Method	Description
`find_bad_channels.pyprep.deviation`	PyPREP deviation	Flags channels whose amplitude distribution deviates from the channel population. `deviation_threshold` controls sensitivity.
`find_bad_channels.pyprep.snr`	PyPREP SNR	Flags low signal-to-noise channels.
`find_bad_channels.pyprep.nan_flat`	PyPREP NaN/flat	Flags channels containing NaNs or flat signals.
`find_bad_channels.pyprep.hfnoise`	PyPREP high-frequency noise	Flags channels with excessive high-frequency noise.
`find_bad_channels.pyprep.ransac`	PyPREP RANSAC	Reconstructs channels from neighboring channels and flags channels with poor reconstruction correlation. This can be slow.
`find_bad_channels.pyprep.correlation`	PyPREP correlation	Flags channels with low correlation to other channels across windows.
`find_bad_channels.psd`	PSD outlier	Computes per-channel mean PSD and flags channels above `mean + std_multiplier * std`.
`find_bad_channels.osl`	OSL `detect_badchannels`	Runs OSL bad-channel detection for magnetometers and, when available, gradiometers. Common fields include `ref_meg` and `significance_level`.
`find_bad_channels.mne.find_bad_channels_lof`	MNE local outlier factor	Passes fields such as `n_neighbors`, `picks`, `metric`, and `threshold` to MNE’s LOF detector.
`find_bad_segments.osl`	OSL `detect_badsegments`	Marks outlier or zero-valued time windows. `segment_len` controls the detection window length in samples.
`find_bad_segments.mne.annotate_muscle_zscore`	MNE muscle z-score	Adds muscle-related annotations using MNE’s z-score based detector.
`find_bad_segments.mne.annotate_amplitude`	MNE amplitude annotation	Adds annotations for amplitude-based excursions.
`find_bad_segments.mne.annotate_break`	MNE break annotation	Marks long breaks between events.

interpolate_bads under artifact_config controls whether bad channels are interpolated immediately in the preprocessed raw file. If false, bad channels are retained in raw.info['bads'] for later exclusion or handling. artifact_images_enabled controls waveform and overview image generation for manual review, and meg_vendor selects vendor-specific plotting assumptions.

For ICA rule-based labeling, ic_label_config.ICA_classify.meg_vendor can be set to auto. This is the recommended cohort setting because each dataset may come from a different MEG system. When auto is used, MEGPrep infers the template family from the ICA channel names and applies bundled templates only when they are available. Current ECG/EOG template similarity bundles cover elekta/neuromag, ctf, 4d/bti, and kit. OPM datasets or unknown channel layouts skip template similarity gracefully and continue with the other ICA labeling methods.

If the dataset vendor is known, meg_vendor_by_dataset can override auto on a per-dataset basis. The key is matched against the ICA file path, so cohort dataset directory names are usually sufficient:

ICA_classify:
  meg_vendor: auto
  meg_vendor_by_dataset:
    OPM-Artifacts: opm
    Cam-CAN: neuromag
    My-CTF-Dataset: ctf
    default: auto

Dataset-specific mappings take precedence over meg_vendor. If no mapping matches, MEGPrep falls back to meg_vendor; if that is auto, it uses channel-name inference. For backward-compatible compact configs, ICA_classify.meg_vendor may also be a mapping with the same keys, although meg_vendor_by_dataset is preferred for clarity.

Bad Segment Marking and Exclusion#

MEGPrep separates marking bad time spans from excluding data:

Artifact detection writes MNE annotations to *_bad_segments.txt. This is a marking step.
ICA fitting loads the annotations and uses reject_by_annotation=True so marked spans are not used for fitting ICA.
ICA application saves a cleaned continuous raw file with the bad-channel and bad-segment metadata attached.
Epoch exclusion is controlled later by epoch_config. In particular, epochs.reject_by_annotation drops epochs overlapping bad annotations, and epochs.reject applies peak-to-peak rejection thresholds.

Epoching#

epoch_config controls optional segmentation after the continuous preprocessing and ICA stages.

Field	Allowed values	Meaning
`task_type`	`task` or `resting`	`resting` creates fixed-length events; `task` uses event triggers or BIDS event files.
`resting.fixed_length_duration`	float seconds	Duration used by MNE fixed-length event generation.
`event_source`	`find_events` or `event_file`	Selects MNE trigger discovery or BIDS `events.tsv` parsing.
`find_events`	MNE `find_events` kwargs	Fields such as `stim_channel`, `shortest_event`, and `min_duration`.
`event_file`	YAML mapping	Filters and optionally maps BIDS event labels to integer ids.
`exclude_event_id`	integer or list of integers	Drops these event ids before epoching. Set `epochs.event_id: null` to keep all other detected or BIDS events; if `epochs.event_id` is also set, both the include filter and this exclude filter are applied.
`epochs`	MNE `Epochs` kwargs	Fields such as `event_id`, `tmin`, `tmax`, `reject_by_annotation`, `picks`, `baseline`, `reject`, `preload`, and `detrend`.
`autoreject`	boolean	If true, MEGPrep estimates global rejection thresholds with `autoreject.get_rejection_threshold` and calls `drop_bad`.
`interpolate_bads`	boolean	Interpolates bad channels in the epoch object.
`drop_bad_channels`	boolean	Drops channels listed in `epochs.info['bads']`.

Covariance and Empty-Room Style Noise Records#

Noise covariance is controlled by covar_type and covar_config.

covar_type = "epochs" computes covariance from baseline epochs generated from each cleaned recording. covar_config.events and covar_config.epochs define the events and baseline window, and covar_config.covariance is passed to MNE compute_covariance.

covar_type = "raw" computes covariance from a continuous raw recording with MNE compute_raw_covariance. The workflow uses raw_covariance_task_id to find the paired noise or baseline recording: for each non-noise task file, it replaces the task-... entity in the filename with task-${params.raw_covariance_task_id} and uses that file if it exists. This is the current mechanism for empty-room or empty-room-like recordings. For example, set raw_covariance_task_id = "emptyroom" when the dataset contains files named with task-emptyroom. These records are not source-localized as experimental recordings; they are used as covariance input for the paired experimental recording.

Coregistration, Forward Model, and Source Reconstruction#

core_config controls automated MEG-MRI coregistration. It contains pre-cleaning parameters for head-shape points and two ICP stages: icp and finetune_icp. Weights such as nasion_weight, hsp_weight, and hpi_weight control how strongly fiducials, head-shape points, and HPI points influence the fit.

fwd_config controls the forward model. surface selects the cortical surface, and spacing controls source-space spacing such as ico4.

src_type selects whether source reconstruction consumes epochs or raw. src_config.source_methods currently supports methods implemented by source_localization.py, including dSPM and LCMV. The nested dSPM and LCMV blocks are passed to MNE inverse or beamformer functions.

Static Report Thresholds#

The static report uses simple, configurable thresholds for alarms. These are not normative or calibrated quality scores.

Field	Default	Meaning
`bad_channel_threshold`	`30`	Warn when detected bad channels exceed this count.
`bad_segment_threshold`	`50`	Warn when detected bad segments exceed this count.
`coreg_mean_threshold`	`5.0`	Danger alarm when mean coregistration distance exceeds this value in mm.
`coreg_max_threshold`	`10.0`	Danger alarm when max coregistration distance exceeds this value in mm.
`epoch_reject_rate_threshold`	`0.30`	Warn when rejected epoch fraction exceeds this value.

Static Report Task Logs#

static_task_log_mode controls command-log bundling for the Task Details and Task Failure Details sections in the static report.

Value	Meaning
`all-command-log`	Default. Copy `.command.err`, `.command.log`, and `.command.out` excerpts for failed or ignored tasks, and also copy `.command.log` for successful tasks.
`failed`	Copy command logs only for failed or ignored tasks when a smaller report directory is preferred.
`none`	Copy no `.command*` logs. Trace-derived task details remain visible.