Denoising (sphero_vem.denoising)#

Self-supervised Noise2Void training and inference via CAREamics.

Functions for denoising images, based on CAREamics.

class sphero_vem.denoising.DenoisingConfig(root_path, src_path, num_images=10, val_split=0.2, random_state=42, batch_size=128, patch_size=64, epochs=100, unet_depth=2, unet_num_channels_init=32, n2v2=False, num_workers=16, wandb_project='denoising', work_root=PosixPath('data/models/n2v'), model_name=None)[source]#

Bases: BaseConfig

Configuration for Noise2Void training via CAREamics.

Parameters:
  • root_path (Path) – Path to the root Zarr archive containing the source data.

  • src_path (str) – Path within the Zarr archive to the source image array.

  • num_images (int, optional) – Number of 2-D slices to load for training. Default is 10.

  • val_split (float, optional) – Fraction of slices to use for validation. Default is 0.2.

  • random_state (int, optional) – Random seed for the train/validation split. Default is 42.

  • batch_size (int, optional) – Training mini-batch size. Default is 128.

  • patch_size (int, optional) – Spatial size of square patches extracted from slices. Default is 64.

  • epochs (int, optional) – Number of training epochs. Default is 100.

  • unet_depth (int, optional) – Depth of the U-Net encoder. Default is 2.

  • unet_num_channels_init (int, optional) – Number of feature channels in the first U-Net encoder layer. Default is 32.

  • n2v2 (bool, optional) – If True, use N2V2 blind-spot strategy instead of standard N2V. Default is False.

  • num_workers (int, optional) – Number of data-loader worker processes. Default is 16.

  • wandb_project (str, optional) – Weights & Biases project name for experiment tracking. Default is "denoising".

  • work_root (Path, optional) – Root directory for model checkpoints and configs. Default is Path("data/models/n2v").

  • model_name (str | None, optional) – Unique name for this training run. If None, a timestamp-based name is generated automatically. Default is None.

save_n2v_config(filepath)[source]#

Saves the N2V config class to a JSON file.

Return type:

None

Parameters:

filepath (str | Path)

sphero_vem.denoising.train_n2v(config)[source]#

Train a Noise2Void model using the parameters in config.

Loads 2-D slices from a Zarr array, splits them into training and validation sets, saves config files, and runs the CAREamics training loop with Weights & Biases logging.

Parameters:

config (DenoisingConfig) – Training configuration. The Zarr archive at config.root_path must be readable and the array at config.src_path must exist.

Return type:

None

class sphero_vem.denoising.DenoisingStats(global_min=inf, global_max=-inf, residual_counts=<factory>)[source]#

Bases: object

Statistics accumulated during the denoising pass.

Tracks global intensity range of the denoised output and the residual histogram (original - denoised) in the original intensity space, before any rescaling.

Parameters:
  • global_min (float) – Running minimum of denoised float32 values across all slices.

  • global_max (float) – Running maximum of denoised float32 values across all slices.

  • residual_counts (np.ndarray) – Histogram counts of shape (511,), covering residuals in [-255, 255].

Notes

Residuals are computed as original (uint8) - denoised (float32), rounded and clipped to [-255, 255]. A well-behaved residual histogram should be approximately zero-mean and Gaussian. A small negative bias is expected due to N2V’s blind-spot averaging.

update(original, denoised)[source]#

Update statistics with a new slice.

Parameters:
  • original (np.ndarray) – Original uint8 image.

  • denoised (np.ndarray) – Denoised float32 image, before any rescaling.

Return type:

None

sphero_vem.denoising.denoise_image(image, careamist, tile_size, tile_overlap, batch_size, num_workers, rescale=False, stats=None)[source]#

Denoise a single YX image.

Parameters:
  • image (np.ndarray) – 2D image to denoise (YX).

  • careamist (CAREamist) – Trained CAREamist model.

  • tile_size (tuple[int, ...]) – Tile size for prediction.

  • tile_overlap (tuple[int, ...]) – Overlap between tiles.

  • batch_size (int) – Number of tiles to process in parallel.

  • num_workers (int) – Number of dataloader workers.

  • rescale (bool) – If True, rescale denoised output to uint8 [0, 255] using per-slice min/max. Default is False.

  • stats (DenoisingStats | None) – If provided, updated in-place with global min/max and residual histogram. Stats are accumulated from the float32 denoised image before rescaling. Default is None.

Returns:

Denoised image. If rescale=True, returns uint8, otherwise float32.

Return type:

np.ndarray

sphero_vem.denoising.denoise_stack(root_path, src_path, model_name, dst_group='images/denoised', model_root=PosixPath('data/models/n2v'), tile_size=(512, 512), tile_overlap=(48, 48), batch_size=64, num_workers=16, temp_dir=PosixPath('data/tmp'), rescale_mode='per_slice')[source]#

Denoise volume with Noise2Void and rescale to uint8.

Performs denoising with either global or per-slice intensity rescaling. Both modes accumulate a residual histogram saved as an npz file alongside the output zarr.

Parameters:
  • root_path (Path) – Path to the zarr root archive.

  • src_path (str) – Path to the source array within the zarr archive.

  • model_name (str) – Name of the trained N2V model. Used to locate the model checkpoint and configuration file under model_root.

  • dst_group (str) – Destination group within the zarr archive. The output array path is constructed as {dst_group}/{dirname_from_spacing(spacing)}. Default is “images/denoised”.

  • model_root (Path) – Root directory containing trained models. Each model should be in a subdirectory with a config.json and checkpoint file. Default is Path(“data/models/n2v”).

  • tile_size (tuple[int, int]) – Tile size in pixels (Y, X) for prediction. Default is (512, 512).

  • tile_overlap (tuple[int, int]) – Overlap in pixels (Y, X) between adjacent tiles to avoid boundary artifacts. Default is (48, 48).

  • batch_size (int) – Number of tiles to process in parallel on the GPU. Default is 64.

  • num_workers (int) – Number of dataloader workers for tile loading. Default is 16.

  • temp_dir (Path | str | None) – Directory for intermediate float32 zarr storage. Only used when rescale_mode=’global’. Should be on fast local storage (SSD). Default is Path(“data/tmp”).

  • rescale_mode (Literal["per_slice", "global"]) – If ‘global’, use global min/max across all slices for rescaling (two-pass, requires temporary zarr storage). If ‘per_slice’, rescale each slice independently using per-slice min/max (single-pass, no temporary storage). Default is ‘per_slice’.

Return type:

None

Notes

When rescale_mode=’global’, the intermediate zarr is uncompressed and can be large (4 bytes per voxel). Ensure sufficient disk space in temp_dir. The residual histogram is saved to {root_path}/images/tables/denoised-residual-hist.npz.