Preprocessing (sphero_vem.preprocessing)#
Zarr-native resampling, tensor downscaling, and crop utilities.
Functions for preprocessing images.
- sphero_vem.preprocessing.create_pyramid(image, num_levels, factor)[source]#
Build a multi-resolution image pyramid.
- Parameters:
image (torch.Tensor) – Input image tensor at full resolution.
num_levels (int) – Total number of pyramid levels, including the full-resolution image.
factor (int) – Downsampling factor between consecutive levels.
- Returns:
List of image tensors ordered from coarsest to finest resolution.
- Return type:
- sphero_vem.preprocessing.downscale_tensor(image, factor, mode='bilinear')[source]#
Downscale a tensor or batch of tensors by an integer factor.
- Parameters:
image (torch.Tensor) – Input tensor of shape
(..., H, W). Unsqueezed to 4-D internally if necessary before interpolation.factor (int) – Integer downsampling factor. Output spatial dimensions are
H // factor×W // factor.mode (str, optional) – Interpolation mode passed to
torch.nn.functional.interpolate. Default is"bilinear". Use"nearest"for label maps.
- Returns:
Downscaled tensor with the same number of dimensions as the input.
- Return type:
- sphero_vem.preprocessing.resample_array(zarr_path, array_path, target_spacing, order=1, zarr_chunks=(1, 1024, 1024), n_workers=4)[source]#
Resample an array in a Zarr archive to the target voxel spacing.
Uses a lazy Gaussian pre-blur followed by affine transform via dask_image, keeping memory usage bounded to chunk size throughout. Anti-aliasing is applied only along downsampled axes, mirroring skimage.transform.resize. Integer label data (integer dtype + order=0) is resampled without anti-aliasing. float16 arrays are promoted to float32 for processing and cast back on output, as scipy.ndimage does not support float16.
- Parameters:
zarr_path (Path) – Path to the Zarr archive.
array_path (str) – Path to the source array within the archive.
target_spacing (tuple[int, int, int]) – Target voxel spacing (Z, Y, X) in nanometers.
order (int) – Spline interpolation order. 0 = nearest neighbour (labels), 1 = linear (images). Default 1.
zarr_chunks (tuple[int, int, int]) – Chunk shape for the output Zarr array.
n_workers (int) – Number of threads for dask’s threaded scheduler. Default 4.
- Return type:
- sphero_vem.preprocessing.rechunk_array(root, src_array_path, dst_array_path, dst_chunks=(1, 1024, 1024), copy_attributes=True, delete_src=False, verbose=True)[source]#
Copy a Zarr array to a new path with a different chunk layout.
- Parameters:
root (zarr.Group) – Root Zarr group containing the source array.
src_array_path (str) – Path to the source array within root.
dst_array_path (str) – Path for the destination array within root. Created or overwritten.
dst_chunks (tuple[int, int, int], optional) – Chunk shape for the output array. Default is
(1, 1024, 1024).copy_attributes (bool, optional) – If True, copy all Zarr attributes from source to destination. Default is True.
delete_src (bool, optional) – If True, delete the source array after copying. Default is False.
verbose (bool, optional) – If True, show a tqdm progress bar. Default is True.
- Returns:
The newly created destination array.
- Return type:
zarr.Array
- Raises:
FileNotFoundError – If src_array_path does not exist within root.
- sphero_vem.preprocessing.crop_to_valid(data, mode='nonzero')[source]#
Crop a 3D array to the bounding box of valid data.
- Parameters:
data (np.ndarray) – The 3D input array.
mode (Literal["nonzero", "notnan"], optional) – The validity criteria: “nonzero” (default) or “notnan”.
- Returns:
The cropped array.
- Return type:
np.ndarray
- Raises:
ValueError – If mode is not a valid value.