Tag
This paper presents 3D masked autoencoders for volumetric microscopy data, demonstrating that 3D modeling outperforms 2D max-projection and slice-based variants on downstream single-cell tasks, with cross-modal alignment to a protein language model further improving performance.