Computational Imaging and Vision Laboratory

Reliability-Aware Restoration Framework for 4D Spectral Photoacoustic Data

Spectral photoacoustic imaging (PAI) is a new technology that is able to provide 3D geometric structure associated with 1D wavelength-dependent absorption information of the interior of a target in a non-invasive manner. It has potentially broad applications in clinical and medical diagnosis. Unfortunately, the usability of spectral PAI is severely affected by a time-consuming data scanning process and complex noise. Therefore in this study, we propose a reliability-aware restoration framework to recover clean 4D data from incomplete and noisy observations. To the best of our knowledge, this is the first attempt for the 4D spectral PA data restoration problem that solves data completion and denoising simultaneously. We first present a sequence of analyses, including modeling of data reliability in the depth and spectral domains, developing an adaptive correlation graph, and analyzing local patch orientation. On the basis of these analyses, we explore global sparsity and local self-similarity for restoration. We demonstrated the effectiveness of our proposed approach through experiments on real data captured from patients, where our approach outperformed the state-of-the-art methods in both objective evaluation and subjective assessment.

＞read more (PAMI 2023)

High-Fidelity Event-Radiance Recovery via Transient Event Frequency

High-fidelity radiance recovery plays a crucial role in scene information reconstruction and understanding. Conventional cameras suffer from limited sensitivity in dynamic range, bit depth, and spectral response, etc. In this paper, we propose to use event cameras with bio-inspired silicon sensors, which are sensitive to radiance changes, to recover precise radiance values. We reveal that, under active lighting conditions, the transient frequency of event signals triggering linearly reflects the radiance value. We propose an innovative method to convert the high temporal resolution of event signals into precise radiance values. The precise radiance values yields several capabilities in image analysis. We demonstrate the feasibility of recovering radiance values solely from the transient event frequency (TEF) through multiple experiments.

＞read more (CVPR 2023)

Blur Interpolation Transformer for Real-World Motion from Blur

This paper studies the challenging problem of recovering motion from blur, also known as joint deblurring and interpolation or blur temporal super-resolution. The challenges are twofold: 1) the current methods still leave considerable room for improvement in terms of visual quality even on the synthetic dataset, and 2) poor generalization to real-world data. To this end, we propose a blur interpolation transformer (BiT) to effectively unravel the underlying temporal correlation encoded in blur. Based on multi-scale residual Swin transformer blocks, we introduce dual-end temporal supervision and temporally symmetric ensembling strategies to generate effective features for time-varying motionrendering. In addition, we design a hybrid camera system to collect the first real-world dataset of one-to-many blur-sharp video pairs. Experimental results show that BiT has a significant gain over the state-of-the-art methods on the public dataset Adobe240. Besides, the proposed realworld dataset effectively helps the model generalize well to real blurry scenarios. Code and data are available at https://github.com/zzh-tech/BiT.

＞read more (CVPR 2023)

Bringing Rolling Shutter Images Alive with Dual Reversed Distortion

Rolling shutter (RS) distortion can be interpreted as the result of picking a row of pixels from instant global shutter (GS) frames over time during the exposure of the RS camera. This means that the information of each instant GS frame is partially, yet sequentially, embedded into the row-dependent distortion. Inspired by this fact, we address the challenging task of reversing this process, i.e., extracting undistorted GS frames from images suffering from RS distortion. However, since RS distortion is coupled with other factors such as readout settings and the relative velocity of scene elements to the camera, models that only exploit the geometric correlation between temporally adjacent images suffer from poor generality in processing data with different readout settings and dynamic scenes with both camera motion and object motion. In this paper, instead of two consecutive frames, we propose to exploit a pair of images captured by dual RS cameras with reversed RS directions for this highly challenging task. Grounded on the symmetric and complementary nature of dual reversed distortion, we develop a novel end-to-end model, IFED, to generate dual optical flow sequence through iterative learning of the velocity field during the RS time. Extensive experimental results demonstrate that IFED is superior to naive cascade schemes, as well as the state-of-the-art which utilizes adjacent RS images. Most importantly, although it is trained on a synthetic dataset, IFED is shown to be effective at retrieving GS frame sequences from real-world RS distorted images of dynamic scenes. Code is available at https://github.com/zzh-tech/Dual-Reversed-RS.

＞read more (ECCV 2022)

Animation from Blur: Multi-modal Blur Decomposition with Motion Guidance

We study the challenging problem of recovering detailed motion from a single motion-blurred image. Existing solutions to this problem estimate a single image sequence without considering the motion ambiguity for each region. Therefore, the results tend to converge to the mean of the multi-modal possibilities. In this paper, we explicitly account for such motion ambiguity, allowing us to generate multiple plausible solutions all in sharp detail. The key idea is to introduce a motion guidance representation, which is a compact quantization of 2D optical flow with only four discrete motion directions. Conditioned on the motion guidance, the blur decomposition is led to a specific, unambiguous solution by using a novel two-stage decomposition network. We propose a unified framework for blur decomposition, which supports various interfaces for generating our motion guidance, including human input, motion information from adjacent video frames, and learning from a video dataset. Extensive experiments on synthesized datasets and real-world data show that the proposed framework is qualitatively and quantitatively superior to previous methods, and also offers the merit of producing physically plausible and diverse solutions. Code is available at https://github.com/zzh-tech/Animation-from-Blur.

＞read more (ECCV 2022)

Real-World Video Deblurring: A Benchmark Dataset and an Efficient Recurrent Neural Network

Real-world video deblurring in real time still remains a challenging task due to the complexity of spatially and temporally varying blur itself and the requirement of low computational cost. To improve the network efficiency, we adopt residual dense blocks into RNN cells, so as to efficiently extract the spatial features of the current frame. Furthermore, a global spatio-temporal attention module is proposed to fuse the effective hierarchical features from past and future frames to help better deblur the current frame. Another issue that needs to be addressed urgently is the lack of a real-world benchmark dataset. Thus, we contribute a novel dataset (BSD) to the community, by collecting paired blurry/sharp video clips using a co-axis beam splitter acquisition system. Experimental results show that the proposed method (ESTRNN) can achieve better deblurring performance both quantitatively and qualitatively with less computational cost against state-of-the-art video deblurring methods. In addition, cross-validation experiments between datasets illustrate the high generality of BSD over the synthetic datasets. The code and dataset are released at https://github.com/zzh-tech/ESTRNN.

＞read more (IJCV 2023)

Unsupervised Deep Non-rigid Alignment by Low-Rank Loss and Multi-input Attention

We propose a deep low-rank alignment network that can simultaneously perform non-rigid alignment and noise decomposition for multiple images despite severe noise and sparse corruptions. To address this challenging task, we introduce a low-rank loss in deep learning under the assumption that a set of well-aligned, well-denoised images should be linearly correlated, and thus, that a matrix consisting of the images should be low-rank. This allows us to remove the noise and corruption from input images in a self-supervised learning manner (i.e., without requiring supervised data). In addition, we introduce multi-input attention modules into Siamese U-nets in order to aggregate the corruption information from the set of images. To the best of our knowledge, this is the first attempt to introduce a low-rank loss for deep learning-based non-rigid alignment. Experiments using both synthetic data and real medical image data demonstrate the effectiveness of the proposed method. The code will be publicly available in https://github.com/asanomitakanori/Unsupervised-Deep-Non-Rigid-Alignment-by-Low-Rank-Loss-and-Multi-Input-Attention.

＞read more (MICCAI 2022)

Graph-Based Compression of Incomplete 3D Photoacoustic Data

Photoacoustic imaging (PAI) is a newly emerging bimodal imaging technology based on the photoacoustic effect; specifically, it uses sound waves caused by light absorption in a material to obtain 3D structure data noninvasively. PAI has attracted attention as a promising measurement technology for comprehensive clinical application and medical diagnosis. Because it requires exhaustively scanning an entire object and recording ultrasonic waves from various locations, it encounters two problems: a long imaging time and a huge data size. To reduce the imaging time, a common solution is to apply compressive sensing (CS) theory. CS can effectively accelerate the imaging process by reducing the number of measurements, but the data size is still large, and efficient compression of such incomplete data remains a problem. In this paper, we present the first attempt at direct compression of incomplete 3D PA observations, which simultaneously reduces the data acquisition time and alleviates the data size issue. Specifically, we first use a graph model to represent the incomplete observations. Then, we propose three coding modes and a reliability-aware rate-distortion optimization (RDO) to adaptively compress the data into sparse coefficients. Finally, we obtain a coded bit stream through entropy coding. We demonstrate the effectiveness of our proposed framework through both objective evaluation and subjective visual checking of real medical PA data captured from patients.

＞read more (MICCAI 2022)

Diffeomorphic Neural Surface Parameterization for 3D and Reflectance Acquisition

This paper proposes a simple method which solves the problem of multi-view 3D reconstruction for objects with unknown and generic surface materials, imaged by a freely moving camera and lit by a freely moving point light source. The object can have arbitrary (diffuse or specular) and spatially-varying surface reflectances. Our solution consists of two small-sized neural networks (dubbed the ‘Shape-Net’ and ‘BRDF-Net’), used to parameterize the unknown shape and material map as functions on a canonical surface (e.g. unit sphere). Key to our method is a velocity field shape representation that drives the canonical surface to target shape through time. We show this parameterization can be implemented as a recurrent residual network that is guaranteed to be diffeomorphic and orientation-preserving. Our method yields an exceptionally clean formulation that can be optimized by standard gradient descent without initialization, and works with both near-field and distant light source. Synthetic and real experiments demonstrate the reliability and accuracy of our reconstructions, with extensions including novel-view-synthesis, relighting and material retouching done with ease. Our source codes are available at https://github.com/za-cheng/DNS.

＞read more (SIGGRAPH 2022)

[Extended] Depth from Spectral Defocus Blur

This paper proposes a method for depth estimation from a single multispectral image by using a lens property known as a chromatic aberration. The chromatic aberration cause that the light passing through the lens is refracted depending on the wavelength. The refraction cause that rays vary their angle depending on the wavelength and generate a change in focal length which leads to a defocus blur for different wavelengths. We show that the chromatic aberration provides clues to recover depth maps from a single multispectral image if we assume that the defocus blur is Gaussian. The proposed method needs only a standard wide-aperture lens which naturally exhibits the chromatic aberration and a multispectral camera. Moreover, we use a simple yet effective depth of field synthesis method to calculate the derivatives and obtain all-in-focus images necessary to approximate spectral derivatives. We verified the effectiveness of the proposed method on various real-world scenes.

＞read more (ICIP 2019)

＞read more (JOSA 2021)

Spatio-temporal BRDF: Modeling and synthesis

We propose a generalization of example-based texture synthesis to spatio-temporal BRDFs. A key component of our method is a novel representation of time-varying materials using polynomials describing time-varying BRDF parameters. Our representation allows efficient fitting of measured data into a compact spatio-temporal BRDF representation, and it allows an efficient analytical evaluation of distances between spatio-temporal BRDF parameters. We show that even polynomials of low degree are sufficient to represent various time-varying phenomena and provide more accurate results than the previously proposed representation. We are the first who applied the example-based texture synthesis on abstract structures such as polynomial functions. We present two applications of synthesizing spatio-temporal BRDFs using our method: material enlargement and transfer of time-varying phenomenon from an example to a given static material. We evaluated the synthesized BRDFs in the context of realistic rendering and real-time rendering.

Reliability-Aware Restoration Framework for 4D Spectral Photoacoustic Data

High-Fidelity Event-Radiance Recovery via Transient Event Frequency

Blur Interpolation Transformer for Real-World Motion from Blur

Bringing Rolling Shutter Images Alive with Dual Reversed Distortion

Animation from Blur: Multi-modal Blur Decomposition with Motion Guidance

Real-World Video Deblurring: A Benchmark Dataset and an Efficient Recurrent Neural Network

Unsupervised Deep Non-rigid Alignment by Low-Rank Loss and Multi-input Attention

Graph-Based Compression of Incomplete 3D Photoacoustic Data

Diffeomorphic Neural Surface Parameterization for 3D and Reflectance Acquisition

[Extended] Depth from Spectral Defocus Blur

Spatio-temporal BRDF: Modeling and synthesis

4D Hyperspectral Photoacoustic Data Restoration With Reliability Analysis

Multi-View 3D Reconstruction of a Texture-Less Smooth Surface of Unknown Generic Reflectance

Towards Rolling Shutter Correction and Deblurring in Dynamic Scenes

Underwater Scene Recovery Using Wavelength-Dependent Refraction of Light

Imaging Scattering Characteristics of Tissue in Transmitted Microscopy

ArtPDGAN: Creating Artistic Pencil Drawing with Key Map Using Generative Adversarial Networks

[Extended] Shape from Water: Bispectral Light Absorption for Depth Recovery

City-Scale Distance Sensing via Bispectral Light Extinction in Bad Weather

A Microfacet-Based Model for Photometric Stereo with General Isotropic Reflectance

[Extended] Wetness and Color from A Single Multispectral Image

Non-Local Intrinsic Decomposition With Near-Infrared Priors

A Data-Driven Approach for Direct and Global Component Separation from a Single Image

Polarimetric Three-View Geometry

Coded Illumination and Imaging for Fluorescence Based Classification

Variable Ring Light Imaging: Capturing Transient Subsurface Scattering with an Ordinary Camera

Deeply Learned Filter Response Functions for Hyperspectral Reconstruction

From RGB to Spectrum for Natural Scenes via Manifold-Based Mapping

A Microfacet-Based Reflectance Model for Photometric Stereo with Highly Specular Surfaces

Separation of Transmitted Light and ScatteringComponents in Transmitted Microscopy

Visibility enhancement of fluorescent substance under ambient illumination using flash photography

Light transport component decomposition using multi-frequency illumination

Direct and global component separation from a single image using basis representation

Spectral Reﬂectance Recovery with Interreﬂection Using a Hyperspectral Image