Computational Imaging
and Vision Laboratory


Reliability-Aware Restoration Framework for 4D Spectral Photoacoustic Data

Spectral photoacoustic imaging (PAI) is a new technology that is able to provide 3D geometric structure associated with 1D wavelength-dependent absorption information of the interior of a target in a non-invasive manner. It has potentially broad applications in clinical and medical diagnosis. Unfortunately, the usability of spectral PAI is severely affected by a time-consuming data scanning process and complex noise. Therefore in this study, we propose a reliability-aware restoration framework to recover clean 4D data from incomplete and noisy observations. To the best of our knowledge, this is the first attempt for the 4D spectral PA data restoration problem that solves data completion and denoising simultaneously. We first present a sequence of analyses, including modeling of data reliability in the depth and spectral domains, developing an adaptive correlation graph, and analyzing local patch orientation. On the basis of these analyses, we explore global sparsity and local self-similarity for restoration. We demonstrated the effectiveness of our proposed approach through experiments on real data captured from patients, where our approach outperformed the state-of-the-art methods in both objective evaluation and subjective assessment.

>read more (PAMI 2023)

High-Fidelity Event-Radiance Recovery via Transient Event Frequency

High-fidelity radiance recovery plays a crucial role in scene information reconstruction and understanding. Conventional cameras suffer from limited sensitivity in dynamic range, bit depth, and spectral response, etc. In this paper, we propose to use event cameras with bio-inspired silicon sensors, which are sensitive to radiance changes, to recover precise radiance values. We reveal that, under active lighting conditions, the transient frequency of event signals triggering linearly reflects the radiance value. We propose an innovative method to convert the high temporal resolution of event signals into precise radiance values. The precise radiance values yields several capabilities in image analysis. We demonstrate the feasibility of recovering radiance values solely from the transient event frequency (TEF) through multiple experiments.

>read more (CVPR 2023)

Blur Interpolation Transformer for Real-World Motion from Blur

This paper studies the challenging problem of recovering motion from blur, also known as joint deblurring and interpolation or blur temporal super-resolution. The challenges are twofold: 1) the current methods still leave considerable room for improvement in terms of visual quality even on the synthetic dataset, and 2) poor generalization to real-world data. To this end, we propose a blur interpolation transformer (BiT) to effectively unravel the underlying temporal correlation encoded in blur. Based on multi-scale residual Swin transformer blocks, we introduce dual-end temporal supervision and temporally symmetric ensembling strategies to generate effective features for time-varying motionrendering. In addition, we design a hybrid camera system to collect the first real-world dataset of one-to-many blur-sharp video pairs. Experimental results show that BiT has a significant gain over the state-of-the-art methods on the public dataset Adobe240. Besides, the proposed realworld dataset effectively helps the model generalize well to real blurry scenarios. Code and data are available at https://github.com/zzh-tech/BiT.

>read more (CVPR 2023)

Bringing Rolling Shutter Images Alive with Dual Reversed Distortion

Rolling shutter (RS) distortion can be interpreted as the result of picking a row of pixels from instant global shutter (GS) frames over time during the exposure of the RS camera. This means that the information of each instant GS frame is partially, yet sequentially, embedded into the row-dependent distortion. Inspired by this fact, we address the challenging task of reversing this process, i.e., extracting undistorted GS frames from images suffering from RS distortion. However, since RS distortion is coupled with other factors such as readout settings and the relative velocity of scene elements to the camera, models that only exploit the geometric correlation between temporally adjacent images suffer from poor generality in processing data with different readout settings and dynamic scenes with both camera motion and object motion. In this paper, instead of two consecutive frames, we propose to exploit a pair of images captured by dual RS cameras with reversed RS directions for this highly challenging task. Grounded on the symmetric and complementary nature of dual reversed distortion, we develop a novel end-to-end model, IFED, to generate dual optical flow sequence through iterative learning of the velocity field during the RS time. Extensive experimental results demonstrate that IFED is superior to naive cascade schemes, as well as the state-of-the-art which utilizes adjacent RS images. Most importantly, although it is trained on a synthetic dataset, IFED is shown to be effective at retrieving GS frame sequences from real-world RS distorted images of dynamic scenes. Code is available at https://github.com/zzh-tech/Dual-Reversed-RS.

>read more (ECCV 2022)

Animation from Blur: Multi-modal Blur Decomposition with Motion Guidance

We study the challenging problem of recovering detailed motion from a single motion-blurred image. Existing solutions to this problem estimate a single image sequence without considering the motion ambiguity for each region. Therefore, the results tend to converge to the mean of the multi-modal possibilities. In this paper, we explicitly account for such motion ambiguity, allowing us to generate multiple plausible solutions all in sharp detail. The key idea is to introduce a motion guidance representation, which is a compact quantization of 2D optical flow with only four discrete motion directions. Conditioned on the motion guidance, the blur decomposition is led to a specific, unambiguous solution by using a novel two-stage decomposition network. We propose a unified framework for blur decomposition, which supports various interfaces for generating our motion guidance, including human input, motion information from adjacent video frames, and learning from a video dataset. Extensive experiments on synthesized datasets and real-world data show that the proposed framework is qualitatively and quantitatively superior to previous methods, and also offers the merit of producing physically plausible and diverse solutions. Code is available at https://github.com/zzh-tech/Animation-from-Blur.

>read more (ECCV 2022)

Real-World Video Deblurring: A Benchmark Dataset and an Efficient Recurrent Neural Network

Real-world video deblurring in real time still remains a challenging task due to the complexity of spatially and temporally varying blur itself and the requirement of low computational cost. To improve the network efficiency, we adopt residual dense blocks into RNN cells, so as to efficiently extract the spatial features of the current frame. Furthermore, a global spatio-temporal attention module is proposed to fuse the effective hierarchical features from past and future frames to help better deblur the current frame. Another issue that needs to be addressed urgently is the lack of a real-world benchmark dataset. Thus, we contribute a novel dataset (BSD) to the community, by collecting paired blurry/sharp video clips using a co-axis beam splitter acquisition system. Experimental results show that the proposed method (ESTRNN) can achieve better deblurring performance both quantitatively and qualitatively with less computational cost against state-of-the-art video deblurring methods. In addition, cross-validation experiments between datasets illustrate the high generality of BSD over the synthetic datasets. The code and dataset are released at https://github.com/zzh-tech/ESTRNN.

>read more (IJCV 2023)

Unsupervised Deep Non-rigid Alignment by Low-Rank Loss and Multi-input Attention

We propose a deep low-rank alignment network that can simultaneously perform non-rigid alignment and noise decomposition for multiple images despite severe noise and sparse corruptions. To address this challenging task, we introduce a low-rank loss in deep learning under the assumption that a set of well-aligned, well-denoised images should be linearly correlated, and thus, that a matrix consisting of the images should be low-rank. This allows us to remove the noise and corruption from input images in a self-supervised learning manner (i.e., without requiring supervised data). In addition, we introduce multi-input attention modules into Siamese U-nets in order to aggregate the corruption information from the set of images. To the best of our knowledge, this is the first attempt to introduce a low-rank loss for deep learning-based non-rigid alignment. Experiments using both synthetic data and real medical image data demonstrate the effectiveness of the proposed method. The code will be publicly available in https://github.com/asanomitakanori/Unsupervised-Deep-Non-Rigid-Alignment-by-Low-Rank-Loss-and-Multi-Input-Attention.

>read more (MICCAI 2022)

Graph-Based Compression of Incomplete 3D Photoacoustic Data

Photoacoustic imaging (PAI) is a newly emerging bimodal imaging technology based on the photoacoustic effect; specifically, it uses sound waves caused by light absorption in a material to obtain 3D structure data noninvasively. PAI has attracted attention as a promising measurement technology for comprehensive clinical application and medical diagnosis. Because it requires exhaustively scanning an entire object and recording ultrasonic waves from various locations, it encounters two problems: a long imaging time and a huge data size. To reduce the imaging time, a common solution is to apply compressive sensing (CS) theory. CS can effectively accelerate the imaging process by reducing the number of measurements, but the data size is still large, and efficient compression of such incomplete data remains a problem. In this paper, we present the first attempt at direct compression of incomplete 3D PA observations, which simultaneously reduces the data acquisition time and alleviates the data size issue. Specifically, we first use a graph model to represent the incomplete observations. Then, we propose three coding modes and a reliability-aware rate-distortion optimization (RDO) to adaptively compress the data into sparse coefficients. Finally, we obtain a coded bit stream through entropy coding. We demonstrate the effectiveness of our proposed framework through both objective evaluation and subjective visual checking of real medical PA data captured from patients.

>read more (MICCAI 2022)

Diffeomorphic Neural Surface Parameterization for 3D and Reflectance Acquisition

This paper proposes a simple method which solves the problem of multi-view 3D reconstruction for objects with unknown and generic surface materials, imaged by a freely moving camera and lit by a freely moving point light source. The object can have arbitrary (diffuse or specular) and spatially-varying surface reflectances. Our solution consists of two small-sized neural networks (dubbed the ‘Shape-Net’ and ‘BRDF-Net’), used to parameterize the unknown shape and material map as functions on a canonical surface (e.g. unit sphere). Key to our method is a velocity field shape representation that drives the canonical surface to target shape through time. We show this parameterization can be implemented as a recurrent residual network that is guaranteed to be diffeomorphic and orientation-preserving. Our method yields an exceptionally clean formulation that can be optimized by standard gradient descent without initialization, and works with both near-field and distant light source. Synthetic and real experiments demonstrate the reliability and accuracy of our reconstructions, with extensions including novel-view-synthesis, relighting and material retouching done with ease. Our source codes are available at https://github.com/za-cheng/DNS.

>read more (SIGGRAPH 2022)

[Extended] Depth from Spectral Defocus Blur

This paper proposes a method for depth estimation from a single multispectral image by using a lens property known as a chromatic aberration. The chromatic aberration cause that the light passing through the lens is refracted depending on the wavelength. The refraction cause that rays vary their angle depending on the wavelength and generate a change in focal length which leads to a defocus blur for different wavelengths. We show that the chromatic aberration provides clues to recover depth maps from a single multispectral image if we assume that the defocus blur is Gaussian. The proposed method needs only a standard wide-aperture lens which naturally exhibits the chromatic aberration and a multispectral camera. Moreover, we use a simple yet effective depth of field synthesis method to calculate the derivatives and obtain all-in-focus images necessary to approximate spectral derivatives. We verified the effectiveness of the proposed method on various real-world scenes.

>read more (ICIP 2019)

>read more (JOSA 2021)

Spatio-temporal BRDF: Modeling and synthesis

We propose a generalization of example-based texture synthesis to spatio-temporal BRDFs. A key component of our method is a novel representation of time-varying materials using polynomials describing time-varying BRDF parameters. Our representation allows efficient fitting of measured data into a compact spatio-temporal BRDF representation, and it allows an efficient analytical evaluation of distances between spatio-temporal BRDF parameters. We show that even polynomials of low degree are sufficient to represent various time-varying phenomena and provide more accurate results than the previously proposed representation. We are the first who applied the example-based texture synthesis on abstract structures such as polynomial functions. We present two applications of synthesizing spatio-temporal BRDFs using our method: material enlargement and transfer of time-varying phenomenon from an example to a given static material. We evaluated the synthesized BRDFs in the context of realistic rendering and real-time rendering.

>read more

4D Hyperspectral Photoacoustic Data Restoration With Reliability Analysis

Hyperspectral photoacoustic (HSPA) spectroscopy is an emerging bi-modal imaging technology that is able to show the wavelength-dependent absorption distribution of the interior of a 3D volume. However, HSPA devices have to scan an object exhaustively in the spatial and spectral domains; and the acquired data tend to suffer from complex noise. This time-consuming scanning process and noise severely affects the usability of HSPA. It is therefore critical to examine the feasibility of 4D HSPA data restoration from an incomplete and noisy observation. In this work, we present a data reliability analysis for the depth and spectral domain. On the basis of this analysis, we explore the inherent data correlations and develop a restoration algorithm to recover 4D HSPA cubes. Experiments on real data verify that the proposed method achieves satisfactory restoration results.

>read more

Multi-View 3D Reconstruction of a Texture-Less Smooth Surface of Unknown Generic Reflectance

Recovering the 3D geometry of a purely texture-less object with generally unknown surface reflectance (e.g. nonLambertian) is regarded as a challenging task in multiview reconstruction. The major obstacle revolves around establishing cross-view correspondences where photometric constancy is violated. This paper proposes a simple and practical solution to overcome this challenge based on a co-located camera-light scanner device. Unlike existing solutions, we do not explicitly solve for correspondence. Instead, we argue the problem is generally well-posed by multi-view geometrical and photometric constraints, and can be solved from a small number of input views. We formulate the reconstruction task as a joint energy minimization over the surface geometry and reflectance. Despite this energy is highly non-convex, we develop an optimization algorithm that robustly recovers globally optimal shape and reflectance even from a random initialization. Extensive experiments on both simulated and real data have validated our method, and possible future extensions are discussed.

>read more

Towards Rolling Shutter Correction and Deblurring in Dynamic Scenes

Joint rolling shutter correction and deblurring (RSCD) techniques are critical for the prevalent CMOS cameras. However, current approaches are still based on conventional energy optimization and are developed for static scenes. To enable learning-based approaches to address real-world RSCD problem, we contribute the first dataset, BS-RSCD, which includes both ego-motion and object-motion in dynamic scenes. Real distorted and blurry videos with corresponding ground truth are recorded simultaneously via a beam-splitter-based acquisition system. Since direct application of existing individual rolling shutter correction (RSC) or global shutter deblurring (GSD) methods on RSCD leads to undesirable results due to inherent flaws in the network architecture, we further present the first learning-based model (JCD) for RSCD. The key idea is that we adopt bi-directional warping streams for displacement compensation, while also preserving the non-warped deblurring stream for details restoration. The experimental results demonstrate that JCD achieves state-of-the-art performance on the realistic RSCD dataset (BS-RSCD) and the synthetic RSC dataset (Fastec-RS). The dataset and code are available at https://github.com/zzh-tech/RSCD.

>read more

Underwater Scene Recovery Using Wavelength-Dependent Refraction of Light

This paper proposes a method of underwater depth estimation from an orthographic multispectral image. In accordance with Snell's law, incoming light is refracted when it enters the water surface, and its directions are determined by the refractive index and the normals of the water surface. The refractive index is wavelength-dependent, and this leads to some disparity between images taken at different wavelengths. Given the camera orientation and the refractive index of a medium such as water, our approach can reconstruct the underwater scene with unknown water surface from the disparity observed in images taken at different wavelengths. We verified the effectiveness of our method through simulations and real experiments on various scenes.

>read more

Imaging Scattering Characteristics of Tissue in Transmitted Microscopy

Scattering property plays a very important role in optical imaging and diagnostic applications, such as analysis of cancerous process and diagnosis of dysplasia or cancer. The existing methods focused on removing scattering components in order to visualize the spatial distribution of the reflection and absorption properties. We propose a novel method for estimating the spatial distribution of scattering property by measuring a set of intensities of the direct scattered light with each angle for each point. Our key contribution is to decompose the captured light into the direct scattered light with each angle by using varying spatial frequency of illumination patterns that can control the range of the scattered angle. By applying the method to observe a spatially inhomogeneous translucent object, we can extract the map of the angular distribution of scattering. To the best of our knowledge, this is the first method to enable visualizing a spatial map of scattering property using a conventional transmitted microscope setup. Experimental results on synthetic data and real complex materials demonstrate the effectiveness of our method for the estimation of scattering distribution.

>read more

ArtPDGAN: Creating Artistic Pencil Drawing with Key Map Using Generative Adversarial Networks

A lot of researches focus on image transfer using deep learning, especially with generative adversarial networks (GANs). However, no existing methods can produce high quality artistic pencil drawings. First, artists do not convert all the details of the photos into the drawings. Instead, artists tend to use strategies to magnify some special parts of the items and cut others down. Second, the elements in artistic drawings may not be located precisely. What’s more, the lines may not relate to the features of the items strictly. To address above challenges, we propose ArtPDGAN, a novel GAN based framework that combines an image-to-image network to generate key map. And then, we use the key map as an important part of input to generate artistic pencil drawings. The key map can show the key parts of the items to guide the generator. We use a paired and unaligned artistic drawing dataset containing high-resolution photos of items and corresponding professional artistic pencil drawings to train ArtPDGAN. Results of our experiments show that the proposed framework performs excellently against existing methods in terms of similarity to artist’s work and user evaluations.

>read more

[Extended] Shape from Water: Bispectral Light Absorption for Depth Recovery

This paper introduces a novel depth recovery method based on light absorption in water. Water absorbs light at almost all wave-lengths whose absorption coefficient is related to the wavelength. Based on the Beer-Lambert model, we introduce a bispectral depth recoverymethod that leverages the light absorption difference between two near-infrared wavelengths captured with a distant point source and ortho-graphic cameras. Through extensive analysis, we show that accuratedepth can be recovered irrespective of the surface texture and reflectance,and introduce algorithms to correct for nonidealities of a practical imple-mentation including tilted light source and camera placement and non-ideal bandpass filters. We construct a coaxial bispectral depth imagingsystem using low-cost off-the-shelf hardware and demonstrate its use forrecovering the shapes of complex and dynamic objects in water. Exper-imental results validate the theory and practical implementation of thisnovel depth recovery paradigm, which we refer to as shape from water.

>read more (ECCV 2016)

>read more (PAMI 2020)

City-Scale Distance Sensing via Bispectral Light Extinction in Bad Weather

In this paper, we propose a novel city-scale distance sensing algorithm based on atmosphere optics. The suspended particles, especially in bad weather, would attenuate the light at almost all wavelengths. Observing this fact and starting from the light scattering mechanism, we derive a bispectral distance sensing algorithm by leveraging the difference of extinction coefficient between two specifically selected near infrared wavelengths. The extinction coefficient of the atmosphere is related to both wavelength and meteorological conditions, also known as visibility, such as the fog and haze day. To account for different bad weather conditions, we explicitly introduce visibility into our algorithm by incorporating it into the calculation of extinction coefficient, making our algorithm simple yet effective. To capture the data, we build a bispectral imaging system that is able to take a pair of images with a monochrome camera and two narrow band-pass filters. We also present a wavelength selection strategy that allows us to accurately sense distance regardless of material reflectance and texture. Specifically, this strategy determines two distinct near infrared wavelengths by maximising the extinction coefficient difference while minimizing the influence of building’s reflectance variance. The experiments empirically validate our model and its practical performance on the distance sensing for the city-scale buildings.

>read more

A Microfacet-Based Model for Photometric Stereo with General Isotropic Reflectance

This paper presents a precise, stable, and invertible reflectance model for photometric stereo. This microfacet-based model is applicable to all types of isotropic surface reflectance, covering cases from diffusion to specular reflections. We introduce a single variable to physically quantify the surface smoothness, and by monotonically sliding this variable between 0 and 1, our model enables a versatile representation that can smoothly transform between an ellipsoid of revolution and the equation for Lambertian reflectance. In the inverse domain, this model offers a compact and physically interpretable formulation, for which we introduce a fast and lightweight solver that allows accurate estimations for both surface smoothness and surface shape. Finally, extensive experiments on the appearances of synthesized and real objects evidence that this model is state-of-the-art in our off-the-shelf solution.

>read more

[Extended] Wetness and Color from A Single Multispectral Image

Visual recognition of wet surfaces and their degrees of wetness is important for many computer vision applications. It can inform slippery spots on a road to autonomous vehicles, muddy areas of a trail to humanoid robots, and the freshness of groceries to us. In the past,monochromatic appearance change,the fact that surfaces darken when wet, has been modeled to recognize wet surfaces. In this paper, we show that color change, particularly in its spectral behavior, carries rich information about a wet surface. We derive an analytical spectral appearance model of wet surfaces that expresses the characteristic spectral sharpening due to multiple scattering and absorption in the surface. We derive a novel method for estimating key parameters of this spectral appearance model, which enables the recovery of the original surface color and the degree of wetness from a single observation. Applied to a multispectral image, the method estimates the spatial map of wetness together with the dry spectral distribution of the surface. To our knowledge, this work is the first to model and leverage the spectral characteristics of wet surfaces to revert its appearance. We conduct comprehensive experimental validation with a number of wet real surfaces. The results demonstrate the accuracy of our model and the effectiveness of our method for surface wetness and color estimation.

>read more (CVPR 2017)

>read more (PAMI 2019)

Non-Local Intrinsic Decomposition With Near-Infrared Priors

Intrinsic image decomposition is a highly under-constrained problem that has been extensively studied by computer vision researchers. Previous methods impose additional constraints by exploiting either empirical or data-driven priors. In this paper, we revisit intrinsic image decomposition with the aid of near-infrared (NIR) imagery. We show that NIR band is considerably less sensitive to textures and can be exploited to reduce ambiguity caused by reflectance variation, promoting a simple yet powerful prior for shading smoothness. With this observation, we formulate intrinsic decomposition as an energy minimisation problem. Unlike existing methods, our energy formulation decouples reflectance and shading estimation, into a convex local shading component based on NIR-RGB image pair, and a reflectance component that encourages reflectance homogeneity both locally and globally. We further show the minimisation process can be approached by a series of multi-dimensional kernel convolutions, each within linear time complexity. To validate the proposed algorithm, a NIR-RGB dataset is captured over real-world objects, where our NIR-assisted approach demonstrates clear superiority over RGB methods.

>read more

A Data-Driven Approach for Direct and Global Component Separation from a Single Image

The radiance captured by camera is often under influence of both direct and global illumination from complex environment. Though separating them is highly desired, existing methods require strict capture restriction such as modulated active light. Here, we propose the first method to infer both components from a single image without any hardware restriction. Our method is a novel generative adversarial network (GAN) based networks which imposes prior physics knowledge to force a physics plausible component separation. We also present the first component separation dataset which comprises of 100 scenes with their direct and global components. In the experiments, our method has achieved satisfactory performance on our own testing set and images in public dataset. Finally, we illustrate an interesting application of editing realistic images through the separated components.

>read more

Polarimetric Three-View Geometry

This paper theorizes the connection between polarizationand three-view geometry. It presents a ubiquitous polarization-inducedconstraint that regulates the relative pose of a system of three cameras.We demonstrate that, in a multi-view system, the polarization phase obtained for a surface point is induced from one of the two pencils ofplanes: one by specular reflections with its axis aligned with the incident light; one by diffusive reflections with its axis aligned with the surface normal. Differing from the traditional three-view geometry, we show that this constraint directly encodes camera rotation and projection, and is independent of camera translation. In theory, six polarized diffusive point-point-point correspondences suffice to determine the camera rotations. In practise, a cross-validation mechanism using correspondences of specularites can effectively resolve the ambiguities caused by mixedpolarization. The experiments on real world scenes validate our proposed theory.(ECCV2018 pp.20-36)

>read more

Coded Illumination and Imaging for Fluorescence Based Classification

The quick detection of specific substances in objects such as produce items via non-destructive visual cues is vital to ensuring the quality and safety of consumer products. At the same time, it is well known that the fluorescence excitation-emission characteristics of many organic objects can serve as a kind of “fingerprint” for detecting the presence of specific substances in classification tasks such as determining if something is safe to consume. However, conventional capture of the fluorescence excitation-emission matrix can take on the order of minutes and can only be done for point measurements. In this paper, we propose a coded illumination approach whereby light spectra are learned such that key visual fluorescent features can be easily seen for material classification. We show that under a single coded illuminant, we can capture one RGB image and perform pixel-level classifications of materials at high accuracy. This is demonstrated through effective classification of different types of honey and alcohol using real images.(ECCV2018 pp.502-516)

>read more

Variable Ring Light Imaging: Capturing Transient Subsurface Scattering with an Ordinary Camera

Subsurface scattering plays a significant role in determining the appearance of real-world surfaces. A light ray penetrating into the subsurface is repeatedly scattered and absorbed by particles along its path before reemerging from the outer interface, which determines its spectral radiance. We introduce a novel imaging method that enables the decomposition of the appearance of a fronto-parallel real-world surface into images of light with bounded path lengths, i.e., transient subsurface light transport. Our key idea is to observe each surface point under a variable ring light: a circular illumination pattern of increasingly larger radius centered on it. We show that the path length of light captured in each of these observations is naturally lower-bounded by the ring light radius. By taking the difference of ring light images of incrementally larger radii, we compute transient images that encode light with bounded path lengths. Experimental results on synthetic and complex real-world surfaces demonstrate that the recovered transient images reveal the subsurface structure of general translucent inhomogeneous surfaces. We further show that their differences reveal the surface colors at different surface depths. The proposed method is the first to enable the unveiling of dense and continuous subsurface structures from steady-state external appearance using ordinary camera and illumination.(ECCV2018 pp.598-613)

>read more

Deeply Learned Filter Response Functions for Hyperspectral Reconstruction

Hyperspectral reconstruction from RGB imaging has recently achieved significant progress via sparse coding and deep learning. However, a largely ignored fact is that existing RGB cameras are tuned to mimic human trichromatic perception, thus their spectral responses are not necessarily optimal for hyperspectral reconstruction. In this paper,rather than use RGB spectral responses, we simultaneously learn optimized camera spectral response functions (to be implemented in hardware) and a mapping for spectral reconstruction by using an end-to-end network. Our core idea is that since camera spectral filters act in effect like the convolution layer, their response functions could be optimized by training standard neural networks. We propose two types of designed filters: a three-chip setup without spatial mosaicing and a single-chip setup with a Bayer-style 2x2 filter array. Numerical simulations verify the advantages of deeply learned spectral responses compared to existing RGB cameras. More interestingly, by considering physical restrictions in the design process, we are able to realize the deeply learned spectral response functions byusing modern film filter production technologies, and thus construct data inspired multispectral cameras for snapshot hyperspectral imaging.(CVPR2018 pp.4767-4776)

>read more

From RGB to Spectrum for Natural Scenes via Manifold-Based Mapping

Spectral analysis of natural scenes can provide much more detailed information about the scene than an ordinary RGB camera. The richer information provided by hyperspectral images has been beneficial to numerous applications, such as understanding natural environmental changes and classifying plants and soils in agriculture based on their spectral properties. In this paper, we present an efficient manifold learning based method for accurately reconstructing a hyperspectral image from a single RGB image captured by a commercial camera with known spectral response. By applying a nonlinear dimensionality reduction technique to a large set of natural spectra, we show that the spectra of natural scenes lie on an intrinsically low dimensional manifold. This allows us to map an RGB vector to its corresponding hyperspectral vector accurately via our proposed novel manifold-based reconstruction pipeline. Experiments using both synthesized RGB images using hyperspectral datasets and real world data demonstrate our method outperforms the state-of-the-art.(ICCV2017 pp.4705-4713)

>read more

A Microfacet-Based Reflectance Model for Photometric Stereo with Highly Specular Surfaces

A precise, stable and invertible model for surface reflectance is the key to the success of photometric stereo with real world materials. Recent developments in the field have enabled shape recovery techniques for surfaces of various types, but an effective solution to directly estimating the surface normal in the presence of highly specular reflectance remains elusive. In this paper, we derive an analytical isotropic microfacet-based reflectance model, based on which a physically interpretable approximate is tailored for highly specular surfaces. With this approximate, we identify the equivalence between the surface recovery problem and the ellipsoid of revolution fitting problem, where the latter can be described as a system of polynomials. Additionally, we devise a fast, non-iterative and globally optimal solver for this problem. Experimental results on both synthetic and real images validate our model and demonstrate that our solution can stably deliver superior performance in its targeted application domain.(ICCV2017 pp.3162-3170)

>read more

Separation of Transmitted Light and ScatteringComponents in Transmitted Microscopy

In transmitted light microscopy, a specimen tends to be observed as unclear. This is caused by a phenomenon that an image sensor captures the sum of these scattered light rays traveled from different paths due to scattering. To cope with this problem, we propose a novel computational photography approach for separating directly transmitted light from the scattering light in a transmitted light microscope by using high-frequency lighting. We first investigated light paths and clarified what types of light overlap in transmitted light microscopy. The scattered light can be simply represented and removed by using the difference in observations between focused and unfocused conditions, where the high-frequency illumination becomes homogeneous. Our method makes a novel spatial multiple-spectral absorption analysis possible, which requires absorption coefficients to be measured in each spectrum at each position. Experiments on real biological tissues demonstrated the effectiveness of our method.(MICCAI2017 pp.12-20)

Visibility enhancement of fluorescent substance under ambient illumination using flash photography

Many natural and manmade objects contain fluorescent substance. To visualize the distribution of fluorescence emitting substance is of great importance for food freshness examination, molecular dynamics analysis and so on. Unfortunately, the presence of fluorescent substance is usually imperceptible under strong ambient illumination, since fluorescent emission is relatively weak compared with surface reflectance. Even assuming that surface reflectance could be somehow blocked out, shading effect on fluorescent emission that relates to surface geometry would still interfere with visibility of fluorescent substance in the scene. In this paper, we propose a visibility enhancement method to better visualize the distribution of fluorescent substance under unknown and uncontrolled ambient illumination. By using an image pair captured with UV and visible flash illumination, we obtain a shading-free luminance image that visualizes the distribution of fluorescent emission. We further replace the luminance of the RGB image under ambient illumination by using this fluorescent emission luminance, so as to obtain a full colored image. The effectiveness of our method has been verified when used to visualize weak fluorescence from bacteria on rotting cheese and meat.(ICIP2017 pp.1622-1626)

>read more

Light transport component decomposition using multi-frequency illumination

Scene appearance is a mixture of light transport phenomena ranging from direct reflection to complicated effect such as inter-reflection and subsurface scattering. To decompose scene appearance into meaningful photometric components is very helpful in scene understanding and image editing. However, it has proven to be a difficult task. In this paper, we explore the difference of direct components obtained by multi-frequency illumination for light transport component decomposition. We apply independent vector analysis (IVA) to this task with no fixed constraints. Experiment results have verified the effectiveness of our method and its applicability to generic scenes.(ICIP2017 pp.3595-3599)

>read more

Direct and global component separation from a single image using basis representation

Previous research showed that the separation of direct and global components could be done with a single image by assuming neighboring scene points have similar direct and global components, but it normally leads to loss of spatial resolution of the results. To tackle such problem, we present a novel approach for separating direct and global components of a scene in full spatial resolution from a single captured image, which employs linear basis representation to approximate direct and global components. Due to the basis dependency of these two components, high frequency lighting pattern is utilized to modulate the frequency of direct components, which can effectively resolve the ambiguity between the basis representation for direct and global components, and contributes to achieving robust separation results. The effectiveness of our approach is demonstrated on both simulated and real images captured by a standard off-the-shelf camera and a projector mounted in a coaxial system. Our results show better visual quality and less error compared with those obtained by the conventional single-shot approach on both still and moving objects.(ACCV2016 pp.99-114)

>read more

Spectral Reflectance Recovery with Interreflection Using a Hyperspectral Image

The capture of scene spectral reflectance (SR) provides a wealth of information about the material properties of objects, and has proven useful for applications including classification, synthetic relighting, medical imaging, and more. Thus many methods for SR capture have been proposed. While effective, past methods do not consider the effects of indirectly bounced light from within the scene, and the estimated SR from traditional techniques is largely affected by interreflection. For example, different lighting directions can cause different SR estimates. On the other hand, past work has shown that accurate interreflection separation in hyperspectral images is possible but the SR of all surface points needs to be known a priori. Thus we see that the estimation of SR and interreflection in its current form constitutes a chicken and egg dilemma. In this work, we propose the challenging and novel problem of simultaneously performing SR recovery and interreflection removal from a single hyperspectral image, and develop the first strategy to address it. Specifically, we model this problem using a compact sparsity regularized nonnegative matrix factorization (NMF) formulation, and introduce a scalable optimization algorithm on the basis of the alternating direction method of multipliers (ADMM). Our experiments have demonstrated its effectiveness on scenes with a single or two reflectance colors, containing possibly concave surfaces that lead to interreflection.(ACCV2016 pp.52-67)

>read more