publications | Bingchen Gong

2023

RecolorNeRF: Layer Decomposed Radiance Fields for Efficient Color Editing of 3D Scenes

Bingchen Gong, Yuehao Wang, Xiaoguang Han, and Qi Dou

In Proceedings of the 31st ACM International Conference on Multimedia, 2023

Abs arXiv Bib PDF Code Website

Radiance fields have gradually become a main representation of media. Although its appearance editing has been studied, how to achieve view-consistent recoloring in an efficient manner is still under explored. We present RecolorNeRF, a novel user-friendly color editing approach for the neural radiance fields. Our key idea is to decompose the scene into a set of pure-colored layers, forming a palette. By this means, color manipulation can be conducted by altering the color components of the palette directly. To support efficient palette-based editing, the color of each layer needs to be as representative as possible. In the end, the problem is formulated as an optimization problem, where the layers and their blending weights are jointly optimized with the NeRF itself. Extensive experiments show that our jointly-optimized layer decomposition can be used against multiple backbones and produce photo-realistic recolored novel-view renderings. We demonstrate that RecolorNeRF outperforms baseline methods both quantitatively and qualitatively for color editing even in complex real-world scenes.
@inproceedings{10.1145/3581783.3611957, author = {Gong, Bingchen and Wang, Yuehao and Han, Xiaoguang and Dou, Qi}, title = {RecolorNeRF: Layer Decomposed Radiance Fields for Efficient Color Editing of 3D Scenes}, year = {2023}, isbn = {9798400701085}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, url = {https://doi.org/10.1145/3581783.3611957}, doi = {10.1145/3581783.3611957}, booktitle = {Proceedings of the 31st ACM International Conference on Multimedia}, pages = {8004–8015}, numpages = {12}, keywords = {recoloring, neural rendering, radiance fields, palettes, visual editing}, location = {Ottawa ON, Canada}, series = {MM '23}, }
SeamlessNeRF: Stitching Part NeRFs with Gradient Propagation

Bingchen Gong, Yuehao Wang, Xiaoguang Han, and Qi Dou

In SIGGRAPH Asia 2023 Conference Papers, 2023

Abs arXiv Bib PDF Website

Neural Radiance Fields (NeRFs) have emerged as promising digital mediums of 3D objects and scenes, sparking a surge in research to extend the editing capabilities in this domain. The task of seamless editing and merging of multiple NeRFs, resembling the “Poisson blending” in 2D image editing, remains a critical operation that is under-explored by existing work. To fill this gap, we propose SeamlessNeRF, a novel approach for seamless appearance blending of multiple NeRFs. In specific, we aim to optimize the appearance of a target radiance field in order to harmonize its merge with a source field. We propose a well-tailored optimization procedure for blending, which is constrained by 1) pinning the radiance color in the intersecting boundary area between the source and target fields and 2) maintaining the original gradient of the target. Extensive experiments validate that our approach can effectively propagate the source appearance from the boundary area to the entire target field through the gradients. To the best of our knowledge, SeamlessNeRF is the first work that introduces gradient-guided appearance editing to radiance fields, offering solutions for seamless stitching of 3D objects represented in NeRFs. Our code and more results are available at https://sites.google.com/view/seamlessnerf.
@inproceedings{10.1145/3610548.3618238, author = {Gong, Bingchen and Wang, Yuehao and Han, Xiaoguang and Dou, Qi}, title = {SeamlessNeRF: Stitching Part NeRFs with Gradient Propagation}, year = {2023}, isbn = {9798400703157}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, url = {https://doi.org/10.1145/3610548.3618238}, doi = {10.1145/3610548.3618238}, booktitle = {SIGGRAPH Asia 2023 Conference Papers}, articleno = {33}, numpages = {10}, keywords = {seamless, gradient propagation, neural radiance fields, 3D editing, composition}, location = {Sydney, NSW, Australia}, series = {SA '23}, }

2022

Structure-Aware Meta-Fusion for Image Super-Resolution

Haoyu Ma, Bingchen Gong, and Yizhou Yu

ACM Trans. Multimedia Comput. Commun. Appl., Feb 2022

Abs Bib PDF Website

There are two main categories of image super-resolution algorithms: distortion oriented and perception oriented. Recent evidence shows that reconstruction accuracy and perceptual quality are typically in disagreement with each other. In this article, we present a new image super-resolution framework that is capable of striking a balance between distortion and perception. The core of our framework is a deep fusion network capable of generating a final high-resolution image by fusing a pair of deterministic and stochastic images using spatially varying weights. To make a single fusion model produce images with varying degrees of stochasticity, we further incorporate meta-learning into our fusion network. Once equipped with the kernel produced by a kernel prediction module, our meta fusion network is able to produce final images at any desired level of stochasticity. Experimental results indicate that our meta fusion network outperforms existing state-of-the-art SISR algorithms on widely used datasets, including PIRM-val, DIV2K-val, Set5, Set14, Urban100, Manga109, and B100. In addition, it is capable of producing high-resolution images that achieve low distortion and high perceptual quality simultaneously.
@article{10.1145/3477553, author = {Ma, Haoyu and Gong, Bingchen and Yu, Yizhou}, title = {Structure-Aware Meta-Fusion for Image Super-Resolution}, year = {2022}, issue_date = {May 2022}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, volume = {18}, number = {2}, issn = {1551-6857}, url = {https://doi.org/10.1145/3477553}, doi = {10.1145/3477553}, journal = {ACM Trans. Multimedia Comput. Commun. Appl.}, month = feb, articleno = {60}, numpages = {25}, keywords = {image fusion, Super-resolution, meta-learning}, }

2021

ME-PCN: Point Completion Conditioned on Mask Emptiness

Bingchen Gong, Yinyu Nie, Yiqun Lin, Xiaoguang Han, and Yizhou Yu

In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Oct 2021

Abs arXiv Bib PDF Code Slides

Point completion refers to completing the missing geometries of an object from incomplete observations. Main-stream methods predict the missing shapes by decoding a global feature learned from the input point cloud, which often leads to deficient results in preserving topology consistency and surface details. In this work, we present ME-PCN, a point completion network that leverages ‘emptiness’ in 3D shape space. Given a single depth scan, previous methods often encode the occupied partial shapes while ignoring the empty regions (e.g. holes) in depth maps. In contrast, we argue that these ‘emptiness’ clues indicate shape boundaries that can be used to improve topology representation and detail granularity on surfaces. Specifically, our ME-PCN encodes both the occupied point cloud and the neighboring ‘empty points’. It estimates coarse-grained but complete and reasonable surface points in the first stage, followed by a refinement stage to produce fine-grained surface details. Comprehensive experiments verify that our ME-PCN presents better qualitative and quantitative performance against the state-of-the-art. Besides, we further prove that our ‘emptiness’ design is lightweight and easy to embed in existing methods, which shows consistent effectiveness in improving the CD and EMD scores.
@inproceedings{Gong_2021_ICCV, author = {Gong, Bingchen and Nie, Yinyu and Lin, Yiqun and Han, Xiaoguang and Yu, Yizhou}, title = {ME-PCN: Point Completion Conditioned on Mask Emptiness}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = oct, year = {2021}, pages = {12488-12497}, doi = {10.1109/iccv48922.2021.01226}, url = {https://doi.org/10.1109%2Ficcv48922.2021.01226}, }

2018

Image Super-Resolution via Deterministic-Stochastic Synthesis and Local Statistical Rectification

Weifeng Ge, Bingchen Gong, and Yizhou Yu

ACM Trans. Graph., Dec 2018

Abs arXiv Bib PDF Supp Code

Single image superresolution has been a popular research topic in the last two decades and has recently received a new wave of interest due to deep neural networks. In this paper, we approach this problem from a different perspective. With respect to a downsampled low resolution image, we model a high resolution image as a combination of two components, a deterministic component and a stochastic component. The deterministic component can be recovered from the low-frequency signals in the downsampled image. The stochastic component, on the other hand, contains the signals that have little correlation with the low resolution image. We adopt two complementary methods for generating these two components. While generative adversarial networks are used for the stochastic component, deterministic component reconstruction is formulated as a regression problem solved using deep neural networks. Since the deterministic component exhibits clearer local orientations, we design novel loss functions tailored for such properties for training the deep regression network. These two methods are first applied to the entire input image to produce two distinct high-resolution images. Afterwards, these two images are fused together using another deep neural network that also performs local statistical rectification, which tries to make the local statistics of the fused image match the same local statistics of the groundtruth image. Quantitative results and a user study indicate that the proposed method outperforms existing state-of-the-art algorithms with a clear margin.
@article{10.1145/3272127.3275060, author = {Ge, Weifeng and Gong, Bingchen and Yu, Yizhou}, title = {Image Super-Resolution via Deterministic-Stochastic Synthesis and Local Statistical Rectification}, year = {2018}, issue_date = {December 2018}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, volume = {37}, number = {6}, issn = {0730-0301}, url = {https://doi.org/10.1145/3272127.3275060}, doi = {10.1145/3272127.3275060}, journal = {ACM Trans. Graph.}, month = dec, articleno = {260}, numpages = {14}, keywords = {local gram matrix, deep learning, image superresolution, deterministic component, stochastic component, local correlation matrix}, }

2016

Tamp: A Library for Compact Deep Neural Networks with Structured Matrices

Bingchen Gong, Brendan Jou, Felix Yu, and Shih-Fu Chang

In Proceedings of the 24th ACM International Conference on Multimedia, Dec 2016

Abs Bib PDF Code

We introduce Tamp, an open source C++ library for reducing the space and time costs of deep neural network models. In particular, Tamp implements several recent works which use structured matrices to replace unstructured matrices which are often bottlenecks in neural networks.Tamp is also designed to serve as a unified development platform with several supported optimization back-ends and abstracted data types.This paper introduces the design and API and also demonstrates the effectiveness with experiments on public datasets.

@inproceedings{10.1145/2964284.2973802,
  author = {Gong, Bingchen and Jou, Brendan and Yu, Felix and Chang, Shih-Fu},
  title = {Tamp: A Library for Compact Deep Neural Networks with Structured Matrices},
  year = {2016},
  isbn = {9781450336031},
  publisher = {Association for Computing Machinery},
  address = {New York, NY, USA},
  url = {https://doi.org/10.1145/2964284.2973802},
  doi = {10.1145/2964284.2973802},
  booktitle = {Proceedings of the 24th ACM International Conference on Multimedia},
  pages = {1206–1209},
  numpages = {4},
  keywords = {open source, neural networks, structured matrices},
  location = {Amsterdam, The Netherlands},
  series = {MM '16},
}

2015

Audeosynth: Music-Driven Video Montage

Zicheng Liao, Yizhou Yu, Bingchen Gong, and Lechao Cheng

ACM Trans. Graph., Jul 2015

Abs Bib PDF Supp Website

We introduce music-driven video montage, a media format that offers a pleasant way to browse or summarize video clips collected from various occasions, including gatherings and adventures. In music-driven video montage, the music drives the composition of the video content. According to musical movement and beats, video clips are organized to form a montage that visually reflects the experiential properties of the music. Nonetheless, it takes enormous manual work and artistic expertise to create it. In this paper, we develop a framework for automatically generating music-driven video montages. The input is a set of video clips and a piece of background music. By analyzing the music and video content, our system extracts carefully designed temporal features from the input, and casts the synthesis problem as an optimization and solves the parameters through Markov Chain Monte Carlo sampling. The output is a video montage whose visual activities are cut and synchronized with the rhythm of the music, rendering a symphony of audio-visual resonance.
@article{10.1145/2766966, author = {Liao, Zicheng and Yu, Yizhou and Gong, Bingchen and Cheng, Lechao}, title = {Audeosynth: Music-Driven Video Montage}, year = {2015}, issue_date = {August 2015}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, volume = {34}, number = {4}, issn = {0730-0301}, url = {https://doi.org/10.1145/2766966}, doi = {10.1145/2766966}, journal = {ACM Trans. Graph.}, month = jul, articleno = {68}, numpages = {10}, keywords = {audio-visual synchresis}, }