Publications | Beichen Li

2025

VLMaterial: Procedural Material Generation with Large Vision-Language Models

Beichen Li, Rundi Wu, Armando Solar-Lezama, Changxi Zheng, Liang Shi, Bernd Bickel, and Wojciech Matusik

International Conference on Learning Representations (ICLR), Spotlight, 2025

Abs PDF Code Bib

Procedural materials, represented as functional node graphs, are ubiquitous in computer graphics for photorealistic material appearance design. They allow users to perform intuitive and precise editing to achieve desired visual appearances. However, creating a procedural material given an input image requires professional knowledge and significant effort. In this work, we leverage the ability to convert procedural materials into standard Python programs and fine-tune a large pre-trained vision-language model (VLM) to generate such programs from input images. To enable effective fine-tuning, we also contribute an open-source procedural material dataset and propose to perform program-level augmentation by prompting another pre-trained large language model (LLM). Through extensive evaluation, we show that our method outperforms previous methods on both synthetic and real-world examples.
@article{li2025vlmaterial, title = {VLMaterial: Procedural Material Generation with Large Vision-Language Models}, author = {Li, Beichen and Wu, Rundi and Solar-Lezama, Armando and Zheng, Changxi and Shi, Liang and Bickel, Bernd and Matusik, Wojciech}, journal = {International Conference on Learning Representations (ICLR)}, year = {2025}, }

2024

Procedural Material Generation with Reinforcement Learning

Beichen Li, Yiwei Hu, Paul Guerrero, Miloš Hašan, Liang Shi, Valentin Deschaintre, and Wojciech Matusik

ACM Transactions on Graphics (TOG), Proc. SIGGRAPH Asia, 2024

Abs PDF Code Bib

Modern 3D content creation heavily relies on procedural assets. In particular, procedural materials are ubiquitous in the industry, but their manipulation remains challenging. Previous work conditionally generates procedural graphs that match a given input image. However, the parameter generation step limits how accurately the generated graph matches the input image, due to a reliance on supervision with scarcely available procedural data. We propose to improve parameter prediction accuracy for image-conditioned procedural material generation by leveraging reinforcement learning (RL) and present the first RL approach for procedural materials. RL circumvents the limited availability of procedural data, the domain gap between real and synthetic materials, and the need for end-to-end differentiable loss functions. Given a target image, we retrieve a procedural material and use an RL-trained transformer model to predict a set of parameters that reconstruct the target image as closely as possible. We show that using RL significantly improves parameter prediction to match a given target image compared to supervised methods on both synthetic and real target images.
@article{li2024procedural, title = {Procedural Material Generation with Reinforcement Learning}, author = {Li, Beichen and Hu, Yiwei and Guerrero, Paul and Ha{\v{s}}an, Milo{\v{s}} and Shi, Liang and Deschaintre, Valentin and Matusik, Wojciech}, journal = {ACM Transactions on Graphics (TOG)}, volume = {43}, number = {6}, pages = {1--14}, year = {2024}, publisher = {ACM New York, NY, USA}, }
Computational Discovery of Microstructured Composites with Optimal Stiffness-Toughness Trade-Offs

Beichen Li, Bolei Deng, Wan Shou^*, Tae-Hyun Oh, Yuanming Hu, Yiyue Luo, Liang Shi, and Wojciech Matusik^*

Science Advances, 2024

Abs PDF Video Bib

The conflict between stiffness and toughness is a fundamental problem in engineering materials design. However, the systematic discovery of microstructured composites with optimal stiffness-toughness trade-offs has never been demonstrated, hindered by the discrepancies between simulation and reality and the lack of data-efficient exploration of the entire Pareto front. We introduce a generalizable pipeline that integrates physical experiments, numerical simulations, and artificial neural networks to address both challenges. Without any prescribed expert knowledge of material design, our approach implements a nested-loop proposal-validation workflow to bridge the simulation-to-reality gap and find microstructured composites that are stiff and tough with high sample efficiency. Further analysis of Pareto-optimal designs allows us to automatically identify existing toughness enhancement mechanisms, which were previously found through trial and error or biomimicry. On a broader scale, our method provides a blueprint for computational design in various research areas beyond solid mechanics, such as polymer chemistry, fluid dynamics, meteorology, and robotics.
@article{li2024computational, title = {Computational Discovery of Microstructured Composites with Optimal Stiffness-Toughness Trade-Offs}, author = {Li, Beichen and Deng, Bolei and Shou, Wan and Oh, Tae-Hyun and Hu, Yuanming and Luo, Yiyue and Shi, Liang and Matusik, Wojciech}, journal = {Science Advances}, volume = {10}, number = {5}, pages = {eadk4284}, year = {2024}, publisher = {American Association for the Advancement of Science}, }

2023

End-to-End Procedural Material Capture with Proxy-Free Mixed-Integer Optimization

Beichen Li, Liang Shi, and Wojciech Matusik

ACM Transactions on Graphics (TOG), Proc. SIGGRAPH, 2023

Abs PDF Code Bib

Node-graph-based procedural materials are vital to 3D content creation within the computer graphics industry. Leveraging the expressive representation of procedural materials, artists can effortlessly generate diverse appearances by altering the graph structure or node parameters. However, manually reproducing a specific appearance is a challenging task that demands extensive domain knowledge and labor. Previous research has sought to automate this process by converting artist-created material graphs into differentiable programs and optimizing node parameters against a photographed material appearance using gradient descent. These methods involve implementing differentiable filter nodes and training differentiable neural proxies for generator nodes to optimize continuous and discrete node parameters jointly. Nevertheless, Neural Proxies exhibits critical limitations, such as long training times, inaccuracies, fixed resolutions, and confined parameter ranges, which hinder their scalability towards the broad spectrum of production-grade material graphs. These constraints fundamentally stem from the absence of faithful and efficient implementations of generic noise and pattern generator nodes, both differentiable and non-differentiable. Such deficiency prevents the direct optimization of continuous and discrete generator node parameters without relying on surrogate models.
We present Diffmat v2, an improved differentiable procedural material library, along with a fully-automated, end-to-end procedural material capture framework that combines gradient-based optimization and gradient-free parameter search to match existing production-grade procedural materials against user-taken flash photos. Diffmat v2 expands the range of differentiable material graph nodes in Diffmat by adding generic noise/pattern generator nodes and user-customizable per-pixel filter nodes. This allows for the complete translation and optimization of procedural materials across various categories without the need for external proprietary tools or pre-cached noise patterns. Consequently, our method can capture a considerably broader array of materials, encompassing those with highly regular or stochastic geometries. We demonstrate that our end-to-end approach yields a closer match to the target than MATch and Neural Proxies when starting from initially unmatched continuous and discrete parameters.
@article{li2023end, title = {End-to-End Procedural Material Capture with Proxy-Free Mixed-Integer Optimization}, author = {Li, Beichen and Shi, Liang and Matusik, Wojciech}, journal = {ACM Transactions on Graphics (TOG)}, volume = {42}, number = {4}, pages = {1--15}, year = {2023}, publisher = {ACM New York, NY, USA}, }
AutODEx: Automated Optimal Design of Experiments Platform with Data-and Time-Efficient Multi-Objective Optimization

Yunsheng Tian, Pavle Vanja Konaković, Beichen Li, Ane Zuniga, Michael Foshey, Timothy Erps, Wojciech Matusik, and Mina Konaković Luković

In NeurIPS 2023 Workshop on Adaptive Experimental Design and Active Learning in the Real World, 2023

Abs Website PDF Video Bib

We introduce AutODEx, an automated machine learning platform for optimal design of experiments to expedite solution discovery with optimal objective trade-offs. We implement state-of-the-art multi-objective Bayesian optimization (MOBO) algorithms in a unified and flexible framework for adaptive experimental design, along with efficient asynchronous batch strategies extended to MOBO to harness experiment parallelization. For users with little or no experience with coding or machine learning, we provide an intuitive graphical user interface (GUI) to help quickly visualize and guide the experiment design. For experienced researchers, our modular code structure serves as a testbed to quickly customize, develop, and evaluate their own MOBO algorithms. Extensive benchmark experiments against other MOBO packages demonstrate AutODEx’s competitive and stable performance. Furthermore, we showcase AutODEx’s real-world utility by autonomously guiding hardware experiments with minimal human involvement.
@inproceedings{tian2023autodex, title = {AutODEx: Automated Optimal Design of Experiments Platform with Data-and Time-Efficient Multi-Objective Optimization}, author = {Tian, Yunsheng and Konakovi\'{c}, Pavle Vanja and Li, Beichen and Zuniga, Ane and Foshey, Michael and Erps, Timothy and Matusik, Wojciech and Lukovi\'{c}, Mina Konakovi\'{c}}, booktitle = {NeurIPS 2023 Workshop on Adaptive Experimental Design and Active Learning in the Real World}, year = {2023}, }

2022

Data-Efficient Graph Grammar Learning for Molecular Generation

Minghao Guo, Veronika Thost, Beichen Li, Payel Das, Jie Chen, and Wojciech Matusik

International Conference on Learning Representations (ICLR), Oral, 2022

Abs PDF Code Bib

The problem of molecular generation has received significant attention recently. Existing methods are typically based on deep neural networks and require training on large datasets with tens of thousands of samples. In practice, however, the size of class-specific chemical datasets is usually limited (e.g., dozens of samples) due to labor-intensive experimentation and data collection. Another major challenge is to generate only physically synthesizable molecules. This is a non-trivial task for neural network-based generative models since the relevant chemical knowledge can only be extracted and generalized from the limited training data. In this work, we propose a data-efficient generative model that can be learned from datasets with orders of magnitude smaller sizes than common benchmarks. At the heart of this method is a learnable graph grammar that generates molecules from a sequence of production rules. Without any human assistance, these production rules are automatically constructed from training data. Furthermore, additional chemical knowledge can be incorporated into the model by further grammar optimization. Our learned graph grammar yields state-of-the-art results on generating high-quality molecules for three monomer datasets that contain only 20 samples each. Our approach also achieves remarkable performance in a challenging polymer generation task with only 117 training samples and is competitive against existing methods using 81k data points.
@article{guo2022data, title = {Data-Efficient Graph Grammar Learning for Molecular Generation}, author = {Guo, Minghao and Thost, Veronika and Li, Beichen and Das, Payel and Chen, Jie and Matusik, Wojciech}, journal = {International Conference on Learning Representations (ICLR)}, year = {2022}, }
End-to-End Learning of 3D Phase-Only Holograms for Holographic Display

Liang Shi, Beichen Li, and Wojciech Matusik

Light: Science & Applications, 2022

Abs Website PDF Code Bib

Computer-generated holography (CGH) provides volumetric control of coherent wavefront and is fundamental to applications such as volumetric 3D displays, lithography, neural photostimulation, and optical/acoustic trapping. Recently, deep learning-based methods emerged as promising computational paradigms for CGH synthesis that overcome the quality-runtime tradeoff in conventional simulation/optimization-based methods. Yet, the quality of the predicted hologram is intrinsically bounded by the dataset’s quality. Here we introduce a new hologram dataset, MIT-CGH-4K-V2, that uses a layered depth image as a data-efficient volumetric 3D input and a two-stage supervised+unsupervised training protocol for direct synthesis of high-quality 3D phase-only holograms. The proposed system also corrects vision aberration, allowing customization for end-users. We experimentally show photorealistic 3D holographic projections and discuss relevant spatial light modulator calibration procedures. Our method runs in real-time on a consumer GPU and 5 FPS on an iPhone 13 Pro, promising drastically enhanced performance for the applications above.
@article{shi2022end, title = {End-to-End Learning of 3D Phase-Only Holograms for Holographic Display}, author = {Shi, Liang and Li, Beichen and Matusik, Wojciech}, journal = {Light: Science \& Applications}, volume = {11}, number = {1}, pages = {247}, year = {2022}, publisher = {Nature Publishing Group UK London}, }

2021

Designing Composites with Target Effective Young’s Modulus using Reinforcement Learning

Aldair E. Gongora, Siddharth Mysore, Beichen Li, Wan Shou, Wojciech Matusik, Elise F. Morgan, Keith A. Brown, and Emily Whiting

In Proceedings of the 6th Annual ACM Symposium on Computational Fabrication (SCF), 2021

Abs PDF Bib

Advancements in additive manufacturing have enabled design and fabrication of materials and structures not previously realizable. In particular, the design space of composite materials and structures has vastly expanded, and the resulting size and complexity has challenged traditional design methodologies, such as brute force exploration and one factor at a time (OFAT) exploration, to find optimum or tailored designs. To address this challenge, supervised machine learning approaches have emerged to model the design space using curated training data; however, the selection of the training data is often determined by the user. In this work, we develop and utilize a Reinforcement learning (RL)-based framework for the design of composite structures which avoids the need for user-selected training data. For a 5x5 composite design space comprised of soft and compliant blocks of constituent material, we find that using this approach, the model can be trained using 2.78% of the total design space consists of 225 design possibilities. Additionally, the developed RL-based framework is capable of finding designs at a success rate exceeding 90%. The success of this approach motivates future learning frameworks to utilize RL for the design of composites and other material systems.
@inproceedings{gongora2021designing, title = {Designing Composites with Target Effective Young's Modulus using Reinforcement Learning}, author = {Gongora, Aldair E. and Mysore, Siddharth and Li, Beichen and Shou, Wan and Matusik, Wojciech and Morgan, Elise F. and Brown, Keith A. and Whiting, Emily}, booktitle = {Proceedings of the 6th Annual ACM Symposium on Computational Fabrication (SCF)}, pages = {1--11}, year = {2021}, }
Learning Human-Environment Interactions using Conformal Tactile Textiles

Yiyue Luo, Yunzhu Li, Pratyusha Sharma, Wan Shou, Kui Wu, Michael Foshey, Beichen Li, Tomás Palacios, Antonio Torralba, and Wojciech Matusik

Nature Electronics, Cover Article, 2021

Abs Website PDF Video Code Bib

Recording, modelling and understanding tactile interactions is important in the study of human behaviour and in the development of applications in healthcare and robotics. However, such studies remain challenging because existing wearable sensory interfaces are limited in terms of performance, flexibility, scalability and cost. Here, we report a textile-based tactile learning platform that can be used to record, monitor and learn human-environment interactions. The tactile textiles are created via digital machine knitting of inexpensive piezoresistive fibres, and can conform to arbitrary three-dimensional geometries. To ensure that our system is robust against variations in individual sensors, we use machine learning techniques for sensing correction and calibration. Using the platform, we capture diverse human-environment interactions (more than a million tactile frames) and show that the artificial-intelligence-powered sensing textiles can classify humans’ sitting poses, motions and other interactions with the environment. We also show that the platform can recover dynamic whole-body poses, reveal environmental spatial information and discover biomechanical signatures.
@article{luo2021learning, title = {Learning Human-Environment Interactions using Conformal Tactile Textiles}, author = {Luo, Yiyue and Li, Yunzhu and Sharma, Pratyusha and Shou, Wan and Wu, Kui and Foshey, Michael and Li, Beichen and Palacios, Tom{\'a}s and Torralba, Antonio and Matusik, Wojciech}, journal = {Nature Electronics}, volume = {4}, number = {3}, pages = {193--201}, year = {2021}, publisher = {Nature Publishing Group UK London}, }
Towards Real-Time Photorealistic 3D Holography with Deep Neural Networks

Liang Shi, Beichen Li, Changil Kim, Petr Kellnhofer, and Wojciech Matusik

Nature, 2021

Abs Website PDF Video Code Bib

The ability to present three-dimensional (3D) scenes with continuous depth sensation has a profound impact on virtual and augmented reality (AR/VR), human-computer interaction, education, and training. Computer-generated holography (CGH) enables high spatio-angular resolution 3D projection via numerical simulation of diffraction and interference. Yet, existing physically based methods fail to produce holograms with both per-pixel focal control and accurate occlusion. The computationally taxing Fresnel diffraction simulation further places an explicit trade-off between image quality and runtime, making dynamic holography far from practical. Here, we demonstrate the first deep learning-based CGH pipeline capable of synthesizing a photorealistic color 3D hologram from a single RGB-Depth (RGB-D) image in real time. Our convolutional neural network (CNN) is extremely memory-efficient (below 620 KB) and runs at 60 Hz for 1920x1080 pixels resolution on a single consumer-grade graphics processing unit (GPU). Leveraging low-power on-device artificial intelligence (AI) acceleration chips, our CNN also runs interactively on mobile (iPhone 11 Pro at 1.1 Hz) and edge (Google Edge TPU at 2 Hz) devices, promising real-time performance in future generation AR/VR mobile headsets. We enable this pipeline by introducing the first large-scale CGH dataset (MIT-CGH-4K) with 4,000 pairs of RGB-D images and corresponding 3D holograms. Our CNN is trained with differentiable wave-based loss functions and physically approximates Fresnel diffraction. With an anti-aliasing phase-only encoding method, we experimentally demonstrate speckle-free, natural-looking high-resolution 3D holograms. Our learning-based approach and the first Fresnel hologram dataset will help unlock the full potential of holography and enable new applications in metasurface design, optical and acoustic tweezer-based microscopic manipulation, holographic microscopy, and single-exposure volumetric 3D printing.
@article{shi2021towards, title = {Towards Real-Time Photorealistic 3D Holography with Deep Neural Networks}, author = {Shi, Liang and Li, Beichen and Kim, Changil and Kellnhofer, Petr and Matusik, Wojciech}, journal = {Nature}, volume = {591}, number = {7849}, pages = {234--239}, year = {2021}, publisher = {Nature Publishing Group UK London}, }

2020

MATch: Differentiable Material Graphs for Procedural Material Capture

Liang Shi, Beichen Li, Miloš Hašan, Kalyan Sunkavalli, Tamy Boubekeur, Radomir Mech, and Wojciech Matusik

ACM Transactions on Graphics (TOG), Proc. SIGGRAPH Asia, 2020

Abs Website PDF Video Code Bib

We present MATch, a method to automatically convert photographs of material samples into production-grade procedural material models. At the core of MATch is a new library DiffMat that provides differentiable building blocks for constructing procedural materials, and automatic translation of large-scale procedural models, with hundreds to thousands of node parameters, into differentiable node graphs. Combining these translated node graphs with a rendering layer yields an end-to-end differentiable pipeline that maps node graph parameters to rendered images. This facilitates the use of gradient-based optimization to estimate the parameters such that the resulting material, when rendered, matches the target image appearance, as quantified by a style transfer loss. In addition, we propose a deep neural feature-based graph selection and parameter initialization method that efficiently scales to a large number of procedural graphs. We evaluate our method on both rendered synthetic materials and real materials captured as flash photographs. We demonstrate that MATch can reconstruct more accurate, general, and complex procedural materials compared to the state-of-the-art. Moreover, by producing a procedural output, we unlock capabilities such as constructing arbitrary-resolution material maps and parametrically editing the material appearance.
@article{shi2020match, title = {MATch: Differentiable Material Graphs for Procedural Material Capture}, author = {Shi, Liang and Li, Beichen and Ha{\v{s}}an, Milo{\v{s}} and Sunkavalli, Kalyan and Boubekeur, Tamy and Mech, Radomir and Matusik, Wojciech}, journal = {ACM Transactions on Graphics (TOG)}, volume = {39}, number = {6}, pages = {1--15}, year = {2020}, publisher = {ACM New York, NY, USA}, }
Physical Realization of Elastic Cloaking with a Polar Material

Xianchen Xu, Chen Wang, Wan Shou, Zongliang Du, Yangyang Chen, Beichen Li, Wojciech Matusik, Nassar Hussein, and Guoliang Huang

Physical Review Letters, 2020

Abs PDF Bib

An elastic cloak is a coating material that can be applied to an arbitrary inclusion to make it indistinguishable from the background medium. Cloaking against elastic disturbances, in particular, has been demonstrated using several designs and gauges. None, however, tolerate the coexistence of normal and shear stresses due to a shortage of physical realization of transformation-invariant elastic materials. Here, we overcome this limitation to design and fabricate a new class of polar materials with a distribution of body torque that exhibits asymmetric stresses. A static cloak for full two-dimensional elasticity is thus constructed based on the transformation method. The proposed cloak is made of a functionally graded multilayered lattice embedded in an isotropic continuum background. While one layer is tailored to produce a target elastic behavior, the other layers impose a set of kinematic constraints equivalent to a distribution of body torque that breaks the stress symmetry. Experimental testing under static compressive and shear loads demonstrates encouraging cloaking performance in good agreement with our theoretical prediction. The work sets a precedent in the field of transformation elasticity and should find applications in mechanical stress shielding and stealth technologies.
@article{xu2020physical, title = {Physical Realization of Elastic Cloaking with a Polar Material}, author = {Xu, Xianchen and Wang, Chen and Shou, Wan and Du, Zongliang and Chen, Yangyang and Li, Beichen and Matusik, Wojciech and Hussein, Nassar and Huang, Guoliang}, journal = {Physical Review Letters}, volume = {124}, number = {11}, pages = {114301}, year = {2020}, publisher = {APS}, }
Noise-Resilient Reconstruction of Panoramas and 3D Scenes using Robot-Mounted Unsynchronized Commodity RGB-D Cameras

Sheng Yang, Beichen Li, Yan-Pei Cao, Hongbo Fu, Yu-Kun Lai, Leif Kobbelt, and Shi-Min Hu

ACM Transactions on Graphics (TOG), 2020

Abs PDF Bib

We present a two-stage approach to first constructing 3D panoramas and then stitching them for noise-resilient reconstruction of large-scale indoor scenes. Our approach requires multiple unsynchronized RGB-D cameras, mounted on a robot platform, which can perform in-place rotations at different locations in a scene. Such cameras rotate on a common (but unknown) axis, which provides a novel perspective for coping with unsynchronized cameras, without requiring sufficient overlap of their Field-of-View (FoV). Based on this key observation, we propose novel algorithms to track these cameras simultaneously. Furthermore, during the integration of raw frames onto an equirectangular panorama, we derive uncertainty estimates from multiple measurements assigned to the same pixels. This enables us to appropriately model the sensing noise and consider its influence, so as to achieve better noise resilience, and improve the geometric quality of each panorama and the accuracy of global inter-panorama registration. We evaluate and demonstrate the performance of our proposed method for enhancing the geometric quality of scene reconstruction from both real-world and synthetic scans.
@article{yang2020noise, title = {Noise-Resilient Reconstruction of Panoramas and 3D Scenes using Robot-Mounted Unsynchronized Commodity RGB-D Cameras}, author = {Yang, Sheng and Li, Beichen and Cao, Yan-Pei and Fu, Hongbo and Lai, Yu-Kun and Kobbelt, Leif and Hu, Shi-Min}, journal = {ACM Transactions on Graphics (TOG)}, volume = {39}, number = {5}, pages = {1--15}, year = {2020}, publisher = {ACM New York, NY, USA}, }

2019

HeteroFusion: Dense Scene Reconstruction Integrating Multi-Sensors

Sheng Yang, Beichen Li, Minghua Liu, Yu-Kun Lai, Leif Kobbelt, and Shi-Min Hu

IEEE Transactions on Visualization and Computer Graphics (TVCG), 2019

Abs PDF Bib

We present a novel approach to integrate data from multiple sensor types for dense 3D reconstruction of indoor scenes in realtime. Existing algorithms are mainly based on a single RGBD camera and thus require continuous scanning of areas with sufficient geometric features. Otherwise, tracking may fail due to unreliable frame registration. Inspired by the fact that the fusion of multiple sensors can combine their strengths towards a more robust and accurate self-localization, we incorporate multiple types of sensors which are prevalent in modern robot systems, including a 2D range sensor, an inertial measurement unit (IMU), and wheel encoders. We fuse their measurements to reinforce the tracking process and to eventually obtain better 3D reconstructions. Specifically, we develop a 2D truncated signed distance field (TSDF) volume representation for the integration and ray-casting of laser frames, leading to a unified cost function in the pose estimation stage. For validation of the estimated poses in the loop-closure optimization process, we train a classifier for the features extracted from heterogeneous sensors during the registration progress. To evaluate our method on challenging use case scenarios, we assembled a scanning platform prototype to acquire real-world scans. We further simulated synthetic scans based on high-fidelity synthetic scenes for quantitative evaluation. Extensive experimental evaluation on these two types of scans demonstrate that our system is capable of robustly acquiring dense 3D reconstructions and outperforms state-of-the-art RGBD and LiDAR systems.
@article{yang2019heterofusion, title = {HeteroFusion: Dense Scene Reconstruction Integrating Multi-Sensors}, author = {Yang, Sheng and Li, Beichen and Liu, Minghua and Lai, Yu-Kun and Kobbelt, Leif and Hu, Shi-Min}, journal = {IEEE Transactions on Visualization and Computer Graphics (TVCG)}, volume = {26}, number = {11}, pages = {3217--3230}, year = {2019}, publisher = {IEEE}, }
Learning to Fly: Computational Controller Design for Hybrid UAVs with Reinforcement Learning

Jie Xu, Tao Du, Michael Foshey, Beichen Li, Bo Zhu, Adriana Schulz, and Wojciech Matusik

ACM Transactions on Graphics (TOG), Proc. SIGGRAPH, 2019

Abs Website PDF Video Code Bib

Hybrid unmanned aerial vehicles (UAV) combine advantages of multicopters and fixed-wing planes: vertical take-off, landing, and low energy use. However, hybrid UAVs are rarely used because controller design is challenging due to its complex, mixed dynamics. In this paper, we propose a method to automate this design process by training a mode-free, model-agnostic neural network controller for hybrid UAVs. We present a neural network controller design with a novel error convolution input trained by reinforcement learning. Our controller exhibits two key features: First, it does not distinguish among flying modes, and the same controller structure can be used for copters with various dynamics. Second, our controller works for real models without any additional parameter tuning process, closing the gap between virtual simulation and real fabrication. We demonstrate the efficacy of the proposed controller both in simulation and in our custom-built hybrid UAVs. The experiments show that the controller is robust to exploit the complex dynamics when both rotors and wings are active in flight tests.
@article{xu2019learning, title = {Learning to Fly: Computational Controller Design for Hybrid UAVs with Reinforcement Learning}, author = {Xu, Jie and Du, Tao and Foshey, Michael and Li, Beichen and Zhu, Bo and Schulz, Adriana and Matusik, Wojciech}, journal = {ACM Transactions on Graphics (TOG)}, volume = {38}, number = {4}, pages = {1--12}, year = {2019}, publisher = {ACM New York, NY, USA}, }

2018

Interactive Co-Design of Form and Function for Legged Robots using the Adjoint Method

Ruta Desai, Beichen Li, Ye Yuan, and Stelian Coros

International Conference Series on Climbing and Walking Robots (CLAWAR), Best Paper (2nd), 2018

Abs PDF Bib

Our goal is to make robotics more accessible to casual users by reducing the domain knowledge required in designing and building robots. Towards this goal, we present an interactive computational design system that enables users to design legged robots with desired morphologies and behaviors by specifying higher level descriptions. The core of our method is a design optimization technique that reasons about the structure, and motion of a robot in coupled manner in order to achieve user-specified robot behavior, and performance. We are inspired by the recent works that also aim to jointly optimize robot’s form and function. However, through efficient computation of necessary design changes, our approach enables us to keep user-in-the-loop for interactive applications. We evaluate our system in simulation by automatically improving robot designs for multiple scenarios. Starting with initial user designs that are physically infeasible or inadequate to perform the user-desired task, we show optimized designs that achieve user-specifications, all while ensuring an interactive design flow.
@article{desai2018interactive, title = {Interactive Co-Design of Form and Function for Legged Robots using the Adjoint Method}, author = {Desai, Ruta and Li, Beichen and Yuan, Ye and Coros, Stelian}, journal = {International Conference Series on Climbing and Walking Robots (CLAWAR)}, year = {2018}, }