Inter-Surface Maps via Constant-Curvature Metrics

Patrick Schmidt, Marcel Campen, Janis Born, Leif Kobbelt

We propose a novel approach to represent maps between two discrete surfaces of the same genus and to minimize intrinsic mapping distortion. Our maps are well-defined at every surface point and are guaranteed to be continuous bijections (surface homeomorphisms). As a key feature of our approach, only the images of vertices need to be represented explicitly, since the images of all other points (on edges or in faces) are properly defined implicitly. This definition is via unique geodesics in metrics of constant Gaussian curvature. Our method is built upon the fact that such metrics exist on surfaces of arbitrary topology, without the need for any cuts or cones (as asserted by the uniformization theorem). Depending on the surfaces' genus, these metrics exhibit one of the three classical geometries: Euclidean, spherical or hyperbolic. Our formulation handles constructions in all three geometries in a unified way. In addition, by considering not only the vertex images but also the discrete metric as degrees of freedom, our formulation enables us to simultaneously optimize the images of these vertices and images of all other points.

» Show BibTeX

author = {Schmidt, Patrick and Campen, Marcel and Born, Janis and Kobbelt, Leif},
title = {Inter-Surface Maps via Constant-Curvature Metrics},
journal = {ACM Transactions on Graphics},
issue_date = {July 2020},
volume = {39},
number = {4},
month = jul,
year = {2020},
articleno = {119},
url = {https://doi.org/10.1145/3386569.3392399},
doi = {10.1145/3386569.3392399},
publisher = {ACM},
address = {New York, NY, USA},

3D-MPA: Multi Proposal Aggregation for 3D Semantic Instance Segmentation

Francis Engelmann, Martin Bokeloh, Alireza Fathi, Bastian Leibe, Matthias Nießner
IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2020

We present 3D-MPA, a method for instance segmentation on 3D point clouds. Given an input point cloud, we propose an object-centric approach where each point votes for its object center. We sample object proposals from the predicted object centers. Then we learn proposal features from grouped point features that voted for the same object center. A graph convolutional network introduces inter-proposal relations, providing higher-level feature learning in addition to the lower-level point features. Each proposal comprises a semantic label, a set of associated points over which we define a foreground-background mask, an objectness score and aggregation features. Previous works usually perform non-maximum-suppression (NMS) over proposals to obtain the final object detections or semantic instances. However, NMS can discard potentially correct predictions. Instead, our approach keeps all proposals and groups them together based on the learned aggregation features. We show that grouping proposals improves over NMS and outperforms previous state-of-the-art methods on the tasks of 3D object detection and semantic instance segmentation on the ScanNetV2 benchmark and the S3DIS dataset.

» Show BibTeX

title = {{3D-MPA: Multi Proposal Aggregation for 3D Semantic Instance Segmentation}},
author = {Engelmann, Francis and Bokeloh, Martin and Fathi, Alireza and Leibe, Bastian and Nie{\ss}ner, Matthias},
booktitle = {{IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}},
year = {2020}

DualConvMesh-Net: Joint Geodesic and Euclidean Convolutions on 3D Meshes

Jonas Schult*, Francis Engelmann*, Theodora Kontogianni, Bastian Leibe
IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2020 (Oral)

We propose DualConvMesh-Nets (DCM-Net) a family of deep hierarchical convolutional networks over 3D geometric data that combines two types of convolutions. The first type, geodesic convolutions, defines the kernel weights over mesh surfaces or graphs. That is, the convolutional kernel weights are mapped to the local surface of a given mesh. The second type, Euclidean convolutions, is independent of any underlying mesh structure. The convolutional kernel is applied on a neighborhood obtained from a local affinity representation based on the Euclidean distance between 3D points. Intuitively, geodesic convolutions can easily separate objects that are spatially close but have disconnected surfaces, while Euclidean convolutions can represent interactions between nearby objects better, as they are oblivious to object surfaces. To realize a multi-resolution architecture, we borrow well-established mesh simplification methods from the geometry processing domain and adapt them to define mesh-preserving pooling and unpooling operations. We experimentally show that combining both types of convolutions in our architecture leads to significant performance gains for 3D semantic segmentation, and we report competitive results on three scene segmentation benchmarks.

» Show BibTeX

author = {Jonas Schult* and
Francis Engelmann* and
Theodora Kontogianni and
Bastian Leibe},
title = {{DualConvMesh-Net: Joint Geodesic and Euclidean Convolutions on 3D Meshes}},
booktitle = {{IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}},
year = {2020}

Siam R-CNN: Visual Tracking by Re-Detection

Paul Voigtlaender, Jonathon Luiten, Philip Torr, Bastian Leibe

We present Siam R-CNN, a Siamese re-detection architecture which unleashes the full power of two-stage object detection approaches for visual object tracking. We combine this with a novel tracklet-based dynamic programming algorithm, which takes advantage of re-detections of both the first-frame template and previous-frame predictions, to model the full history of both the object to be tracked and potential distractor objects. This enables our approach to make better tracking decisions, as well as to re-detect tracked objects after long occlusion. Finally, we propose a novel hard example mining strategy to improve Siam RCNN’s robustness to similar looking objects. The proposed tracker achieves the current best performance on ten tracking benchmarks, with especially strong results for long-term tracking.

» Show BibTeX

title={Siam R-CNN: Visual Tracking by Re-Detection},
author={Paul Voigtlaender and Jonathon Luiten and Philip H. S. Torr and Bastian Leibe},

Implicit Frictional Boundary Handling for SPH

Jan Bender, Tassilo Kugelstadt, Marcel Weiler, Dan Koschier
IEEE Transactions on Visualization and Computer Graphics

In this paper, we present a novel method for the robust handling of static and dynamic rigid boundaries in Smoothed Particle Hydrodynamics (SPH) simulations. We build upon the ideas of the density maps approach which has been introduced recently by Koschier and Bender. They precompute the density contributions of solid boundaries and store them on a spatial grid which can be efficiently queried during runtime. This alleviates the problems of commonly used boundary particles, like bumpy surfaces and inaccurate pressure forces near boundaries. Our method is based on a similar concept but we precompute the volume contribution of the boundary geometry. This maintains all benefits of density maps but offers a variety of advantages which are demonstrated in several experiments. Firstly, in contrast to the density maps method we can compute derivatives in the standard SPH manner by differentiating the kernel function. This results in smooth pressure forces, even for lower map resolutions, such that precomputation times and memory requirements are reduced by more than two orders of magnitude compared to density maps. Furthermore, this directly fits into the SPH concept so that volume maps can be seamlessly combined with existing SPH methods. Finally, the kernel function is not baked into the map such that the same volume map can be used with different kernels. This is especially useful when we want to incorporate common surface tension or viscosity methods that use different kernels than the fluid simulation.

» Show BibTeX

author = {Jan Bender and Tassilo Kugelstadt and Marcel Weiler and Dan Koschier },
title = {Implicit Frictional Boundary Handling for SPH},
journal = {IEEE Transactions on Visualization and Computer Graphics},
year = {2020},
publisher = {IEEE},

Dilated Point Convolutions: On the Receptive Field Size of Point Convolutions on 3D Point Clouds

Francis Engelmann, Theodora Kontogianni, Bastian Leibe
International Conference on Robotics and Automation (ICRA) 2020

In this work, we propose Dilated Point Convolutions (DPC). In a thorough ablation study, we show that the receptive field size is directly related to the performance of 3D point cloud processing tasks, including semantic segmentation and object classification. Point convolutions are widely used to efficiently process 3D data representations such as point clouds or graphs. However, we observe that the receptive field size of recent point convolutional networks is inherently limited. Our dilated point convolutions alleviate this issue, they significantly increase the receptive field size of point convolutions. Importantly, our dilation mechanism can easily be integrated into most existing point convolutional networks. To evaluate the resulting network architectures, we visualize the receptive field and report competitive scores on popular point cloud benchmarks.

» Show BibTeX

author = {Engelmann, Francis and Kontogianni, Theodora and Leibe, Bastian},
title = {{Dilated Point Convolutions: On the Receptive Field Size of Point Convolutions on 3D Point Clouds}},
booktitle = {{International Conference on Robotics and Automation (ICRA)}},
year = {2020}

Track to Reconstruct and Reconstruct to Track

Jonathon Luiten, Tobias Fischer, Bastian Leibe
RA-L 2020 / ICRA 2020

Object tracking and 3D reconstruction are often performed together, with tracking used as input for reconstruction. However, the obtained reconstructions also provide useful information for improving tracking. We propose a novel method that closes this loop, first tracking to reconstruct, and then reconstructing to track. Our approach, MOTSFusion (Multi-Object Tracking, Segmentation and dynamic object Fusion), exploits the 3D motion extracted from dynamic object reconstructions to track objects through long periods of complete occlusion and to recover missing detections. Our approach first builds up short tracklets using 2D optical flow, and then fuses these into dynamic 3D object reconstructions. The precise 3D object motion of these reconstructions is used to merge tracklets through occlusion into long-term tracks, and to locate objects when detections are missing. On KITTI, our reconstruction-based tracking reduces the number of ID switches of the initial tracklets by more than 50%, and outperforms all previous approaches for both bounding box and segmentation tracking.

» Show BibTeX

title={Track to Reconstruct and Reconstruct to Track},
author={Luiten, Jonathon and Fischer, Tobias and Leibe, Bastian},
journal={IEEE Robotics and Automation Letters},

UnOVOST: Unsupervised Offline Video Object Segmentation and Tracking

Jonathon Luiten, Idil Esen Zulfikar, Bastian Leibe
WACV 2020

We address Unsupervised Video Object Segmentation (UVOS), the task of automatically generating accurate pixel masks for salient objects in a video sequence and of tracking these objects consistently through time, without any input about which objects should be tracked. Towards solving this task, we present UnOVOST (Unsupervised Offline Video Object Segmentation and Tracking) as a simple and generic algorithm which is able to track and segment a large variety of objects. This algorithm builds up tracks in a number stages, first grouping segments into short tracklets that are spatio-temporally consistent, before merging these tracklets into long-term consistent object tracks based on their visual similarity. In order to achieve this we introduce a novel tracklet-based Forest Path Cutting data association algorithm which builds up a decision forest of track hypotheses before cutting this forest into paths that form long-term consistent object tracks. When evaluating our approach on the DAVIS 2017 Unsupervised dataset we obtain state-of-the-art performance with a mean J &F score of 67.9% on the val, 58% on the test-dev and 56.4% on the test-challenge benchmarks, obtaining first place in the DAVIS 2019 Unsupervised Video Object Segmentation Challenge. UnOVOST even performs competitively with many semi-supervised video object segmentation algorithms even though it is not given any input as to which objects should be tracked and segmented.

» Show BibTeX

title={UnOVOST: Unsupervised Offline Video Object Segmentation and Tracking},
author={Luiten, Jonathon and Zulfikar, Idil Esen and Leibe, Bastian},
booktitle={Proceedings of the IEEE Winter Conference on Applications in Computer Vision},

Metric-Scale Truncation-Robust Heatmaps for 3D Human Pose Estimation

István Sárándi, Timm Linder, Kai O. Arras, Bastian Leibe
IEEE International Conference on Automatic Face and Gesture Recognition (FG) 2020, to appear

Heatmap representations have formed the basis of 2D human pose estimation systems for many years, but their generalizations for 3D pose have only recently been considered. This includes 2.5D volumetric heatmaps, whose X and Y axes correspond to image space and the Z axis to metric depth around the subject. To obtain metric-scale predictions, these methods must include a separate, explicit post-processing step to resolve scale ambiguity. Further, they cannot encode body joint positions outside of the image boundaries, leading to incomplete pose estimates in case of image truncation. We address these limitations by proposing metric-scale truncation-robust (MeTRo) volumetric heatmaps, whose dimensions are defined in metric 3D space near the subject, instead of being aligned with image space. We train a fully-convolutional network to estimate such heatmaps from monocular RGB in an end-to-end manner. This reinterpretation of the heatmap dimensions allows us to estimate complete metric-scale poses without test-time knowledge of the focal length or person distance and without relying on anthropometric heuristics in post-processing. Furthermore, as the image space is decoupled from the heatmap space, the network can learn to reason about joints beyond the image boundary. Using ResNet-50 without any additional learned layers, we obtain state-of-the-art results on the Human3.6M and MPI-INF-3DHP benchmarks. As our method is simple and fast, it can become a useful component for real-time top-down multi-person pose estimation systems. We make our code publicly available to facilitate further research.

» Show BibTeX

title={Metric-Scale Truncation-Robust Heatmaps for 3{D} Human Pose Estimation},
author={S\'ar\'andi, Istv\'an and Linder, Timm and Arras, Kai O. and Leibe, Bastian},
booktitle={Automatic Face and Gesture Recognition, 2020 IEEE Int. Conf. on},
note={in press}

Fast and Robust QEF Minimization using Probabilistic Quadrics

Philip Trettner, Leif Kobbelt
Computer Graphics Forum (Proc. EUROGRAPHICS 2020)

Error quadrics are a fundamental and powerful building block in many geometry processing algorithms. However, finding the minimizer of a given quadric is in many cases not robust and requires a singular value decomposition or some ad-hoc regularization. While classical error quadrics measure the squared deviation from a set of ground truth planes or polygons, we treat the input data as genuinely uncertain information and embed error quadrics in a probabilistic setting ("probabilistic quadrics") where the optimal point minimizes the expected squared error. We derive closed form solutions for the popular plane and triangle quadrics subject to (spatially varying, anisotropic) Gaussian noise. Probabilistic quadrics can be minimized robustly by solving a simple linear system - 50x faster than SVD. We show that probabilistic quadrics have superior properties in tasks like decimation and isosurface extraction since they favor more uniform triangulations and are more tolerant to noise while still maintaining feature sensitivity. A broad spectrum of applications can directly benefit from our new quadrics as a drop-in replacement which we demonstrate with mesh smoothing via filtered quadrics and non-linear subdivision surfaces.

» Show BibTeX

@article {10.1111:cgf.13933,
journal = {Computer Graphics Forum},
title = {{Fast and Robust QEF Minimization using Probabilistic Quadrics}},
author = {Trettner, Philip and Kobbelt, Leif},
year = {2020},
publisher = {The Eurographics Association and John Wiley & Sons Ltd.},
ISSN = {1467-8659},
DOI = {10.1111/cgf.13933}

High-Fidelity Point-Based Rendering of Large-Scale 3D Scan Datasets

Patric Schmitz, Timothy Blut, Christian Mattes, Leif Kobbelt
IEEE Computer Graphics and Applications

Digitalization of 3D objects and scenes using modern depth sensors and high-resolution RGB cameras enables the preservation of human cultural artifacts at an unprecedented level of detail. Interactive visualization of these large datasets, however, is challenging without degradation in visual fidelity. A common solution is to fit the dataset into available video memory by downsampling and compression. The achievable reproduction accuracy is thereby limited for interactive scenarios, such as immersive exploration in Virtual Reality (VR). This degradation in visual realism ultimately hinders the effective communication of human cultural knowledge. This article presents a method to render 3D scan datasets with minimal loss of visual fidelity. A point-based rendering approach visualizes scan data as a dense splat cloud. For improved surface approximation of thin and sparsely sampled objects, we propose oriented 3D ellipsoids as rendering primitives. To render massive texture datasets, we present a virtual texturing system that dynamically loads required image data. It is paired with a single-pass page prediction method that minimizes visible texturing artifacts. Our system renders a challenging dataset in the order of 70 million points and a texture size of 1.2 terabytes consistently at 90 frames per second in stereoscopic VR.

High-Performance Image Filters via Sparse Approximations

Kersten Schuster, Philip Trettner, Leif Kobbelt
Proceedings of the ACM on Computer Graphics and Interactive Techniques, Vol. 3, No. 2, 2020

We present a numerical optimization method to find highly efficient (sparse) approximations for convolutional image filters. Using a modified parallel tempering approach, we solve a constrained optimization that maximizes approximation quality while strictly staying within a user-prescribed performance budget. The results are multi-pass filters where each pass computes a weighted sum of bilinearly interpolated sparse image samples, exploiting hardware acceleration on the GPU. We systematically decompose the target filter into a series of sparse convolutions, trying to find good trade-offs between approximation quality and performance. Since our sparse filters are linear and translation-invariant, they do not exhibit the aliasing and temporal coherence issues that often appear in filters working on image pyramids. We show several applications, ranging from simple Gaussian or box blurs to the emulation of sophisticated Bokeh effects with user-provided masks. Our filters achieve high performance as well as high quality, often providing significant speed-up at acceptable quality even for separable filters. The optimized filters can be baked into shaders and used as a drop-in replacement for filtering tasks in image processing or rendering pipelines.

Cost Minimizing Local Anisotropic Quad Mesh Refinement

Max Lyon, David Bommes, Leif Kobbelt
Eurographics Symposium on Geometry Processing 2020

Quad meshes as a surface representation have many conceptual advantages over triangle meshes. Their edges can naturally be aligned to principal curvatures of the underlying surface and they have the flexibility to create strongly anisotropic cells without causing excessively small inner angles. While in recent years a lot of progress has been made towards generating high quality uniform quad meshes for arbitrary shapes, their adaptive and anisotropic refinement remains difficult since a single edge split might propagate across the entire surface in order to maintain consistency. In this paper we present a novel refinement technique which finds the optimal trade-off between number of resulting elements and inserted singularities according to a user prescribed weighting. Our algorithm takes as input a quad mesh with those edges tagged that are prescribed to be refined. It then formulates a binary optimization problem that minimizes the number of additional edges which need to be split in order to maintain consistency. Valence 3 and 5 singularities have to be introduced in the transition region between refined and unrefined regions of the mesh. The optimization hence computes the optimal trade-off and places singularities strategically in order to minimize the number of consistency splits — or avoids singularities where this causes only a small number of additional splits. When applying the refinement scheme iteratively, we extend our binary optimization formulation such that previous splits can be undone if this prevents degenerate cells with small inner angles that otherwise might occur in anisotropic regions or in the vicinity of singularities. We demonstrate on a number of challenging examples that the algorithm performs well in practice.

A Three-Level Approach to Texture Mapping and Synthesis on 3D Surfaces

Kersten Schuster, Philip Trettner, Patric Schmitz, Leif Kobbelt
Proceedings of the ACM on Computer Graphics and Interactive Techniques, Vol. 3, No. 1, 2020

We present a method for example-based texturing of triangular 3D meshes. Our algorithm maps a small 2D texture sample onto objects of arbitrary size in a seamless fashion, with no visible repetitions and low overall distortion. It requires minimal user interaction and can be applied to complex, multi-layered input materials that are not required to be tileable. Our framework integrates a patch-based approach with per-pixel compositing. To minimize visual artifacts, we run a three-level optimization that starts with a rigid alignment of texture patches (macro scale), then continues with non-rigid adjustments (meso scale) and finally performs pixel-level texture blending (micro scale). We demonstrate that the relevance of the three levels depends on the texture content and type (stochastic, structured, or anisotropic textures).

» Show BibTeX

author = {Schuster, Kersten and Trettner, Philip and Schmitz, Patric and Kobbelt, Leif},
title = {A Three-Level Approach to Texture Mapping and Synthesis on 3D Surfaces},
year = {2020},
issue_date = {Apr 2020},
publisher = {The Association for Computers in Mathematics and Science Teaching},
address = {USA},
volume = {3},
number = {1},
url = {https://doi.org/10.1145/3384542},
doi = {10.1145/3384542},
journal = {Proc. ACM Comput. Graph. Interact. Tech.},
month = apr,
articleno = {1},
numpages = {19},
keywords = {material blending, surface texture synthesis, texture mapping}

Reposing Humans by Warping 3D Features

Markus Knoche, István Sárándi, Bastian Leibe
Workshop on Towards Human-Centric Image/Video Synthesis -- Conference on Computer Vision and Pattern Recognition (CVPRW'20)

We address the problem of reposing an image of a human into any desired novel pose. This conditional image-generation task requires reasoning about the 3D structure of the human, including self-occluded body parts. Most prior works are either based on 2D representations or require fitting and manipulating an explicit 3D body mesh. Based on the recent success in deep learning-based volumetric representations, we propose to implicitly learn a dense feature volume from human images, which lends itself to simple and intuitive manipulation through explicit geometric warping. Once the latent feature volume is warped according to the desired pose change, the volume is mapped back to RGB space by a convolutional decoder. Our state-of-the-art results on the DeepFashion and the iPER benchmarks indicate that dense volumetric human representations are worth investigating in more detail.

» Show BibTeX

author = {Markus Knoche and Istv\'an S\'ar\'andi and Bastian Leibe},
title = {Reposing Humans by Warping 3{D} Features},
booktitle = {CVPR Workshop on Towards Human-Centric Image/Video Synthesis},
year = {2020}

Calibratio - A Small, Low-Cost, Fully Automated Motion-to-Photon Measurement Device

Sebastian Pape, Marcel Krüger, Jan Müller, Torsten Wolfgang Kuhlen
10th Workshop on Software Engineering and Architectures for Realtime Interactive Systems (SEARIS), 2020

Since the beginning of the design and implementation of virtual environments, these systems have been built to give the users the best possible experience. One detrimental factor for the user experience was shown to be a high end-to-end latency, here measured as motionto-photon latency, of the system. Thus, a lot of research in the past was focused on the measurement and minimization of this latency in virtual environments. Most existing measurement-techniques require either expensive measurement hardware like an oscilloscope, mechanical components like a pendulum or depend on manual evaluation of samples. This paper proposes a concept of an easy to build, low-cost device consisting of a microcontroller, servo motor and a photo diode to measure the motion-to-photon latency in virtual reality environments fully automatically. It is placed or attached to the system, calibrates itself and is controlled/monitored via a web interface. While the general concept is applicable to a variety of VR technologies, this paper focuses on the context of CAVE-like systems.

» Show BibTeX

author = {Sebastian Pape and Marcel Kr\"{u}ger and Jan M\"{u}ller and Torsten W. Kuhlen},
title = {{Calibratio - A Small, Low-Cost, Fully Automated Motion-to-Photon Measurement Device}},
booktitle = {10th Workshop on Software Engineering and Architectures for Realtime Interactive Systems (SEARIS)},
year = {2020},

Joint Dual-Tasking in VR: Outlining the Behavioral Design of Interactive Human Companions Who Walk and Talk with a User

Andrea Bönsch, Torsten Wolfgang Kuhlen
IEEE Virtual Humans and Crowds for Immersive Environments (VHCIE), 2020

To resemble realistic and lively places, virtual environments are increasingly often enriched by virtual populations consisting of computer-controlled, human-like virtual agents. While the applications often provide limited user-agent interaction based on, e.g., collision avoidance or mutual gaze, complex user-agent dynamics such as joint locomotion combined with a secondary task, e.g., conversing, are rarely considered yet. These dual-tasking situations, however, are beneficial for various use-cases: guided tours and social simulations will become more realistic and engaging if a user is able to traverse a scene as a member of a social group, while platforms to study crowd and walking behavior will become more powerful and informative. To this end, this presentation deals with different areas of interaction dynamics, which need to be combined for modeling dual-tasking with virtual agents. Areas covered are kinematic parameters for the navigation behavior, group shapes in static and mobile situations as well as verbal and non-verbal behavior for conversations.

» Show BibTeX

author = {Andrea B\"{o}nsch and Torsten W. Kuhlen},
title = {{Joint Dual-Tasking in VR: Outlining the Behavioral Design of Interactive Human Companions Who Walk and Talk with a User}},
booktitle = {IEEE Virtual Humans and Crowds for Immersive Environments (VHCIE)},
year = {2020},

Towards a Graphical User Interface for Exploring and Fine-Tuning Crowd Simulations

Andrea Bönsch, Marcel Jonda, Jonathan Ehret, Torsten Wolfgang Kuhlen
IEEE Virtual Humans and Crowds for Immersive Environments (VHCIE), 2020

Simulating a realistic navigation of virtual pedestrians through virtual environments is a recurring subject of investigations. The various mathematical approaches used to compute the pedestrians’ paths result, i.a., in different computation-times and varying path characteristics. Customizable parameters, e.g., maximal walking speed or minimal interpersonal distance, add another level of complexity. Thus, choosing the best-fitting approach for a given environment and use-case is non-trivial, especially for novice users.

To facilitate the informed choice of a specific algorithm with a certain parameter set, crowd simulation frameworks such as Menge provide an extendable collection of approaches with a unified interface for usage. However, they often miss an elaborated visualization with high informative value accompanied by visual analysis methods to explore the complete simulation data in more detail – which is yet required for an informed choice. Benchmarking suites such as SteerBench are a helpful approach as they objectively analyze crowd simulations, however they are too tailored to specific behavior details. To this end, we propose a preliminary design of an advanced graphical user interface providing a 2D and 3D visualization of the crowd simulation data as well as features for time navigation and an overall data exploration.

» Show BibTeX

author = {Andrea B\"{o}nsch and Marcel Jonda and Jonathan Ehret and Torsten W. Kuhlen},
title = {{Towards a Graphical User Interface for Exploring and Fine-Tuning Crowd Simulations}},
booktitle = {IEEE Virtual Humans and Crowds for Immersive Environments (VHCIE)},
year = {2020},

Talk: Insite: A Generalized Pipeline for In-transit Visualization and Analysis

Simon Oehrl, Jan Müller, Ali Can Demiralp, Marcel Krüger, Sebastian Spreizer, Benjamin Weyers, Torsten Wolfgang Kuhlen
NEST Conference 2020

Neuronal network simulators are essential to computational neuroscience, enabling the study of the nervous system through in-silico experiments. Through utilization of high-performance computing resources, these simulators are able to simulate increasingly complex and large networks of neurons today. It also creates new challenges for the analysis and visualization of such simulations. In-situ and in-transport strategies are popular approaches in these scenarios. They enable live monitoring of running simulations and parameter adjustment in the case of erroneous configurations which can save valuable compute resources.

This talk will present the current status of our pipeline for in-transport analysis and visualization of neuronal network simulator data. The pipeline is able to couple with NEST along other simulators with data management (querying, filtering and merging) from multiple simulator instances. Finally, the data is passed to end-user applications for visualization and analysis. The goal is to be integrated into third party tools such as the multi-view visual analysis toolkit ViSimpl.

Talk: Proximity in Social VR - Interpersonal Distance between a User and Virtual Agents

Andrea Bönsch
3rd Workshop on "Person-to-Person Interaction: From Analysis to Applications", 2020

Proxemic is a well known social behavioral measure, where the interpersonal distance between interactans is evaluated - either in real or in virtual social encounters. Given the prominent role of emotional expressions in our everyday social interactions, we investigated how emotions of a virtual agent affect proxemic adaptions while taking the aspects spatial constellation between user and agent as well as user’s level of dynamics into account.

Accurately Solving Physical Systems with Graph Learning

Han Shao, Tassilo Kugelstadt, Wojciech Palubicki, Jan Bender, Sören Pirk, Dominik L. Michels

Iterative solvers are widely used to accurately simulate physical systems. These solvers require initial guesses to generate a sequence of improving approximate solutions. In this contribution, we introduce a novel method to accelerate iterative solvers for physical systems with graph networks (GNs) by predicting the initial guesses to reduce the number of iterations. Unlike existing methods that aim to learn physical systems in an end-to-end manner, our approach guarantees long-term stability and therefore leads to more accurate solutions. Furthermore, our method improves the run time performance of traditional iterative solvers. To explore our method we make use of position-based dynamics (PBD) as a common solver for physical systems and evaluate it by simulating the dynamics of elastic rods. Our approach is able to generalize across different initial conditions, discretizations, and realistic material properties. Finally, we demonstrate that our method also performs well when taking discontinuous effects into account such as collisions between individual rods.

» Show BibTeX

title={Accurately Solving Physical Systems with Graph Learning},
author={Han Shao and Tassilo Kugelstadt and Wojciech Pa{\l{}}ubicki and Jan Bender and S{\"o}ren Pirk and Dominik L. Michels},

STEm-Seg: Spatio-temporal Embeddings for Instance Segmentation in Videos

Ali Athar, Sabarinath Mahadevan, Aljoša Ošep, Laura Leal-Taixé, Bastian Leibe

Existing methods for instance segmentation in videos typically involve multi-stage pipelines that follow the tracking-by-detection paradigm and model a video clip as a sequence of images. Multiple networks are used to detect objects in individual frames, and then associate these detections over time. Hence, these methods are often non-end-to-end trainable and highly tailored to specific tasks. In this paper, we propose a different approach that is well-suited to a variety of tasks involving instance segmentation in videos. In particular, we model a video clip as a single 3D spatio-temporal volume, and propose a novel approach that segments and tracks instances across space and time in a single stage. Our problem formulation is centered around the idea of spatio-temporal embeddings which are trained to cluster pixels belonging to a specific object instance over an entire video clip. To this end, we introduce (i) novel mixing functions that enhance the feature representation of spatio-temporal embeddings, and (ii) a single-stage, proposal-free network that can reason about temporal context. Our network is trained end-to-end to learn spatio-temporal embeddings as well as parameters required to cluster these embeddings, thus simplifying inference. Our method achieves state-of-the-art results across multiple datasets and tasks.

Single-Shot Panoptic Segmentation

Mark Weber, Jonathon Luiten, Bastian Leibe

We present a novel end-to-end single-shot method that segments countable object instances (things) as well as background regions (stuff) into a non-overlapping panoptic segmentation at almost video frame rate. Current state-of-the-art methods are far from reaching video frame rate and mostly rely on merging instance segmentation with semantic background segmentation. Our approach relaxes this requirement by using an object detector but is still able to resolve inter- and intra-class overlaps to achieve a non-overlapping segmentation. On top of a shared encoder-decoder backbone, we utilize multiple branches for semantic segmentation, object detection, and instance center prediction. Finally, our panoptic head combines all outputs into a panoptic segmentation and can even handle conflicting predictions between branches as well as certain false predictions. Our network achieves 32.6% PQ on MS-COCO at 21.8 FPS, opening up panoptic segmentation to a broader field of applications.

» Show BibTeX

title={Single-Shot Panoptic Segmentation},
author={Weber, Mark and Luiten, Jonathon and Leibe, Bastian},
journal={arXiv preprint arXiv:1911.00764},

Previous Year (2019)
Disclaimer Home Visual Computing institute RWTH Aachen University