Seminars

Drop-out as a regularization method

Speaker: 
Dr. Georgios Anagnostopoulos
Start Time: 
Thursday, November 9, 2017 - 12:30
Location: 
F.W. Olin Engineering Complex, Room 313
Abstract: 

Drop-out, namely the "zeroing-out of signals" in model response functions (a type of multiplicative noise), can be intuitively understood as a type of regularization. Hence, it is not a surprise that is has become of interest, when training models that are prone to over-fitting, such as the ones encountered in the context of deep learning. In a 2014 JMLR paper by Srivastava et al., titled "Dropout: A Simple Way to Prevent Neural Networks from Overfitting," the authors provide a pedagogical example using the square loss to show the connection between drop-put and regularization. In this flash talk, the example will be briefly showcased. Moreover, a second example, this time pertaining to the hinge loss, will be demonstrated; to the best knowledge of the presenter, this is a new result. The latter could be of practical importance, when applying kernel machines to big data.

Automated Identification of Cardiac Signals after Blind Source Separation for Camera-Based Photoplethysmography

Speaker: 
Luke Moyou
Start Time: 
Monday, October 30, 2017 - 12:30
Location: 
F.W. Olin Engineering Complex, Room 313
Abstract: 

In the field of camera-based photoplethysmography the application of blind source separation (BSS) techniques has extensively stressed to cope with frequently occurring artifacts and noise. Although said techniques can help to extract the cardiac component from a mixture of input sources, permutation indeterminacy inherit to BSS techniques often introduces inaccuracies or requires manual intervention. The current contribution focuses on methods to automatically select the cardiac component from the output of BSS techniques applied to camera-based photoplethysmograms. To that end, we propose simple Markov models to describe and subsequently identify cardiac components. It is shown that good results can be obtained by combining different simple Markov models.

Link: http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=7146950

Swish: A Self-gated Activation Function

Speaker: 
Muntaser Syed
Start Time: 
Wednesday, October 25, 2017 - 09:15
Location: 
F.W. Olin Engineering Complex, Room 313
Abstract: 

The choice of activation functions in deep networks has a significant effect on the training dynamics and task performance. Currently, the most successful and widely-used activation function is the Rectified Linear Unit (ReLU). Although various alternatives to ReLU have been proposed, none have managed to replace it due to inconsistent gains. In this work, we propose a new activation function, named Swish, which is simply f(x) = x * sigmoid(x). Our experiments show that Swish tends to work better than ReLU on deeper models across a number of challenging datasets. For example, simply replacing ReLUs with Swish units improves top-1 classification accuracy on ImageNet by 0.9% for Mobile NASNet-A and 0.6% for Inception-ResNet-v2. The simplicity of Swish and its similarity to ReLU make it easy for practitioners to replace ReLUs with Swish units in any neural network.

Link: Swish: A Self-gated Activation Function

Persistent Homology of Delay Embeddings​ ​ and​ ​ its Application to Wheeze Detection​

Speaker: 
Kaylen Bryan
Start Time: 
Monday, October 16, 2017 - 17:00
Location: 
F.W. Olin Engineering Complex, Room 313
Abstract: 

We propose a new approach to detect and quantify the periodic structure of dynamical systems using topological methods. We propose to use delay-coordinate embedding as a tool to detect the presence of harmonic structures by using persistent homology for robust analysis of point clouds of delay-coordinate embeddings. To discover the proper delay, we propose an autocorrelation like (ACL) function of the signals, and apply the introduced topological approach to analyze breathing sound signals for wheeze detection. Experiments have been carried out to substantiate the capabilities of the proposed method.

Link: Persistent Homology of Delay Embeddings​ ​ and​ ​ its Application to Wheeze Detection​

Adaptive Influence Maximization in Dynamic Social Networks

Speaker: 
Xi Zhang (CC)
Start Time: 
Thursday, October 12, 2017 - 12:30
Location: 
F.W. Olin Engineering Complex, Room 313
Abstract: 

For the purpose of propagating information and ideas through a social network, a seeding strategy aims to find a small set of seed users that are able to maximize the spread of the influence, which is termed influence maximization problem. Despite a large number of works have studied this problem, the existing seeding strategies are limited to the models that cannot fully capture the characteristics of real-world social networks. In fact, due to high-speed data transmission and large population of participants, the diffusion processes in real-world social networks have many aspects of uncertainness. As shown in the experiments, when taking such uncertainness into account, the state-of-the-art seeding strategies are pessimistic as they fail to trace the influence diffusion. In this paper, we study the strategies that select seed users in an adaptive manner. We first formally model the dynamic independent Cascade model and introduce the concept of adaptive seeding strategy. Then, based on the proposed model, we show that a simple greedy adaptive seeding strategy finds an effective solution with a provable performance guarantee. Besides the greedy algorithm, an efficient heuristic algorithm is provided for better scalability. Extensive experiments have been performed on both the real-world networks and synthetic power-law networks. The results herein demonstrate the superiority of the adaptive seeding strategies to other baseline methods.

Link: Adaptive Influence Maximization in Dynamic Social Networks

Co-occurrence Feature Learning for Skeleton based Action Recognition using Regularized Deep LSTM Networks

Speaker: 
Tatsanee Chaiya
Start Time: 
Thursday, October 5, 2017 - 12:30
Location: 
F.W. Olin Engineering Complex, Room 313
Abstract: 

Skeleton based action recognition distinguishes human actions using the trajectories of skeleton joints, which provide a very good representation for describing actions. Considering that recurrent neural networks (RNNs) with Long Short-Term Memory (LSTM) can learn feature representations and model long-term temporal dependencies automatically, we propose an end-to-end fully connected deep LSTM network for skeleton based action recognition. Inspired by the observation that the co-occurrences of the joints intrinsically characterize human actions, we take the skeleton as the input at each time slot and introduce a novel regularization scheme to learn the co-occurrence features of skeleton joints. To train the deep LSTM network effectively, we propose a new dropout algorithm which simultaneously operates on the gates, cells, and output responses of the LSTM neurons. Experimental results on three human action recognition datasets consistently demonstrate the effectiveness of the proposed model.

Link: Co-occurrence Feature Learning for Skeleton based...

Autonomous Vehicle And Real Time Road Lanes Detection And Tracking

Speaker: 
Max Ble
Start Time: 
Thursday, September 28, 2017 - 12:30
Location: 
F.W. Olin Engineering Complex, Room 313
Abstract: 

Advanced Driving Assistant Systems, intelligent and autonomous vehicles are promising solutions to enhance road safety, traffic issues and passengers' comfort. Such applications require advanced computer vision algorithms that demand powerful computers with high-speed processing capabilities. Keeping intelligent vehicles on the road until its destination, in some cases, remains a great challenge, particularly when driving at high speeds. The first principle task is robust navigation, which is often based on system vision to acquire RGB images of the road for more advanced processing. The second task is the vehicle's dynamic controller according to its position, speed and direction. This paper presents an accurate and efficient road boundaries and painted lines' detection algorithm for intelligent and autonomous vehicle. It combines Hough Transform to initialize the algorithm at each time needed, and Canny edges' detector, least-square method and Kalman filter to minimize the adaptive region of interest, predict the future road boundaries' location and lines parameters. The scenarios are simulated on the Pro-SiVIC simulator provided by Civitec, which is a realistic simulator of vehicles' dynamics, road infrastructures, and sensors behaviors, and OPAL-RT product dedicated for real time processing and parallel computing.

Link: Autonomous Vehicle And Real Time Road Lanes Detection And Tracking

SLURM: Simple Linux Utility for Resource Management

Speaker: 
Anmar
Start Time: 
Tuesday, September 19, 2017 - 13:45
Location: 
F.W. Olin Engineering Complex, Room 313
Abstract: 

A new cluster resource management system called Simple Linux Utility Resource Management (SLURM) is described in this paper. SLURM, initially developed for large Linux clusters at the Lawrence Livermore National Laboratory (LLNL), is a simple cluster manager that can scale to thousands of processors. SLURM is designed to be flexible and fault-tolerant and can be ported to other clusters of different size and architecture with minimal effort. We are certain that SLURM will benefit both users and system architects by providing them with a simple, robust, and highly scalable parallel job execution environment for their cluster system.

Link: SLURM: Simple Linux Utility for Resource Management

Self-normalizing Neural Networks

Speaker: 
Mitch Solomon
Start Time: 
Thursday, September 7, 2017 - 12:30
Location: 
F.W. Olin Engineering Complex, Room 313
Abstract: 

Deep Learning has revolutionized vision via convolutional neural networks (CNNs) and natural language processing via recurrent neural networks (RNNs). However, success stories of Deep Learning with standard feed-forward neural networks (FNNs) are rare. FNNs that perform well are typically shallow and, therefore cannot exploit many levels of abstract representations. We introduce self-normalizing neural networks (SNNs) to enable high-level abstract representations. While batch normalization requires explicit normalization, neuron activations of SNNs automatically converge towards zero mean and unit variance. The activation function of SNNs are “scaled exponential linear units” (SELUs), which induce self-normalizing properties. Using the Banach fixed-point theorem, we prove that activations close to zero mean and unit variance that are propagated through many network layers will converge towards zero mean and unit variance — even under the presence of noise and perturbations. This convergence property of SNNs allows to (1) train deep networks with many layers, (2) employ strong regularization schemes, and (3) to make learning highly robust. Furthermore, for activations not close to unit variance, we prove an upper and lower bound on the variance, thus, vanishing and exploding gradients are impossible. We compared SNNs on (a) 121 tasks from the UCI machine learning repository, on (b) drug discovery benchmarks, and on (c) astronomy tasks with standard FNNs, and other machine learning methods such as random forests and support vector machines. For FNNs we considered (i) ReLU networks without normalization, (ii) batch normalization, (iii) layer normalization, (iv) weight normalization, (v) highway networks, and (vi) residual networks. SNNs significantly outperformed all competing FNN methods at 121 UCI tasks, outperformed all competing methods at the Tox21 dataset, and set a new record at an astronomy data set. The winning SNN architectures are often very deep. Implementations are available at: github.com/bioinf-jku/SNNs.

Link: Self-normalizing Neural Networks

GPU-SM: Shared Memory Multi-GPU Programming

Speaker: 
Anmar Salih
Start Time: 
Thursday, March 30, 2017 - 14:00
Location: 
F.W. Olin Engineering Complex, Room 313
Abstract: 

Discrete GPUs in modern multi-GPU systems can transparently access each other’s memories through the PCIe interconnect. Future systems will improve this capability by including better GPU interconnects such as NVLink. However, remote memory access across GPUs has gone largely unnoticed among programmers, and multi-GPU systems are still programmed like distributed systems in which each GPU only accesses its own memory. This increases the complexity of the host code as programmers need to explicitly communicate ata across GPU memories. In this paper we present GPU-SM, a set of guidelines to program ulti-GPU systems like NUMA shared memory systems with minimal performance overheads. sing GPU-SM, data structures can be decomposed across several GPU memories and data that resides on a different GPU is accessed remotely through the PCI interconnect. The programmability benefits of the shared-memory model on GPUs are shown using a finite difference and an image filtering applications. We also present a detailed performance analysis of the PCIe interconnect and the impact of remote accesses on kernel performance. While PCIe imposes long latency and has limited bandwidth compared to the local GPU memory, we show that the highly-multithreaded GPU execution model can help educing its costs. Evaluation of finite difference and image filtering GPU-SM implementations hows close to linear speedups on a system with 4 GPUs, with much simpler code than the original implementations (e.g., a 40% SLOC reduction in the host code of finite difference).

Link: GPU-SM: Shared Memory Multi-GPU Programming

Pages