Dinu Marius-Constantin
Website
Menu
  • Home
  • Resume
  • Newsletter
  • About

Align-RUDDER: Learning From Few Demonstrations by Reward Redistribution

Marius-Constantin Dinu

30. September 2020 0 comments General Credit Assignment Problem, Learning from Demonstrations, Reinforcement Learning, Reward Redistribution, RUDDER
Reinforcement Learning algorithms require a large number of samples to solve complex tasks with sparse and delayed rewards. Complex tasks can often be hierarchically decomposed into sub-tasks. A step in the Q-function can be associated with solving a sub-task, where the expectation of the return increases. RUDDER has been introduced to identify these steps and then redistribute reward to them, thus immediately giving reward if sub-tasks are solved. Since the problem of delayed rewards is mitigated, learning is considerably sped up. However, for complex tasks, current exploration strategies as deployed in RUDDER struggle with discovering episodes with high rewards. Therefore, we assume that episodes with high rewards are given as demonstrations and do not have to be discovered by exploration. Typically the number of demonstrations is small and RUDDER's LSTM model as a deep learning method does not learn well. Hence, we introduce Align-RUDDER, which is RUDDER with two major modifications. First, Align-RUDDER assumes that episodes with high rewards are given as demonstrations, replacing RUDDER's safe exploration and lessons replay buffer. Second, we replace RUDDER's LSTM model by a profile model that is obtained from multiple sequence alignment of demonstrations. Profile models can be constructed from as few as two demonstrations as known from bioinformatics. Align-RUDDER inherits the concept of reward redistribution, which considerably reduces the delay of rewards, thus speeding up learning. Align-RUDDER outperforms competitors on complex artificial tasks with delayed reward and few demonstrations. On the MineCraft ObtainDiamond task, Align-RUDDER is able to mine a diamond, though not frequently. Code is published on GitHub

Overcoming Catastrophic Forgetting with Context-Dependent Activations (XdA) and Synaptic Stabilization

Marius-Constantin Dinu

25. November 2019 0 comments General
Abstract Overcoming Catastrophic Forgetting in neural networks is crucial to solving continuous learning problems. Deep Reinforcement Learning uses neural networks to make predictions of actions according to the current state space of an environment. In a dynamic environment, robust and adaptive life-long learning algorithms mark the cornerstone of their success. In this thesis we will examine an elaborate subset of algorithms countering catastrophic forgetting in neural networks and reflect on their weaknesses and strengths. Furthermore, we present an enhanced alternative to promising synaptic stabilization methods, such as Elastic Weight Consolidation or Synaptic Intelligence. Our method uses context-based information to switch between different pathways throughout the neural network, reducing destructive activation interference during the forward pass and destructive weight updates during the backward pass. We call this method Context-Dependent Activations (XdA). We show that XdA enhanced methods outperform basic synaptic stabilization methods and are a better choice for long task sequences.
Thesis link
GitHub link

Imagine Kara

Marius-Constantin Dinu

9. May 2018 0 comments General

Imagine Kara was one of the most exciting project I have been able to work on and the team has worked exceptionally hard and dedicated to make it reality! Sadly, as life goes one, some project come and some may go to enable other ideas to flourish. 

I want to thank you all for your dedication, hard work and support in all aspects during this amazing experience and I wish you all the best on your new paths.

Until next time!

Yours, Marius

Seamlessly entering the crypto world with Apollon

Marius-Constantin Dinu

25. April 2018 0 comments General

This will provide you an overview regarding the Apollon project and how you can get started with masternodes.

How I earned almost one Bitcoin with masternodes in three weeks and how you can do the same!

Marius-Constantin Dinu

20. April 2018 0 comments General

Apollon is an amazing blockchain project that enables everyone to easily host and maintain masternodes. It is already competing with some Google Trends search terms such as EOS, NEO and Cardano from the top 10 list of coinmarketcap.com .

Uni Swift Project

Marius-Constantin Dinu

15. May 2017 0 comments General

This project is a mobile application for managing your images. HOWEVER, where most applications start to struggle when it comes to searching the right images for the right moments, here you simply find them.

 

Deep Learning Script

Marius-Constantin Dinu

19. April 2017 0 comments General

NVIDIA DIGITS offers great support for experimenting with Deep Learning and provides great integration of Caffe Script.

To improve this experience I developed a DSL for Caffe which eases the prototyping of network architectures by drastically reducing the amount of code line and simplifying the development.

All the results are available on GitHub.

The project offers a Visual Studio Code Extension on the Marketplace, which provides a transpiler from the Deep Learning Script to Caffe Script.

DLS Demo:

Deep Learning

Marius-Constantin Dinu

26. November 2016 0 comments Education

Hi,

in this post I have added two PDF files with some important information and links related to the wide topic “Deep Learning”.

These should give you some guidence where to start and how to dig deeper.

Good luck and have fun!

Deep Learning Overview

Using docker for Deep Learning

Operation Phrike

Marius-Constantin Dinu

7. September 2016 0 comments Education Kinect, Myo, Oculus Rift, Unreal Engine, Virtual Reality

Operation Phrike has been created by students from the University of Applied Sciences Upper-Austria (Bachelor Software Engineering). It is basically a simulation software for military virtual reality combat scenarios. To create an authentic battlefield experience we use the Oculus Rift in combination with the Unreal Gameing-Engine. During the simulation process the test subject, usually a military representative, is monitored by different sensors to detect his experienced stress level. The collected data will be analyzed and used to extract information, which shows how a human reacts in certain war zone situations.

Download

Operation Phrike (Release V1.0)

More

GitHub Repository

System Requirements

User Manual

Presentation

Internship Report

Marius-Constantin Dinu

31. August 2016 0 comments Education Neural Networks, Support Vector Machine, Xamarin

Developing a Xamarin App for Handwritten Character Recognition using a Neural Network

Company: Siemens Corporation Corporate Technology
Institute: University of Applied Sciences Upper-Austria
Field of Study: Software Engineering
Author: Dinu Marius-Constantin

 

Prelude

This internship report gives an overview how about my experiences at Siemens Corporation Corporate Technology with the development of an application using a Neural Network for classification. It also compares the results a Support Vector Machine implementation summarizes the overall results.

Thesis

Internship Report
1 2

Latest Tweets

  • @iris_schaffer Don’t worry Iris, things will get back to normal again eventually and Austria is always with you 😁 P… twitter.com/i/web/status/1…
    13 hours ago
  • RT @IARAInews: IARAI is now hiring! PhD and Postdoc Positions in AI for Earth Observation. We invite the best to work with us! Details: htt…
    4 days ago
  • RT @ikwess: Pleased to present the Gaussian Neural Process (GNP) at #AABI2020: Gaussian, translation-equivariant prediction maps for meta-l…
    1 week ago

Recent Posts

  • Align-RUDDER: Learning From Few Demonstrations by Reward Redistribution
    30. September 2020
  • Overcoming Catastrophic Forgetting with Context-Dependent Activations (XdA) and Synaptic Stabilization
    25. November 2019
  • Imagine Kara
    9. May 2018

Feedback

Leave a comment.

Find me on →

Please, feel free to leave a comment and give me feedback about my webpage.
You can also contact me via email.

© Dinu Marius-Constantin 2021
Powered by WordPress • Themify WordPress Themes