NRC Research Associate Programs
Fellowships Office
Policy and Global Affairs

Participating Agencies - AFRL

  sign inOpen Printer View

Opportunity at Air Force Research Laboratory (AFRL)

Hierarchical Learning for Decision Making and Guidance


Munitions Directorate, RW/Advanced Guidance

RO# Location
13.45.03.C0532 Eglin Air Force Base, FL 325426810


name email phone
Nivison, Scott Andrew 850.883.0103


Recently, reinforcement learning (RL) has been a widely explored area to solve complex problems where agents must find the localized policies to maximize a specified objective in an environment. While reinforcement learning has been a well explored topic for some time, there are still many fundamental issues that prevent its widespread use:

1) Scalability

2) Applying knowledge to new tasks (outside training distribution)

3) Effectiveness in environments with sparse rewards

·         Agents seek to use rewards gained from their actions to help optimize a certain policy. When there are not enough opportunities to gain such rewards, it may stall or even prevent agents from finding the best policy as there is little difference between the consequences of taking a variety of actions

    4) Long-term credit assignment problems

·         Standard RL algorithms have issues dealing with working towards a long-term goal while pursuing short-term goals

·         It is sometimes difficult to be sure of when a particular action was critical in achieving the goal.

·         This problem is amplified in multi-agent systems as there are many interacting components horizon-dependent goals

Hierarchical Reinforcement Learning (HRL) is a framework in which a high-level layer focuses on optimizing a policy towards “goal states” while a low-level layer focuses on choosing the primitive actions needed at a certain time step. This allows the problem to be divided into a set of sub-goals and a higher level objective (could be on different time-scales). This type of architecture does not ignore interplay between higher-level and lower-level objectives (optimize simultaneously) and can potentially reduce the dimensionality of problem / help with scalability.

This position will investigate and develop multi-agent algorithms to target some of the RL challenges listed above using adaptive control or reinforcement learning approaches. The algorithms developed will move towards a hierarchical framework in order to solve multi-faceted optimal control problems without ignoring the inter-play between the higher and lower level objectives.

Reinforcement learning; multi-agent systems; Hierarchical Learning; Adaptive Control; Machine Learning;


Citizenship:  Open to U.S. citizens
Level:  Open to Postdoctoral and Senior applicants


Base Stipend Travel Allotment Supplementation
$76,542.00 $4,000.00

$3,000 Supplement for Doctorates in Engineering & Computer Science

Experience Supplement:
Postdoctoral and Senior Associates will receive an appropriately higher stipend based on the number of years of experience past their PhD.

Copyright © 2022. National Academy of Sciences. All rights reserved.Terms of Use and Privacy Policy