Upwork is hiring a Applied mathematics - update agent policy without rewards function

Applied mathematics - update agent policy without rewards function

Upwork  ·  US  ·  $15k/yr - $42k/yr
over 1 year ago

This is mathematics and reinforcement learning. Need to change agent policy by updating it without the reward function. Have two papers we are currently using , one paper talks about updating policy with rewards function(rewards rational paper) and other talks about updating without the reward function (constructive preference paper ) our goal is to change the equations of each feedback in the rationale paper to the one like CPL paper, omitting the reward function. I have starting working on it already, the third paper attached (formalism and feedback) I have stated the feedback and equations with reward function, which needs to be changed without the reward functions

Job is closed

This job is already closed and no longer accepting applicants, sorry.