WIP: CRR algorithm #407

pilgrimygy · 2021-07-27T03:04:37Z

@findmyway
I finish a simple OfflinePolicy framework and we can use it to call the online algorithms by offline dataset.
I test the discrete CRR that has good performance.
But the continuous CRR is poor. I don't know why. I guess the actor_loss_coef calculation is wrong.

pilgrimygy · 2021-07-27T03:29:20Z

And I test the Distributional Critic is inferior to normal Critic in performance.

pilgrimygy and others added 2 commits July 26, 2021 23:59

update

888db8d

Update

0ca0fe1

Merge branch 'JuliaReinforcementLearning:master' into framework

a6459fa

pilgrimygy closed this Aug 5, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

WIP: CRR algorithm #407

WIP: CRR algorithm #407

Uh oh!

pilgrimygy commented Jul 27, 2021

Uh oh!

pilgrimygy commented Jul 27, 2021

Uh oh!

Uh oh!

Uh oh!

WIP: CRR algorithm #407

WIP: CRR algorithm #407

Uh oh!

Conversation

pilgrimygy commented Jul 27, 2021

Uh oh!

pilgrimygy commented Jul 27, 2021

Uh oh!

Uh oh!