In this code, I try to implement Pointenv, (A pointmass environment in which an agent gets reward for reaching a goal circle.) And uses the cross-entropy method to try to create a policy that maximizes reward in a PointEnv. Also, I evaluate multiple plans in the PointEnv environment, using memoization for increased efficiency.
-
Notifications
You must be signed in to change notification settings - Fork 0
Create a policy that maximizes reward in a PointEnv
License
zsarayloo/Cross-Entropy-optmization-
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
About
Create a policy that maximizes reward in a PointEnv
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published