Conversation
| } | ||
|
|
||
| int nle_done2() { | ||
| pline("You can't quit now, you're having so much fun!"); |
nle/agent/core/vtrace.py
Outdated
| vs_t_plus_1 = torch.cat( | ||
| [vs[1:], broadcasted_bootstrap_values.unsqueeze(0)], dim=0 | ||
| ) | ||
| vs_t_plus_1 = torch.cat([vs[1:], torch.unsqueeze(bootstrap_value, 0)], dim=0) |
There was a problem hiding this comment.
What is the rationale for changing this file? Are you sure the broadcasting isn't needed?
Just wondering as this file is not original, but copied from elsewhere.
There was a problem hiding this comment.
Line 128 in 0424d29
Pulled straight out of our neurips2020release branch
There was a problem hiding this comment.
Let's leave this file as-is.
nle/scripts/play.py
Outdated
| # print('--------') | ||
| # env.render(render_mode) | ||
| # print('--------') | ||
| # print('\033[31A') # Go up 31 lines. |
There was a problem hiding this comment.
Are these lines supposed to be commented out? (also the ones below)
There was a problem hiding this comment.
will adjust these
|
Although our CI is still down (and should be put up again asap) can confirm the code is all blacked, tests all pass, and the only flake8 errors we have are line too long in baseline model... (aren't these fine?) |
|
Changes:
Needless to say I could squash restructure these commits, and maybe should for quick reversion. |
nle/env/base.py
Outdated
| "\x1b[95m", | ||
| "\x1b[96m", | ||
| "\x1b[97m", | ||
| "\033[30m", |
There was a problem hiding this comment.
What is this? Why do we have this constant?
nle/env/base.py
Outdated
| observation, done, exceptions=True | ||
| ) | ||
| if ( | ||
| not self._allow_all_modes or observation[self._program_state_index][0] == 1 |
There was a problem hiding this comment.
What does this do? Doesn't seem very readable to me.
There was a problem hiding this comment.
This checks if death the game is over. will improve clarity
nle/scripts/play.py
Outdated
| print("--------") | ||
| env.render(render_mode) | ||
| print("--------") | ||
| if not print_all_frames: |
There was a problem hiding this comment.
That's a bit of a weird name for what it does (going back to override previous frames)?
There was a problem hiding this comment.
if you are not printing all frames, then you are going back and overwriting instead. i wanted a store_true parameter that would stop the overwriting behaviour. howabout print_frames_separately?
nle/scripts/ttyplay.py
Outdated
| import struct | ||
| import termios | ||
| import time | ||
| from nle.nethack.actions import _ACTIONS_DICT |
There was a problem hiding this comment.
I don't like this as it makes ttyplay.py nethack-only all of a sudden. Better to add another script and import the relevant functions.
There was a problem hiding this comment.
do you mean nle only?
There was a problem hiding this comment.
do you mean now you cant just take the script and play any old ttyrec, away from nle?
nle/tests/test_envs.py
Outdated
| # Hack to quit. | ||
| env.env.step(nethack.M("q")) | ||
| _, reward, done, _ = env.step(env._actions.index(ord("y"))) | ||
| obs, reward, done, _ = env.step(env._actions.index(ord("y"))) |
There was a problem hiding this comment.
i was undoing a previous change to the tests a few commits back, and slipped.
There was a problem hiding this comment.
| obs, reward, done, _ = env.step(env._actions.index(ord("y"))) | |
| _, reward, done, _ = env.step(env._actions.index(ord("y"))) |
heiner
left a comment
There was a problem hiding this comment.
This polybeast version here truly is a beast, in the bad sense of the word. I'm not convinced hydra is adding anything here either, but I understand this is just a copy&paste from the agent repo.
I can see how this might not be the easiest setup to start experimenting.
cdaa283 to
f8d1ebe
Compare
|
I have rebased and restructured into 3 commits. I'm increasingly tempted to cut out the agent and move it to a branch - am happy to do this if we think this is sensible. |
This environment is derived from NetHackScore with some key differences:
* starting character is '@' (random) by default
* `perform_known_steps` is only ever called on end, meaning that
menus, yes no questions and text input are all open again
* the action space is as full as possible
Note in this change is included the slight modification of the actions
enums to prevent dangeours actions being called, and to expand to allow
numbers and +/- signs.
Also saving and option changing have been disabled.
actions to new file ttyplay2.py. Also the play file now overwrites frames instead of printing all frames separately.
f8d1ebe to
50ccf04
Compare
|
All tests passed so rebased to include latest changes. |
I think moving the agent to a branch is a good idea. |
50ccf04 to
cf9aebf
Compare
|
The agent code now lives at #163. When this is merged, I will rebase that one off merge so the versions all make sense (currently the "challenge" environment doesnt exist in that branch. |
* `tty_colours` runs 0-31 (not 0-32) * `inv_strs` runs 0-255 (not 0-127) as you can name your inventory items with meta + keypress.
|
(Also a quick drive by fix for the observation space bug spotted on Friday) |
heiner
left a comment
There was a problem hiding this comment.
Generally looks good to me.
| PREVMSG = C("p") # view recent game messages | ||
|
|
||
|
|
||
| class Command(enum.IntEnum): |
There was a problem hiding this comment.
Previously, this was "all commands" (or something), and the "RL relevant ones" we gathered in their own enum. What's the philosophy now?
There was a problem hiding this comment.
This enum wasn't really all commands because it didnt contain movement etc? I figured if we were splitting out lets at least clearly mark the very dangerous ones.
| no_render, | ||
| render_mode, | ||
| print_frames_separately, | ||
| **kwargs, |
There was a problem hiding this comment.
What do we use the **kwargs for?
There was a problem hiding this comment.
the debug flag is created but does nothing in this function because it has been used previously. however this fn is called with play(**vars(flags)), so it needs a home here. the home i thought would be clearer in **kwargs, but i almost put **_
| if channel == 1: # Input channel. | ||
| os.write( | ||
| 1, b"\033[s\033[26;0f\033[37;1mFrame %d:\033[0m " % frame | ||
| ) # Save Cursor & Jump to L26 |
There was a problem hiding this comment.
could this interfere with other saved cursors from the data stream?
There was a problem hiding this comment.
A quick google suggests, yes you can only save one cursor at a time, so yes this would interfere. On test rendering i've not seen any problems though for our ttyrec2s...
This is the branch that contains the Challenge Environment and the NeurIPS baseline we will release for it.
The key differences from
NetHackChallengetoNetHackScoreare:characeter = '@')PREV_MSGandHISTORYactions are removed (they seemed to be causing crashes)The model itself we submit as baseline is based heavily on the model in Neurips2020 paper, and posted on the branch here (
neurips2020release). Here we have taken the best model and cut out all the other elements of the model, to keep the baseline simple, and to try to keep the model to only one file.The results of this run (for rotating characters and single characters are visible here ).
My suspicion is the model has gotten a bit slower (6000 sps) so there is probably room for a speed up. We can do this after the competition has started.