Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 1 addition & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@ Three scripts for gradius are available to try:

- `python -m copain.commands.gradius_random_inputs` will start a visible FCEUX instance, start a 1p game, turn on autofire and then perform random directional inputs. If vic viper dies, the game resets.
- `python -m copain.commands.gradius_brute_force` will start a visible FCEUX instance, start a 1p game and initiate a range of savestates, and perform a naïve brute-force exploration of the gamestates using a die-and-retry strategy, using savestates to unlock a fast exploration of savestates, until it reaches the end of the game. At maximum emulation speed, it takes about 10 minutes to see the end boss (TODO: detection of end game not implemented yet causes an infinite loop after end of game)
- `python -m copain.commands.gradius_brute_force`: a naïve attempt at applying q-learning to vic viper controls, reading the gamestate directly from the game RAM. Runs seemingly smoothly, but so far I got no signs of learning. Probably lacks many analysis tools, q-learning tricks, and understanding of the search space, before getting interesting results. The previous brute-force command might help creating a bank of states/transitions and savestates that would kickstart the learning process.
- `python -m copain.commands.gradius_q_learning`: a naïve attempt at applying q-learning to vic viper controls, reading the gamestate directly from the game RAM. Runs seemingly smoothly, but so far I got no signs of learning. Probably lacks many analysis tools, q-learning tricks, and understanding of the search space, before getting interesting results. The previous brute-force command might help creating a bank of states/transitions and savestates that would kickstart the learning process.

A nice milestone would be managing to perfect-score the game using a combination of those 3 approaches.

Expand All @@ -111,7 +111,6 @@ A nice milestone would be managing to perfect-score the game using a combination
│ ├── rl.py # classes that setup the RL framework
│ ├── run.lua # lua entrypoint
│ ├── run.py # python high-level classes to start a run and a scripting loop
│ ├── utils.lua
│ ├── utils.py
│ └── VERSION.txt
├── LICENSE
Expand Down
17 changes: 5 additions & 12 deletions copain/commands/gradius_brute_force.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,10 +13,6 @@

TMP_FOLDER = "/tmp/"

NUM_RUNNERS = 1
THREADED_SOCKET = True
THREADED_REQUESTS = True

FRAME_PER_ACTION = 16
NUMBER_OF_SAVESTATES = 10
ACTIONS_PER_SAVESTATES = 12
Expand Down Expand Up @@ -108,7 +104,7 @@ def __init__(
self.nb_preferred_positions = len(preferred_positions)
self.nb_fails_before_position_change = nb_fails_before_position_change

def __call__(self, handler, run_metadata):
def __call__(self, handler):
self._register_handler(handler)

# temporally ordered savestates
Expand Down Expand Up @@ -326,15 +322,12 @@ def gradius_loop_fn_init():
rom_path=ROM_PATH,
rom_hash=ROM_HASH,
loop_fn_init=gradius_loop_fn_init,
threaded_socket=THREADED_SOCKET,
threaded_requests=THREADED_REQUESTS,
num_runners=NUM_RUNNERS,
enable_game_genie=ENABLE_GAME_GENIE,
display_fceux_gui=DISPLAY_FCEUX_GUI,
visible_enable_sound=ENABLE_SOUND,
visible_speedmode=SPEEDMODE,
visible_render_sprites=RENDER_SPRITES,
visible_render_background=RENDER_BACKGROUND,
enable_sound=ENABLE_SOUND,
speedmode=SPEEDMODE,
render_sprites=RENDER_SPRITES,
render_background=RENDER_BACKGROUND,
tmp_folder=TMP_FOLDER,
fceux_executable=FCEUX_EXECUTABLE,
).run()
Loading