Google Research Football の環境でゲームをしてみる

Google Research Football の環境を使う場合、エージェントをトレーニングするのが眼目になりますが、ただ単にゲームすることもできます。

Google Research Football Environment（GRF）には３つのやり方でプレイする方法が書かれています。

１：Play vs built-in AI（ビルトインAIと対戦）

By default, it starts the base scenario and the left player is controlled by the keyboard. Different types of players are supported (gamepad, external bots, agents…).

デフォルトでは、基本シナリオが開始され、左サイドのプレーヤーはキーボードによって制御されます。さまざまなタイプのプレーヤー（デバイス？）がサポートされています（ゲームパッド、外部ボット、エージェントなど）。

キーボードのマッピングについてはココ参照

デフォルトのシナリオでプレイしてみます。

python3 -m gfootball.play_game --action_set=full "keyboard:left_players=1;"

1	python3 -m gfootball.play_game --action_set=full "keyboard:left_players=1;"

(WisteriaHillのPCでは遅すぎて動作が確認できていません….どうなんでしょう？）

For possible options

利用可能なオプションについて

python3 -m gfootball.play_game -helpfull

1	python3 -m gfootball.play_game -helpfull

Ubuntu 18.04 に Google Research Football Environment をセットアップ参照

２：Play vs pre-trained agent（事前にトレーニングされたエージェントと対戦）

In particular, one can play against agent trained with run_ppo2 script with the following command (notice no action_set flag, as PPO agent uses default action set):

特に、次のコマンドを使用して、run_ppo2スクリプトでトレーニングされたエージェントと対戦できます（PPOエージェントはデフォルトのアクションセットを使用するため、action_setフラグがないことに注意してください）。

$YOUR_PATHはあなたがトレーニングしたチェックポイントへのパス

python3 -m gfootball.play_game --players "keyboard:left_players=1;ppo2_cnn:right_players=1,checkpoint=$YOUR_PATH"

トレーニングについてはUbuntu 18.04 に Google Research Football Environment をセットアップ参照

シナリオ（scenarios）を使ったトレーニングについては以下のページのFootball Academy & Future Directionsを参照

Introducing Google Research Football: A Novel Reinforcement Learning Environment

ピックアップしてみます。

Football Academy & Future Directions

As training agents for the full Football Benchmarks can be challenging,we also provide Football Academy, a diverse set of scenarios of varying difficulty.
フットボールベンチマークの完全なトレーニングを行うことは難しいため、Football Academyでは難易度の異なる多様なシナリオを提供しています。

This allows researchers to get the ball rolling on new research ideas, allows testing of high-level concepts (such as passing), and provides a foundation to investigate curriculum learning research ideas, where agents learn from progressively harder scenarios.
これにより、研究者は新しい研究アイデアに着手することができ、高レベルの概念（パスなど）をテストすることができ、エージェントが徐々に難しいシナリオから学習するカリキュラム学習研究のアイデアを調査するための基盤を提供します。

Examples of the Football Academy scenarios include settings where agents have to learn how to score against the empty goal, where they have to learn how to quickly pass between players, and where they have to learn how to execute a counter-attack. Using a simple API, researchers can further define their own scenarios and train agents to solve them.
Football Academyのシナリオの例としては、エージェントが無人のゴールに対して得点する方法を学習しなければならない設定や、プレイヤー間の素早いパスの出し方を学習しなければならない設定、カウンターアタックの実行方法を学習しなければならない設定などがあります。シンプルなAPIを使用して、研究者はさらに独自のシナリオを定義し、それを解決するためにエージェントを訓練することができます。

シナリオの内容は、セットアップしたGoogle Research Football Environmentの中の

~/football/gfootball/scenarios

1	~/football/gfootball/scenarios

にPythonファイルで置かれていますのでご参照ください。

11_vs_11_competition.py
11_vs_11_easy_stochastic.py
11_vs_11_hard_stochastic.py
11_vs_11_kaggle.py
11_vs_11_stochastic.py
1_vs_1_easy.py
5_vs_5.py
academy_3_vs_1_with_keeper.py
academy_corner.py
academy_counterattack_easy.py
academy_counterattack_hard.py
academy_empty_goal.py
academy_empty_goal_close.py
academy_pass_and_shoot_with_keeper.py
academy_run_pass_and_shoot_with_keeper.py
academy_run_to_score.py
academy_run_to_score_with_keeper.py
academy_single_goal_versus_lazy.py

11_vs_11_competition.py

11_vs_11_easy_stochastic.py

11_vs_11_hard_stochastic.py

11_vs_11_kaggle.py

11_vs_11_stochastic.py

1_vs_1_easy.py

5_vs_5.py

academy_3_vs_1_with_keeper.py

academy_corner.py

academy_counterattack_easy.py

academy_counterattack_hard.py

academy_empty_goal.py

academy_empty_goal_close.py

academy_pass_and_shoot_with_keeper.py

academy_run_pass_and_shoot_with_keeper.py

academy_run_to_score.py

academy_run_to_score_with_keeper.py

academy_single_goal_versus_lazy.py

３：Trained checkpoints（トレーニング済チェックポイント）

We provide trained PPO checkpoints for the following scenarios:

Googleは２つのシナリオを使ったトレーニング済のチェックポイントを提供しています。

ブラウザーからダウンロードしたファイルをDockerコマンドでコピーしておきます。

コピーにはContainerのIDを使います。

ContainerのIDを使ったイメージの起動と、hostからContainerへのファイルのコピーについては,

ここをご参照ください

ダウンロードフォルダーへ移動しておきます。CONTAINER_IDは適宜書き換えてください。

sudo docker cp 11_vs_11_easy_stochastic CONTAINER_ID:/gfootball

1	sudo docker cp 11_vs_11_easy_stochastic CONTAINER_ID:/gfootball

In order to see the checkpoints playing

チェックポイントが再生されていることを確認するには

python3 -m gfootball.play_game --players "ppo2_cnn:left_players=1,policy=gfootball_impala_cnn,checkpoint=$CHECKPOINT" --level=$LEVEL

where $CHECKPOINT is the path to downloaded checkpoint.

ここで、$ CHECKPOINTはダウンロードされたチェックポイントへのパスです。

$LEVELはそのままでも動きます（デフォルト値があるようです、多分、空かnull）。

キックオフ

上の例にならって起動してみます。

sudo xhost +si:localuser:root

sudo docker start -i CONTAINER_ID

sudo xhost +si:localuser:root

sudo docker start -i CONTAINER_ID

python3 -m gfootball.play_game --players "ppo2_cnn:left_players=1,policy=gfootball_impala_cnn,checkpoint=/gfootball/11_vs_11_easy_stochastic" --level=$LEVEL

途中で終了するとDumpファイルが保存されます。

In order to train against a checkpoint, you can pass ‘extra_players’ argument to create_environment function.

チェックポイントに対してトレーニングするために、 ‘extra_players’引数をcreate_environment関数に渡すことができます。

例えば

extra_players='ppo2_cnn:right_players=1,policy=gfootball_impala_cnn,checkpoint=$CHECKPOINT'.

Keyboard mappings（キーボードへの割り当て）

↑：上に走る

↓：下に走る

←：左に走る

→：右に走る

S：攻撃モードではショートパス、防御モードではプレッシャー

A：攻撃モードではハイパス（ハイパント？）、防御モードではスライディング

D：攻撃モードではシュート、防御モードではチームプレッシャー

W：攻撃モードではロングパス、防御モードではゴールキーパーのプレッシャー

Q：アクティブなプレイヤーを防御モードに切り替えます

C：攻撃モードでのドリブル

E：スプリント

For possible options（利用可能なオプション）

Script allowing to play the game by multiple players.

複数のプレイヤーがゲームをプレイできるスクリプト。

play_game で使用できるオプションです。

flags:

/gfootball/gfootball/play_game.py:
  --action_set: <default|full>: Action set
    (default: 'default')
  --level: Level to play
    (default: '')
  --players: Semicolon separated list of players, single keyboard player on the
    left by default
    (default: 'keyboard:left_players=1')
  --[no]real_time: If true, environment will slow down so humans can play.
    (default: 'true')
  --[no]render: Whether to do game rendering.
    (default: 'true')

absl.app:
  -?,--[no]help: show this help
    (default: 'false')
  --[no]helpfull: show full help
    (default: 'false')
  --[no]helpshort: show this help
    (default: 'false')
  --[no]helpxml: like --helpfull, but generates XML output
    (default: 'false')
  --[no]only_check_args: Set to true to validate args and exit.
    (default: 'false')
  --[no]pdb: Alias for --pdb_post_mortem.
    (default: 'false')
  --[no]pdb_post_mortem: Set to true to handle uncaught exceptions with PDB post
    mortem.
    (default: 'false')
  --profile_file: Dump profile information to a file (for python -m pstats).
    Implies --run_with_profiling.
  --[no]run_with_pdb: Set to true for PDB debug mode
    (default: 'false')
  --[no]run_with_profiling: Set to true for profiling the script. Execution will
    be slower, and the output format might change over time.
    (default: 'false')
  --[no]use_cprofile_for_profiling: Use cProfile instead of the profile module
    for profiling. This has no effect unless --run_with_profiling is set.
    (default: 'true')

absl.logging:
  --[no]alsologtostderr: also log to stderr?
    (default: 'false')
  --log_dir: directory to write logfiles into
    (default: '')
  --logger_levels: Specify log level of loggers. The format is a CSV list of
    `name:level`. Where `name` is the logger name used with
    `logging.getLogger()`, and `level` is a level name  (INFO, DEBUG, etc). e.g.
    `myapp.foo:INFO,other.logger:DEBUG`
    (default: '')
  --[no]logtostderr: Should only log to stderr?
    (default: 'false')
  --[no]showprefixforinfo: If False, do not prepend prefix to info messages when
    it's logged to stderr, --verbosity is set to INFO level, and python logging
    is used.
    (default: 'true')
  --stderrthreshold: log messages at this level, or more severe, to stderr in
    addition to the logfile.  Possible values are 'debug', 'info', 'warning',
    'error', and 'fatal'.  Obsoletes --alsologtostderr. Using --alsologtostderr
    cancels the effect of this flag. Please also note that this flag is subject
    to --verbosity and requires logfile not be stderr.
    (default: 'fatal')
  -v,--verbosity: Logging verbosity level. Messages logged at this level or
    lower will be included. Set to 1 for debug logging. If the flag was not set
    or supplied, the value will be changed from the default of -1 (warning) to 0
    (info) after flags are parsed.
    (default: '-1')
    (an integer)

absl.flags:
  --flagfile: Insert flag definitions from the given file into the command line.
    (default: '')
  --undefok: comma-separated list of flag names that it is okay to specify on
    the command line even if the program does not define a flag with that name.
    IMPORTANT: flags in this list that have arguments MUST use the --flag=value
    format.
    (default: '')

/gfootball/gfootball/play_game.py:

--action_set: <default|full>: Action set

(default: 'default')

--level: Level to play

(default: '')

--players: Semicolon separated list of players, single keyboard player on the

left by default

(default: 'keyboard:left_players=1')

--[no]real_time: If true, environment will slow down so humans can play.

(default: 'true')

--[no]render: Whether to do game rendering.

(default: 'true')

absl.app:

-?,--[no]help: show this help

(default: 'false')

--[no]helpfull: show full help

(default: 'false')

--[no]helpshort: show this help

(default: 'false')

--[no]helpxml: like --helpfull, but generates XML output

(default: 'false')

--[no]only_check_args: Set to true to validate args and exit.

(default: 'false')

--[no]pdb: Alias for --pdb_post_mortem.

(default: 'false')

--[no]pdb_post_mortem: Set to true to handle uncaught exceptions with PDB post

mortem.

(default: 'false')

--profile_file: Dump profile information to a file (for python -m pstats).

Implies --run_with_profiling.

--[no]run_with_pdb: Set to true for PDB debug mode

(default: 'false')

--[no]run_with_profiling: Set to true for profiling the script. Execution will

be slower, and the output format might change over time.

(default: 'false')

--[no]use_cprofile_for_profiling: Use cProfile instead of the profile module

for profiling. This has no effect unless --run_with_profiling is set.

(default: 'true')

absl.logging:

--[no]alsologtostderr: also log to stderr?

(default: 'false')

--log_dir: directory to write logfiles into

(default: '')

--logger_levels: Specify log level of loggers. The format is a CSV list of

`name:level`. Where `name` is the logger name used with

`logging.getLogger()`, and `level` is a level name (INFO, DEBUG, etc). e.g.

`myapp.foo:INFO,other.logger:DEBUG`

(default: '')

--[no]logtostderr: Should only log to stderr?

(default: 'false')

--[no]showprefixforinfo: If False, do not prepend prefix to info messages when

it's logged to stderr, --verbosity is set to INFO level, and python logging

is used.

(default: 'true')

--stderrthreshold: log messages at this level, or more severe, to stderr in

addition to the logfile. Possible values are 'debug', 'info', 'warning',

'error', and 'fatal'. Obsoletes --alsologtostderr. Using --alsologtostderr

cancels the effect of this flag. Please also note that this flag is subject

to --verbosity and requires logfile not be stderr.

(default: 'fatal')

-v,--verbosity: Logging verbosity level. Messages logged at this level or

lower will be included. Set to 1 for debug logging. If the flag was not set

or supplied, the value will be changed from the default of -1 (warning) to 0

(info) after flags are parsed.

(default: '-1')

(an integer)

absl.flags:

--flagfile: Insert flag definitions from the given file into the command line.

(default: '')

--undefok: comma-separated list of flag names that it is okay to specify on

the command line even if the program does not define a flag with that name.

IMPORTANT: flags in this list that have arguments MUST use the --flag=value

format.

(default: '')

FRONT

地図と画像のサイト

Google Research Football の環境でゲームをしてみる

Keyboard mappings（キーボードへの割り当て）

Be the first to comment

Leave a Reply コメントをキャンセル