From 0ebefaefbb79bbf4f7c7e3343f1ac1f8efbed171 Mon Sep 17 00:00:00 2001 From: Haesun Park Date: Thu, 26 Apr 2018 15:23:26 +0900 Subject: [PATCH] =?UTF-8?q?16=EC=9E=A5=20=EC=8B=A4=ED=96=89?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- 16_reinforcement_learning.ipynb | 967 +++++++++++++++---------------- images/rl/MsPacman.png | Bin 0 -> 12332 bytes images/rl/cart_pole_plot.png | Bin 0 -> 14852 bytes images/rl/preprocessing_plot.png | Bin 0 -> 62244 bytes 4 files changed, 474 insertions(+), 493 deletions(-) create mode 100644 images/rl/MsPacman.png create mode 100644 images/rl/cart_pole_plot.png create mode 100644 images/rl/preprocessing_plot.png diff --git a/16_reinforcement_learning.ipynb b/16_reinforcement_learning.ipynb index 7019c7b..08641ac 100644 --- a/16_reinforcement_learning.ipynb +++ b/16_reinforcement_learning.ipynb @@ -1,54 +1,79 @@ { "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "**Chapter 16 – Reinforcement Learning**" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "This notebook contains all the sample code and solutions to the exersices in chapter 16." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Setup" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "First, let's make sure this notebook works well in both python 2 and 3, import a few common modules, ensure MatplotLib plots figures inline and prepare a function to save the figures:" - ] - }, { "cell_type": "code", "execution_count": 1, "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "CPython 3.5.5\n", + "IPython 6.3.0\n", + "\n", + "numpy 1.14.2\n", + "sklearn 0.19.1\n", + "scipy 1.0.1\n", + "matplotlib 2.2.2\n", + "tensorflow 1.7.0\n" + ] + } + ], + "source": [ + "%load_ext watermark\n", + "%watermark -v -p numpy,sklearn,scipy,matplotlib,tensorflow" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**16장 – 강화 학습**" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "_이 노트북은 15장에 있는 모든 샘플 코드와 연습문제 해답을 가지고 있습니다._" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# 설정" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "파이썬 2와 3을 모두 지원합니다. 공통 모듈을 임포트하고 맷플롯립 그림이 노트북 안에 포함되도록 설정하고 생성한 그림을 저장하기 위한 함수를 준비합니다:" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, "outputs": [], "source": [ - "# To support both python 2 and python 3\n", + "# 파이썬 2와 파이썬 3 지원\n", "from __future__ import division, print_function, unicode_literals\n", "\n", - "# Common imports\n", + "# 공통\n", "import numpy as np\n", "import os\n", "import sys\n", "\n", - "# to make this notebook's output stable across runs\n", + "# 일관된 출력을 위해 유사난수 초기화\n", "def reset_graph(seed=42):\n", " tf.reset_default_graph()\n", " tf.set_random_seed(seed)\n", " np.random.seed(seed)\n", "\n", - "# To plot pretty figures and animations\n", + "# 맷플롯립 설정\n", "%matplotlib nbagg\n", "import matplotlib\n", "import matplotlib.animation as animation\n", @@ -57,13 +82,16 @@ "plt.rcParams['xtick.labelsize'] = 12\n", "plt.rcParams['ytick.labelsize'] = 12\n", "\n", - "# Where to save the figures\n", + "# 한글출력\n", + "plt.rcParams['font.family'] = 'NanumBarunGothic'\n", + "plt.rcParams['axes.unicode_minus'] = False\n", + "\n", + "# 그림을 저장할 폴더\n", "PROJECT_ROOT_DIR = \".\"\n", "CHAPTER_ID = \"rl\"\n", "\n", "def save_fig(fig_id, tight_layout=True):\n", " path = os.path.join(PROJECT_ROOT_DIR, \"images\", CHAPTER_ID, fig_id + \".png\")\n", - " print(\"Saving figure\", fig_id)\n", " if tight_layout:\n", " plt.tight_layout()\n", " plt.savefig(path, format='png', dpi=300)" @@ -73,26 +101,19 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Note: there may be minor differences between the output of this notebook and the examples shown in the book. You can safely ignore these differences. They are mainly due to the fact that most of the environments provided by OpenAI gym have some randomness." + "# OpenAI 짐(gym)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "# Introduction to OpenAI gym" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "In this notebook we will be using [OpenAI gym](https://gym.openai.com/), a great toolkit for developing and comparing Reinforcement Learning algorithms. It provides many environments for your learning *agents* to interact with. Let's start by importing `gym`:" + "이 노트북에서는 강화 학습 알고리즘을 개발하고 비교할 수 있는 훌륭한 도구인 [OpenAI 짐(gym)](https://gym.openai.com/)을 사용합니다. 짐은 *에이전트*가 학습할 수 있는 많은 환경을 제공합니다. `gym`을 임포트해 보죠:" ] }, { "cell_type": "code", - "execution_count": 2, + "execution_count": 3, "metadata": {}, "outputs": [], "source": [ @@ -103,12 +124,12 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Next we will load the MsPacman environment, version 0." + "그다음 MsPacman 환경 버전 0을 로드합니다." ] }, { "cell_type": "code", - "execution_count": 3, + "execution_count": 4, "metadata": {}, "outputs": [], "source": [ @@ -119,12 +140,12 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Let's initialize the environment by calling is `reset()` method. This returns an observation:" + "`reset()` 메서드를 호출하여 환경을 초기화합니다. 이 메서드는 하나의 관측을 반환합니다:" ] }, { "cell_type": "code", - "execution_count": 4, + "execution_count": 5, "metadata": {}, "outputs": [], "source": [ @@ -135,12 +156,12 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Observations vary depending on the environment. In this case it is an RGB image represented as a 3D NumPy array of shape [width, height, channels] (with 3 channels: Red, Green and Blue). In other environments it may return different objects, as we will see later." + "관측은 환경마다 다릅니다. 여기에서는 [width, height, channels] 크기의 3D 넘파이 배열로 저장되어 있는 RGB 이미지입니다(채널은 3개로 빨강, 초록, 파랑입니다). 잠시 후에 보겠지만 다른 환경에서는 다른 오브젝트가 반환될 수 있습니다." ] }, { "cell_type": "code", - "execution_count": 5, + "execution_count": 6, "metadata": {}, "outputs": [ { @@ -149,7 +170,7 @@ "(210, 160, 3)" ] }, - "execution_count": 5, + "execution_count": 6, "metadata": {}, "output_type": "execute_result" } @@ -162,12 +183,12 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "An environment can be visualized by calling its `render()` method, and you can pick the rendering mode (the rendering options depend on the environment). In this example we will set `mode=\"rgb_array\"` to get an image of the environment as a NumPy array:" + "환경은 `render()` 메서드를 사용하여 화면에 나타낼 수 있고 렌더링 모드를 고를 수 있습니다(렌더링 옵션은 환경마다 다릅니다). 이 경우에는 `mode=\"rgb_array\"`로 지정해서 넘파이 배열로 환경에 대한 이미지를 받겠습니다:" ] }, { "cell_type": "code", - "execution_count": 6, + "execution_count": 7, "metadata": {}, "outputs": [], "source": [ @@ -178,15 +199,13 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Let's plot this image:" + "이미지를 그려보죠:" ] }, { "cell_type": "code", - "execution_count": 7, - "metadata": { - "scrolled": true - }, + "execution_count": 8, + "metadata": {}, "outputs": [ { "data": { @@ -765,7 +784,7 @@ " // Register the callback with on_msg.\n", " comm.on_msg(function(msg) {\n", " //console.log('receiving', msg['content']['data'], msg);\n", - " // Pass the mpl event to the overriden (by mpl) onmessage function.\n", + " // Pass the mpl event to the overridden (by mpl) onmessage function.\n", " ws.onmessage(msg['content']['data'])\n", " });\n", " return ws;\n", @@ -971,7 +990,7 @@ { "data": { "text/html": [ - "" + "" ], "text/plain": [ "" @@ -979,13 +998,6 @@ }, "metadata": {}, "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Saving figure MsPacman\n" - ] } ], "source": [ @@ -1000,19 +1012,19 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Welcome back to the 1980s! :)" + "1980년대로 돌아오신 걸 환영합니다! :)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "In this environment, the rendered image is simply equal to the observation (but in many environments this is not the case):" + "이 환경에서는 렌더링된 이미지가 관측과 동일합니다(하지만 많은 경우에 그렇지 않습니다):" ] }, { "cell_type": "code", - "execution_count": 8, + "execution_count": 9, "metadata": {}, "outputs": [ { @@ -1021,7 +1033,7 @@ "True" ] }, - "execution_count": 8, + "execution_count": 9, "metadata": {}, "output_type": "execute_result" } @@ -1034,17 +1046,17 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Let's create a little helper function to plot an environment:" + "환경을 그리기 위한 유틸리티 함수를 만들겠습니다:" ] }, { "cell_type": "code", - "execution_count": 9, + "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "def plot_environment(env, figsize=(5,4)):\n", - " plt.close() # or else nbagg sometimes plots in the previous cell\n", + " plt.close() # 이렇게 하지 않으면 nbagg 백엔드가 이전 그래프를 그릴 때가 있습니다\n", " plt.figure(figsize=figsize)\n", " img = env.render(mode=\"rgb_array\")\n", " plt.imshow(img)\n", @@ -1056,12 +1068,12 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Let's see how to interact with an environment. Your agent will need to select an action from an \"action space\" (the set of possible actions). Let's see what this environment's action space looks like:" + "환경을 어떻게 다루는지 보겠습니다. 에이전트는 \"행동 공간\"(가능한 행동의 모음)에서 하나의 행동을 선택합니다. 이 환경의 액션 공간을 다음과 같습니다:" ] }, { "cell_type": "code", - "execution_count": 10, + "execution_count": 11, "metadata": {}, "outputs": [ { @@ -1070,7 +1082,7 @@ "Discrete(9)" ] }, - "execution_count": 10, + "execution_count": 11, "metadata": {}, "output_type": "execute_result" } @@ -1083,40 +1095,40 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "`Discrete(9)` means that the possible actions are integers 0 through 8, which represents the 9 possible positions of the joystick (0=center, 1=up, 2=right, 3=left, 4=down, 5=upper-right, 6=upper-left, 7=lower-right, 8=lower-left)." + "`Discrete(9)`는 가능한 행동이 정수 0에서부터 8까지있다는 의미입니다. 이는 조이스틱의 9개의 위치(0=중앙, 1=위, 2=오른쪽, 3=왼쪽, 4=아래, 5=오른쪽위, 6=왼쪽위, 7=오른쪽아래, 8=왼쪽아래)에 해당합니다." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "Next we need to tell the environment which action to play, and it will compute the next step of the game. Let's go left for 110 steps, then lower left for 40 steps:" - ] - }, - { - "cell_type": "code", - "execution_count": 11, - "metadata": {}, - "outputs": [], - "source": [ - "env.reset()\n", - "for step in range(110):\n", - " env.step(3) #left\n", - "for step in range(40):\n", - " env.step(8) #lower-left" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Where are we now?" + "그다음 환경에게 플레이할 행동을 알려주고 게임의 다음 단계를 진행시킵니다. 왼쪽으로 110번을 진행하고 왼쪽아래로 40번을 진행해 보겠습니다:" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, + "outputs": [], + "source": [ + "env.reset()\n", + "for step in range(110):\n", + " env.step(3) #왼쪽\n", + "for step in range(40):\n", + " env.step(8) #왼쪽아래" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "어디에 있을까요?" + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "metadata": {}, "outputs": [ { "data": { @@ -1695,7 +1707,7 @@ " // Register the callback with on_msg.\n", " comm.on_msg(function(msg) {\n", " //console.log('receiving', msg['content']['data'], msg);\n", - " // Pass the mpl event to the overriden (by mpl) onmessage function.\n", + " // Pass the mpl event to the overridden (by mpl) onmessage function.\n", " ws.onmessage(msg['content']['data'])\n", " });\n", " return ws;\n", @@ -1901,7 +1913,7 @@ { "data": { "text/html": [ - "" + "" ], "text/plain": [ "" @@ -1919,12 +1931,12 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "The `step()` function actually returns several important objects:" + "사실 `step()` 함수는 여러 개의 중요한 객체를 반환해 줍니다:" ] }, { "cell_type": "code", - "execution_count": 13, + "execution_count": 14, "metadata": {}, "outputs": [], "source": [ @@ -1935,12 +1947,12 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "The observation tells the agent what the environment looks like, as discussed earlier. This is a 210x160 RGB image:" + "앞서 본 것처럼 관측은 보이는 환경을 설명합니다. 여기서는 210x160 RGB 이미지입니다:" ] }, { "cell_type": "code", - "execution_count": 14, + "execution_count": 15, "metadata": {}, "outputs": [ { @@ -1949,7 +1961,7 @@ "(210, 160, 3)" ] }, - "execution_count": 14, + "execution_count": 15, "metadata": {}, "output_type": "execute_result" } @@ -1962,12 +1974,12 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "The environment also tells the agent how much reward it got during the last step:" + "환경은 마지막 스텝에서 받을 수 있는 보상을 알려 줍니다:" ] }, { "cell_type": "code", - "execution_count": 15, + "execution_count": 16, "metadata": {}, "outputs": [ { @@ -1976,7 +1988,7 @@ "0.0" ] }, - "execution_count": 15, + "execution_count": 16, "metadata": {}, "output_type": "execute_result" } @@ -1989,12 +2001,12 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "When the game is over, the environment returns `done=True`:" + "게임이 종료되면 환경은 `done=True`를 반환합니다:" ] }, { "cell_type": "code", - "execution_count": 16, + "execution_count": 17, "metadata": {}, "outputs": [ { @@ -2003,7 +2015,7 @@ "False" ] }, - "execution_count": 16, + "execution_count": 17, "metadata": {}, "output_type": "execute_result" } @@ -2016,12 +2028,12 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Finally, `info` is an environment-specific dictionary that can provide some extra information about the internal state of the environment. This is useful for debugging, but your agent should not use this information for learning (it would be cheating)." + "마지막으로 `info`는 환경의 내부 상태에 관한 추가 정보를 제공하는 딕셔너리입니다. 디버깅에는 유용하지만 에이전트는 학습을 위해서 이 정보를 사용하면 안됩니다(학습이 아니고 속이는 셈이므로)." ] }, { "cell_type": "code", - "execution_count": 17, + "execution_count": 18, "metadata": {}, "outputs": [ { @@ -2030,7 +2042,7 @@ "{'ale.lives': 3}" ] }, - "execution_count": 17, + "execution_count": 18, "metadata": {}, "output_type": "execute_result" } @@ -2043,12 +2055,12 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Let's play one full game (with 3 lives), by moving in random directions for 10 steps at a time, recording each frame:" + "10번의 스텝마다 랜덤한 방향을 선택하는 식으로 전체 게임(3개의 팩맨)을 플레이하고 각 프레임을 저장해 보겠습니다:" ] }, { "cell_type": "code", - "execution_count": 18, + "execution_count": 19, "metadata": {}, "outputs": [], "source": [ @@ -2072,12 +2084,12 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Now show the animation (it's a bit jittery within Jupyter):" + "이제 애니메이션으로 한번 보죠(주피터에서는 조금 지글거립니다):" ] }, { "cell_type": "code", - "execution_count": 19, + "execution_count": 20, "metadata": {}, "outputs": [], "source": [ @@ -2086,7 +2098,7 @@ " return patch,\n", "\n", "def plot_animation(frames, repeat=False, interval=40):\n", - " plt.close() # or else nbagg sometimes plots in the previous cell\n", + " plt.close() # 이렇게 하지 않으면 nbagg 백엔드가 이전 그래프를 그릴 때가 있습니다\n", " fig = plt.figure()\n", " patch = plt.imshow(frames[0])\n", " plt.axis('off')\n", @@ -2095,7 +2107,7 @@ }, { "cell_type": "code", - "execution_count": 20, + "execution_count": 21, "metadata": {}, "outputs": [ { @@ -2675,7 +2687,7 @@ " // Register the callback with on_msg.\n", " comm.on_msg(function(msg) {\n", " //console.log('receiving', msg['content']['data'], msg);\n", - " // Pass the mpl event to the overriden (by mpl) onmessage function.\n", + " // Pass the mpl event to the overridden (by mpl) onmessage function.\n", " ws.onmessage(msg['content']['data'])\n", " });\n", " return ws;\n", @@ -2881,7 +2893,7 @@ { "data": { "text/html": [ - "" + "" ], "text/plain": [ "" @@ -2900,12 +2912,12 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Once you have finished playing with an environment, you should close it to free up resources:" + "환경을 더 이상 사용하지 않으면 환경을 종료하여 자원을 반납합니다:" ] }, { "cell_type": "code", - "execution_count": 21, + "execution_count": 22, "metadata": {}, "outputs": [], "source": [ @@ -2916,35 +2928,43 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "To code our first learning agent, we will be using a simpler environment: the Cart-Pole. " + "첫 번째 에이전트를 학습시키기 위해 간단한 Cart-Pole 환경을 사용하겠습니다." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "# A simple environment: the Cart-Pole" + "# 간단한 Cart-Pole 환경" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "The Cart-Pole is a very simple environment composed of a cart that can move left or right, and pole placed vertically on top of it. The agent must move the cart left or right to keep the pole upright." + "Cart-Pole은 아주 간단한 환경으로 왼쪽이나 오른쪽으로 움직일 수 있는 카트와 카트 위에 수직으로 서 있는 막대로 구성되어 있습니다. 에이전트는 카트를 왼쪽이나 오른쪽으로 움직여서 막대가 넘어지지 않도록 유지시켜야 합니다." ] }, { "cell_type": "code", - "execution_count": 22, + "execution_count": 26, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\u001b[33mWARN: gym.spaces.Box autodetected dtype as . Please provide explicit dtype.\u001b[0m\n" + ] + } + ], "source": [ "env = gym.make(\"CartPole-v0\")" ] }, { "cell_type": "code", - "execution_count": 23, + "execution_count": 27, "metadata": {}, "outputs": [], "source": [ @@ -2953,16 +2973,16 @@ }, { "cell_type": "code", - "execution_count": 24, + "execution_count": 28, "metadata": {}, "outputs": [ { "data": { "text/plain": [ - "array([ 0.0230852 , 0.03446783, -0.04351584, -0.04777627])" + "array([ 0.02573366, -0.02697052, 0.03177656, 0.02587927])" ] }, - "execution_count": 24, + "execution_count": 28, "metadata": {}, "output_type": "execute_result" } @@ -2975,30 +2995,30 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "The observation is a 1D NumPy array composed of 4 floats: they represent the cart's horizontal position, its velocity, the angle of the pole (0 = vertical), and the angular velocity. Let's render the environment... unfortunately we need to fix an annoying rendering issue first." + "관측은 4개의 부동소수로 구성된 1D 넘파이 배열입니다. 각각 카트의 수평 위치, 속도, 막대의 각도(0=수직), 각속도를 나타냅니다. 이 환경을 렌더링하려면 먼저 몇 가지 이슈를 해결해야 합니다." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "## Fixing the rendering issue" + "## 렌더링 이슈 해결하기" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "Some environments (including the Cart-Pole) require access to your display, which opens up a separate window, even if you specify the `rgb_array` mode. In general you can safely ignore that window. However, if Jupyter is running on a headless server (ie. without a screen) it will raise an exception. One way to avoid this is to install a fake X server like Xvfb. You can start Jupyter using the `xvfb-run` command:\n", + "일부 환경(Cart-Pole을 포함하여)은 `rgb_array` 모드를 설정하더라도 별도의 창을 띄우기 위해 디스플레이 접근이 필수적입니다. 일반적으로 이 창을 무시하면 됩니다. 주피터가 헤드리스(headless) 서버로 (즉 스크린이 없이) 실행중이면 예외가 발생합니다. 이를 피하는 한가지 방법은 Xvfb 같은 가짜 X 서버를 설치하는 것입니다. `xvfb-run` 명령을 사용해 주피터를 실행합니다:\n", "\n", " $ xvfb-run -s \"-screen 0 1400x900x24\" jupyter notebook\n", - "\n", - "If Jupyter is running on a headless server but you don't want to worry about Xvfb, then you can just use the following rendering function for the Cart-Pole:" + " \n", + "주피터가 헤드리스 서버로 실행 중이지만 Xvfb를 설치하기 번거롭다면 Cart-Pole에 대해서는 다음 렌더링 함수를 사용할 수 있습니다:" ] }, { "cell_type": "code", - "execution_count": 25, + "execution_count": 29, "metadata": {}, "outputs": [], "source": [ @@ -3006,16 +3026,16 @@ "\n", "try:\n", " from pyglet.gl import gl_info\n", - " openai_cart_pole_rendering = True # no problem, let's use OpenAI gym's rendering function\n", + " openai_cart_pole_rendering = True # 문제없음, OpenAI 짐의 렌더링 함수를 사용합니다\n", "except Exception:\n", - " openai_cart_pole_rendering = False # probably no X server available, let's use our own rendering function\n", + " openai_cart_pole_rendering = False # 가능한 X 서버가 없다면, 자체 렌더링 함수를 사용합니다\n", "\n", "def render_cart_pole(env, obs):\n", " if openai_cart_pole_rendering:\n", - " # use OpenAI gym's rendering function\n", + " # OpenAI 짐의 렌더링 함수를 사용합니다\n", " return env.render(mode=\"rgb_array\")\n", " else:\n", - " # rendering for the cart pole environment (in case OpenAI gym can't do it)\n", + " # Cart-Pole 환경을 위한 렌더링 (OpenAI 짐이 처리할 수 없는 경우)\n", " img_w = 600\n", " img_h = 400\n", " cart_w = img_w // 12\n", @@ -3025,8 +3045,8 @@ " x_width = 2\n", " max_ang = 0.2\n", " bg_col = (255, 255, 255)\n", - " cart_col = 0x000000 # Blue Green Red\n", - " pole_col = 0x669acc # Blue Green Red\n", + " cart_col = 0x000000 # 파랑 초록 빨강\n", + " pole_col = 0x669acc # 파랑 초록 빨강\n", "\n", " pos, vel, ang, ang_vel = obs\n", " img = Image.new('RGB', (img_w, img_h), bg_col)\n", @@ -3041,7 +3061,7 @@ " return np.array(img)\n", "\n", "def plot_cart_pole(env, obs):\n", - " plt.close() # or else nbagg sometimes plots in the previous cell\n", + " plt.close() # 이렇게 하지 않으면 nbagg 백엔드가 이전 그래프를 그릴 때가 있습니다\n", " img = render_cart_pole(env, obs)\n", " plt.imshow(img)\n", " plt.axis(\"off\")\n", @@ -3050,7 +3070,16 @@ }, { "cell_type": "code", - "execution_count": 26, + "execution_count": 32, + "metadata": {}, + "outputs": [], + "source": [ + "openai_cart_pole_rendering = True" + ] + }, + { + "cell_type": "code", + "execution_count": 33, "metadata": {}, "outputs": [ { @@ -3630,7 +3659,7 @@ " // Register the callback with on_msg.\n", " comm.on_msg(function(msg) {\n", " //console.log('receiving', msg['content']['data'], msg);\n", - " // Pass the mpl event to the overriden (by mpl) onmessage function.\n", + " // Pass the mpl event to the overridden (by mpl) onmessage function.\n", " ws.onmessage(msg['content']['data'])\n", " });\n", " return ws;\n", @@ -3836,7 +3865,7 @@ { "data": { "text/html": [ - "" + "" ], "text/plain": [ "" @@ -3854,12 +3883,12 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Now let's look at the action space:" + "행동 공간을 확인해 보겠습니다:" ] }, { "cell_type": "code", - "execution_count": 27, + "execution_count": 34, "metadata": {}, "outputs": [ { @@ -3868,7 +3897,7 @@ "Discrete(2)" ] }, - "execution_count": 27, + "execution_count": 34, "metadata": {}, "output_type": "execute_result" } @@ -3881,12 +3910,12 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Yep, just two possible actions: accelerate towards the left or towards the right. Let's push the cart left until the pole falls:" + "네 딱 두 개의 행동이 있네요. 왼쪽이나 오른쪽 방향으로 가속합니다. 막대가 넘어지기 전까지 카트를 왼쪽으로 밀어보죠:" ] }, { "cell_type": "code", - "execution_count": 28, + "execution_count": 35, "metadata": {}, "outputs": [], "source": [ @@ -3899,7 +3928,7 @@ }, { "cell_type": "code", - "execution_count": 29, + "execution_count": 36, "metadata": {}, "outputs": [ { @@ -4479,7 +4508,7 @@ " // Register the callback with on_msg.\n", " comm.on_msg(function(msg) {\n", " //console.log('receiving', msg['content']['data'], msg);\n", - " // Pass the mpl event to the overriden (by mpl) onmessage function.\n", + " // Pass the mpl event to the overridden (by mpl) onmessage function.\n", " ws.onmessage(msg['content']['data'])\n", " });\n", " return ws;\n", @@ -4685,7 +4714,7 @@ { "data": { "text/html": [ - "" + "" ], "text/plain": [ "" @@ -4693,17 +4722,10 @@ }, "metadata": {}, "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Saving figure cart_pole_plot\n" - ] } ], "source": [ - "plt.close() # or else nbagg sometimes plots in the previous cell\n", + "plt.close() # 이렇게 하지 않으면 nbagg 백엔드가 이전 그래프를 그릴 때가 있습니다\n", "img = render_cart_pole(env, obs)\n", "plt.imshow(img)\n", "plt.axis(\"off\")\n", @@ -4734,12 +4756,12 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Notice that the game is over when the pole tilts too much, not when it actually falls. Now let's reset the environment and push the cart to right instead:" + "막대가 실제로 넘어지지 않더라도 너무 기울어지면 게임이 끝납니다. 환경을 다시 초기화하고 이번에는 오른쪽으로 밀어보겠습니다:" ] }, { "cell_type": "code", - "execution_count": 31, + "execution_count": 37, "metadata": {}, "outputs": [], "source": [ @@ -4752,7 +4774,7 @@ }, { "cell_type": "code", - "execution_count": 32, + "execution_count": 38, "metadata": {}, "outputs": [ { @@ -5332,7 +5354,7 @@ " // Register the callback with on_msg.\n", " comm.on_msg(function(msg) {\n", " //console.log('receiving', msg['content']['data'], msg);\n", - " // Pass the mpl event to the overriden (by mpl) onmessage function.\n", + " // Pass the mpl event to the overridden (by mpl) onmessage function.\n", " ws.onmessage(msg['content']['data'])\n", " });\n", " return ws;\n", @@ -5538,7 +5560,7 @@ { "data": { "text/html": [ - "" + "" ], "text/plain": [ "" @@ -5556,26 +5578,26 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Looks like it's doing what we're telling it to do. Now how can we make the poll remain upright? We will need to define a _policy_ for that. This is the strategy that the agent will use to select an action at each step. It can use all the past actions and observations to decide what to do." + "아까 말했던 것과 같은 상황인 것 같습니다. 어떻게 막대가 똑 바로 서있게 만들 수 있을까요? 이를 위한 *정책*을 만들어야 합니다. 이 정책은 에이전트가 각 스텝에서 행동을 선택하기 위해 사용할 전략입니다. 어떤 행동을 할지 결정하기 위해 지난 행동이나 관측을 사용할 수 있습니다." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "# A simple hard-coded policy" + "# 하드 코딩 정책" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "Let's hard code a simple strategy: if the pole is tilting to the left, then push the cart to the left, and _vice versa_. Let's see if that works:" + "간단한 정책을 하드 코딩해 보겠습니다. 막대가 왼쪽으로 기울어지면 카트를 왼쪽으로 밀고 반대의 경우는 오른쪽으로 밉니다. 작동이 되는지 확인해 보죠:" ] }, { "cell_type": "code", - "execution_count": 33, + "execution_count": 39, "metadata": {}, "outputs": [], "source": [ @@ -5603,7 +5625,7 @@ }, { "cell_type": "code", - "execution_count": 34, + "execution_count": 40, "metadata": {}, "outputs": [ { @@ -6183,7 +6205,7 @@ " // Register the callback with on_msg.\n", " comm.on_msg(function(msg) {\n", " //console.log('receiving', msg['content']['data'], msg);\n", - " // Pass the mpl event to the overriden (by mpl) onmessage function.\n", + " // Pass the mpl event to the overridden (by mpl) onmessage function.\n", " ws.onmessage(msg['content']['data'])\n", " });\n", " return ws;\n", @@ -6389,7 +6411,7 @@ { "data": { "text/html": [ - "" + "" ], "text/plain": [ "" @@ -6408,57 +6430,55 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Nope, the system is unstable and after just a few wobbles, the pole ends up too tilted: game over. We will need to be smarter than that!" + "아니네요, 불안정해서 몇 번 움직이고 막대가 너무 기울어져 게임이 끝났습니다. 더 똑똑한 정책이 필요합니다!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "# Neural Network Policies" + "# 신경망 정책" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "Let's create a neural network that will take observations as inputs, and output the action to take for each observation. To choose an action, the network will first estimate a probability for each action, then select an action randomly according to the estimated probabilities. In the case of the Cart-Pole environment, there are just two possible actions (left or right), so we only need one output neuron: it will output the probability `p` of the action 0 (left), and of course the probability of action 1 (right) will be `1 - p`." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Note: instead of using the `fully_connected()` function from the `tensorflow.contrib.layers` module (as in the book), we now use the `dense()` function from the `tf.layers` module, which did not exist when this chapter was written. This is preferable because anything in contrib may change or be deleted without notice, while `tf.layers` is part of the official API. As you will see, the code is mostly the same.\n", - "\n", - "The main differences relevant to this chapter are:\n", - "* the `_fn` suffix was removed in all the parameters that had it (for example the `activation_fn` parameter was renamed to `activation`).\n", - "* the `weights` parameter was renamed to `kernel`,\n", - "* the default activation is `None` instead of `tf.nn.relu`" + "관측을 입력으로 받고 각 관측에 대해 선택할 행동을 출력하는 신경망을 만들어 보겠습니다. 행동을 선택하기 위해 네트워크는 먼저 각 행동에 대한 확률을 추정하고 그다음 추정된 확률을 기반으로 랜덤하게 행동을 선택합니다. Cart-Pole 환경의 경우에는 두 개의 행동(왼쪽과 오른쪽)이 있으므로 하나의 출력 뉴런만 있으면 됩니다. 행동 0(왼쪽)에 대한 확률 `p`를 출력할 것입니다. 행동 1(오른쪽)에 대한 확률은 `1 - p`가 됩니다." ] }, { "cell_type": "code", - "execution_count": 35, + "execution_count": 41, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "WARNING:tensorflow:From /home/haesun/anaconda3/envs/handson-ml/lib/python3.5/site-packages/tensorflow/contrib/learn/python/learn/datasets/base.py:198: retry (from tensorflow.contrib.learn.python.learn.datasets.base) is deprecated and will be removed in a future version.\n", + "Instructions for updating:\n", + "Use the retry module or similar alternatives.\n" + ] + } + ], "source": [ "import tensorflow as tf\n", "\n", - "# 1. Specify the network architecture\n", + "# 1. 네트워크 구조를 설정합니다\n", "n_inputs = 4 # == env.observation_space.shape[0]\n", - "n_hidden = 4 # it's a simple task, we don't need more than this\n", - "n_outputs = 1 # only outputs the probability of accelerating left\n", + "n_hidden = 4 # 간단한 작업이므로 너무 많은 뉴런이 필요하지 않습니다\n", + "n_outputs = 1 # 왼쪽으로 가속할 확률을 출력합니다\n", "initializer = tf.contrib.layers.variance_scaling_initializer()\n", "\n", - "# 2. Build the neural network\n", + "# 2. 네트워크를 만듭니다\n", "X = tf.placeholder(tf.float32, shape=[None, n_inputs])\n", "hidden = tf.layers.dense(X, n_hidden, activation=tf.nn.elu,\n", " kernel_initializer=initializer)\n", "outputs = tf.layers.dense(hidden, n_outputs, activation=tf.nn.sigmoid,\n", " kernel_initializer=initializer)\n", "\n", - "# 3. Select a random action based on the estimated probabilities\n", + "# 3. 추정된 확률을 기반으로 랜덤하게 행동을 선택합니다\n", "p_left_and_right = tf.concat(axis=1, values=[outputs, 1 - outputs])\n", "action = tf.multinomial(tf.log(p_left_and_right), num_samples=1)\n", "\n", @@ -6469,26 +6489,26 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "In this particular environment, the past actions and observations can safely be ignored, since each observation contains the environment's full state. If there were some hidden state then you may need to consider past actions and observations in order to try to infer the hidden state of the environment. For example, if the environment only revealed the position of the cart but not its velocity, you would have to consider not only the current observation but also the previous observation in order to estimate the current velocity. Another example is if the observations are noisy: you may want to use the past few observations to estimate the most likely current state. Our problem is thus as simple as can be: the current observation is noise-free and contains the environment's full state." + "이 환경은 각 관측이 환경의 모든 상태를 포함하고 있기 때문에 지난 행동과 관측은 무시해도 괜찮습니다. 숨겨진 상태가 있다면 이 정보를 추측하기 위해 이전 행동과 상태를 고려해야 합니다. 예를 들어, 속도가 없고 카트의 위치만 있다면 현재 속도를 예측하기 위해 현재의 관측뿐만 아니라 이전 관측도 고려해야 합니다. 관측에 잡음이 있을 때도 같은 경우입니다. 현재 상태를 근사하게 추정하기 위해 과거 몇 개의 관측을 사용하는 것이 좋을 것입니다. 이 문제는 아주 간단해서 현재 관측에 잡음이 없고 환경의 모든 상태가 담겨 있습니다." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "You may wonder why we are picking a random action based on the probability given by the policy network, rather than just picking the action with the highest probability. This approach lets the agent find the right balance between _exploring_ new actions and _exploiting_ the actions that are known to work well. Here's an analogy: suppose you go to a restaurant for the first time, and all the dishes look equally appealing so you randomly pick one. If it turns out to be good, you can increase the probability to order it next time, but you shouldn't increase that probability to 100%, or else you will never try out the other dishes, some of which may be even better than the one you tried." + "정책 네트워크에서 만든 확률을 기반으로 가장 높은 확률을 가진 행동을 고르지 않고 왜 랜덤하게 행동을 선택하는지 궁금할 수 있습니다. 이런 방식이 에이전트가 새 행동을 *탐험*하는 것과 잘 동작하는 행동을 *이용*하는 것 사이에 균형을 맞추게 합니다. 만약 어떤 레스토랑에 처음 방문했다고 가정합시다. 모든 메뉴에 대한 선호도가 동일하므로 랜덤하게 하나를 고릅니다. 이 메뉴가 맛이 좋았다면 다음에 이를 주문할 가능성을 높일 것입니다. 하지만 100% 확률이 되어서는 안됩니다. 그렇지 않으면 다른 메뉴를 전혀 선택하지 않게 되고 더 좋을 수 있는 메뉴를 시도해 보지 못하게 됩니다." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "Let's randomly initialize this policy neural network and use it to play one game:" + "정책 신경망을 랜덤하게 초기화하고 게임 하나를 플레이해 보겠습니다:" ] }, { "cell_type": "code", - "execution_count": 36, + "execution_count": 42, "metadata": {}, "outputs": [], "source": [ @@ -6513,12 +6533,12 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Now let's look at how well this randomly initialized policy network performed:" + "랜덤하게 초기화한 정책 네트워크가 얼마나 잘 동작하는지 확인해 보겠습니다:" ] }, { "cell_type": "code", - "execution_count": 37, + "execution_count": 43, "metadata": {}, "outputs": [ { @@ -7098,7 +7118,7 @@ " // Register the callback with on_msg.\n", " comm.on_msg(function(msg) {\n", " //console.log('receiving', msg['content']['data'], msg);\n", - " // Pass the mpl event to the overriden (by mpl) onmessage function.\n", + " // Pass the mpl event to the overridden (by mpl) onmessage function.\n", " ws.onmessage(msg['content']['data'])\n", " });\n", " return ws;\n", @@ -7304,7 +7324,7 @@ { "data": { "text/html": [ - "" + "" ], "text/plain": [ "" @@ -7323,12 +7343,12 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Yeah... pretty bad. The neural network will have to learn to do better. First let's see if it is capable of learning the basic policy we used earlier: go left if the pole is tilting left, and go right if it is tilting right. The following code defines the same neural network but we add the target probabilities `y`, and the training operations (`cross_entropy`, `optimizer` and `training_op`):" + "음.. 별로 좋지 않네요. 신경망이 더 잘 학습되어야 합니다. 먼저 앞서 사용한 기본 정책을 학습할 수 있는지 확인해 보겠습니다. 막대가 왼쪽으로 기울어지면 왼쪽으로 움직이고 오른쪽으로 기울어지면 오른쪽으로 이동해야 합니다. 다음 코드는 같은 신경망이지만 타깃 확률 `y`와 훈련 연산(`cross_entropy`, `optimizer`, `training_op`)을 추가했습니다:" ] }, { "cell_type": "code", - "execution_count": 38, + "execution_count": 44, "metadata": {}, "outputs": [], "source": [ @@ -7349,7 +7369,7 @@ "\n", "hidden = tf.layers.dense(X, n_hidden, activation=tf.nn.elu, kernel_initializer=initializer)\n", "logits = tf.layers.dense(hidden, n_outputs)\n", - "outputs = tf.nn.sigmoid(logits) # probability of action 0 (left)\n", + "outputs = tf.nn.sigmoid(logits) # 행동 0(왼쪽)에 대한 확률\n", "p_left_and_right = tf.concat(axis=1, values=[outputs, 1 - outputs])\n", "action = tf.multinomial(tf.log(p_left_and_right), num_samples=1)\n", "\n", @@ -7365,14 +7385,31 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "We can make the same net play in 10 different environments in parallel, and train for 1000 iterations. We also reset environments when they are done." + "동일한 네트워크를 동시에 10개의 다른 환경에서 플레이하고 1,000번 반복동안 훈련시키겠습니다. 완료되면 환경을 리셋합니다." ] }, { "cell_type": "code", - "execution_count": 39, + "execution_count": 45, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\u001b[33mWARN: gym.spaces.Box autodetected dtype as . Please provide explicit dtype.\u001b[0m\n", + "\u001b[33mWARN: gym.spaces.Box autodetected dtype as . Please provide explicit dtype.\u001b[0m\n", + "\u001b[33mWARN: gym.spaces.Box autodetected dtype as . Please provide explicit dtype.\u001b[0m\n", + "\u001b[33mWARN: gym.spaces.Box autodetected dtype as . Please provide explicit dtype.\u001b[0m\n", + "\u001b[33mWARN: gym.spaces.Box autodetected dtype as . Please provide explicit dtype.\u001b[0m\n", + "\u001b[33mWARN: gym.spaces.Box autodetected dtype as . Please provide explicit dtype.\u001b[0m\n", + "\u001b[33mWARN: gym.spaces.Box autodetected dtype as . Please provide explicit dtype.\u001b[0m\n", + "\u001b[33mWARN: gym.spaces.Box autodetected dtype as . Please provide explicit dtype.\u001b[0m\n", + "\u001b[33mWARN: gym.spaces.Box autodetected dtype as . Please provide explicit dtype.\u001b[0m\n", + "\u001b[33mWARN: gym.spaces.Box autodetected dtype as . Please provide explicit dtype.\u001b[0m\n" + ] + } + ], "source": [ "n_environments = 10\n", "n_iterations = 1000\n", @@ -7383,7 +7420,7 @@ "with tf.Session() as sess:\n", " init.run()\n", " for iteration in range(n_iterations):\n", - " target_probas = np.array([([1.] if obs[2] < 0 else [0.]) for obs in observations]) # if angle<0 we want proba(left)=1., or else proba(left)=0.\n", + " target_probas = np.array([([1.] if obs[2] < 0 else [0.]) for obs in observations]) # angle<0 이면 proba(left)=1. 이 되어야 하고 그렇지 않으면 proba(left)=0. 이 되어야 합니다\n", " action_val, _ = sess.run([action, training_op], feed_dict={X: np.array(observations), y: target_probas})\n", " for env_index, env in enumerate(envs):\n", " obs, reward, done, info = env.step(action_val[env_index][0])\n", @@ -7396,7 +7433,7 @@ }, { "cell_type": "code", - "execution_count": 40, + "execution_count": 46, "metadata": {}, "outputs": [], "source": [ @@ -7419,13 +7456,14 @@ }, { "cell_type": "code", - "execution_count": 41, + "execution_count": 47, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ + "\u001b[33mWARN: gym.spaces.Box autodetected dtype as . Please provide explicit dtype.\u001b[0m\n", "INFO:tensorflow:Restoring parameters from ./my_policy_net_basic.ckpt\n" ] }, @@ -8006,7 +8044,7 @@ " // Register the callback with on_msg.\n", " comm.on_msg(function(msg) {\n", " //console.log('receiving', msg['content']['data'], msg);\n", - " // Pass the mpl event to the overriden (by mpl) onmessage function.\n", + " // Pass the mpl event to the overridden (by mpl) onmessage function.\n", " ws.onmessage(msg['content']['data'])\n", " });\n", " return ws;\n", @@ -8212,7 +8250,7 @@ { "data": { "text/html": [ - "" + "" ], "text/plain": [ "" @@ -8232,28 +8270,28 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Looks like it learned the policy correctly. Now let's see if it can learn a better policy on its own." + "정책을 잘 학습한 것 같네요. 이제 스스로 더 나은 정책을 학습할 수 있는지 알아 보겠습니다." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "# Policy Gradients" + "# 정책 그래디언트" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "To train this neural network we will need to define the target probabilities `y`. If an action is good we should increase its probability, and conversely if it is bad we should reduce it. But how do we know whether an action is good or bad? The problem is that most actions have delayed effects, so when you win or lose points in a game, it is not clear which actions contributed to this result: was it just the last action? Or the last 10? Or just one action 50 steps earlier? This is called the _credit assignment problem_.\n", + "신경망을 훈련하기 위해 타깃 확률 `y`를 정의할 필요가 있습니다. 행동이 좋다면 이 확률을 증가시켜야 하고 반대로 나쁘면 이를 감소시켜야 합니다. 하지만 행동이 좋은지 나쁜지 어떻게 알 수 있을까요? 대부분의 행동으로 인한 영향은 뒤늦게 나타나는 것이 문제입니다. 게임에서 이기거나 질 때 어떤 행동이 이런 결과에 영향을 미쳤는지 명확하지 않습니다. 마지막 행동일까요? 아니면 마지막 10개의 행동일까요? 아니면 50번 스텝 앞의 행동일까요? 이를 *신용 할당 문제*라고 합니다.\n", "\n", - "The _Policy Gradients_ algorithm tackles this problem by first playing multiple games, then making the actions in good games slightly more likely, while actions in bad games are made slightly less likely. First we play, then we go back and think about what we did." + "*정책 그래디언트* 알고리즘은 먼저 여러번 게임을 플레이하고 성공한 게임에서의 행동을 조금 더 높게 실패한 게임에서는 조금 더 낮게 되도록 하여 이 문제를 해결합니다. 먼저 게임을 진행해 보고 다시 어떻게 한 것인지 살펴 보겠습니다." ] }, { "cell_type": "code", - "execution_count": 42, + "execution_count": 48, "metadata": {}, "outputs": [], "source": [ @@ -8273,7 +8311,7 @@ "\n", "hidden = tf.layers.dense(X, n_hidden, activation=tf.nn.elu, kernel_initializer=initializer)\n", "logits = tf.layers.dense(hidden, n_outputs)\n", - "outputs = tf.nn.sigmoid(logits) # probability of action 0 (left)\n", + "outputs = tf.nn.sigmoid(logits) # 행동 0(왼쪽)에 대한 확률\n", "p_left_and_right = tf.concat(axis=1, values=[outputs, 1 - outputs])\n", "action = tf.multinomial(tf.log(p_left_and_right), num_samples=1)\n", "\n", @@ -8296,7 +8334,7 @@ }, { "cell_type": "code", - "execution_count": 43, + "execution_count": 49, "metadata": {}, "outputs": [], "source": [ @@ -8318,7 +8356,7 @@ }, { "cell_type": "code", - "execution_count": 44, + "execution_count": 50, "metadata": {}, "outputs": [ { @@ -8327,7 +8365,7 @@ "array([-22., -40., -50.])" ] }, - "execution_count": 44, + "execution_count": 50, "metadata": {}, "output_type": "execute_result" } @@ -8338,17 +8376,17 @@ }, { "cell_type": "code", - "execution_count": 45, + "execution_count": 51, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[array([-0.28435071, -0.86597718, -1.18910299]),\n", - " array([ 1.26665318, 1.0727777 ])]" + " array([1.26665318, 1.0727777 ])]" ] }, - "execution_count": 45, + "execution_count": 51, "metadata": {}, "output_type": "execute_result" } @@ -8359,14 +8397,15 @@ }, { "cell_type": "code", - "execution_count": 46, + "execution_count": 52, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ - "Iteration: 249" + "\u001b[33mWARN: gym.spaces.Box autodetected dtype as . Please provide explicit dtype.\u001b[0m\n", + "반복: 249" ] } ], @@ -8382,7 +8421,7 @@ "with tf.Session() as sess:\n", " init.run()\n", " for iteration in range(n_iterations):\n", - " print(\"\\rIteration: {}\".format(iteration), end=\"\")\n", + " print(\"\\r반복: {}\".format(iteration), end=\"\")\n", " all_rewards = []\n", " all_gradients = []\n", " for game in range(n_games_per_update):\n", @@ -8413,7 +8452,7 @@ }, { "cell_type": "code", - "execution_count": 47, + "execution_count": 53, "metadata": {}, "outputs": [], "source": [ @@ -8422,13 +8461,14 @@ }, { "cell_type": "code", - "execution_count": 48, + "execution_count": 54, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ + "\u001b[33mWARN: gym.spaces.Box autodetected dtype as . Please provide explicit dtype.\u001b[0m\n", "INFO:tensorflow:Restoring parameters from ./my_policy_net_pg.ckpt\n" ] }, @@ -9009,7 +9049,7 @@ " // Register the callback with on_msg.\n", " comm.on_msg(function(msg) {\n", " //console.log('receiving', msg['content']['data'], msg);\n", - " // Pass the mpl event to the overriden (by mpl) onmessage function.\n", + " // Pass the mpl event to the overridden (by mpl) onmessage function.\n", " ws.onmessage(msg['content']['data'])\n", " });\n", " return ws;\n", @@ -9215,7 +9255,7 @@ { "data": { "text/html": [ - "" + "" ], "text/plain": [ "" @@ -9235,44 +9275,44 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "# Markov Chains" + "# 마르코프 연쇄" ] }, { "cell_type": "code", - "execution_count": 49, + "execution_count": 55, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ - "States: 0 0 3 \n", - "States: 0 1 2 1 2 1 2 1 2 1 3 \n", - "States: 0 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 3 \n", - "States: 0 3 \n", - "States: 0 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 3 \n", - "States: 0 1 3 \n", - "States: 0 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 ...\n", - "States: 0 0 3 \n", - "States: 0 0 0 1 2 1 2 1 3 \n", - "States: 0 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 3 \n" + "상태: 0 0 3 \n", + "상태: 0 1 2 1 2 1 2 1 2 1 3 \n", + "상태: 0 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 3 \n", + "상태: 0 3 \n", + "상태: 0 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 3 \n", + "상태: 0 1 3 \n", + "상태: 0 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 ...\n", + "상태: 0 0 3 \n", + "상태: 0 0 0 1 2 1 2 1 3 \n", + "상태: 0 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 3 \n" ] } ], "source": [ "transition_probabilities = [\n", - " [0.7, 0.2, 0.0, 0.1], # from s0 to s0, s1, s2, s3\n", - " [0.0, 0.0, 0.9, 0.1], # from s1 to ...\n", - " [0.0, 1.0, 0.0, 0.0], # from s2 to ...\n", - " [0.0, 0.0, 0.0, 1.0], # from s3 to ...\n", + " [0.7, 0.2, 0.0, 0.1], # s0에서 s0, s1, s2, s3으로\n", + " [0.0, 0.0, 0.9, 0.1], # s1에서 ...\n", + " [0.0, 1.0, 0.0, 0.0], # s2에서 ...\n", + " [0.0, 0.0, 0.0, 1.0], # s3에서 ...\n", " ]\n", "\n", "n_max_steps = 50\n", "\n", "def print_sequence(start_state=0):\n", " current_state = start_state\n", - " print(\"States:\", end=\" \")\n", + " print(\"상태:\", end=\" \")\n", " for step in range(n_max_steps):\n", " print(current_state, end=\" \")\n", " if current_state == 3:\n", @@ -9290,12 +9330,12 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "# Markov Decision Process" + "# 마르코프 결정 과정" ] }, { "cell_type": "code", - "execution_count": 50, + "execution_count": 56, "metadata": {}, "outputs": [ { @@ -9303,35 +9343,35 @@ "output_type": "stream", "text": [ "policy_fire\n", - "States (+rewards): 0 (10) 0 (10) 0 1 (-50) 2 2 2 (40) 0 (10) 0 (10) 0 (10) ... Total rewards = 210\n", - "States (+rewards): 0 1 (-50) 2 (40) 0 (10) 0 (10) 0 1 (-50) 2 2 (40) 0 (10) ... Total rewards = 70\n", - "States (+rewards): 0 (10) 0 1 (-50) 2 (40) 0 (10) 0 (10) 0 (10) 0 (10) 0 (10) 0 (10) ... Total rewards = 70\n", - "States (+rewards): 0 1 (-50) 2 1 (-50) 2 (40) 0 (10) 0 1 (-50) 2 (40) 0 ... Total rewards = -10\n", - "States (+rewards): 0 1 (-50) 2 (40) 0 (10) 0 (10) 0 1 (-50) 2 (40) 0 (10) 0 (10) ... Total rewards = 290\n", - "Summary: mean=121.1, std=129.333766, min=-330, max=470\n", + "상태 (+보상): 0 (10) 0 (10) 0 1 (-50) 2 2 2 (40) 0 (10) 0 (10) 0 (10) ... 전체 보상 = 210\n", + "상태 (+보상): 0 1 (-50) 2 (40) 0 (10) 0 (10) 0 1 (-50) 2 2 (40) 0 (10) ... 전체 보상 = 70\n", + "상태 (+보상): 0 (10) 0 1 (-50) 2 (40) 0 (10) 0 (10) 0 (10) 0 (10) 0 (10) 0 (10) ... 전체 보상 = 70\n", + "상태 (+보상): 0 1 (-50) 2 1 (-50) 2 (40) 0 (10) 0 1 (-50) 2 (40) 0 ... 전체 보상 = -10\n", + "상태 (+보상): 0 1 (-50) 2 (40) 0 (10) 0 (10) 0 1 (-50) 2 (40) 0 (10) 0 (10) ... 전체 보상 = 290\n", + "요약: 평균=121.1, 표준 편차=129.333766, 최소=-330, 최대=470\n", "\n", "policy_random\n", - "States (+rewards): 0 1 (-50) 2 1 (-50) 2 (40) 0 1 (-50) 2 2 (40) 0 ... Total rewards = -60\n", - "States (+rewards): 0 (10) 0 0 0 0 0 (10) 0 0 0 (10) 0 ... Total rewards = -30\n", - "States (+rewards): 0 1 1 (-50) 2 (40) 0 0 1 1 1 1 ... Total rewards = 10\n", - "States (+rewards): 0 (10) 0 (10) 0 0 0 0 1 (-50) 2 (40) 0 0 ... Total rewards = 0\n", - "States (+rewards): 0 0 (10) 0 1 (-50) 2 (40) 0 0 0 0 (10) 0 (10) ... Total rewards = 40\n", - "Summary: mean=-22.1, std=88.152740, min=-380, max=200\n", + "상태 (+보상): 0 1 (-50) 2 1 (-50) 2 (40) 0 1 (-50) 2 2 (40) 0 ... 전체 보상 = -60\n", + "상태 (+보상): 0 (10) 0 0 0 0 0 (10) 0 0 0 (10) 0 ... 전체 보상 = -30\n", + "상태 (+보상): 0 1 1 (-50) 2 (40) 0 0 1 1 1 1 ... 전체 보상 = 10\n", + "상태 (+보상): 0 (10) 0 (10) 0 0 0 0 1 (-50) 2 (40) 0 0 ... 전체 보상 = 0\n", + "상태 (+보상): 0 0 (10) 0 1 (-50) 2 (40) 0 0 0 0 (10) 0 (10) ... 전체 보상 = 40\n", + "요약: 평균=-22.1, 표준 편차=88.152740, 최소=-380, 최대=200\n", "\n", "policy_safe\n", - "States (+rewards): 0 1 1 1 1 1 1 1 1 1 ... Total rewards = 0\n", - "States (+rewards): 0 1 1 1 1 1 1 1 1 1 ... Total rewards = 0\n", - "States (+rewards): 0 (10) 0 (10) 0 (10) 0 1 1 1 1 1 1 ... Total rewards = 30\n", - "States (+rewards): 0 (10) 0 1 1 1 1 1 1 1 1 ... Total rewards = 10\n", - "States (+rewards): 0 1 1 1 1 1 1 1 1 1 ... Total rewards = 0\n", - "Summary: mean=22.3, std=26.244312, min=0, max=170\n", + "상태 (+보상): 0 1 1 1 1 1 1 1 1 1 ... 전체 보상 = 0\n", + "상태 (+보상): 0 1 1 1 1 1 1 1 1 1 ... 전체 보상 = 0\n", + "상태 (+보상): 0 (10) 0 (10) 0 (10) 0 1 1 1 1 1 1 ... 전체 보상 = 30\n", + "상태 (+보상): 0 (10) 0 1 1 1 1 1 1 1 1 ... 전체 보상 = 10\n", + "상태 (+보상): 0 1 1 1 1 1 1 1 1 1 ... 전체 보상 = 0\n", + "요약: 평균=22.3, 표준 편차=26.244312, 최소=0, 최대=170\n", "\n" ] } ], "source": [ "transition_probabilities = [\n", - " [[0.7, 0.3, 0.0], [1.0, 0.0, 0.0], [0.8, 0.2, 0.0]], # in s0, if action a0 then proba 0.7 to state s0 and 0.3 to state s1, etc.\n", + " [[0.7, 0.3, 0.0], [1.0, 0.0, 0.0], [0.8, 0.2, 0.0]], # s0에서, 행동 a0이 선택되면 0.7의 확률로 상태 s0로 가고 0.3의 확률로 상태 s1로 가는 식입니다.\n", " [[0.0, 1.0, 0.0], None, [0.0, 0.0, 1.0]],\n", " [None, [0.8, 0.1, 0.1], None],\n", " ]\n", @@ -9370,7 +9410,7 @@ "def run_episode(policy, n_steps, start_state=0, display=True):\n", " env = MDPEnvironment()\n", " if display:\n", - " print(\"States (+rewards):\", end=\" \")\n", + " print(\"상태 (+보상):\", end=\" \")\n", " for step in range(n_steps):\n", " if display:\n", " if step == 10:\n", @@ -9383,7 +9423,7 @@ " if reward:\n", " print(\"({})\".format(reward), end=\" \")\n", " if display:\n", - " print(\"Total rewards =\", env.total_rewards)\n", + " print(\"전체 보상 =\", env.total_rewards)\n", " return env.total_rewards\n", "\n", "for policy in (policy_fire, policy_random, policy_safe):\n", @@ -9391,7 +9431,7 @@ " print(policy.__name__)\n", " for episode in range(1000):\n", " all_totals.append(run_episode(policy, n_steps=100, display=(episode<5)))\n", - " print(\"Summary: mean={:.1f}, std={:1f}, min={}, max={}\".format(np.mean(all_totals), np.std(all_totals), np.min(all_totals), np.max(all_totals)))\n", + " print(\"요약: 평균={:.1f}, 표준 편차={:1f}, 최소={}, 최대={}\".format(np.mean(all_totals), np.std(all_totals), np.min(all_totals), np.max(all_totals)))\n", " print()" ] }, @@ -9399,19 +9439,19 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "# Q-Learning" + "# Q-러닝" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "Q-Learning works by watching an agent play (e.g., randomly) and gradually improving its estimates of the Q-Values. Once it has accurate Q-Value estimates (or close enough), then the optimal policy consists in choosing the action that has the highest Q-Value (i.e., the greedy policy)." + "Q-러닝은 에이전트가 플레이하는 것(가령, 랜덤하게)을 보고 점진적으로 Q-가치 추정을 향상시킵니다. 정확한 (또는 충분히 이에 가까운) Q-가치가 추정되면 최적의 정책은 가장 높은 Q-가치(즉, 그리디 정책)를 가진 행동을 선택하는 것이 됩니다." ] }, { "cell_type": "code", - "execution_count": 51, + "execution_count": 57, "metadata": {}, "outputs": [], "source": [ @@ -9430,13 +9470,13 @@ " action = exploration_policy(env.state)\n", " state = env.state\n", " next_state, reward = env.step(action)\n", - " next_value = np.max(q_values[next_state]) # greedy policy\n", + " next_value = np.max(q_values[next_state]) # 그리디한 정책\n", " q_values[state, action] = (1-alpha)*q_values[state, action] + alpha*(reward + gamma * next_value)" ] }, { "cell_type": "code", - "execution_count": 52, + "execution_count": 58, "metadata": {}, "outputs": [], "source": [ @@ -9446,18 +9486,18 @@ }, { "cell_type": "code", - "execution_count": 53, + "execution_count": 59, "metadata": {}, "outputs": [ { "data": { "text/plain": [ - "array([[ 39.13508139, 38.88079412, 35.23025716],\n", - " [ 18.9117071 , -inf, 20.54567816],\n", - " [ -inf, 72.53192111, -inf]])" + "array([[39.13508139, 38.88079412, 35.23025716],\n", + " [18.9117071 , -inf, 20.54567816],\n", + " [ -inf, 72.53192111, -inf]])" ] }, - "execution_count": 53, + "execution_count": 59, "metadata": {}, "output_type": "execute_result" } @@ -9468,19 +9508,19 @@ }, { "cell_type": "code", - "execution_count": 54, + "execution_count": 60, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ - "States (+rewards): 0 (10) 0 (10) 0 1 (-50) 2 (40) 0 (10) 0 1 (-50) 2 (40) 0 (10) ... Total rewards = 230\n", - "States (+rewards): 0 (10) 0 (10) 0 (10) 0 1 (-50) 2 2 1 (-50) 2 (40) 0 (10) ... Total rewards = 90\n", - "States (+rewards): 0 1 (-50) 2 (40) 0 (10) 0 (10) 0 (10) 0 (10) 0 (10) 0 (10) 0 (10) ... Total rewards = 170\n", - "States (+rewards): 0 1 (-50) 2 (40) 0 (10) 0 (10) 0 (10) 0 (10) 0 (10) 0 (10) 0 (10) ... Total rewards = 220\n", - "States (+rewards): 0 1 (-50) 2 (40) 0 (10) 0 1 (-50) 2 (40) 0 (10) 0 (10) 0 (10) ... Total rewards = -50\n", - "Summary: mean=125.6, std=127.363464, min=-290, max=500\n", + "상태 (+보상): 0 (10) 0 (10) 0 1 (-50) 2 (40) 0 (10) 0 1 (-50) 2 (40) 0 (10) ... 전체 보상 = 230\n", + "상태 (+보상): 0 (10) 0 (10) 0 (10) 0 1 (-50) 2 2 1 (-50) 2 (40) 0 (10) ... 전체 보상 = 90\n", + "상태 (+보상): 0 1 (-50) 2 (40) 0 (10) 0 (10) 0 (10) 0 (10) 0 (10) 0 (10) 0 (10) ... 전체 보상 = 170\n", + "상태 (+보상): 0 1 (-50) 2 (40) 0 (10) 0 (10) 0 (10) 0 (10) 0 (10) 0 (10) 0 (10) ... 전체 보상 = 220\n", + "상태 (+보상): 0 1 (-50) 2 (40) 0 (10) 0 1 (-50) 2 (40) 0 (10) 0 (10) 0 (10) ... 전체 보상 = -50\n", + "요약: 평균=125.6, 표준 편차=127.363464, 최소=-290, 최대=500\n", "\n" ] } @@ -9489,7 +9529,7 @@ "all_totals = []\n", "for episode in range(1000):\n", " all_totals.append(run_episode(optimal_policy, n_steps=100, display=(episode<5)))\n", - "print(\"Summary: mean={:.1f}, std={:1f}, min={}, max={}\".format(np.mean(all_totals), np.std(all_totals), np.min(all_totals), np.max(all_totals)))\n", + "print(\"요약: 평균={:.1f}, 표준 편차={:1f}, 최소={}, 최대={}\".format(np.mean(all_totals), np.std(all_totals), np.min(all_totals), np.max(all_totals)))\n", "print()" ] }, @@ -9497,40 +9537,19 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "# Learning to Play MsPacman Using the DQN Algorithm" + "# DQN 알고리즘으로 미스팩맨 게임 학습하기" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "**Warning**: Unfortunately, the first version of the book contained two important errors in this section.\n", - "\n", - "1. The actor DQN and critic DQN should have been named _online DQN_ and _target DQN_ respectively. Actor-critic algorithms are a distinct class of algorithms.\n", - "2. The online DQN is the one that learns and is copied to the target DQN at regular intervals. The target DQN's only role is to estimate the next state's Q-Values for each possible action. This is needed to compute the target Q-Values for training the online DQN, as shown in this equation:\n", - "\n", - "$y(s,a) = \\text{r} + \\gamma . \\underset{a'}{\\max} \\, Q_\\text{target}(s', a')$\n", - "\n", - "* $y(s,a)$ is the target Q-Value to train the online DQN for the state-action pair $(s, a)$.\n", - "* $r$ is the reward actually collected after playing action $a$ in state $s$.\n", - "* $\\gamma$ is the discount rate.\n", - "* $s'$ is the state actually reached after played action $a$ in state $s$.\n", - "* $a'$ is one of the possible actions in state $s'$.\n", - "* $Q_\\text{target}(s', a')$ is the target DQN's estimate of the Q-Value of playing action $a'$ while in state $s'$.\n", - "\n", - "I hope these errors did not affect you, and if they did, I sincerely apologize." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Creating the MsPacman environment" + "## 미스팩맨 환경 만들기" ] }, { "cell_type": "code", - "execution_count": 55, + "execution_count": 61, "metadata": {}, "outputs": [ { @@ -9539,7 +9558,7 @@ "(210, 160, 3)" ] }, - "execution_count": 55, + "execution_count": 61, "metadata": {}, "output_type": "execute_result" } @@ -9552,7 +9571,7 @@ }, { "cell_type": "code", - "execution_count": 56, + "execution_count": 62, "metadata": {}, "outputs": [ { @@ -9561,7 +9580,7 @@ "Discrete(9)" ] }, - "execution_count": 56, + "execution_count": 62, "metadata": {}, "output_type": "execute_result" } @@ -9574,29 +9593,29 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## Preprocessing" + "## 전처리" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "Preprocessing the images is optional but greatly speeds up training." + "이미지 전처리는 선택 사항이지만 훈련 속도를 크게 높여 줍니다." ] }, { "cell_type": "code", - "execution_count": 57, + "execution_count": 63, "metadata": {}, "outputs": [], "source": [ "mspacman_color = 210 + 164 + 74\n", "\n", "def preprocess_observation(obs):\n", - " img = obs[1:176:2, ::2] # crop and downsize\n", - " img = img.sum(axis=2) # to greyscale\n", - " img[img==mspacman_color] = 0 # Improve contrast\n", - " img = (img // 3 - 128).astype(np.int8) # normalize from -128 to 127\n", + " img = obs[1:176:2, ::2] # 자르고 크기를 줄입니다.\n", + " img = img.sum(axis=2) # 흑백 스케일로 변환합니다.\n", + " img[img==mspacman_color] = 0 # 대비를 높입니다.\n", + " img = (img // 3 - 128).astype(np.int8) # -128~127 사이로 정규화합니다.\n", " return img.reshape(88, 80, 1)\n", "\n", "img = preprocess_observation(obs)" @@ -9606,12 +9625,12 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Note: the `preprocess_observation()` function is slightly different from the one in the book: instead of representing pixels as 64-bit floats from -1.0 to 1.0, it represents them as signed bytes (from -128 to 127). The benefit is that the replay memory will take up roughly 8 times less RAM (about 6.5 GB instead of 52 GB). The reduced precision has no visible impact on training." + "노트 `preprocess_observation()` 함수가 책에 있는 것과 조금 다릅니다. 64비트 부동소수를 -1.0~1.0 사이로 나타내지 않고 부호있는 바이트(-128~127 사이)로 표현합니다. 이렇게 하는 이유는 재생 메모리가 약 8배나 적게 소모되기 때문입니다(52GB에서 6.5GB로). 정밀도를 감소시켜도 눈에 띄이게 훈련에 미치는 영향은 없습니다." ] }, { "cell_type": "code", - "execution_count": 58, + "execution_count": 66, "metadata": {}, "outputs": [ { @@ -10191,7 +10210,7 @@ " // Register the callback with on_msg.\n", " comm.on_msg(function(msg) {\n", " //console.log('receiving', msg['content']['data'], msg);\n", - " // Pass the mpl event to the overriden (by mpl) onmessage function.\n", + " // Pass the mpl event to the overridden (by mpl) onmessage function.\n", " ws.onmessage(msg['content']['data'])\n", " });\n", " return ws;\n", @@ -10397,7 +10416,7 @@ { "data": { "text/html": [ - "" + "" ], "text/plain": [ "" @@ -10405,23 +10424,16 @@ }, "metadata": {}, "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Saving figure preprocessing_plot\n" - ] } ], "source": [ - "plt.figure(figsize=(11, 7))\n", + "plt.figure(figsize=(10, 6))\n", "plt.subplot(121)\n", - "plt.title(\"Original observation (160×210 RGB)\")\n", + "plt.title(\"원본 관측 (160×210 RGB)\")\n", "plt.imshow(obs)\n", "plt.axis(\"off\")\n", "plt.subplot(122)\n", - "plt.title(\"Preprocessed observation (88×80 greyscale)\")\n", + "plt.title(\"전처리된 관측 (88×80 그레이스케일)\")\n", "plt.imshow(img.reshape(88, 80), interpolation=\"nearest\", cmap=\"gray\")\n", "plt.axis(\"off\")\n", "save_fig(\"preprocessing_plot\")\n", @@ -10432,25 +10444,12 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## Build DQN" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Note: instead of using `tf.contrib.layers.convolution2d()` or `tf.contrib.layers.conv2d()` (as in the first version of the book), we now use the `tf.layers.conv2d()`, which did not exist when this chapter was written. This is preferable because anything in contrib may change or be deleted without notice, while `tf.layers` is part of the official API. As you will see, the code is mostly the same, except that the parameter names have changed slightly:\n", - "* the `num_outputs` parameter was renamed to `filters`,\n", - "* the `stride` parameter was renamed to `strides`,\n", - "* the `_fn` suffix was removed from parameter names that had it (e.g., `activation_fn` was renamed to `activation`),\n", - "* the `weights_initializer` parameter was renamed to `kernel_initializer`,\n", - "* the weights variable was renamed to `\"kernel\"` (instead of `\"weights\"`), and the biases variable was renamed from `\"biases\"` to `\"bias\"`,\n", - "* and the default `activation` is now `None` instead of `tf.nn.relu`." + "## DQN 만들기" ] }, { "cell_type": "code", - "execution_count": 59, + "execution_count": 67, "metadata": {}, "outputs": [], "source": [ @@ -10464,14 +10463,14 @@ "conv_strides = [4, 2, 1]\n", "conv_paddings = [\"SAME\"] * 3 \n", "conv_activation = [tf.nn.relu] * 3\n", - "n_hidden_in = 64 * 11 * 10 # conv3 has 64 maps of 11x10 each\n", + "n_hidden_in = 64 * 11 * 10 # conv3은 11x10 크기의 64개의 맵을 가집니다\n", "n_hidden = 512\n", "hidden_activation = tf.nn.relu\n", - "n_outputs = env.action_space.n # 9 discrete actions are available\n", + "n_outputs = env.action_space.n # 9개의 행동이 가능합니다\n", "initializer = tf.contrib.layers.variance_scaling_initializer()\n", "\n", "def q_network(X_state, name):\n", - " prev_layer = X_state / 128.0 # scale pixel intensities to the [-1.0, 1.0] range.\n", + " prev_layer = X_state / 128.0 # 픽셀 강도를 [-1.0, 1.0] 범위로 스케일 변경합니다.\n", " with tf.variable_scope(name) as scope:\n", " for n_maps, kernel_size, strides, padding, activation in zip(\n", " conv_n_maps, conv_kernel_sizes, conv_strides,\n", @@ -10495,7 +10494,7 @@ }, { "cell_type": "code", - "execution_count": 60, + "execution_count": 68, "metadata": {}, "outputs": [], "source": [ @@ -10511,7 +10510,7 @@ }, { "cell_type": "code", - "execution_count": 61, + "execution_count": 69, "metadata": {}, "outputs": [ { @@ -10529,7 +10528,7 @@ " '/dense_1/kernel:0': }" ] }, - "execution_count": 61, + "execution_count": 69, "metadata": {}, "output_type": "execute_result" } @@ -10540,9 +10539,19 @@ }, { "cell_type": "code", - "execution_count": 62, + "execution_count": 70, "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "WARNING:tensorflow:From :8: calling reduce_sum (from tensorflow.python.ops.math_ops) with keep_dims is deprecated and will be removed in a future version.\n", + "Instructions for updating:\n", + "keep_dims is deprecated, use keepdims instead\n" + ] + } + ], "source": [ "learning_rate = 0.001\n", "momentum = 0.95\n", @@ -10551,7 +10560,7 @@ " X_action = tf.placeholder(tf.int32, shape=[None])\n", " y = tf.placeholder(tf.float32, shape=[None, 1])\n", " q_value = tf.reduce_sum(online_q_values * tf.one_hot(X_action, n_outputs),\n", - " axis=1, keep_dims=True)\n", + " axis=1, keepdims=True)\n", " error = tf.abs(y - q_value)\n", " clipped_error = tf.clip_by_value(error, 0.0, 1.0)\n", " linear_error = 2 * (error - clipped_error)\n", @@ -10565,16 +10574,9 @@ "saver = tf.train.Saver()" ] }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Note: in the first version of the book, the loss function was simply the squared error between the target Q-Values (`y`) and the estimated Q-Values (`q_value`). However, because the experiences are very noisy, it is better to use a quadratic loss only for small errors (below 1.0) and a linear loss (twice the absolute error) for larger errors, which is what the code above computes. This way large errors don't push the model parameters around as much. Note that we also tweaked some hyperparameters (using a smaller learning rate, and using Nesterov Accelerated Gradients rather than Adam optimization, since adaptive gradient algorithms may sometimes be bad, according to this [paper](https://arxiv.org/abs/1705.08292)). We also tweaked a few other hyperparameters below (a larger replay memory, longer decay for the $\\epsilon$-greedy policy, larger discount rate, less frequent copies of the online DQN to the target DQN, etc.)." - ] - }, { "cell_type": "code", - "execution_count": 63, + "execution_count": 71, "metadata": {}, "outputs": [], "source": [ @@ -10585,7 +10587,7 @@ "\n", "def sample_memories(batch_size):\n", " indices = np.random.permutation(len(replay_memory))[:batch_size]\n", - " cols = [[], [], [], [], []] # state, action, reward, next_state, continue\n", + " cols = [[], [], [], [], []] # 상태, 행동, 보상, 다음 상태, 종료 여부\n", " for idx in indices:\n", " memory = replay_memory[idx]\n", " for col, value in zip(cols, memory):\n", @@ -10596,7 +10598,7 @@ }, { "cell_type": "code", - "execution_count": 64, + "execution_count": 72, "metadata": {}, "outputs": [], "source": [ @@ -10607,40 +10609,40 @@ "def epsilon_greedy(q_values, step):\n", " epsilon = max(eps_min, eps_max - (eps_max-eps_min) * step/eps_decay_steps)\n", " if np.random.rand() < epsilon:\n", - " return np.random.randint(n_outputs) # random action\n", + " return np.random.randint(n_outputs) # 랜덤 행동\n", " else:\n", - " return np.argmax(q_values) # optimal action" + " return np.argmax(q_values) # 최적 행동" ] }, { "cell_type": "code", - "execution_count": 65, + "execution_count": 73, "metadata": {}, "outputs": [], "source": [ - "n_steps = 4000000 # total number of training steps\n", - "training_start = 10000 # start training after 10,000 game iterations\n", - "training_interval = 4 # run a training step every 4 game iterations\n", - "save_steps = 1000 # save the model every 1,000 training steps\n", - "copy_steps = 10000 # copy online DQN to target DQN every 10,000 training steps\n", + "n_steps = 4000000 # 전체 훈련 스텝 횟수\n", + "training_start = 10000 # 10,000번 게임을 반복한 후에 훈련을 시작합니다\n", + "training_interval = 4 # 4번 게임을 반복하고 훈련 스텝을 실행합니다\n", + "save_steps = 1000 # 1,000번 훈련 스텝마다 모델을 저장합니다\n", + "copy_steps = 10000 # 10,000번 훈련 스텝마다 온라인 DQN을 타깃 DQN으로 복사합니다\n", "discount_rate = 0.99\n", - "skip_start = 90 # Skip the start of every game (it's just waiting time).\n", + "skip_start = 90 # 게임의 시작 부분은 스킵합니다 (시간 낭비이므로).\n", "batch_size = 50\n", - "iteration = 0 # game iterations\n", + "iteration = 0 # 게임 반복횟수\n", "checkpoint_path = \"./my_dqn.ckpt\"\n", - "done = True # env needs to be reset" + "done = True # 환경을 리셋해야 합니다" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "A few variables for tracking progress:" + "학습 과정을 트래킹하기 위해 몇 개의 변수가 필요합니다:" ] }, { "cell_type": "code", - "execution_count": 66, + "execution_count": 74, "metadata": {}, "outputs": [], "source": [ @@ -10654,34 +10656,32 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "And now the main training loop!" + "이제 훈련 반복 루프입니다!" ] }, { "cell_type": "code", - "execution_count": 67, + "execution_count": 78, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ - "INFO:tensorflow:Restoring parameters from ./my_dqn.ckpt\n" + "INFO:tensorflow:Restoring parameters from ./my_dqn.ckpt\n", + "반복 2427324\t훈련 스텝 603778/4000000 (15.1)%\t손실 4.245887\t평균 최대-Q 69.582249 " ] }, { - "name": "stderr", - "output_type": "stream", - "text": [ - "[2017-09-25 13:55:15,610] Restoring parameters from ./my_dqn.ckpt\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\r", - "Iteration 1270171\tTraining step 315001/4000000 (7.9)%\tLoss 2.651937\tMean Max-Q 30.964941 " + "ename": "KeyboardInterrupt", + "evalue": "", + "output_type": "error", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mKeyboardInterrupt\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[1;32m 43\u001b[0m \u001b[0;31m# 메모리에서 샘플링하여 타깃 Q-가치를 얻기 위해 타깃 DQN을 사용합니다\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 44\u001b[0m X_state_val, X_action_val, rewards, X_next_state_val, continues = (\n\u001b[0;32m---> 45\u001b[0;31m sample_memories(batch_size))\n\u001b[0m\u001b[1;32m 46\u001b[0m next_q_values = target_q_values.eval(\n\u001b[1;32m 47\u001b[0m feed_dict={X_state: X_next_state_val})\n", + "\u001b[0;32m\u001b[0m in \u001b[0;36msample_memories\u001b[0;34m(batch_size)\u001b[0m\n\u001b[1;32m 5\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 6\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0msample_memories\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mbatch_size\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 7\u001b[0;31m \u001b[0mindices\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mrandom\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mpermutation\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlen\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mreplay_memory\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0mbatch_size\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 8\u001b[0m \u001b[0mcols\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m[\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m[\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m[\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m[\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m[\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m]\u001b[0m \u001b[0;31m# 상태, 행동, 보상, 다음 상태, 종료 여부\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 9\u001b[0m \u001b[0;32mfor\u001b[0m \u001b[0midx\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mindices\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;31mKeyboardInterrupt\u001b[0m: " ] } ], @@ -10697,27 +10697,27 @@ " if step >= n_steps:\n", " break\n", " iteration += 1\n", - " print(\"\\rIteration {}\\tTraining step {}/{} ({:.1f})%\\tLoss {:5f}\\tMean Max-Q {:5f} \".format(\n", + " print(\"\\r반복 {}\\t훈련 스텝 {}/{} ({:.1f})%\\t손실 {:5f}\\t평균 최대-Q {:5f} \".format(\n", " iteration, step, n_steps, step * 100 / n_steps, loss_val, mean_max_q), end=\"\")\n", - " if done: # game over, start again\n", + " if done: # 게임이 종료되면 다시 시작합니다\n", " obs = env.reset()\n", - " for skip in range(skip_start): # skip the start of each game\n", + " for skip in range(skip_start): # 게임 시작 부분은 스킵합니다\n", " obs, reward, done, info = env.step(0)\n", " state = preprocess_observation(obs)\n", "\n", - " # Online DQN evaluates what to do\n", + " # 온라인 DQN이 해야할 행동을 평가합니다\n", " q_values = online_q_values.eval(feed_dict={X_state: [state]})\n", " action = epsilon_greedy(q_values, step)\n", "\n", - " # Online DQN plays\n", + " # 온라인 DQN으로 게임을 플레이합니다.\n", " obs, reward, done, info = env.step(action)\n", " next_state = preprocess_observation(obs)\n", "\n", - " # Let's memorize what happened\n", + " # 재생 메모리에 기록합니다\n", " replay_memory.append((state, action, reward, next_state, 1.0 - done))\n", " state = next_state\n", "\n", - " # Compute statistics for tracking progress (not shown in the book)\n", + " # 트래킹을 위해 통계값을 계산합니다 (책에는 없습니다)\n", " total_max_q += q_values.max()\n", " game_length += 1\n", " if done:\n", @@ -10726,9 +10726,9 @@ " game_length = 0\n", "\n", " if iteration < training_start or iteration % training_interval != 0:\n", - " continue # only train after warmup period and at regular intervals\n", + " continue # 워밍엄 시간이 지난 후에 일정 간격으로 훈련합니다\n", " \n", - " # Sample memories and use the target DQN to produce the target Q-Value\n", + " # 메모리에서 샘플링하여 타깃 Q-가치를 얻기 위해 타깃 DQN을 사용합니다\n", " X_state_val, X_action_val, rewards, X_next_state_val, continues = (\n", " sample_memories(batch_size))\n", " next_q_values = target_q_values.eval(\n", @@ -10736,15 +10736,15 @@ " max_next_q_values = np.max(next_q_values, axis=1, keepdims=True)\n", " y_val = rewards + continues * discount_rate * max_next_q_values\n", "\n", - " # Train the online DQN\n", + " # 온라인 DQN을 훈련시킵니다\n", " _, loss_val = sess.run([training_op, loss], feed_dict={\n", " X_state: X_state_val, X_action: X_action_val, y: y_val})\n", "\n", - " # Regularly copy the online DQN to the target DQN\n", + " # 온라인 DQN을 타깃 DQN으로 일정 간격마다 복사합니다\n", " if step % copy_steps == 0:\n", " copy_online_to_target.run()\n", "\n", - " # And save regularly\n", + " # 일정 간격으로 저장합니다\n", " if step % save_steps == 0:\n", " saver.save(sess, checkpoint_path)" ] @@ -10753,12 +10753,12 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "You can interrupt the cell above at any time to test your agent using the cell below. You can then run the cell above once again, it will load the last parameters saved and resume training." + "아래 셀에서 에이전트를 테스트하기 위해 언제든지 위의 셀을 중지할 수 있습니다. 그런다음 다시 위의 셀을 실행하면 마지막으로 저장된 파라미터를 로드하여 훈련을 다시 시작할 것입니다." ] }, { "cell_type": "code", - "execution_count": 68, + "execution_count": 79, "metadata": {}, "outputs": [ { @@ -10767,13 +10767,6 @@ "text": [ "INFO:tensorflow:Restoring parameters from ./my_dqn.ckpt\n" ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "[2017-09-25 13:53:39,307] Restoring parameters from ./my_dqn.ckpt\n" - ] } ], "source": [ @@ -10787,11 +10780,11 @@ " for step in range(n_max_steps):\n", " state = preprocess_observation(obs)\n", "\n", - " # Online DQN evaluates what to do\n", + " # 온라인 DQN이 해야할 행동을 평가합니다\n", " q_values = online_q_values.eval(feed_dict={X_state: [state]})\n", " action = np.argmax(q_values)\n", "\n", - " # Online DQN plays\n", + " # 온라인 DQN이 게임을 플레이합니다\n", " obs, reward, done, info = env.step(action)\n", "\n", " img = env.render(mode=\"rgb_array\")\n", @@ -10803,10 +10796,8 @@ }, { "cell_type": "code", - "execution_count": 69, - "metadata": { - "scrolled": true - }, + "execution_count": 80, + "metadata": {}, "outputs": [ { "data": { @@ -10890,7 +10881,7 @@ " };\n", "\n", " this.imageObj.onunload = function() {\n", - " this.ws.close();\n", + " fig.ws.close();\n", " }\n", "\n", " this.ws.onmessage = this._make_on_message_function(this);\n", @@ -11385,7 +11376,7 @@ " // Register the callback with on_msg.\n", " comm.on_msg(function(msg) {\n", " //console.log('receiving', msg['content']['data'], msg);\n", - " // Pass the mpl event to the overriden (by mpl) onmessage function.\n", + " // Pass the mpl event to the overridden (by mpl) onmessage function.\n", " ws.onmessage(msg['content']['data'])\n", " });\n", " return ws;\n", @@ -11537,9 +11528,12 @@ " // Check for shift+enter\n", " if (event.shiftKey && event.which == 13) {\n", " this.canvas_div.blur();\n", - " // select the cell after this one\n", - " var index = IPython.notebook.find_cell_index(this.cell_info[0]);\n", - " IPython.notebook.select(index + 1);\n", + " event.shiftKey = false;\n", + " // Send a \"J\" for go to next cell\n", + " event.which = 74;\n", + " event.keyCode = 74;\n", + " manager.command_mode();\n", + " manager.handle_keydown(event);\n", " }\n", "}\n", "\n", @@ -11588,7 +11582,7 @@ { "data": { "text/html": [ - "" + "" ], "text/plain": [ "" @@ -11600,10 +11594,10 @@ { "data": { "text/plain": [ - "" + "" ] }, - "execution_count": 70, + "execution_count": 80, "metadata": {}, "output_type": "execute_result" } @@ -11616,49 +11610,39 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "# Extra material" + "# 추가 자료" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "## Preprocessing for Breakout" + "## 브레이크아웃(Breakout)을 위한 전처리" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "Here is a preprocessing function you can use to train a DQN for the Breakout-v0 Atari game:" + "다음은 Breakout-v0 아타리 게임을 위한 DQN을 훈련시키기 위해 사용할 수 있는 전처리 함수입니다:" ] }, { "cell_type": "code", - "execution_count": 71, - "metadata": { - "collapsed": true - }, + "execution_count": 81, + "metadata": {}, "outputs": [], "source": [ "def preprocess_observation(obs):\n", - " img = obs[34:194:2, ::2] # crop and downsize\n", + " img = obs[34:194:2, ::2] # 자르고 크기를 줄입니다.\n", " return np.mean(img, axis=2).reshape(80, 80) / 255.0" ] }, { "cell_type": "code", - "execution_count": 72, + "execution_count": 82, "metadata": {}, - "outputs": [ - { - "name": "stderr", - "output_type": "stream", - "text": [ - "[2017-09-25 13:54:27,989] Making new env: Breakout-v0\n" - ] - } - ], + "outputs": [], "source": [ "env = gym.make(\"Breakout-v0\")\n", "obs = env.reset()\n", @@ -11670,7 +11654,7 @@ }, { "cell_type": "code", - "execution_count": 73, + "execution_count": 83, "metadata": {}, "outputs": [ { @@ -11755,7 +11739,7 @@ " };\n", "\n", " this.imageObj.onunload = function() {\n", - " this.ws.close();\n", + " fig.ws.close();\n", " }\n", "\n", " this.ws.onmessage = this._make_on_message_function(this);\n", @@ -12250,7 +12234,7 @@ " // Register the callback with on_msg.\n", " comm.on_msg(function(msg) {\n", " //console.log('receiving', msg['content']['data'], msg);\n", - " // Pass the mpl event to the overriden (by mpl) onmessage function.\n", + " // Pass the mpl event to the overridden (by mpl) onmessage function.\n", " ws.onmessage(msg['content']['data'])\n", " });\n", " return ws;\n", @@ -12402,9 +12386,12 @@ " // Check for shift+enter\n", " if (event.shiftKey && event.which == 13) {\n", " this.canvas_div.blur();\n", - " // select the cell after this one\n", - " var index = IPython.notebook.find_cell_index(this.cell_info[0]);\n", - " IPython.notebook.select(index + 1);\n", + " event.shiftKey = false;\n", + " // Send a \"J\" for go to next cell\n", + " event.which = 74;\n", + " event.keyCode = 74;\n", + " manager.command_mode();\n", + " manager.handle_keydown(event);\n", " }\n", "}\n", "\n", @@ -12453,7 +12440,7 @@ { "data": { "text/html": [ - "" + "" ], "text/plain": [ "" @@ -12464,13 +12451,13 @@ } ], "source": [ - "plt.figure(figsize=(11, 7))\n", + "plt.figure(figsize=(10, 6))\n", "plt.subplot(121)\n", - "plt.title(\"Original observation (160×210 RGB)\")\n", + "plt.title(\"원본 관측 (160×210 RGB)\")\n", "plt.imshow(obs)\n", "plt.axis(\"off\")\n", "plt.subplot(122)\n", - "plt.title(\"Preprocessed observation (80×80 grayscale)\")\n", + "plt.title(\"전처리된 관측 (80×80 그레이스케일)\")\n", "plt.imshow(img, interpolation=\"nearest\", cmap=\"gray\")\n", "plt.axis(\"off\")\n", "plt.show()" @@ -12480,12 +12467,12 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "As you can see, a single image does not give you the direction and speed of the ball, which are crucial informations for playing this game. For this reason, it is best to actually combine several consecutive observations to create the environment's state representation. One way to do that is to create a multi-channel image, with one channel per recent observation. Another is to merge all recent observations into a single-channel image, using `np.max()`. In this case, we need to dim the older images so that the DQN can distinguish the past from the present." + "여기서 볼 수 있듯이 하나의 이미지는 볼의 방향과 속도에 대한 정보가 없습니다. 이 정보들은 이 게임에 아주 중요합니다. 이런 이유로 실제로 몇 개의 연속된 관측을 연결하여 환경의 상태를 표현하는 것이 좋습니다. 한 가지 방법은 관측당 하나의 채널을 할당하여 멀티 채널 이미지를 만드는 것입니다. 다른 방법은 `np.max()` 함수를 사용해 최근의 관측을 모두 싱글 채널 이미지로 합치는 것입니다. 여기에서는 이전 이미지를 흐리게하여 DQN이 현재와 이전을 구분할 수 있도록 했습니다." ] }, { "cell_type": "code", - "execution_count": 74, + "execution_count": 84, "metadata": {}, "outputs": [], "source": [ @@ -12508,7 +12495,7 @@ }, { "cell_type": "code", - "execution_count": 75, + "execution_count": 85, "metadata": {}, "outputs": [ { @@ -12593,7 +12580,7 @@ " };\n", "\n", " this.imageObj.onunload = function() {\n", - " this.ws.close();\n", + " fig.ws.close();\n", " }\n", "\n", " this.ws.onmessage = this._make_on_message_function(this);\n", @@ -13088,7 +13075,7 @@ " // Register the callback with on_msg.\n", " comm.on_msg(function(msg) {\n", " //console.log('receiving', msg['content']['data'], msg);\n", - " // Pass the mpl event to the overriden (by mpl) onmessage function.\n", + " // Pass the mpl event to the overridden (by mpl) onmessage function.\n", " ws.onmessage(msg['content']['data'])\n", " });\n", " return ws;\n", @@ -13240,9 +13227,12 @@ " // Check for shift+enter\n", " if (event.shiftKey && event.which == 13) {\n", " this.canvas_div.blur();\n", - " // select the cell after this one\n", - " var index = IPython.notebook.find_cell_index(this.cell_info[0]);\n", - " IPython.notebook.select(index + 1);\n", + " event.shiftKey = false;\n", + " // Send a \"J\" for go to next cell\n", + " event.which = 74;\n", + " event.keyCode = 74;\n", + " manager.command_mode();\n", + " manager.handle_keydown(event);\n", " }\n", "}\n", "\n", @@ -13291,7 +13281,7 @@ { "data": { "text/html": [ - "" + "" ], "text/plain": [ "" @@ -13305,13 +13295,13 @@ "img1 = combine_observations_multichannel(preprocessed_observations)\n", "img2 = combine_observations_singlechannel(preprocessed_observations)\n", "\n", - "plt.figure(figsize=(11, 7))\n", + "plt.figure(figsize=(10, 6))\n", "plt.subplot(121)\n", - "plt.title(\"Multichannel state\")\n", + "plt.title(\"멀티 채널 상태\")\n", "plt.imshow(img1, interpolation=\"nearest\")\n", "plt.axis(\"off\")\n", "plt.subplot(122)\n", - "plt.title(\"Singlechannel state\")\n", + "plt.title(\"싱글 채널 상태\")\n", "plt.imshow(img2, interpolation=\"nearest\", cmap=\"gray\")\n", "plt.axis(\"off\")\n", "plt.show()" @@ -13321,7 +13311,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "# Exercise solutions" + "# 연습문제 해답" ] }, { @@ -13330,15 +13320,6 @@ "source": [ "Coming soon..." ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "collapsed": true - }, - "outputs": [], - "source": [] } ], "metadata": { @@ -13357,7 +13338,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.6.3" + "version": "3.5.5" }, "nav_menu": {}, "toc": { diff --git a/images/rl/MsPacman.png b/images/rl/MsPacman.png new file mode 100644 index 0000000000000000000000000000000000000000..3a1621b4a56d9a0bdc0aa14e59d9dec27f89b157 GIT binary patch literal 12332 zcmeHN3s{op9)B@wW*2KM%a!I?JG=CxRyNZVC}(N4Q=P3^nAe)JrUp~IqCw4dd1kq8 zkV$x&m334s8HH)WYUN6WZbve27*-yWUDXsz~^Jh zeuNkRoM$6{n1oaKIQXImW!Xl`XQXgSd}wqShzg~UBT1CVt(zU=!lGlglD0W6aE8Bg z9lxYd$ctTEe*Fh$QZ&(JSJu68062mbK1E&?X&G-iT3IPg$4QfFG02IdBYzZfT(X4x26Cj+ zr4x}Oz`Xu|S0M)X(;NW3<8nz!ngh%y|9otqZ#ocWu+Xv@aLk(QyWiL+Z=Q@~yW8r0 zw0dBkw(vSz7wVa0$^YeR+Db`>1K@MoV`_+Q+B1fl(R7_O^_4BP9hJ=(z^9ezKMW+r z-w%&C!mV5^DXj3^7|?EsI2PR>~?J1MVncLJivEG z&DhB8;K=1R07xjcS#OCx)9eS0F(^~h}Ixo-W16d@dl7Z_1u#?q7BDljm)t6>Ssv9c>c*y zz|H{wOTz*sv7M6uOO6kw9na@^Msj&j4j5x=vkG|zP2D4aaO!DPjXS%M7r>AwV(rRxkH1cEa)H$qGJeQfbNB=D3|k9%al4}~ zR8_Q8%RUG$4^t<#*CxB+YzyQ~Ep~uaHugmYp1)V)=7@;l$!XqVdOP_c!kCt|6gLHM;JU9>p zTsLcxLcPXx6W10alYH;Z*Wm1`a+6<>ARl*`IzMTB#e8S<8BO-pna+uo>` z)2NDNIZdO_a+O9XqH6MDoXSi36&1E7FT9RRkXuL-2L&uG1Ab)%s8_;70ZP#H_gMZqYFzGG+X_?`0zpS}g z^cxk$ICXr2vNBKA5tb2ufgty1BoZ@YIbj)8H*j?Ckd;mXK@w|^wFL6So=3XdvrC>l z!>ABkll=N+T6kagzib{S(To!Oovv+_Tui`20?xInL12fq0L19Q3d}_JKz!05(g^&G z^@6()DwOEUUiBFUNLL0GJ9e>FebUfNEzm2eRTj|cl9>T4ke5#2gEse8m9KT_353w) zHYjur*iX=J0ANvP)px4{zlUcX=1`%~-b1-2+t=|S@6i?$8S@Kg33?k(3(902H0uJj z@kUb#6MTTmSi?UFWZs6nCCu*C2@L8SnogZN%h`~j?W>Kz!`B0LqZSp+PQsUP^xqVL zxDh`&9Tpo46KvK6758;I;Xf<5W)u${w)N}9A3YwId<`LLyDC2Oo3yys_u^9h=1waB+I}1qiSHRk+!IV4I3YJ z&pi4ctrkN~y?yrbXddRR{GIiw=QhljlCSK*OP2rmR+LXs=#{Y(g1${-{GImIxUtNW zHAw}R#5cy*HIe?AJ2k*TKib^1{AtB!_B6wSMzm0kt;6O>ic?B%aMt(R`>M#M0M{6C zJ*kT7))CKLsL%&9iinvNuFv>Sfky9$`y zlM-3VO>u`muAHE%u4t$vsO)5wy^$5jb`vy5YPsc&T+1OC%gJxUv9v5+mjJB}Z=6`0 z*~K<(X<=tDInBdlJ&4%obLzQ4Jr3lPPJSqzgN@Bm;R%(qrMNs{>C;O{GI}J!s zoo%zr=zkC!)ZK>G$4ZalDw2&&H4nJEyy{V(z9wFr?F+LA_#eD5e4N`>b77x@1wEw- zIz=#6ZP!lDpv%aEIX?rGWboQv z?_)qXs&M$t8m=%Cq7KN*M<353Xo|{i+wQzX1h~pZKu4vTluGi`fpL>l3fceWGN|PmUPrE$s(_m*GeX|aTmYMZcVM$K>8J0>yUD34T z{wJ-|-mL>O(G9_&n$>s2!f^xc^0#m~3TS#s=WWL(H*{T6X$1$h7?ydZ#W-y-a}V6# zZ1bQ_`=RuUrE}&TG^->B-UrAjX5rfKkI5rPstjQJ>llC+zf%<kWd-* tb<~F->aT!ahK%j0yTF0%hLruuhW&eYXwrA3rtM8m z(yR!9|CVa@7z`<^LPJG^kmS_ep+}2u&jvsP;^S5)7|%VtTj#@R)ZB$z&oiP=X=CF? z6r;PZU(1Bcpeu67>W@c&n?ExQ54E{&-guS7Af?2z|0sRfMp1u1h2w{#8OW*M{-)>vy6$X_T##(?f`J% zuXFs&0QiO=>S_8aLUODPzrxoL+ML;X{OpBg>-}!Pd#2tJXuIem=bH&7hb@7xyYmGG z5M*Vc(`;!gdX-@+f<8P}z|r6_$t5_|&-a!`!BtEjq&Q!Nh%-%cLD8(~G4jy%q&%(2 z3{;O$gKN|V)N2P|I`E2C2UM5oH6G~az_12QUfnFGcQkh zk3qaL2wt6#^vZ4%7*ps%#{5{-oW;w7u9+eaU0!=2X2y+zwl!Bx6Cn0RLS)cMTQbtk z2l}Pj*?#rYr;DXG3FGOu@8Exl5#W2-va|Hf)cDsTo9tp)a!on75;g7ybgg=i~Ek%f3!71Fv*)DDAjYb_Bh?eeI>_>zQ@jFN< zS}*Mv&Jc&UkRdARnv-b~if}HC->v{cbf5SYEcm7Ck<8qhuLZxlUEvo|vyN7xW_nsr zP0Ku7uI!B@Ky-DMRSMZG-f<=Xk#eYO)l**B!=#In<8!AEG}vU9q3ul8X~LFn> zrgd(7r)|G5?ifDMvU*~qY4l3VSoMVCEeLT+sF+85g|&TG^Q7--MeUt^({eh}ZgvhyH+;j|fBwNpQ){ftA8Mu<&QSof_z6!d< z`P0-)q?D7OB*O>m@GN(1bKr8u*)*W}YH19fc%CpEn4L5<@*A`)<|vLmO0~2nwZgCb z0i%TvWkzrRqkn=*vfGP#SMJG1Um*NjbdT@!HT5206+Jp6cw1-@lC3$AZ1$d}{7`uD zpI6LBJL_N?c~#y!aJYUGtqC-`pS9S!0|Dr0zEoRt3YmjyG0+s8EdtUYBGM%#E!`#E zAl>^Ku5WGUTKk-z=hxY+HA5!6@AHml-1mLecpl#n6WOwnW+Md!#TL=a!jcpeo02Ff z*8lzYI(#R}>6s1wvBvy@sPy0Xy7RZDEB?H}{Ufv=-+X*?^`o-ahoA0OyQeFz6`K_|Gn7ZjRH5;sPE|+E6euuXRWTePu6R^2*3E0dSA@n6lpiLv_9s)V2J(lg+sLRbU9xT z4X2(>67!$MJ7@K~C{;RU$2%M2G+bDflHPZ9Y0fPy96xwa_|TQtH%h%ZckS8JXHuqL z7jta?{;aJ}-P~j+dP;}O#v8N)bPL@N)ta`Z*-kYXXYwc&5Bu`l%hK~&oXoxSoF)W+ zB^zuJzxma`OEJ;N&{g3C_1l;@t6@bJ`A=E>b0JqeQ?zNCGneP}r-qy4+;{HW8T|0! z8Y;Op6s_fTb#ji5jx_vsUk>*B7-vY_xsz7pdAzry=+DJdC{(mlQc&<|Eldx4v)_L5 z=7hork0au()4YO$f;AB`SJ~3W?x$Gve`s%SSBf-USzc0%*LrR_G&(w3x!B*|@4V06%*H*9F{K-vlX;R9QG__4 z0}b(^*A9F59ABDAw&-`an;98?)2y6gArUDX-Rw?VSrH^slb|nmTTAPTJ0lZQO}w@^ zmE2m28A=8QhU)=E_Hc`ryPpq?<((PxM85u8Vm-)u!7rkML%*_>k8^a_TDHmJ2 zI5)8iua@INHHgjgYi34n{Ab?OccvF%sE~pBxS(IX<)gn6;%h5D-4aTT;q8{A zR8@%!Mz3eM^COUN7qj#aZ1V6^8u^#qsaX_*k4b$@GMw1`=jLXa9)6m7RvzD2c3Mxh zuUg{;|3?(6P}+^a3wCqeZ)=iFHDsdXB@P@oa7A3aF4?^Ii)rU27Nw*c$7Q2F>83f1 zB==5sz7TA+O8@Udj+yh6Q>l1;Oe#vl$;nAM)v9!4&f5HE?gmd*<;JZ(r?uU9EeEeZ{pbo<0o`;qyE#hJ{Nnw!SyA%e@ZIU_vh&!3+kh>Wequ21b9YRgo`J(s$6 ztwcF?%jV71@6SKrP|y7v&8R6oKi#O}!$rH97V89!0?O20X?#``WCpL*P@g*uDT(~& zw1d^*QqD<{w9RJ>8-wLk9b|mY7-tM7U`@y`*2Jm`&dtsyMLFN(sjBz$EYy_j%zw_Q z^JBluGGD3(?HMB-l+=8WkMS3@D6dj27jx+dX;E}lX|#^f6w};YZR`63MlWJpb65?@ z*Mv#rjM#>Uhue*1ueZc{iAtV3clWDtTg;PP$9s|v|G9bR%1rN*_OOfk=|z6~=`XrQ zPW>ToIZS`%8i<#&$VR=xHi>wv>ZmY2K2CO}8~s@=2d6S$9=)co(T$15qto-I)@A-* zhF$CHmjl9G}` zDFP^IvcQfD!)I6&J_}!bvePsv-@XttrzTR4&9LEfUPtnJO1c}NVt!u?8rVzUof68( z$PlaBMo%Bz>u)cHA~)l8;s1Ee>x1Lcd_QiFm2!ZuZ`8fj>zNN9zJ%^%xu82z2B@i+ zpyw$Ytr%j|lG6W_E#roX$)~D0QUwQuocr3dAMixzkG7_hQIIkFbJG?E0VQ;5lZE_k zPxC&sIWAl`H8rId58Rq1#*ws| z;;s^zL(wYfja%(I9_^l7nrUm|%)}KudHPgpxG}*Jm94*+KGRxUvu+0Kdz_K++wHl^ zsw(9-#ryJ}>@siI(6Z~eEC+`$_0bzyhipeRu|3INZq9H!*j@ZqoGpFQy-CdMYu(+y zR&N&HV$=EjjIW+nx@o5~8OYXc4rl5{Qc=PDOOsU?j((-~G7-`e!om-kWuwB-w+4T8 z7xUWw3g$AXfB&5J3^Nl`PGExL(v)0+aa*gF%5xgd>JJw;c^;Fxgh@t5H(9WW-twEV z4WDj3ApIY`t0zpN-UmN=g~4G`;%jLu760T*%%|A(`)tmk_EPMoBDU|?F|{z7dHleE zY!gFi>DQ}Puihg5^^=^QUc@Cgy6_D<4s%;xNMibU-RGm6if!KEWeg2wc6Q0Ur@Q5@ zTzSG}+!}>{9UQ2Mtkr+Ra`PjXVWSS#=huY!iW9rIFZcM@fB@7;ib=0?UHwasD-0kT(up}i%)m6Maadf%+K`(#5=T=my6@)X|7(r{3Av=RXXI#uD8{;XHen#wNZth z32803h|Qfpe%zpv`MHnj(z7ueb9)1WD6Ekdz01RgVNNT{l)aDL+-gZX9BxXa{fR!s zB`#xvHoKG4uu&S#iz^opBJ73WvOI2P#&9z(Sr=_#9`|5Kaj-s4gOqDvGC#0d)yM0_ zUlpb%$_49w1PV<7xZKoeN;JlzrW*)EF0-^ctvKXxt7hC8XiZO@pB|3HKy0&W#owpC zZDQ*~sejG?DCQSY=65#o{DXB?(S}?;oYo^s&*^w1U8oqi(t6$OO^V;J*iF^QZHcQ4 zxq6(L`O4H_U1IVw?&m;N=nLu-@@miXa&!BJ6WeNWZOV!_tuQ$(n({|(-p1l%`q@pC zp4lRcK~k9))8I5QJklJ!D&#ux|4nH63@> z8}IL`YiJnTZ#TcUYVBl=oNB}3v))plQ1p~G%kah}KR!aKhxICwAInvy)l|j%k4OQDwId8@&?$0++9qvJj7{yN83LcO# zEIOn>FX*TwE-)8--Q522W!J;ky@LT2!^HyxtQ<65^`{Vhg@42AU5Wn7$3;OK z%$UV7t46(PN356TEdglvswY>pFIR{#^aGk$I~+Q6sK3PPG_U=16oHTEE(6%;SFT?5 zXfTikgl1LEh_Y_A6`_8b_vsS4RRRXsi-7i2yD7bB<8UB8zjV4X^Z7~LoFw(U zCr=(fuIMWC*tK=*%TC_-i5BZNT{QgKxI)cJD#69}vGCdD`kJd-8_oj>`FfdRt<4$s^)^2;f>!P9(U~mMo2&G-|WN!ST z>p?mK5IV1nT+>&>gxkf`=s?xsb7gxlbxYG1F(C#ql&nf;rlt}ageN*&7>W$IQll>Y zxkq<9e^be1M*Ru%+8$neKR(-EG083A1`VGv<5}&oPx$KetV5!}or7@-57KAn8z>^wk#KDPgBHdO(97q}&FY_~_gbC;`179(7amkV~ zyijL$Cctq%{8sbl+gC%+IA31zwc#`UDLmYqECt}n!!r6+Qc7wH-GEk|!ErXQw7))v z{nocNIZ2WsSFHTzS(nV)+!zE~w>Bl4hXRCjOS5zqyr=`}xE&`yS#&&Br|9)DvF>Eu zwhRZo`cHvk0RqvevaYbgrNSct)n5&pCMT=J&vItsCOMmeK`k#%h`X2(c#5w^G&;T7 zsD5dBYmX^Moyy?H-kkbcbpOssSiZH` zC`ZByj5qSo)i*joiFV6#y)DxUrep2r4eLK;Z>8lnc`wAXJi4+hj>eqe2HZ@z@Z0KM zJwIN{&yO*$ny2S6UG-2t&hq`0A)HRjcXk2O-L{;Yo72VJ2St5TY}Q)1nu+P*)K0nk#$7^vJIt^#;ZyxD! z*VJBtwz>HAvChCNLu6H+)1Q^?=;(kr^SZLKGHP_~nl(B@^>KUbUvni*q$!bfrSgFnh)Q&RTQ%-WX!VZm2mDa7UefVRlyMXYOO8o5raA6suwH zTj@~&oYUVMkEYp-s}oLRHakDBhswdqe#$>V{gf1bZ02%fppyOckVKC27Qhe9=Z_vf z)WNojP)vvrN><9#WIx8iQN8u7^@y*sxd@aUuz*yD9VL)P)b{|)a4*_Jc-e<#EZchX zi;jewzPStbSi_=9TYvx22kW-5Kg0A%FzL9ki&d$fWvctlkoQEqQGSvS3Wz43P}Dhd z%;SH5y|n*VFZrNb^)W&s$oT-5e(ed%;YJydij=`cybodgyIGY^<^lr#tmIvJOE!)t zv&`p=k;(K@yC+}H_}cmVf#&FD-Tez~@mZ_gAd4TreeD?b=M>?PTj+VE4|@~}$uxR{ z;@(=UmJ;V_{?xzW>YEd%_`Cu^vm#W|Q<`d@K7D%nnSOOxAmha++`gd@E1bvCq7w|8 zxLhXItXroCMS^q&POJDrhst14tL}OfEe)4}4)?uzEpvM#KUasdiAINCF7t#p`Dmsy z;5K(v=etGC2wk{vzf{}h%t^=&n>L-!HL^Cd1wsU*HgM&iX*Rb5k7 z*^OJ`+2L4LQc{gpZKg=`@8zPP$P)b*sp$a90F-O}D0&9)48LRt-mw#m3=+t39v%hg zQ&#|3K|ex!N__~><2q8s1ucc&Zt{C6Tc_$U5dzUVrYte+n^f%kejLcD_YpG@beR8g zSj;!KkYZd+x%_s8e0lko9om?W*Q| z&`R)Lwu_>>w9EYy8q}nLQ>bJ-(gD8qX+)mCRH` z!%DE|mn6*0`C}RZ4+K?1_A|1M0g@rYTe9Q8beq#kw03|)?d`eG{;qdozc;q4d^J#T z>2ZSA(2##zE@sF?Dgx*RTYTq^9YFxQeL(Ur8w|>(6`|a7 zV?wuO1fF9Z|KtxLqD{y~$p=D_JPO$cVuQ7>)Mrgm(v8#YrX-Wiy3N8RHG983+T@%s z-XGvm*V3KX=2!(`Jr*T+^wcQ{rwuD+YG3mFUj*_MVf?YZy{jaOci4_$ZshH$-+P#Djrwdx-B$GAs?IFv3Ehl1|hwE&0kzx z>-o; zHnP~3fNIhZZm8K*qM#B3M^C#A3JVJ-B_#7RF$uv+IsGq~GNp7(f3*cZ&}%QRaN;-g zlz3V7D#WQj0zc){E~87(t?Kz&2ddSO3qY>rl&zp?_@tCI^E8NaDA9!A57mHnMUS6p zwGk?K!B7XLq-L|~eBgygJ85b4XGEt*-T*K{<=dl9#4ikYK52>R&&z2WceCCpXj@4& zty^km1`M75`IK#Q-1b9P!o>V|x$3kW6ff%N=(uq`u(<%8ikMbp(&J9ej~6gtavd97 zAks2eN;R-?Xx&kjv?Riboz`hCDXspq++CEvm%E_5)JF~$j|@5}zw(pk7-0Ipx(cmS zSHSbaF?*cl&CEV;q(0F%P|5cZETJ)8`^9e`$;8b#WA1R14UmU{Iv{8@**{S6>(k^g ztFY?g7$#Cc9K>|^uF1|LeU$*XWOKr=6;MpS%PU4H9KtU4f)K>3my_b1<508ciiX2h zuctf;;cx44)r_>8bryrqC}4m6D+}Nm_53~NtIUgi{F_UM(Wq*PodDJ3C_zB*N>gnYr3Me`iXTeogKdH%cxEz-A>Q~3b43ovvgbgrmV=(Ue`94V@+ zt9#(tP-G{lxn#@BJh&J&$6u%(*9RG+284gh_&$C+%?0y9yZN+4qdn#ySuqEpXQ}`k zTNc*IWZbdM(Qd$noB8hebCcS+$xX`Q?*_xUA}01VV|jVO7~2lrSoDave>Dg-r&-tK zSc5k!iGRbsF)=Z5GfVqbbevLI5QAj6Azp-30x^yPFRYb`QEKRZ?hUrC*PIkbbH?a| zD8b3g^Mh9hnh$QHOhvDiG z++t4Hd)K`=L?QjTlUl)N46_<;Z1DGX(yvN52rjW-WNQ7|wGFwD9fm#!%tc=xsi>&n z8g+T}s2^=oE9haw&3oB}A@`lkC6FB6nqAn3Ci9sk6I?i5x49?Qh+tX8G@G~s z`}XaN2^4bCI(tOl+!1huphW1#?*x`-HBgp^o7z;hcOH|>J()4pPXa9e^L4jbi0CLL zPf>_B&`gp#Ikn;yaNXE;H!fXrW@l%ISj7oFaNx)yaCN!!P)lJqAR`+ z>1k;r0ehRR9Eqsa{Z@4#dp(0RWV5Joa9Mrwu|8`)F7=H%($YH_JT=s#t;`e(W;<1O zcJ%iKZL-#Y7vk+5f!7;^z6+xG zmXK@W51`DP^&dQVK$zXyb?dZzjilz#`l8A#V%GH>hCqRsyA)RUSL zZi{wLM&ES^j1F)G$(|s@=WR^?GjzysaCK*S9i1@Z_)42w7@5p0wU@1%@%0T;NwLsf zf=)=tZj1S6uCykNl6R-(F}CQuCvX?ItJ?*PnvuJ-oP)w$y?Kt0m1g)-2!?NW$*>bI$VqSvt65xur z$oQ>S>#tt5YI7$TLrOeHC~0iKFjmU;bH8pc0iY9pKV=OkM`$!-A;yi4=fmF9THG=Z zA_M+`iz!Upte05$DJafSG(!&DhBRH_d{2iPoDfy z)omBjl4`9F>P}fs8a?z)TIQ5^(l5Z!k~3oBQ7`SEl|V}Zt`9-GWi1|MRq%OgOt2d^P`>8jFuDXs98<;e)>Lh|L z>5eof*UlpL!2%l&h_C68{#ha_UwzBknCX;xU>V~-955kjn>FP4B82;rL@wK z(|iByqIO5ZcqtNHU%W_6%{~crdgI@J-vB_v2h_GP25Pqo=o^R{8#5t=X*dV6t@LIv zZ?~$&yk=wb)~#FLv)?Z2TV5EI10&*7F5^Fvs#Sz*w|&38Kr=BpSoraFuFUdjG9&`C z9R_QoL1Iq(3i}BYu^KN}jiLr>PsJl)LHAq4HE3e`8?HF0e-Zk@W_!;dPF-+Np zj6_5wuKRG6@`_!*UQ^7L8HJ^5^b^!)TErZMG^xq%HB=S{zw&H`6I#t$j2FLh86vDh zlml`s9Xrbfu|Iw6+a^lEibI1{M}K#!f3IKp?=)S~MToC-{+)w^gU(mf*wEA=9g1Ky zfnm{A)X@x|Aey16R{K$1z%plf*!FOtRJ)7jc|4~azxU_A z1Q&I>EQ-0;29QO35EOa$IP=TSt2^bvT}}LHq{`j6i{l)eT*T zP89+Os)NF>l?R&4nCqc+IrEr~$1L}x)yz7~%YT2JoBq9qGrk0DqSIQZvqJIM9Wm}0 z7Z(Tamgvx?$!=TIb04=Q{AQw&KWh#XVLhNm#Ph7J&wj9;U;dQ?XuwWL8_@dYHE^pe zc~z}KSpx)}T7MXLPp#39Zy*;Q4i#1`+P0@;Z%+NI9iQXlZ{@55WhnWfqYO`!Y%&Pz z&iatmrhv!37Ziv=FS7UYt`efDF(KHQL{$Jmj6q8H>v3#Ai3Cf2C z%Li-Q4rnLMJDSL1By-q|sX-5`rtvwY;Y{EMWa>swrqscoot-P3TSnZgL$2-?dXRb$ zLQP5Fx*3ptAitumU~k@62mS%5|HbM}$`v48{vJHt`Y535r-KrSn3iLn7H&ZtIV z51>#0vN55?Iei>WB``bPqQ6qCw`fFoHi}jjva5!9DG;Tvk%^$`&xf~s6E?JL1F?dd z?$xn#41R z^(^`+Lbap%{~`|UlTwG2Yz4cdq3?4#F52i53Y*|Rk*bY{J%5lT5txpeppVm&$l+m+xCZ-K2H1JsTDP)^iRMO;01fsaMkOkwe%3-EO+m**&FgQ4v93oyG@;DpA+d zzedP;vxK6exAFw+V7)z|$jsplSR>16HLWF6(?b)lt}HIMMxfE#wclHBrt}|CEnkuv z7~w&r;FGRZX&D^X>QM0jrLez~JHUszu6rlnhDHVWT;!SH4>=Uwd#`seMEq#jh3IN! z5vrLPx9AucMlE761oq~G^pjMYVT}Fac!B53mumFd)~n+ADd~BM#j;1;&A-_oPQDtN zN=|t+L<^#ugiD6d6?J!Z20=qF)MVf5{ZeqQYX``&D0s5@QB_%7k}TS-#qy$UjyIZz z9U_Y6ubVs|{00Fwm-4D)I;O*$?lxE^_^mGYO#hDz_afTw=9UrL zUtJYa;`|{%YlSom0E?s%{9aM1P&cAWKuHyK2hI$|x{(b_*{k?P=hdxd5N?Cmr*N@= z&!am*cctF4se(#=5*mz_kNb8q1CUJ`xkDGTch?RR<%md%(40tGCn<>=l3!&Y-`n2}}**Nu!zghSo0HT`tz$`Y`Ykz;E8E?>jB2XWguDr_D zz%LjHbP5BKAS$umBm@#G5izJivD0767wY@Gp4T zcgF64X}0jdDnCnRG4J&Fc*Q2RwD!+5<`GC=5SDOY8hx@cVXmmHO%?n7b!KMf0RfEq z^tm1%N{4{7La)=>ld!S(s3U4^wXzTXD`fQYt@cPK(h1Bl{Mk6|yo=9SC@98gP}QE0 znbQ!};a8?M3xi-pQe++7&tTpAEacF~0>`J$o^9s_ngqbD+PVPm9h{b5`jvwoTr{ZH z3I%nss&bfb!@Lg55OsdXO5pgozM$b8a0?BmqT#pHyUftb{XycOF=V^caDb zYSrxu=?gP5S_7NGa#7$h2IwIG48JQ1*>1J>p|YhKQ8UFT7ZW_ZiT!<+OD z2lRex-?s?i!n)|gPN{u|9RtV3w7_=e0|ufzIT>(37U+t25?rCSZG*(2LF?SB4w>>J zjHVH8V`N$F2Es!()0#jLPd?3@Jj4{GtKW%^- z7yB)cmW%e2Xg~HwAPhKJjB*cbZ83koaQuiKL|iUgK!Hf(%%$2GP7>$`+U)=Sipl*T zsOqg8=VNeH;CZJoGBWPmx^=%OFU(CW&Lq-@gUAm#^NprWxRG7y!PYiNoB_MR=)}!L z<0Pesqz+FR8l-)W^E%^`hK(FhqNE@qxCbMgLTnMRNExg$xwXfT3R=>}@@Aq%dkZ+EVCHJq*)1~!r&Mf?i%R5P7cphEf-fk?Zl0bS_! zB&F11F0eQzlvB4v`T~MsJQ4pu`!CJ^=ZM_7=?gI0;6Ko)!*H6u!RZDR_8fA0Fsh`1 zKNL5Z%q8?L4Qum->#z?b@7!556ju|(KL1)e3=R%y+85zzMvua-AKqgsSfZ*F%fZR4 z)l~Iq+Hvw0_+0~^Cdit*cI^rQ!R@`|B?j>`0Lrn}jul9G(cy5=7^d zhMeRZ5ek@L*X=3y1#veLk;M^|R{^RcL!nN9vODo=m~q_I&5lUWl5iz2E3#VoZuEoB z8~+(4`niVqR#{Jp07iHw8u=knxZcvxkRX}%brg-n;A0ZQ0RM>uQ;9@Hq#;(7w4H4{M{rn#i|A5+@EJ4WzL;oGAE$5Q>W#5~TXy9&VUN zc54d8KIs`m=ffIhFR@K4cL^zhZ%-WqJ1Bn32Fvh@a&knKWz zUos{-Q>TKF1h``KnHU+rXqD0oCIKEq$j5z1y%JBdyKpN8rY#RPa2%3_l`*`L*QaJXU(l+<0Fr@?;lk{WH6lmC6vTaF5axl+5lhGUV1dqUuMAGGW1-Vucc@wT3{+$vz1b_ffPm0xHN^1*}hTfy;W|rC2OO zozwCJL(5{O5J{E;3y=I`myB)y&4u_zFpdEqZ^T5u_44Ys&=^EzTEM57wz(WFQTb#0?1YYGgZTI5UoTY`FEPR;JXSsZw+z{;&1Npb(1;40R$6*z>F}RE)_!r8R*CeneLryggCNSVJPF?IHvo_LDlMG}!_N;o4s%M{f9)wox>nrKID2p9qDUcOQ-*07G|X zKl$IkOF_|aO2J6rEiJxtmjvb-;;qxRW{LqiXaz)j{oDbzEl3btbe=}{x!C}v;QBMl1h{ z2}IHxvvB?Kl(Qg5d)Smyq~U4*y1_Z!0K5g_qUB1@Vyi`r#aFKaAN0Kp2LKp{Ra2?bIC&xLynPM+Z;RBsq1#a)*FxOrltVq&!jgwgr$MvMsEtaZz-zf2NgS_+xl zBlJ=8)sXC+EYIee;Ai$tsNEVehfCDsk_^*Hw}XaX12_}`>|boaMcc}TeoxLHA~!(X z;)2qVX>VT32z-bS&@B5hSC(Q`?u=h7^*eh;u6Etsl|#Rk`2V>bch@6jM@$p3(ggVv z(0%d$qLLEnf7y$qyhsEv&$E#nZdn$b9yuFTXkbq}=5b@~&NZ}e;u7PmB_z&EXGxsU z$ZowMLcb6vx07~S;QmJWIO`iCYlZ3UY4_&2G3GsCr2nm7V5VJ9ld=o^*-+E6F3=>4$9`{<`8y;hXwJkD+-->zeeUwoLL3-!i9bIr0*2=ib>Q$< z#V!7X89X^yM@7*-I{E_K=oXL)O`DHvpZ2#^*1)I#_@XCnoJo7uUkNd|A}WEW-v7=5 zh;sArgj<#_G9Np(0gP3)$!_h&op%vjY^*IVM5JyT9bMG4Wpq?jb;d__F<wWd9`9&H8Gph95i z9LB{hUERlE7}>S8)M8>{?8?D1pfn=={!p^=aC!wc?0QIR@8MY;RZ2RET3dJX=FKn0 zZToNz;ro>TTm_)nwmo}X@ow!X7n_x3et|o)6mjy;fDLkriZ7cYDTwK)1lx6{~ckOfh z{PnAIWaM$A7hPeQ4eZZpww)UAY_|=!`!ud?0`U(_WcfRlDp~Y>6P|9BH&6{@aIi1zcn>Azo+Be?q3K1CE3rg;)INvj*cs2+}|&b_H<>z zaj9(0V1;KY@k8s@Ry@u>1bG#L|s!swx?=eR3?AK^a|a2ASf z9VJ~@r2yg+*;rs{??@wr(XueW#}FGj5&QVblWze3pKz$)gmTJGBO{}^h#;bXs~O;E zZN(DArs&BIfycu4@z(t~c*YC__!J*U@90&9nUNMDA1=eKyLTU%AnEK|8X6C9xPl2E z1$CDP=>E#vn>on{&~kRNe+NleEua33;2~^7ioJXHUKSI(gjl%g!nEP@-E}**EuD~W z7en0X&)4MYW0JwHwJAPtm}NG?P?xq-L0jj1Sy~>1o0qI2R7c;;3Em^zY5&m0r4t5= zp0X6*W|F3prPq?i|=n$-Zjvm$z z4!RTT0~(dp(xPm1&(;d=L3?kn3oyMJ=&y;nxs;xxjSV+2)?IM0` z9EkUZFG-INHJpZuB-FWZN8IZybuJDR_^M(UNqdJHA-qjZ=UXVMq(tXT#o&h8CDC@8 zl4}l#glM+9a4acUVg*Zl&al-D);BchMrZM(%aNkb8|LETl20-@fFu*k>0U5Ec}TZV zC>~IoUxu)-;3n#%9T>TP zzWQ%qtlkK|yv-}=@fM{E5$-u3S`EfZDYEm zV<QYFz;~Z(rWTQ_sY$8IKoX!Y()=6mDUPd8!ZzBOxh`6z#T}WwFRmDfdm>3SFg*@&xeAP2Rd$ok?*x?5mMR# zlshsRGD+_Q1^4274D!A#JI>%xDOQT?bR^#P^Yhb_qO}=k1f^UXI|yC+BvOwD4<1y* z8G((%wA;4*Wp8gUeCg80pBPJ|gGnIjjHQBmv_`~@E*lIqxGpz0_dSs3-y1fZ2VIuA z(_{OVi?Y9Kmw8XgtDDU#_>C`mH56{gBiisejQ6fGoT}K1(Dpjm zPvdRYmt}mc_NZp?iGu)MwX*;nFIqcoJ zboBIbj!SlrMj9nxdjK@8fpz226Y%bxD{_u%Aj~v{b3P#WflxRrSx0cPG#e63I}RZH zo{a1MdomVR5q3b#-q7MwcXo7*yCTrCU-w@!y{WwogPd9c6|BMWQ zXV0FgX=*+MOhFiX!#+0Yu$S-8rpH7nB}&PXGlrh=6!{vhn@4qxjNEw4d$P*P zD9g&q?i-CjzI){3LmL$x9bbC(_zMQXOR&;jGpm$|+1c?z_m)Vvv-Iq7n)|gD*SkA+ za&qz$%m;u7rlw?cgaQ!RA9xzWHD7KCnzxq^zT-A!garC#co0SkFAT$rTbsV?>`^%;aXApF}2QXzd(tPYWa%c>Y6f0)wxsk_u0#)YP4h#v zZ`{}lAC2i~Bgd&z*-+Xg5SzHpxAqaV;9J#V5F|L9Wj9=NiYAC2Oxigv|-|IbQ;dh1i3nO*!zrI$*~hJY#@ky1RHcPFL)J z9VThxXAI?4OSVy%HEY)V zoewNipK8qp6)rT6AHLQ5O8e_QjOK<*MoS10+&4wmUz#@yJD%bEJ^O3dHltxZfGpb0 zTQG#DWNe_Mq=clj`EIPsOSINCY;0`LN3IG_26e6oo%@5e@NciFZ0+nA4;{L=RVrjw zaxWZG=*35>Q?w8vRcIDtE>;STe7?#W!J5}cB^c`l!RFVZBII!l<;!tSig#XK9S~(d zB_xzi?Mv<+9%eb3#-{k?D5TMkKbnDKHs)hEkByDp;B~=~1p-|W?0LXp1wS--?Z{t$ z{l)HLvm4oeIOj_&S5 zOA41oMY93YHk6c3~gp;#ApK6njpoKm3wKKRfT32!(Se zi=uR=J>YH_hyBWy|6nMLlBwWlg%*ztTGMzii4M^_%};JXTwcMkCEZ@GNw;_kA@%{@ zwM7MZbjDf@%bouE-olL8sLUHc34rj@H=0%OxC6AZwQwA2ZaJ@LV=~`2g;a72;nE+? zK%krQNvB9iNT}r5vv+THMTNvx`^mmn7wn?9(acJ%uC`W!~h@>`2XEXgS) zH2`Mq*DN<>JC-2Ix&mb0PStXpn48efo)uv;c`G-@4;U9xi)Q>mjgFey8KrinE?M^* z_Ir1JP*4yu`;!Y+Bh_kIXD_;LXR zj(S+rveSnd=OyB@{QlkBe*g1mfT_8;$FY~b5RTLqqs>iZRU<40OCMi8cpv|lKjDnw zcJ10Vyz)J8nILSXFr<1)d=U^0$+b;;N`_K0OA&$S{mz}65GUCbYU8nSED%|x-^r#CAv zju6-Z05muDmGcFRu!8>Q@_M(~OkVSU_K)|kN z5JA)s+vwQrxqHWs2l$XdI0tTGY3bRoLc*JfYp#P8W*cEY0gkY?6WbsVPNb7J_2x0_RoRskAvgz7etLLzT{b+9oI;6TGW=Sp+8zg>7CL=vg_K+WRyy1 z$)iw+Q8)U}xu7ziKi`690r+<}&Rvw0+>Q>kCN3a4D;-mqIlAio`PJARmw3zc*N$Ku zc<{Vt2XwQsv-<$m%myf*Vm~i04Nhd|<5-m%&hb8#4grr^+X?iqXJ`!RCOa?!Y6%qy zKg{MEx+zzdc8*o zh`oh8g^Uu7&}$GUL+>GyeLGnNr3lsQ<1}A9Hr~E$-0JC*h~nr`3f?bqp_ArR%eu=a z*BiHoa{}Ybj{rVZ@mzlj?U29gaUfqB1K7g!2@ae(rI2VS}19XI~OjLgFgGV=sM+k~fZv66feJ%m=t z2q8g=*@XoroZ!(iv}Lz;iGhhOY3~q=-g1A{3|28B?G0&E+Ye`WXfKpoo(wSMeg27Jv9Y~lOz@Cg#; zy=iD^gD}2e2!7_E#`;6u(EZ;#;n!U`jRK73f7WbK4#spxSJy$Q%YRb<^9kW2w+{`S zaD97RjuLaZ*sK(gGnzUH=k(j*v+T+x7!OT(x8X4+M6aKFz2wEiXg1^{AbcuNOF%HrDJ=yiGI*Ka+|qK(*f`eY%uk@k_tp9(xt?)!8)}tnw7X90a$J81a z;CfV0{odI>>Kgt{V&60x>KCMHKF$vLpxZ9uI0U!jBKOenaDuPjPekZX?2!Yw#{)fo z6t#k(Ru>5bD|?;200611VGqfAGpy4b%swJ6rrAwh`SFPV_~FBc`#{T3xc8vfT~@P3=90EF?Y>vgt$KP0`x7)DKAhHyGrKql|+ChkGOe-=mI?(lC zaJxdj*GlchC_agEyGrSHGBl?-IC3%EZ}Ub)7GY~KzWN(1&VxGH#o75=Ma4Gk96-zq zIDnx^*;5iA$Ux4aT%}Zx+k-}N34z(*M@euB?n42GhV=sp1%ux8Yu2bi&9|8u@#B#F zcx|tZy?t@I06)L-{OPe$a*)z&*=w}+B~(Im1A_OMW5YLjIrVxDx9+2-e-0V`n8h%z z@$;RM;^Ix{5|VnxG7voqb4kRYQt_7~Jfy|2s#P<9);VQmVm%AII1BD2W4Jbv>&8f2cb_IH&9e zxoJPLRa;{f;t!KbaCQBgLN!(N?Afy#(0u#wkT6!s(puzx)?8G|suL?q9+sRxt<#)z z3QuVY80JJg5>RYcE^O}?uyi?_Y@y_Qw{%Rw9%hvKq<|RO^I=5_mjmpU@0424SO3?&iGrpS;CiRL1NG^;cxO^%eY zK@*2`&_tT2tq@8Xj*@1n&|GRa_j^CP!tUIk*X1kMJ+Hg}aJaYj=ks}<^{n+?@AY2m zBOOsQYxeBvVq(ex^A4U*Vzy9X_jtI#-8(iVhDX|J0&H{=PY1;Tk@N107mAfu2+fG< zX&}#F+1?M3|BjnHc?8JLgBcbYfa~!>XLpef6F>>JsZ*!kt$zTlNb+mYJ=};a>(GI8 z`t)fBH1d{$kaWyBVMq#fFkHPQw%5T}J|<@s9MXC($wS?mDzyo3sYU~^Rx01#Zt zJtWRB+EJ)g{YK3bx(@(EN`+XpOWPOUm;>Lcjh;(KpXg{&wCK-r!(NUtH#g4@ztA4t zHd$F!b)w~`^1{NdO1Y;`ElI(E&_%QHUSU6`uoN_P&aV8Zb(Ox?v z*&Hf&fWIAwZ1>Rvans1w38YoIFv)T}P}~E6!KLaciu}5EEi!e5T~^d9=!fE zaOcjQAoY&{RTF<*yB%ArPqc8CMb*{SBLOef&8FmJ!vc|~2qPDKUqnG9oMjr|>N=d1 zWzO&*4m+RN ze95N^?NA4h{-DM88K}}=QxQ?owZK2YB)gm-uytj}zCC-GH*MMk9%BOP?e}00w?JzI z)1Eou!DlR8Isw6F!i1h-TJ{;H2`?N11Nnds7DJ4=`I>JSTDl(HLnSBTxV)QN(~7}` zrHbjG_Ni!WF@n0mHvGJOv8%)`!ZlXLej6H=eG zt5-8aJLQHOy9~-4-4efnHa5ACNt2zy0ojx%;FS(AG|nuQpiHFAU_qv59OsXpeP;dM zgmDWBR3w%l6GT7L=<##p9b#jb!!0?%nh#gG-B(bjof*&q>~ZInoP^Biyqd7jc&w(p z-?A}?#SRe>i?Biy=R5KVYg3?y_XL$8kQ{Owhk)hGdQv2KDELZ)Xyr$VB1w+Kcq2#= zh5>j{K{HfTHc2BWlP`m}U5n0K)53DZeUZgVk`}|75-sd|#eKB)3K=azX(&AJ&h6Wd zDR$Wv+9lICjVvrOK!UP*W;}SX2Lm|cv#YFuW<+kmKVHZ^yF90`P(3($$MiGn3CDps zSL1=~E?6cZ1!!n)&TJGW6XM2kayquw7L2MJ*My@952uZrQ;yU=mQ^C3KM(vLDm+)7 zMgvv6!YCI0ooG;ASkk=0zQ-Nl-Pv^8TBNT=@S|k?pbR+OF7_lx?sh*s4S57ifX=!V2eL02>G(RZvBYP> zh884OA?vth({6%%XF*|E0g^7FKJi(svqOgM%IICYcA-c6H((pid-m-WT<&O0E$S{*r9pC!V0GC@ zt{uz@Y8q@{jM4$mgBu~2B%%wbRWjhcNP7w@y?@rmQ>Z|+to5g$#t=iZnV0MDNI90; zJsjEAAqj|a1V8*60cI;#QjmLq7PEdY3I_O3^>`7?vGPR1!;3zwr$;q^Cq4$+)FWNp zy>n*FxIdV*Tc1IEGwSf=(D#XWf)C#fN}3xz+lBFH_Msvre1qf zRNA3lIl0`hxDBtC7}i1;+8m(CobHRSTL%Dwryk2A)0Z2v$P}E46;h7Vifj zkDekb5#>>CZthdCKu3-&MU*#&dBY!+;#9th*5kqi3TG6*bWq1PNwm)$tRet1=|z$8l7!4o1Px&!9zO4`eE0Xq74qj|WOf!zOG)<>qkS^&Qta zfgZ^r^ar2a@CvrZ0j{UFyWXA@EcAL*2*w71P%$ao($W&7IR{pG;yu)OPT*MUCe~q! z7n$&utG+=JW!|#d+JJI1U&$$$qIel1D`;`O)n@sPS4s>dz7`KO|H(^MGxMptbS-Ki zm$-vz29*uj4ot&@F$e}=RI@JW&p!o-CqG6;;Qzcn*1kJ(wD@#(WEm!?1H5x z86L2+25pp{Ku~Xx4X9|oK#L?)Co^_ASHVdrCmfjv2uZlCasYD246UT2$1w3)8{NV_ z=wT(Z%bPTj4G^XSms+|LDl~G0Dnz62K77bV+A^yzwzjs4*jsZ?omzG1Nmh_=e7p`T z$FrP;809G^dq}CnAv#=<{|}otO0j?XFM3W%*@Ddbj6|d1lo^k!kF?032~Re`7!x8$ zv?V(O`BjpB$-$cF6J)0*eB_pF59F3On1p)@kg@p->P}k6i3eeTpKZU&W4}cru z74CGB^tJTO8%J2l$}L;A)D;r5hB_!PFwke|!R;~Dst|2J$VR*-Q7NIyLRksVg{;k} zI#5)jPe8ro7msApuJX@p=s_{bJ3uoji zGclNKT<0^%xM3^7zBPuD8`_EImnnt-`6nRqzagxw<$Zw+|8dsR-_dc79PXBHpUf&fw#F-W6b~S%0P~OI;TerQH|*^NErM1IOH!2HsUi|1r$uJRTk z=b-rjxy5M%Y%xkA#?BC}j8hgE<;?+NBmyTK7C96gjsz&A=o~n}2I#kHI%1ap^J=05 z!6+%tdmwX6NHrRa79=V1CPS{5286peJk_jv5>Yg{5v6d(o7Vc5*A<6@ zULjrIch)zm<(a@xobw)S1kZ^3Hp1o-mvJ{UbCc?w1Z2lY>Wej0Us<6QR7^~)6e-i8 zc`G&0lzeHWGO{3mpdBBCD5!|7N6LW%5PS>C9!WCD@YPaPMMXv5b`)f`O>YjZx`@F{ zNr6H(T5rctKv2bh=n2qTdgw1UFh^%RA~X_R5Y*?j#8Tm*_STIgbYygez!q#TzSA7q zj$t@aU^9J0_Kn1?F1@XVaTrLhreI*lSi*rJ_(6y`5-E$nm@zXEa~~|i$sz;*fV?9% zFnwQL=($}8rf<-1p;}~$cs{>m;tWB-F=(xICL`$$O&?s(wi_)0hQac zkdFU$OF6VU$bwTS zJRG&O95M2~269>4FD!x=rB{jrIBC=a1O)|^vZq5K2hGmJdqld3(;|?Zi#{Ksr-qN% zpo3czVXbQ_E>{v?+V~{aqer~QeYrks3y9>_UI?7@mDi3dSA!8Wo` zE7{5!@djqIYD#BU`w}AG@>zcH`K$}};Sby=9|x?36Xl@hz4-`2f^b#y!@NP|p#Jm* zv>QFNgQ6A&)$Us@iNZO4svKS0-jm`>z-+)sh}J?$$OJua3o&az(2X>51xKem=}w_` zr3og*Iq;@TwbOmX{{D_cIz9+S#l`aKv zjh?zpUf8a!VLF+t@U#%=2C8&M9j~XGrnRc@>CR)n=r`ngyM1{kO3v7xcd(fgkrFF6#maY zp4(i#13X`FrS7$m5La|>vjY5|d+;_{fmEN4cgwVS*+{!9To6hK zK<;EO$cqOWR1>TMz`z|x5BmGI&Nut_VGfkt% za~Tvkq?H228)qzlo{b@}6RBWBTzFh|22(?esIid|6}3|00pC^kYHd)1-$un*LmR6;tvOq`t23gtLgTusmIlNbc z9gp|in3$}@NIf+meTYg)w})>ArmLL-^YT}y%8~+6bP{@?L`VvKPajE(icUn+7jir{ z(gXr=Q)?HKHq?A5cV8&X07Z>r%V{4Uf$okT>y_j^SC|~Xy1G*Vdbco*{ZWJuhF5Y5{PUWpbc-q_T1cAf!V8?hF6^L+Mw z)Mf!n))D2(+B$$PXCN4`0Dm+{B(xP%98R4&RoajPDd}k_p9y_V%PgF(Qp`KN={6ut zrcL;tH_ydD8H>3^$6!S7byc9$K#J{(d;${qH%JzN2G9E5tSO?HkkL2Dtf^?d555J% ziHnaXJT@^*Btj$$C`YZ!Ycfy5xpT8LIX?^2ElmUNl+*uRX!D}k7mO&X7)!2?~gn+xSaIu zV9OFV2}}h|dLRFLbV`H>Z|X~E?$>TDwF4DI>K|Y@o*0Kigs07e{0~NY1)9D26`aV3 z>3o7*C4oHT4&e8%-wBbOu?m%n80)ut(I`TCkf0D{mX?-2T#~{Im_5H@n2q3hDU-Ks zn2h0$5tT>k3Mb=5fLjlp%E}>QA;BALO*q=?M~XS{mig!eMpMu+bZ?l~D9Fpp!d#vg zox;%WxluuATtBN)or`J^E3yln3--%J)PP)7bDz0j*MON^7-UZ=C_xcpl%Bw{ry(mwDlhS{rpl@QHfeN4)<=EvMAKk&5a7#QE)jL3wZahp+Lg;JHCFsq%>YX zRf^P*TV(Hg57P>>zuh}!2(3pToei!8Bx6C&b=G4%Cuby-=s=%4f^CqVy{$Wg zan1Liq8heJt`nBGK%=w{YE^D;LA6PsG`Aiaqmijz3Z6s>^)23g%`K*}5QM_y!*XeX z;6D!Kg`a|5&V-N6s0P-?myI0FExdA6e#Huu*<4Z50%tn=6s8|>g0LZ77h{(w76Cug z`>nzj)f=iQK|HWt7|haw#ES7870iQo#BMbd4$l=5#OLh_=K-NRYkUN%Bk+Z62*U8mKTY$C<$H6SIoft(F)B2{2?ZOwCo_a(JwM`d(Tp;t+NRbvD0@)^3;gC*bD z^>j@JIdb7dVqBa+^w@^zqpu1I)ZSa>Od_T&J#hLG)fOO)G>nlp+^2RhndifXm5k;Q+xk9TDZwkAnvAWR zQWAy1GLbPc>+s}_m?UzERGW}O;+GsrlXOVXbeuRjT#z)L5_*6XTjH+)U=WTA^$J%I z^Y8gHyY4=J&WTb~?45_Un5=`O(lSNpHd0MtpZZSm`uEHmHe5UO*3XZo-%k} z1=OJOO#^XFNOR&;(E)T1$W0g#1bdgq5ml9L&IPP=iG;xzzUA2m9dSy54H2C;lqdH4Ippu>c(vc0zhs50iXe? zH^;qRyytSC<LufOFH6$4DZ! zp&YCpc18^<)&?3ar!fI#<=v3IJ~u`ykrC#!`BtMr=<4uj_fw}}n8$Q6Uj4cFEIZYs8qopdz` zH3$UTH4PPMMyA7KO6%@|gk?JW=TYWF!NVyD)ACwE9`eK>}wt(RQeT zWkso5A)xZopYRGJM;0E7SX;NyATJQS+hI&u8RsWsYn76i$ilHr^KU%Qb~n&yVE@bl ze&aa!4At>xXxe{&6nqak@(=;})Jh>r!r1-*@#}`!vJcLkJGa=dScyE1&t=8 zs*hy+cd#JnI4ngL&o*WZ)U;=_E}~k6gp}R3!qMSAIu%; zTqDX5N&}?(C`aZL4@5U>WB;sv^= zOQL+l&~Ji&%<_iy;BLwDP^f^B5S2@ylr5M+`nBZH*p1l|o+Q`h+axc44+C~dy}`MO zl<6TP^+3?yXB`k6JR6nA(#lE~RM*+~ki@S6hK8tW!s{r=gNQN>0=_FxTiQ23PsV|Q zHKb=F#}1=tjVeO&|2ssg^Y82gS4iYISeFGyl7oV#L9@fUVmz2M2S2~5fbZG{24jJ^ zNQze11&iMekAWJW;(^o}s1E)B+Kbnen>xYP_QZN?(iVV*l2wNR1)t{Ws?6coX3!|J zq#keb=zB*%zbPu`# zq5>iL!}WHktf{2XgTgMrhI19sgCI#H-XHLHPV-UFs$QmKi~`t$H1vC2SC86`_>Z$7 zC#d(#lNcBDM8w4~L0%+vv|*(L(tB|-tv21=&CQNya4~Mij6W83W8Mw7r1?U029Bjb zp2v%VEjcyLl@RWLwHS@9O2-LC>h6wuZ z;0WDx0uU)3gM+VFqvG5VGOwm8xt)}%UU5VCKxINI|r&Af4qg_ZT7jpwxc zNGnJi9JH;&C}b@$O^~$RTsK3W>vc2jfYM)KYFyZ=mmdP#PeGLf0TAB4e(9wxUJG^% zBwZ=e8zOu{A0?^PQyiR~SqZ5M71t%a%3AfwNw=nJNLD{$B3OAR5)bK&)G)k%<`~F< z4H+aUxCa`+^%-!y_t5co#D|ps zTDuYZ5Tu7Uv>1%XTh*LWWUR-G=eRz8@E*Cjt)Gr?dEC#>?}!wkZqP>KIvHPaFb6!N z5x7eV$zn*GFI-Bs7};oc&oTIcFFqC9&`gw|Fn!BGynwJ&^x*3|*T+U}lLu~MEeB|- zI3x`lLXrmP!G%$@?3?1Z-vJD}5MhOqEB!c~Zy%!1!F3z)=l zeFMO+NmuBz7cb7WyGO;u%toB%zVK<$`MT#<`uGUp(=nnYm`ngTp}ozbSuTW4Cv`w@l?? z^?CFt>oS=9__M-4CQ1J}p(Qe>W~0kvZN3AgQ;hZ$-kVrUpK<*uEUo zwElZ9==7q7$)P1F)kdn@*Sos?eGZRKwrYm5Uh;x4Gk42sjYo&-&4XLQ@8mAdRjVI; zv$k<#wZ_3PozU3g3!$wB-L!}0VRtkwCeZJSDwXmpJ=%A#Wt&N2kwKKJ!%JIh+kmFR z%oeGNcxqL1Pt)Rx__}$N($u%h$823*9oW&)x~w9;Z$s?wbN$=##Xl0GRaOO-Dz!Sn(=2{ zO?rUsJAtH=i>K6IiR5Z6%a$pe*m_K|iu@(tRkD-1q-*hjUD3^kdkw>BboyRLX@7yWEU_pc6T{i11G>?=3wX*>zFNW7?`o8M_GMVqkF zHPza)YWb5}?HbKTvzyAFNMvZ06kn6#YO^hfdBVq3YFOJTW6eUFMWGrtSdT5u8%cY6 zrrq%;JK$g0?8(kkf}V9|W}L6m46mA(qjADw6!lP7t)x)>q&_QXbN^@fDWi+h6pGC{ z?`q3dHYvCN$WxNApe;?mgU=_7Wk?M2a|DtJu`SW{6vnGsrC0Bn7Pr8Ec&TpP21<{X zUCyd=C8oS}8+MkB8Uoz$E1Ug*FQ)IpHz2D%Z*LwSsOw4T9iiJEm!wA7`KG@3mSt++ zh8m6jr2rDae*Ko_v8RNVKhvF{5voP@Fp7;#k6rW3Dkkl*TE8xK{cal)d&1ZM+@%7< z@|maqx{pkZifUD?B@@__mD$quQP*R!F7S7B@r7Iomlm1yje5K~VgCZ45B`CVe!tP5 z`O<@bAEZT|E%B;Rf%{L>Bm3$M8)>iMVCB|10p^FLa0CDRXCCq1{?rCQCc z7iJM0?Xf#qE{6PoVRD}(+-znY@?7h??e)I~Qade*EvQ{)&8A!wo0m5)Cs{p5F>n+X zn3@9vz9vPhE+{x(#hhwb5R*TXtUL8A9qnEok|6gr>;3MnAg;p)U|6S>ZHp=K^pQ@J zbss0%dR(YGsp4OeVd_z4YO+Emu(Pqzzr3oUWM)Q1AT!;>lu!P2to#ex4Lw%U*TqJm zGbM*__u+zN)-0V30k&~b>=yLwh9osed?ZHJ?{R<7DNy75xBshdWG7YUzsGj|{q0Ee zE~qv4lV%C6E2yrn;TT(QB8?>)LC#g%hQ;`{f^A0mRb&#=s{s_A;WQ}Y;o1XeV zozv2XLV3+K97o61dtEW|VzR3=cq`yvQQll!_|LLhX_;p>ud(@;rlQ1wfo!sC&@V8rN4Mzem5vcI!G>=yivy{ z;^4&_atF%Ac6w5|WY`RG80Vi9G6C8K(>tC+mX$vekBeZGIy8MjM#2Rr3sRYj9o zqi$W@mPz*e^+fgb!wiz8-`83cys)xX)zh;Liw%j6*%@2xy`rQyKfk$sTcBkqt$R|? z^?hQCJR(9`TJPlMcBmO{vFa>p{Yc8n`zV)e~q*KGddx zwQm^zz&Moe82`Um%x1NgzMt;<^XWx~$o36)o3>S&@-(QZdqWG`$@d4tlK;iBg}hj? zDNyaM?x94hPpg?*VrBijpgtzHr4J8&%w~8`$F~^tz|+#7c{tcJqfEIv!%XfXr zFY$|gC+n3*R8_Nuee02qqv!MZtcn(EeX@;+sv$c?#1L!RlWykq?I_-Ymos#3Gb~v> zWv6GPWn$;K4;*5gCDUTk#LaB7M=8&(DsL2@cI-&@Ji0wNq%fYtK^~a#IDND0#w9|Z zhGNK;GUDlox>dFj;!^|Y5WsdL05n8unaRQg=IZsl~v_$N5luR8w!`#Jd+ zCf;Xo^rfbF;o+NK)ZZ$px4%1UMU`&enMklKZBv$iSrz$ua2k?({6VWi5Q2lCb-azB zWnEp*)wl7~4zUop|LEKLN>yuFb!VQ=7Vj@$c?~jrgMQku)K&h)4(ZiXGj3V2bW(d8 zEq(G`eQo>xm|c^ONz2=$S35bCtksk8?%g+Sf9vl30QKY{3?pxZ*nB?rn6dF`AWZ>%~Ngp_Uj?3|q{X+4U$(_vO9R-maY}c76Wb znZv;GGZu<})J^{kQ&=;A@bs2!R%~IjS?r&;w4Y)|g6jM{t6#X_HaN}&%&QOvM>`&hG?qFD0ORT0t{j-AnSADrcZlY6Q z?(2)GWqk&v?D9*ynNKDXx&OsdCC{fMU0^EJ$bQ7r%9J8h=D+cDBzyKeeqU40uV-H* ziHh7u$6}LSewj1@=09pRvW;z9vdAD}lKq@^E6J(Ws)k$i^q*M8hK5AJGV#A&8O9`-a4~?=s|Dm7|P#5M4YWE#p$GX}VmoyM9Xc&3%I??m8NbY%Ln9aM5jFbH(T!p*t1?5e zHscw9X-I_d->q(3I;r1UFsC=OaVLlV@;lW1pZ3TbT$OiWxc z&r&E-CVQUO5o7C@FZEOweJu$X5(elSQr7Rr9E7GkoR5^pipAl$zhVZ>q+L~-xq)4L z>ZdfD5|cG#t4((gl-$D?uYKTnO_J?bu8V5Yvt+W!AC%DJ%~tdzss9gc@u6`m|LNW9 z$`|Lfy>cUO(| zm=Yf&HdZV3!>k=!7LQ}z5UDhrn!RbYwzwqo0sRXs)5i@HO7n?5t)&t^S!e3$Y%SXq z!%fGsYYU2m6;l&%Ki?23f4)<6x^6|$cCX#7D9x-cn8V(%#ZrsXV_XXln)cXYom@_B12ljz|jimk~JV<3IMAzMh> zLEIUYTPI0MCa9vSSgO#jZAnR|=D-D00>=+UzS?)$r|FsHriJ99fwk6sX%ToEn*Wg0 z9~c9Z4;`Acx%;^mSfutewRK&N=sp)TrvI_2c_s`B)E{MQeOI-LvyAI*prv5citm_?X1T+vlyGW1na8IMJ$m z&|2djxM)^1^TwP)zGYm!S&I+lN~|lgl6)T%8<1OomV=L~h9S`&w^CZ@mSU~N7nEvr zc139Il->Y^>RL1?ScS#DP|bK3Tda^!s~^%U*Y|wX5nrkooqkYs|3L)XR~MITSw}wK zvX?W}wS2w$-+lHsO3_!}Ss$*oVBZl0?IXCVe3~!eY-ESQqFg z_pLyY`1c;sZvd#mN{|XuJ~lM9sY?&A;gqsLd(5C7_X7*!G4InYVX%3RhZ&BhOo8x#eu-`}fzaSlr&)j%kpo-w?;c_uB@`G zwJ0K0O2*qpXyD$&hjJMnaF#aIpGc$DA2y4&*f`YUdXD(Y7i4|H>&_?)=czUzEU zY?c4#1tA9e;Apb-O}^bNF|M0i*ov!@QF< z7Fj21w)Q1C4x*+*5*-;A=!-a()eLvmFg|cIzpuq#2{p9N4;(W3u?0$=lru>YWpc!| ztQ=SWh)2(ES-_BB2m0z9wtlt4lj{4bpyhUK^*+NgQ>iuhPv0u_kqg>ShQ`PI`zHS- zR(!WNw!pSS7t6KC2ac)fyD{Q71fZXh7Bc3OeIbwdQZrSrU@ND^Ii@Xa;lu3T*om3& zdZ@e3%qx3nCZ?Kc_NvxgOL#+=Q!;nDZM%JV`k88@w-P+@W93HV9h*tRKdH0-lPjbW zv!mn-Er7!y3{OIa8QqFKOhdsLMcO%G;n zQ42SwY+}vGbp7?=-635w-JlvNv@QJDMPw*YzjUT@I7e9F`SYr62FCLsy zy6`{k6uydY8J+G|pC4Y(!A`TjSg6OhAieTUkodYMK>BnXe8nI+0FLoPOl(oE+T{Y# zx|m{xnp!Kf0<~T~@l=KC7LPtS`e!){gP7`z6W8NhWUz~%)=;k@Njbm!FnskF$+_%4 z%DfT9!`BmnPY_YvZ=cEy zxl)T}sa>9wF4;8e=bh7CJC7-jpbb-z3KHiG+(KJ3mNu+m7iTehXwZMr$}zg}pC34W z=_`KPeyXV2r>NGI-A$cG8x~T?emb-HQQzPEO;%7Vu$0;FNEmQ>SwJ#Xs!}sBTgj^G zig@W2dEDXWN^GmDy8HBj)Y)AjpZO$sD{jUH{U|TK^|)B)nLq{?A~b!FVhQy877snGllC4SOTQm!51hqNp{7 zupJY#FTYsw7a@k!4Cl6bMK|>J%TSZQaP83j4uUg`SNX+`q9Q7JD+B*tBr&9Zx!o>#VJ14k2b9L|&E>&VP@;&Qd zl~=gNOGIUu9*ud&wFy)u?go*wj>l!sO}2b+`)a46RLI0xMYQ#gg=D12P5vm+@>4p$ zgfFCKj7i@r&1{n{J%Sq1fO48wMv0+E{IvZ3DGF4o4S~WTH;$H!5*DOpB9X+zKmo2L5%2E5&d_YX#(gX?Gu_oB^y&{ zjTM*QT9@%1H6V;gq*S0<-@|I3_@{V*zK7-0;)9}8=Ar4Rpz_I%6^(`c=U~%Y#+2p< zg*M`9LvzYA*AQLdz=vgM%2MBu1Ae!~F#h5@dif2HWS~C?w5{Gn3ee<}x%5*EDV-WPil7M+Ncg?0Oo_(gm`>gBw!kV!HKTtdfyC&i zox%rlG%qw6LRRik<)~9f*X$2}SkdlJYm|cEs!CRv}(lIr*u) zP*qdQ`{K#Jz9_nSXd$asBmIIvzpuYq-utx<*ng2P_!+nQIfW6S>zSsKz{9^I&2WNv zS2Cqd++LE&VZwL2Ve;*NS=!p&IPHS6L5E{?$iQH~) z-wuTd`8$Q0I>-(EAeq76al`t^k`)0qVfE~~sk9|_t;tXa?=G%gs$s-Q zzadH~-*=zV&q0Zr%#Efgr)8so!>H_SwhronTS_OSvQtC{JubZNP25;=r&YR<*^={hlO77>2w*6l8lUr#maD&w!~FBl@*>Vwu9D`_3H2qB52)yy^Q9 z3gwb0!>vtcJTG5E_I=AxP#=_SXIL^#NNj|6Dm$1dTMc9L&)d|QU&OJB59;qI-a9mygdCjpPid~_ zB5)_m{hcN8PFWe$fxtkky}rN8|w%UGXBbH(YX589aGokPEmjF=Zknxc#Ln_kH2 z?`AmggdsQjs<7@erc}PYoBM{GgIM4w;>u7=7~%hh#uO*|wbS;uLsL?qfbOn-OXd9; ziV#MKJc#}O_wD}I|Ha)@r{fd~r{D(JRXYaG`N|)4+l0*kjmTqf+ul7tsIzmLy+(y8 zjGXT?9O`Vi(BB;VsZIFd86!c}%s9Rg*1JEo2M@^KM|4unz7lHNLINbk^m#^J5|Chy zC_1xo#8npWH=NA#*7C+5TT$SR6h`N7$kLNgguOl`+vLz2U>JMUJ)%kq;@(@v*2Svi`udQG{^`n_e|Dt z&!1UZB$z2;4HPB6xSEAlGGRFrb!Pk8(K*%S8t$4Z2Gsq7=C=1=we1nEniSJ_)w%s= zv_YA1dl~Qd2N|4L7G2(L9U7wFXB^}(XEVm6A9N`O@Gvyiwou1?u{BFl;y#uA^j?V4 z&9Hp>a-L^|r9Z)-e)_8Q+Y(4n-Kd3`H6xtVg}nWOjHUSMBb}H_P%|*vqNSy^yr{Rm z-O`(PWe&GatCK4u?)h#{^|h1r^A|$o0e6%fHBrrLU!P4EpVzt<>DikQQDG9F6=)Py z-Y90Bmf1k}?>VIxdg~<_#Pt7g<5=>*N1Cap-0!Vc3aRy#`Z5OFf8x{sbQApF1^a^- z{_}%E<&Z2X0eNlBx`-N|%g_}0Aqn+Ea{s@S-~Hbc{&yJt|BU9-KK<_%JfD_=Lis^& lpZXtw_`Ih7txM{Z`nYOo_Z^qh_sO$tSgRl#zFOPy{{bg#2lxO0 literal 0 HcmV?d00001