Notebook

Rock - Paper - Scissors - Lizard - Spock¶

Welcome to Pycon 23 Beginners'Day! In this workshop you will learn the basics of programming in the Python language by developing several versions of the classic Rock-Paper-Scissors game from scratch. From the classic version to the top version in which through Machine Learning our program will recognize our move from the webcam, everything is at your fingertips.

Task 1¶

In the next cell we will study:

what is a variable in Python
what are the types of variables you can have in Python
how it is possible to take input from the user
how it is possible to create a list with the possible game choices
how it is possible to generate the computer's move randomly

In [ ]:

integer=10
boolean=True
number_float=0.13
string='pycon'

In [ ]:

print('Hello World :)')

In [ ]:

print(integer)

In [ ]:

print(string,integer)

In [ ]:

result = input('give me a number?')

In [ ]:

print(result)

In [ ]:

user_action = input("Enter a choice (rock, paper, scissors): ")

In [ ]:

import random

In [ ]:

possible_actions = ["rock", "paper", "scissors"]
computer_action = random.choice(possible_actions)

In [ ]:

print(f"\nYou chose {user_action}, computer chose {computer_action}.\n")

Task 2¶

In this cell we will compare the moves of the computer and the player to figure out who won and display an appropriate message

In [ ]:

if user_action == computer_action:
    print(f"Both players selected {user_action}. It's a tie!")
elif user_action == "rock":
    if computer_action == "scissors":
        print("Rock smashes scissors! You win!")
    else:
        print("Paper covers rock! You lose.")
elif user_action == "paper":
    if computer_action == "rock":
        print("Paper covers rock! You win!")
    else:
        print("Scissors cuts paper! You lose.")
elif user_action == "scissors":
    if computer_action == "paper":
        print("Scissors cuts paper! You win!")
    else:
        print("Rock smashes scissors! You lose.")

Task 2a - We repeat the game rounds to make a real game¶

In this cell we will use a Python loop (specifically a while loop) to play an indefinite number of rounds. In particular we will repeat inside the while loop everything we have done so far for the single run:

take a choice as input from the user
generate the computer move
compare moves
show an output

We will add one to these operations: we ask the user if he wants to play again and, if not, we exit the game loop.

In [ ]:

while True:
    user_action = input("Enter a choice (rock, paper, scissors): ")
    possible_actions = ["rock", "paper", "scissors"]
    computer_action = random.choice(possible_actions)
    print(f"\nYou chose {user_action}, computer chose {computer_action}.\n")

    if user_action == computer_action:
        print(f"Both players selected {user_action}. It's a tie!")
    elif user_action == "rock":
        if computer_action == "scissors":
            print("Rock smashes scissors! You win!")
        else:
            print("Paper covers rock! You lose.")
    elif user_action == "paper":
        if computer_action == "rock":
            print("Paper covers rock! You win!")
        else:
            print("Scissors cuts paper! You lose.")
    elif user_action == "scissors":
        if computer_action == "paper":
            print("Scissors cuts paper! You win!")
        else:
            print("Rock smashes scissors! You lose.")

    play_again = input("Play again? (y/n): ")
    if play_again.lower() != "y":
        break

Task 3: Optimizations in the code¶

Now that we have a basic version of the game where we can play against the computer and also increase the length of a game, let's be a little more pro.

We will go in the next cells to implement a series of optimizations that will serve to make our code more maintainable and readable.

Task 3a: Let's create an enum¶

In this cell we are going to generalize the concept of "action" by creating a class that inherits the behavior of Python's IntEnum

In [ ]:

from enum import IntEnum

class Action(IntEnum):
    Rock = 0
    Paper = 1
    Scissors = 2

In [ ]:

print('Action.Rock == Action.Rock',Action.Rock == Action.Rock)
print('Action.Rock == Action(0)',Action.Rock == Action(0))
print('Action(0)',Action(0))

Task 3b: Let's use functions to optimize the code¶

Through the use of functions we divide our main program into "blocks" of code that can be called at any time we need them. In particular, our game can be divided into 3 phases:

Let the user play -> get_user_selection()
Let the computer play -> get_computer_selection()
Decide who won -> determine_winner(user_selection, computer_selection)

In [ ]:

def get_user_selection():
    user_input = input("Enter a choice (rock[0], paper[1], scissors[2]): ")
    selection = int(user_input)
    action = Action(selection)
    return action


def get_user_selection():
    choices = [f"{action.name}[{action.value}]" for action in Action]
    choices_str = ", ".join(choices)
    selection = int(input(f"Enter a choice ({choices_str}): "))
    action = Action(selection)
    return action

In [ ]:

def get_computer_selection():
    selection = random.randint(0, len(Action) - 1)
    action = Action(selection)
    return action

In [ ]:

def determine_winner(user_action, computer_action):
    if user_action == computer_action:
        print(f"Both players selected {user_action.name}. It's a tie!")
    elif user_action == Action.Rock:
        if computer_action == Action.Scissors:
            print("Rock smashes scissors! You win!")
        else:
            print("Paper covers rock! You lose.")
    elif user_action == Action.Paper:
        if computer_action == Action.Rock:
            print("Paper covers rock! You win!")
        else:
            print("Scissors cuts paper! You lose.")
    elif user_action == Action.Scissors:
        if computer_action == Action.Paper:
            print("Scissors cuts paper! You win!")
        else:
            print("Rock smashes scissors! You lose.")

Once these functions have been created, we can create a single one that contains all the game logic that we can invoke (or call) every time we want to start a new game:

start_game()

In [ ]:

def start_game():
    while True:
        try:
            user_action = get_user_selection()
        except ValueError as e:
            range_str = f"[0, {len(Action) - 1}]"
            print(f"Invalid selection. Enter a value in range {range_str}")
            continue

        computer_action = get_computer_selection()
        determine_winner(user_action, computer_action)

        play_again = input("Play again? (y/n): ")
        if play_again.lower() != "y":
            break

In [ ]:

start_game()

Task 3c: Let's create a dictionary with the winning moves¶

Let's create a dictionary where we will have a key/value pair for every possible move. In particular:

the key will be the action specified in our Action class
the value will be a list containing the actions of class Action that lose against the move specified as key

In [ ]:

victories = {
    Action.Rock: [Action.Scissors],  # Rock beats scissors
    Action.Paper: [Action.Rock],  # Paper beats rock
    Action.Scissors: [Action.Paper]  # Scissors beats paper
}

Task 3d: Let's use the dictionary and the `in` operator to simplify the checks¶

In [ ]:

def determine_winner(user_action, computer_action):
    print(f"You chose {user_action.name}. The computer chose {computer_action.name}.")
    defeats = victories[user_action]
    if user_action == computer_action:
        print(f"Both players selected {user_action.name}. It's a tie!")
    elif computer_action in defeats:
        print(f"{user_action.name} beats {computer_action.name}! You win!")
    else:
        print(f"{computer_action.name} beats {user_action.name}! You lose.")

In [ ]:

start_game()

Task 3e: Let's add the other moves: `lizard` and `spock`¶

It is important to note that thanks to the optimizations already done adding new moves comes to us almost free!

In [ ]:

class Action(IntEnum):
    Rock = 0
    Paper = 1
    Scissors = 2
    Lizard = 3
    Spock = 4

victories = {
    Action.Scissors: [Action.Lizard, Action.Paper],
    Action.Paper: [Action.Spock, Action.Rock],
    Action.Rock: [Action.Lizard, Action.Scissors],
    Action.Lizard: [Action.Spock, Action.Paper],
    Action.Spock: [Action.Scissors, Action.Rock]
}

Task 3f: Let's make the game more catchy via ASCII art¶

We will create two new dictionaries:

in ascii_action we will put the ascii art of the moves
in ascii_results we will put the ascii art of the possible results

In [ ]:

ascii_action = {
    Action.Scissors: r"""
     _____      _
    /  ___|    (_)
    \ `--.  ___ _ ___ ___  ___  _ __ ___
     `--. \/ __| / __/ __|/ _ \| '__/ __|
    /\__/ / (__| \__ \__ \ (_) | |  \__ \\
    \____/ \___|_|___/___/\___/|_|  |___/
    """,
    Action.Paper: r"""
    ______
    | ___ \
    | |_/ /_ _ _ __   ___ _ __
    |  __/ _` | '_ \ / _ \ '__|
    | | | (_| | |_) |  __/ |
    \_|  \__,_| .__/ \___|_|
              | |
              |_|
     """,
    Action.Rock: r"""
    ______           _
    | ___ \         | |
    | |_/ /___   ___| | __
    |    // _ \ / __| |/ /
    | |\ \ (_) | (__|   <
    \_| \_\___/ \___|_|\_\

     """,
    Action.Lizard: r"""
     _     _                      _
    | |   (_)                    | |
    | |    _ __________ _ _ __ __| |
    | |   | |_  /_  / _` | '__/ _` |
    | |___| |/ / / / (_| | | | (_| |
    \_____/_/___/___\__,_|_|  \__,_|
     """,
    Action.Spock: r"""
     _____                  _
    /  ___|                | |
    \ `--. _ __   ___   ___| | __
     `--. \ '_ \ / _ \ / __| |/ /
    /\__/ / |_) | (_) | (__|   <
    \____/| .__/ \___/ \___|_|\_\\
          | |
          |_|
    """
}

COMPUTER_WIN=-1
HUMAN_WIN=1
DROW=0
ascii_result = {
    COMPUTER_WIN: r"""
 _____ ________  _________ _   _ _____ ___________
/  __ \  _  |  \/  || ___ \ | | |_   _|  ___| ___ \\
| /  \/ | | | .  . || |_/ / | | | | | | |__ | |_/ /
| |   | | | | |\/| ||  __/| | | | | | |  __||    /
| \__/\ \_/ / |  | || |   | |_| | | | | |___| |\ \
 \____/\___/\_|  |_/\_|    \___/  \_/ \____/\_| \_|


 _    _ _____ _   _  _____   _ _ _
| |  | |_   _| \ | |/  ___| | | | |
| |  | | | | |  \| |\ `--.  | | | |
| |/\| | | | | . ` | `--. \ | | | |
\  /\  /_| |_| |\  |/\__/ / |_|_|_|
 \/  \/ \___/\_| \_/\____/  (_|_|_)

                                                   """,
    HUMAN_WIN: r"""
 _   _ _   ____  ___  ___   _   _
| | | | | | |  \/  | / _ \ | \ | |
| |_| | | | | .  . |/ /_\ \|  \| |
|  _  | | | | |\/| ||  _  || . ` |
| | | | |_| | |  | || | | || |\  |
\_| |_/\___/\_|  |_/\_| |_/\_| \_/


 _    _ _____ _   _  _____   _ _ _
| |  | |_   _| \ | |/  ___| | | | |
| |  | | | | |  \| |\ `--.  | | | |
| |/\| | | | | . ` | `--. \ | | | |
\  /\  /_| |_| |\  |/\__/ / |_|_|_|
 \/  \/ \___/\_| \_/\____/  (_|_|_)


        __
       / _|
      | |_ ___  _ __   _ __   _____      __
      |  _/ _ \| '__| | '_ \ / _ \ \ /\ / /
 _ _ _| || (_) | |    | | | | (_) \ V  V /   _ _ _
(_|_|_)_| \___/|_|    |_| |_|\___/ \_/\_/   (_|_|_)

                                                   """,
    DROW: r"""
         _   _          _
        | | (_)        | |
  __ _  | |_ _  ___  __| |   __ _  __ _ _ __ ___   ___
 / _` | | __| |/ _ \/ _` |  / _` |/ _` | '_ ` _ \ / _ \\
| (_| | | |_| |  __/ (_| | | (_| | (_| | | | | | |  __/
 \__,_|  \__|_|\___|\__,_|  \__, |\__,_|_| |_| |_|\___|
                             __/ |
                            |___/
  ___                     _                _            __
 / / |                   | |              (_)           \ \\
| || |__   _____      __ | |__   ___  _ __ _ _ __   __ _ | |
| || '_ \ / _ \ \ /\ / / | '_ \ / _ \| '__| | '_ \ / _` || |
| || | | | (_) \ V  V /  | |_) | (_) | |  | | | | | (_| || |
| ||_| |_|\___/ \_/\_/   |_.__/ \___/|_|  |_|_| |_|\__, || |
 \_\                                                __/ /_/
                                                   |___/    """
}

After that we will create two functions to easily display actions and results in ASCII art:

display_action
display_results

In [ ]:

def display_action(action):
    print(ascii_action[action])

def display_result(result):
    print(ascii_result[result])

In [ ]:

display_action(Action.Spock)

To use these functions we will also need to modify the determine_winner function

In [ ]:

def determine_winner(user_action, computer_action):
    print(f"You chose")
    display_action(user_action)
    print(f"The computer chose")
    display_action(computer_action)
    defeats = victories[user_action]
    if user_action == computer_action:
        display_result(DROW)
        return DROW
    elif computer_action in defeats:
       display_result(HUMAN_WIN)
       return HUMAN_WIN
    else:
       display_result(COMPUTER_WIN)
       return COMPUTER_WIN

In [ ]:

start_game()

We keep the scores obtained in each game by users¶

We will no longer be satisfied with just the single heat victory messages. We really want to play a game to understand who wins between user and computer after N rounds. Now we can have a real game against the computer and decide when to finish it!

In [ ]:

def print_game_results(game_results):
    num_tied = game_results.count(DROW)/len(game_results)*100
    num_player_wins = game_results.count(HUMAN_WIN)/len(game_results)*100
    num_computer_wins =game_results.count(COMPUTER_WIN)/len(game_results)*100

    print( 'There were ', num_tied, '% tied games', "\nthe player won ", num_player_wins, '% of games\nthe computer won ', num_computer_wins, '% of games\nin a total of ', len(game_results), ' games')

def start_game(num_games=1):
    game_results=[]
    counter=0
    while True:
        try:
            user_action = get_user_selection()
        except ValueError as e:
            range_str = f"[0, {len(Action) - 1}]"
            print(f"Invalid selection. Enter a value in range {range_str}")
            continue

        computer_action = get_computer_selection()
        game_results.append(determine_winner(user_action, computer_action))
        counter+=1

        if counter>=num_games:
            break
    print_game_results(game_results)
    return game_results

In [ ]:

game_results=start_game(5)

We use a graphical interface!¶

In the next cell we're going to use a Jupyter feature that allows us to create a drop-down menu on the fly (after all, this is an HTML page, isn't it?) and to associate a behavior with the choice of item from the menu!

Related concepts:

list comprehension
widgets.Dropdown

In [ ]:

import ipywidgets as widgets
options=[(action.name,action.value) for action in Action]
menu = widgets.Dropdown(
       options=options ,
       description='Chose:')
output = widgets.Output(layout={'border': '1px solid black'})

def on_button_clicked(b):
    output.clear_output()
    with output:
        computer_action = get_computer_selection()
        determine_winner(Action(menu.value), computer_action)

button = widgets.Button(description="Play!", button_style='success', icon='check')
button.on_click(on_button_clicked)
box = widgets.VBox([menu, button, output])

display(box)

Time to use ML!¶

In the following cells we will use Machine Learning to train a predictive model capable of deducing the user's move starting from the shot of the hand obtained with the webcam.

Let's install the necessary libraries and import them:

In [ ]:

!pip install numpy
!pip install opencv-python
!pip install mediapipe
!pip install requests

In [ ]:

import numpy as np
import mediapipe as mp
import cv2

In [ ]:

import requests
url = "https://raw.githubusercontent.com/ntu-rris/google-mediapipe/main/data/gesture_train.csv"

# If repo is private - we need to add a token in header:


resp = requests.get(url)

with open('./gesture_train.csv', 'wb') as f:
    f.write(resp.content)

In [ ]:

# Define default camera intrinsic
img_width  = 640
img_height = 480
intrin_default = {
    'fx': img_width*0.9, # Approx 0.7w < f < w https://www.learnopencv.com/approximate-focal-length-for-webcams-and-cell-phone-cameras/
    'fy': img_width*0.9,
    'cx': img_width*0.5, # Approx center of image
    'cy': img_height*0.5,
    'width': img_width,
}
class GestureRecognition:
    def __init__(self):

        # 11 types of gesture 'name':class label
        self.gesture = {
            'fist':0,'one':1,'two':2,'three':3,'four':4,'five':5,'six':6,
            'rock':7,'spiderman':8,'yeah':9,'ok':10,
        }

        # Load training data
        file = np.genfromtxt('./gesture_train.csv', delimiter=',')
        # Extract input joint angles
        angle = file[:,:-1].astype(np.float32)
        # Extract output class label
        label = file[:, -1].astype(np.float32)
        # Use OpenCV KNN
        self.knn = cv2.ml.KNearest_create()
        self.knn.train(angle, cv2.ml.ROW_SAMPLE, label)



    def eval(self, angle):
        # Use KNN for gesture recognition
        data = np.asarray([angle], dtype=np.float32)
        ret, results, neighbours ,dist = self.knn.findNearest(data, 3)
        idx = int(results[0][0]) # Index of class label

        return list(self.gesture)[idx] # Return name of class label


class MediaPipeHand:
    def __init__(self, static_image_mode=True, max_num_hands=1,
        model_complexity=1, intrin=None):
        self.max_num_hands = max_num_hands
        if intrin is None:
            self.intrin = intrin_default
        else:
            self.intrin = intrin

        # Access MediaPipe Solutions Python API
        mp_hands = mp.solutions.hands
        # help(mp_hands.Hands)

        # Initialize MediaPipe Hands
        # static_image_mode:
        #   For video processing set to False:
        #   Will use previous frame to localize hand to reduce latency
        #   For unrelated images set to True:
        #   To allow hand detection to run on every input images

        # max_num_hands:
        #   Maximum number of hands to detect

        # model_complexity:
        #   Complexity of the hand landmark model: 0 or 1.
        #   Landmark accuracy as well as inference latency generally
        #   go up with the model complexity. Default to 1.

        # min_detection_confidence:
        #   Confidence value [0,1] from hand detection model
        #   for detection to be considered successful

        # min_tracking_confidence:
        #   Minimum confidence value [0,1] from landmark-tracking model
        #   for hand landmarks to be considered tracked successfully,
        #   or otherwise hand detection will be invoked automatically on the next input image.
        #   Setting it to a higher value can increase robustness of the solution,
        #   at the expense of a higher latency.
        #   Ignored if static_image_mode is true, where hand detection simply runs on every image.

        self.pipe = mp_hands.Hands(
            static_image_mode=static_image_mode,
            max_num_hands=max_num_hands,
            model_complexity=model_complexity,
            min_detection_confidence=0.5,
            min_tracking_confidence=0.5)

        # Define hand parameter
        self.param = []
        for i in range(max_num_hands):
            p = {
                'keypt'   : np.zeros((21,2)), # 2D keypt in image coordinate (pixel)
                'joint'   : np.zeros((21,3)), # 3D joint in camera coordinate (m)
                'class'   : None,             # Left / right / none hand
                'score'   : 0,                # Probability of predicted handedness (always>0.5, and opposite handedness=1-score)
                'angle'   : np.zeros(15),     # Flexion joint angles in degree
                'gesture' : None,             # Type of hand gesture
                'rvec'    : np.zeros(3),      # Global rotation vector Note: this term is only used for solvepnp initialization
                'tvec'    : np.asarray([0,0,0.6]), # Global translation vector (m) Note: Init z direc to some +ve dist (i.e. in front of camera), to prevent solvepnp from wrongly estimating z as -ve
                'fps'     : -1, # Frame per sec
                # https://github.com/google/mediapipe/issues/1351
                # 'visible' : np.zeros(21), # Visibility: Likelihood [0,1] of being visible (present and not occluded) in the image
                # 'presence': np.zeros(21), # Presence: Likelihood [0,1] of being present in the image or if its located outside the image
            }
            self.param.append(p)


    def result_to_param(self, result, img):
        # Convert mediapipe result to my own param
        img_height, img_width, _ = img.shape

        # Reset param
        for p in self.param:
            p['class'] = None

        if result.multi_hand_landmarks is not None:
            # Loop through different hands
            for i, res in enumerate(result.multi_handedness):
                if i>self.max_num_hands-1: break # Note: Need to check if exceed max number of hand
                self.param[i]['class'] = res.classification[0].label
                self.param[i]['score'] = res.classification[0].score

            # Loop through different hands
            for i, res in enumerate(result.multi_hand_landmarks):
                if i>self.max_num_hands-1: break # Note: Need to check if exceed max number of hand
                # Loop through 21 landmark for each hand
                for j, lm in enumerate(res.landmark):
                    self.param[i]['keypt'][j,0] = lm.x * img_width  # Convert normalized coor to pixel [0,1] -> [0,width]
                    self.param[i]['keypt'][j,1] = lm.y * img_height # Convert normalized coor to pixel [0,1] -> [0,height]

                    # Ignore it https://github.com/google/mediapipe/issues/1320
                    # self.param[i]['visible'][j] = lm.visibility
                    # self.param[i]['presence'][j] = lm.presence

        if result.multi_hand_world_landmarks is not None:
            for i, res in enumerate(result.multi_hand_world_landmarks):
                if i>self.max_num_hands-1: break # Note: Need to check if exceed max number of hand
                # Loop through 21 landmark for each hand
                for j, lm in enumerate(res.landmark):
                    self.param[i]['joint'][j,0] = lm.x
                    self.param[i]['joint'][j,1] = lm.y
                    self.param[i]['joint'][j,2] = lm.z

                # Convert relative 3D joint to angle
                self.param[i]['angle'] = self.convert_joint_to_angle(self.param[i]['joint'])
                # Convert relative 3D joint to camera coordinate
                self.convert_joint_to_camera_coor(self.param[i], self.intrin)

        return self.param


    def convert_joint_to_angle(self, joint):
        # Get direction vector of bone from parent to child
        v1 = joint[[0,1,2,3,0,5,6,7,0,9,10,11,0,13,14,15,0,17,18,19],:] # Parent joint
        v2 = joint[[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20],:] # Child joint
        v = v2 - v1 # [20,3]
        # Normalize v
        v = v/np.linalg.norm(v, axis=1)[:, np.newaxis]

        # Get angle using arcos of dot product
        angle = np.arccos(np.einsum('nt,nt->n',
            v[[0,1,2,4,5,6,8,9,10,12,13,14,16,17,18],:],
            v[[1,2,3,5,6,7,9,10,11,13,14,15,17,18,19],:])) # [15,]

        return np.degrees(angle) # Convert radian to degree


    def convert_joint_to_camera_coor(self, param, intrin, use_solvepnp=True):
        # MediaPipe version 0.8.9.1 onwards:
        # Given real-world 3D joint centered at middle MCP joint -> J_origin
        # To estimate the 3D joint in camera coordinate          -> J_camera = J_origin + tvec,
        # We need to find the unknown translation vector         -> tvec = [tx,ty,tz]
        # Such that when J_camera is projected to the 2D image plane
        # It matches the 2D keypoint locations

        # Considering all 21 keypoints,
        # Each keypoints will form 2 eq, in total we have 42 eq 3 unknowns
        # Since the equations are linear wrt [tx,ty,tz]
        # We can solve the unknowns using linear algebra A.x = b, where x = [tx,ty,tz]

        # Consider a single keypoint (pixel x) and joint (X,Y,Z)
        # Using the perspective projection eq:
        # (x - cx)/fx = (X + tx) / (Z + tz)
        # Similarly for pixel y:
        # (y - cy)/fy = (Y + ty) / (Z + tz)
        # Rearranging the above linear equations by keeping constants to the right hand side:
        # fx.tx - (x - cx).tz = -fx.X + (x - cx).Z
        # fy.ty - (y - cy).tz = -fy.Y + (y - cy).Z
        # Therefore, we can factor out the unknowns and form a matrix eq:
        # [fx  0 (x - cx)][tx]   [-fx.X + (x - cx).Z]
        # [ 0 fy (y - cy)][ty] = [-fy.Y + (y - cy).Z]
        #                 [tz]

        idx = [i for i in range(21)] # Use all landmarks

        if use_solvepnp:
            # Method 1: OpenCV solvePnP
            fx, fy = intrin['fx'], intrin['fy']
            cx, cy = intrin['cx'], intrin['cy']
            intrin_mat = np.asarray([[fx,0,cx],[0,fy,cy],[0,0,1]])
            dist_coeff = np.zeros(4)

            ret, param['rvec'], param['tvec'] = cv2.solvePnP(
                param['joint'][idx], param['keypt'][idx],
                intrin_mat, dist_coeff, param['rvec'], param['tvec'],
                useExtrinsicGuess=True)
            # Add tvec to all joints
            param['joint'] += param['tvec']

        else:
            # Method 2:
            A = np.zeros((len(idx),2,3))
            b = np.zeros((len(idx),2))

            A[:,0,0] = intrin['fx']
            A[:,1,1] = intrin['fy']
            A[:,0,2] = -(param['keypt'][idx,0] - intrin['cx'])
            A[:,1,2] = -(param['keypt'][idx,1] - intrin['cy'])

            b[:,0] = -intrin['fx'] * param['joint'][idx,0] \
                     + (param['keypt'][idx,0] - intrin['cx']) * param['joint'][idx,2]
            b[:,1] = -intrin['fy'] * param['joint'][idx,1] \
                     + (param['keypt'][idx,1] - intrin['cy']) * param['joint'][idx,2]

            A = A.reshape(-1,3) # [8,3]
            b = b.flatten() # [8]

            # Use the normal equation AT.A.x = AT.b to minimize the sum of the sq diff btw left and right sides
            x = np.linalg.solve(A.T @ A, A.T @ b)
            # Add tvec to all joints
            param['joint'] += x



    def forward(self, img):

        # Extract result
        result = self.pipe.process(img)

        # Convert result to my own param
        param = self.result_to_param(result, img)

        return param

In [ ]:

import io

try:
  from  google.colab.output import eval_js
  colab = True
except:
  colab = False

# colab=False

if colab:
    from IPython.display import display, Javascript
    from  google.colab.output import eval_js
    from base64 import b64decode
    from PIL import Image as PIL_Image


    def take_photo(quality=0.8):
        js = Javascript('''
        async function takePhoto(quality) {
          const div = document.createElement('div');
          const capture = document.createElement('button');
          capture.textContent = 'Capture';
          div.appendChild(capture);

          const video = document.createElement('video');
          video.style.display = 'block';
          const stream = await navigator.mediaDevices.getUserMedia({video: true});

          document.body.appendChild(div);
          div.appendChild(video);
          video.srcObject = stream;
          await video.play();

          // Resize the output to fit the video element.
          google.colab.output.setIframeHeight(document.documentElement.scrollHeight, true);

          // Wait for Capture to be clicked.
          await new Promise((resolve) => capture.onclick = resolve);

          const canvas = document.createElement('canvas');
          canvas.width = video.videoWidth;
          canvas.height = video.videoHeight;
          canvas.getContext('2d').drawImage(video, 0, 0);
          stream.getVideoTracks()[0].stop();
          div.remove();
          return canvas.toDataURL('image/jpeg', quality);
        }
        ''')
        display(js)
        data = eval_js('takePhoto({})'.format(quality))
        binary = b64decode(data.split(',')[1])


        image = PIL_Image.open(io.BytesIO(binary))
        image_np = np.array(image)

        # with open(filename, 'wb') as f:
        #   f.write(binary)
        return image_np
else:
    import cv2
    def take_photo(filename='photo.jpg', quality=0.8):
        cam = cv2.VideoCapture(0)

        cv2.namedWindow("test")

        img_counter = 0

        while True:
            ret, frame = cam.read()
            # Convert the image from BGR color (which OpenCV uses) to RGB color (which face_recognition uses)
            if not ret:
                print("failed to grab frame")
                break
            cv2.imshow("test", frame)

            k = cv2.waitKey(1)
            if k%256 == 27 or k%256 == 32 :
                # ESC pressed
                break

        cam.release()

        cv2.destroyAllWindows()

        # Preprocess image
        img = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
        # Flip image for 3rd person view
        img = cv2.flip(img, 1)

        # To improve performance, optionally mark image as not writeable to pass by reference
        img.flags.writeable = False

        return img

In [ ]:

def start_game(num_games=1):
    game_results=[]
    counter=0
    # Load mediapipe hand class
    pipe = MediaPipeHand(static_image_mode=True, max_num_hands=1)
    # Load gesture recognition class
    gest = GestureRecognition()
    while True:
        try:
            img = take_photo()

            # # Show the image which was just taken.
            # plt.imshow(img)
            # Feedforward to extract keypoint
            param = pipe.forward(img)
            # Evaluate gesture for all hands

            for p in param:
                if p['class'] is not None:
                    p['gesture'] = gest.eval(p['angle'])
                    # print(p['class'])
                    # print(p['gesture'])

                    if p['gesture']=='fist':
                        action = Action.Rock
                    elif p['gesture']=='five':
                        action = Action.Paper
                    elif (p['gesture']=='three') or (p['gesture']=='yeah'):
                        action = Action.Scissors
                    elif (p['gesture']=='rock') :
                        action = Action.Lizard
                    elif (p['gesture']=='four'):
                        action = Action.Spock
                    if action is not None:
                        computer_action = get_computer_selection()
                        game_results.append(determine_winner(action, computer_action))
                        counter+=1
                        print_game_results(game_results)
                        old_action=action

            if counter>=num_games:
                break
        except Exception as err:
            # Errors will be thrown if the user does not have a webcam or if they do not
            # grant the page permission to access it.
            print(str(err))
            raise err

    pipe.pipe.close()

In [ ]:

start_game(num_games=5)

Rock - Paper - Scissors - Lizard - Spock¶

Task 1¶

Task 2¶

Task 2a - We repeat the game rounds to make a real game¶

Task 3: Optimizations in the code¶

Task 3a: Let's create an enum¶

Task 3b: Let's use functions to optimize the code¶

Task 3c: Let's create a dictionary with the winning moves¶

Task 3d: Let's use the dictionary and the in operator to simplify the checks¶

Task 3e: Let's add the other moves: lizard and spock¶

Task 3f: Let's make the game more catchy via ASCII art¶

We keep the scores obtained in each game by users¶

We use a graphical interface!¶

Time to use ML!¶

Task 3d: Let's use the dictionary and the `in` operator to simplify the checks¶

Task 3e: Let's add the other moves: `lizard` and `spock`¶