AI Gesture Control Logic

A deep dive into how `grafgameN_8_5-5d.html` translates physical hand movements into game inputs using Computer Vision.

1. The Technology Stack

The game relies on two specific libraries provided by Google's TensorFlow team. These are loaded via Content Delivery Network (CDN) to run directly in the browser without backend servers.

Core TensorFlow.js

The machine learning engine that allows hardware acceleration (WebGL) in the browser. It handles the heavy math required for neural networks.

<script src=".../tfjs"></script>

Model HandPose

A pre-trained neural network specifically designed to detect 21 3D landmarks of a human hand from a single video frame.

<script src=".../handpose"></script>
Implementation Strategy: The code loads the model asynchronously using handpose.load(). Once loaded, it continuously passes the webcam video feed to the model to get real-time predictions.

2. Detection & Tracking Logic

The code doesn't just track "a hand." It specifically stabilizes the input by averaging specific finger joints (Knuckles).

Step A: Landmark Extraction

The estimateHands(video) function returns an array of landmarks. The code extracts the MCP (Metacarpophalangeal) joints—the knuckles where the fingers meet the palm.

  • Index Finger MCP (Index 5)
  • Middle Finger MCP (Index 9)
  • Ring Finger MCP (Index 13)

Step B: Stability Averaging

To prevent "jitter" (shaky movement), the code averages the X and Y coordinates of these three joints to create a single Control Point.

const landmarkIndices = [5, 9, 13];
let totalX = 0, totalY = 0;

// Average the position
controlPointX = totalX / 3;
controlPointY = totalY / 3;

3. Calculating Movement (The Swipe)

The core innovation in this file is the History Buffer. It doesn't look at where the hand is; it looks at where the hand was compared to now.

The History Array

A variable controlPointHistory stores positions with timestamps. Old data (>500ms) is constantly removed.

Oldest Point (t - 0.5s)
Where you started
Current Point (t - 0s)
Where you are now

The Vector Math

The code calculates the difference (Delta) between the Oldest point in memory and the Current point.

// Note: X is mirrored!
diffX = oldestPoint.x - currentPoint.x;
diffY = currentPoint.y - oldestPoint.y;

const threshold = 25; // Pixels

Direction Resolution Logic

The code determines the dominant axis (Horizontal vs Vertical) to decide the game input.

1. Threshold Check
Is the movement > 25px?
If No: IDLE (0)
2. Axis Check
Is abs(diffX) > abs(diffY)?
If Yes: Horizontal Move
If No: Vertical Move
3. Direction Check
Horizontal: Left/Right?
Vertical: Up/Down?

4. Visualizing the Logic

The diagram below simulates how the code interprets a physical hand swipe. The Red Line represents the controlPointHistory (the vector).

5. Game Loop Integration

Finally, the code injects this logic into the game loop using a non-blocking approach.

requestAnimationFrame(gameLoop);

function gameLoop(now) {
    // 1. Run CV (Promise based)
    model.estimateHands(video).then( preds => {
        direction = getMotionDirection(now);
        
        // Persistent Movement Logic
        if (direction !== 0) {
            lastValidDirection = direction;
            arah_gelinding = direction;
        }
    });

    // 2. Update Game State
    // Moves the ball based on 'arah_gelinding'
}

Critical Detail: The variable lastValidDirection ensures that if the camera loses track of the hand momentarily, or the user stops moving, the ball continues rolling in the last intended direction.