Hand Pose Estimation

This function block detects and estimates hand keypoints (21 per hand) in images and provides both a visual overlay and structured detection data. It is designed for real-time use and offers controls for detection sensitivity, keypoint visibility, style of skeleton output, and the maximum number of hands to process.

📥 Inputs

Image Feed an image (camera frame, loaded image, or processed image) to analyze for hands.

📤 Outputs

Visualization Annotated image showing keypoints, skeletons and bounding boxes.

Hands Structured detection data (list/dictionary) including bounding boxes, per-keypoint positions, confidence scores and visibility flags.

Model Info Basic run-time information such as selected skeleton style and threshold settings.

Hand Count Number of hands detected (after applying the configured limits).

🕹️ Controls

Skeleton Style Choose how keypoints/skeletons are formatted for the visualization (e.g., MMPose or OpenPose style).

Det Threshold Adjust minimum confidence required for a hand detection to be considered valid (0–100 scale).

Keypoint Threshold Set the minimum confidence for an individual keypoint to be considered visible (0–100 scale).

Max Hands Limit how many hands are returned and drawn (useful to reduce output size and processing for crowded scenes).

🎨 Features

Visual overlay with keypoints, skeleton connections and bounding boxes for each detected hand.
Structured JSON-like output for downstream logic: bounding boxes, per-keypoint (x,y) positions, confidence and visibility.
User-adjustable thresholds to trade off sensitivity vs. false positives.
Limit the number of hands processed with Max Hands for predictable downstream behavior.
Automatically uses available hardware to improve speed (will prefer GPU if available).

📝 Usage Instructions

Provide an image source into the Image input (live camera, stream, or image file).
Choose preferred Skeleton Style for visualization and downstream format.
Adjust Det Threshold to control whether weak detections are ignored.
Adjust Keypoint Threshold to control which keypoints are considered visible.
Set Max Hands if you only want to track a limited number of hands.
Read outputs: use Visualization to preview, and use Hands / Hand Count for logic, UI or logging.

📊 How it runs

When provided with an image, the block analyzes the picture for hands, applies detection and keypoint confidence thresholds, limits results by Max Hands, and then outputs: a visual image with overlays, a structured list of detected hands with bounding boxes and per-keypoint details, a small model info summary, and the number of detected hands.

💡 Tips and Tricks

For live input combine with Camera USB, Camera IP (ONVIF), or Stream Reader to feed continuous frames.
Use Show Image to preview the Visualization output in a larger window while tuning thresholds.
Preprocess noisy images with Blur, Denoising or Image Resize to improve detection stability.
If the hands appear cropped or you only want to analyze a specific area, place an Image ROI Select or Image ROI block before this block.
To annotate results for reporting, combine Visualization with Write Text On Image or Draw Result On Image and then save with Image Logger, Image Write or Record Video.
Use Object Detection or Object Detection - Custom before this block when you want to first locate people and then analyze only person regions for hands—this reduces false positives and speeds processing.
If you need full-body keypoints as well as hand keypoints, consider pairing with Skeleton Estimation and merge results in subsequent processing steps.

🛠️ Troubleshooting

No detections: Try lowering Det Threshold and Keypoint Threshold slightly, or improve image clarity with Image Resizer.
False positives / noisy keypoints: Increase thresholds and/or crop the region of interest with Image ROI Select to remove clutter.
Too slow: Lower image resolution via Image Resize, reduce Max Hands, or use a faster image source. Using a system with GPU will accelerate processing.
Missing dependencies or model not available: The block requires the hand-pose model to be available. If the model or runtime components are not present, follow the application’s module installer / module downloader to add the required runtime and model packages.

If you need example combinations or a recommended small pipeline for live hand tracking (camera → preprocess → hand pose → display / save), ask for a suggested block chain and a short explanation.

PreviousDepth Estimation (DepthAny. V2)NextMatch Anything (ELOFTR)

Last updated 2 hours ago

Was this helpful?