Skeleton Estimation

This function block performs full-body skeleton estimation on input images. It provides multiple detail levels (Body, Body with Feet, Wholebody), adjustable performance modes, and confidence thresholds so you can balance speed and accuracy. Results include a visualization image, structured skeleton data, model metadata, and a person count.

📥 Inputs

Image The image to analyze for human poses. Accepts typical image sources (camera frames, loaded images, or processed images from other blocks).

📤 Outputs

Visualization An image with skeletons and bounding boxes drawn for detected people.

Skeletons Structured data describing detected persons, their bounding boxes, keypoints (with names and confidences), and optional body-part groupings.

Model Info Metadata about the current model selection and runtime settings (model type, mode, device, thresholds).

Person Count Number of detected persons included in the structured output.

🕹️ Controls

Model Type Choose the detail level: Body (17 keypoints), Body with Feet (26 keypoints), or Wholebody (body + face + hands).

Mode Select processing profile for speed vs accuracy, e.g. lightweight, balanced, performance.

Skeleton Style Choose keypoint output style, for example MMPose or OpenPose formats.

Detection Threshold Adjust minimum confidence required to consider a detected person valid.

Keypoint Threshold Adjust minimum confidence required to mark individual keypoints as visible.

Max Persons Limit the number of people processed and returned to keep performance stable.

🎯 Key Features

Multiple model formats to suit your use case: quick body-only detection or detailed whole-body analysis (face + hands).
Performance tuning via Mode, Detection Threshold, and Max Persons to adapt to device capabilities.
Confidence-based keypoint filtering so only reliable keypoints are reported as visible.
Visual feedback with skeleton overlays and bounding boxes for easy verification.
Structured outputs suitable for downstream automation, analytics, or logging.

⚙️ Running Mechanism (User-Facing)

When an image is provided, the block runs the selected estimation model and returns both an annotated image and structured pose data.
Detection Threshold controls whether a detected person is considered valid. Lower values return more detections but may include false positives; higher values are stricter.
Keypoint Threshold controls which keypoints are flagged as visible; use this to ignore low-confidence joints.
Max Persons truncates results to the top detections to preserve performance on crowded scenes.
The block adapts to the chosen Mode to trade off speed and accuracy: choose lighter modes for real-time needs and heavier modes for accuracy.

📝 Usage Instructions

Provide an image source to Image. Typical sources are camera frames (e.g. Camera USB or Camera IP (ONVIF)) or a loaded image (Load Image).
Choose the desired Model Type depending on the detail you need.
Set Mode to match your performance expectations (faster or more accurate).
Tune Detection Threshold and Keypoint Threshold to filter unreliable detections.
Optionally reduce Max Persons for faster processing on resource-limited systems.
Use the outputs: visualize Visualization, send Skeletons to analytics or logging, and monitor Person Count.

💡 Tips and Tricks

For live setups, use Camera USB, Camera IP (ONVIF), or Stream Reader as image sources. For testing, use Load Image.
If processing is slow, try:
- Selecting a faster Mode.
- Reducing Max Persons.
- Pre-resizing images with Image Resize before feeding into this block.
Improve robustness in noisy images by applying preprocessing such as Blur or Denoising before the skeleton block.
To focus on a specific area (e.g., a doorway or assembly line), crop with Image ROI or Image ROI Select and run skeleton estimation only on that region.
Combine outputs with visualization and logging blocks:
- Send Visualization to Show Image for an interactive preview.
- Overlay bounding boxes or labels using Draw Detections or Write Text On Image to create clear operator displays.
- Save verification frames with Image Logger or record sessions with Record Video for audits.
- Convert structured Skeletons data into logs using Data to JSON or export counts via CSV Export.
For higher-level safety or analytics:
- Use Skeletons (person positions) together with Social Distance Detector to check proximity violations (you may need a perspective transform or calibration via Perspective Transform).
- Feed person bounding boxes or centers into custom logic blocks to trigger alerts or external actions (e.g., Send Mail or MQTT Publish).

🛠️ Troubleshooting

If you get few or no detections:
- Increase Detection Threshold and/or Keypoint Threshold incrementally, or try a more accurate Mode.
- Ensure the subject is well-lit and clearly visible in the Image source.
- Try preprocessing with Image Resize (upsample or downsample) to match the expected subject scale.
If performance or responsiveness is poor:
- Pick a lighter Mode, lower Max Persons, or preprocess images to smaller sizes with Image Resize.
If results look noisy or jittery across frames:
- Consider adding temporal smoothing in downstream logic or only logging confident detections (use thresholds).
If model initialization or runtime fails: ensure required runtime components are available (installable via the application’s module tools) and then re-run the block.

🔗 Example Block Flows

Real-time monitoring: Camera USB → Image Resize → Skeleton Estimation → Draw Detections → Show Image
Logged audit trail: Camera IP (ONVIF) → Skeleton Estimation → Image Logger + Data to JSON
Safety enforcement (distance checks): Camera USB → Skeleton Estimation → (extract person centers) → Social Distance Detector → Draw Result On Image

Use these combinations to build reliable and performant pose-detection systems without needing to touch implementation details.

PreviousOCR NextData/Logic

Last updated 2 hours ago

Was this helpful?