📥Dataset Collection

The fastest way to build a high-performance AI model is to capture data on purpose. This page covers how to collect high-quality images and videos using AugeLab Studio's native tools.

You may skip this section if you already have a folder of images/videos ready for annotation.

Planning Your Dataset

It's crucial to plan your dataset before collection. A well-structured dataset leads to better model performance.

📊 How Much Data Do You Need?

The number of images required depends on how much the environment changes. Use this table as a starting point for your collection goal.

Project Type

Environment

Recommended Images per class*

Simple

Controlled lighting, fixed camera, 1-2 classes.

50 - 150

Industrial

Factory floor, changing shifts, conveyor belt.

200 - 500

Complex

Variable lighting, many classes, moving camera.

1,000+

Complex Outdoor

Outdoor scenes with weather changes.

2,000+

Rare Event

Detecting occasional defects or leaks.

50 Target / 100 Empty

*Images per class refers to the number of annotated instances of each object category, not just total images.

For best results, aim for diversity in angles, distances, and lighting within your dataset.

Number of classes should be consistent across the dataset. If not, augmentation can help balance classes later.

🏗️ Define the boundaries:

Write these down before taking the first photo to ensure your dataset is Representative and Consistent.

Class List: What specific objects are you detecting?
Camera Specs: What is the final mounting angle, distance, and Field of View (FoV)? Single or multiple cameras?
Variations: Will there be shifts in lighting (glare/shadows) or background clutter?
Negatives: What does an "empty" scene look like?
Scope: What objects should the model intentionally ignore?

Camera Configuration

Whether using a USB camera, IP camera, or industrial camera, ensure the following settings are optimized before collection:

Resolution: Aim for 480p to 720p (640x480 is a common standard). Higher resolutions can be downscaled later.
Frame Rate: 15-30 FPS is sufficient for most object detection tasks.
Focus: Set to manual focus to avoid shifts during collection.
Exposure: Use manual exposure settings to maintain consistent lighting.
Save Settings: Save your camera settings profile, most cameras allow saving presets, so the settings remain consistent across sessions.

Dataset Collection

You can collect images and videos for your object detection dataset directly within AugeLab Studio using built-in tools. This ensures compatibility and streamlines the annotation process.

Another option is to download public datasets or use external cameras/software, but this may require additional formatting steps.

Capture Inside AugeLab Studio

Using the Studio environment allows you to use triggers (buttons, PLC signals, or timers) to automate your collection.

1. Start from the Example Project

AugeLab ships with a pre-configured template for this exact task.

Path: File → Example Projects (or "Example Scenarios")
Project: "Data Collection for AI Training"

📸 Single Images: The `Image Write` Block

Use this for high-quality static frames. It is best for "same scene, many positions."

Input/Setting

Logic

Folder Path

Where images are stored.

Save (Trigger)

Set to True to capture a frame. Pair this with a button or a timer.

Compress Image

Checked = .jpg (Smaller)

🎥 Continuous Motion: The `Record Video` Block

Best for conveyor belts or fast-moving inspections where you intend to extract frames later.

Input/Setting

Logic

Video Quality

Compressed = .mp4

Trigger Mode: Spacebar

Press Space to Start/Stop.

Trigger Mode: Once

Record=True toggles recording on/off.

Plan recordings as short, focused clips (10–60s) rather than one massive file. This makes frame extraction much easier.

📉 Collecting Background (Negative) Images

A robust model needs to know what not to detect. You must capture "Empty" scenes on purpose.

What to capture: Empty conveyors, empty workstations, or common non-target objects (fixtures, tools).
Empty: An annotation file exists, but has no boxes.
Excluded: No annotation file exists.

Public Datasets

If you need to supplement your own data, consider these public datasets:

COCO Dataset: Large-scale object detection, segmentation, and captioning dataset.
Pascal VOC: Standard dataset for object detection and segmentation.
Open Images Dataset: A dataset with ~9 million images annotated with image-level labels and bounding boxes.
ImageNet: Large visual database designed for use in visual object recognition research.
Kaggle Datasets: Various datasets for machine learning, including object detection.

📂 Folder Structure & Preparation

AugeLab Studio loads datasets by folder. Ensure your structure looks like this:

my_dataset/
  ├── 000001.jpg
  ├── 000002.jpg
  ├── background_01.png
  └── classes.names  <-- (Optional, will be created during annotation)

🏁 Capture Checklist

Check

Requirement

Quality

Avoid heavy motion blur or over-exposure where edges disappear.

Coverage

Capture objects in the center, corners, and edges of the frame.

Scale

Match the real-world distance from the camera to the object.

Clutter

Include the messy backgrounds the camera will actually see.

Resolution

Most AI models work best between 480p and 720p (640x480 average).

PreviousAnnotate Data for Object Detection NextAnnotation Window Basics

Last updated 1 day ago

Was this helpful?

Planning Your Dataset

📊 How Much Data Do You Need?

🏗️ Define the boundaries:

Camera Configuration

Dataset Collection

Capture Inside AugeLab Studio

1. Start from the Example Project

📸 Single Images: The Image Write Block

🎥 Continuous Motion: The Record Video Block

📉 Collecting Background (Negative) Images

Public Datasets

📂 Folder Structure & Preparation

🏁 Capture Checklist

📸 Single Images: The `Image Write` Block

🎥 Continuous Motion: The `Record Video` Block