π₯Dataset Collection
The fastest way to build a high-performance AI model is to capture data on purpose. This page covers how to collect high-quality images and videos using AugeLab Studio's native tools.
Planning Your Dataset
It's crucial to plan your dataset before collection. A well-structured dataset leads to better model performance.
π How Much Data Do You Need?
The number of images required depends on how much the environment changes. Use this table as a starting point for your collection goal.
Simple
Controlled lighting, fixed camera, 1-2 classes.
50 - 150
Industrial
Factory floor, changing shifts, conveyor belt.
200 - 500
Complex
Variable lighting, many classes, moving camera.
1,000+
Complex Outdoor
Outdoor scenes with weather changes.
2,000+
Rare Event
Detecting occasional defects or leaks.
50 Target / 100 Empty
*Images per class refers to the number of annotated instances of each object category, not just total images.
Number of classes should be consistent across the dataset. If not, augmentation can help balance classes later.
ποΈ Define the boundaries:
Write these down before taking the first photo to ensure your dataset is Representative and Consistent.
Class List: What specific objects are you detecting?
Camera Specs: What is the final mounting angle, distance, and Field of View (FoV)? Single or multiple cameras?
Variations: Will there be shifts in lighting (glare/shadows) or background clutter?
Negatives: What does an "empty" scene look like?
Scope: What objects should the model intentionally ignore?
Camera Configuration
Whether using a USB camera, IP camera, or industrial camera, ensure the following settings are optimized before collection:
Resolution: Aim for 480p to 720p (640x480 is a common standard). Higher resolutions can be downscaled later.
Frame Rate: 15-30 FPS is sufficient for most object detection tasks.
Focus: Set to manual focus to avoid shifts during collection.
Exposure: Use manual exposure settings to maintain consistent lighting.
Save Settings: Save your camera settings profile, most cameras allow saving presets, so the settings remain consistent across sessions.
Dataset Collection
You can collect images and videos for your object detection dataset directly within AugeLab Studio using built-in tools. This ensures compatibility and streamlines the annotation process.
Another option is to download public datasets or use external cameras/software, but this may require additional formatting steps.
Capture Inside AugeLab Studio
Using the Studio environment allows you to use triggers (buttons, PLC signals, or timers) to automate your collection.
1. Start from the Example Project
AugeLab ships with a pre-configured template for this exact task.
Path:
FileβExample Projects(or "Example Scenarios")Project: "Data Collection for AI Training"
πΈ Single Images: The Image Write Block
Image Write BlockUse this for high-quality static frames. It is best for "same scene, many positions."
Folder Path
Where images are stored.
Save (Trigger)
Set to True to capture a frame. Pair this with a button or a timer.
Compress Image
Checked = .jpg (Smaller)
π₯ Continuous Motion: The Record Video Block
Record Video BlockBest for conveyor belts or fast-moving inspections where you intend to extract frames later.
Video Quality
Compressed = .mp4
Trigger Mode: Spacebar
Press Space to Start/Stop.
Trigger Mode: Once
Record=True toggles recording on/off.
Plan recordings as short, focused clips (10β60s) rather than one massive file. This makes frame extraction much easier.
π Collecting Background (Negative) Images
A robust model needs to know what not to detect. You must capture "Empty" scenes on purpose.
What to capture: Empty conveyors, empty workstations, or common non-target objects (fixtures, tools).
Empty: An annotation file exists, but has no boxes.
Excluded: No annotation file exists.
Public Datasets
If you need to supplement your own data, consider these public datasets:
COCO Dataset: Large-scale object detection, segmentation, and captioning dataset.
Pascal VOC: Standard dataset for object detection and segmentation.
Open Images Dataset: A dataset with ~9 million images annotated with image-level labels and bounding boxes.
ImageNet: Large visual database designed for use in visual object recognition research.
Kaggle Datasets: Various datasets for machine learning, including object detection.
π Folder Structure & Preparation
AugeLab Studio loads datasets by folder. Ensure your structure looks like this:
π Capture Checklist
Quality
Avoid heavy motion blur or over-exposure where edges disappear.
Coverage
Capture objects in the center, corners, and edges of the frame.
Scale
Match the real-world distance from the camera to the object.
Clutter
Include the messy backgrounds the camera will actually see.
Resolution
Most AI models work best between 480p and 720p (640x480 average).
Last updated
Was this helpful?