Status
Open
Goal: Show how to generate datasets for AI training (Computer Vision).
- Can be Bounding Box, Segmentation, Depth etc.
Outline:
- What synthetic data generation (SDG) is and why it’s important (e.g., filling gaps where real data is expensive).
- Setting up a scene and a Replicator script.
- Randomizing environments: lighting, materials, object placement.
- Defining annotations (i.e. bounding boxes, segmentation masks, keypoints).
- Exporting the dataset.
- Demo: Generate a small dataset of cubes and spheres with randomized lighting and show exported JSON + images.
- Train a model, i.e. using RoboFlow or custom models like FasterRCNN, Yolo.. etc.
- Takeaway: Understand how to prototype synthetic datasets for training models.
- Additional info:
- Best to build on top of existing tutorials from NVIDIA docs and reference them
- Please make a few suggestions on synthetic data topics