Accelerate AI models training in Vision AI and the Metaverse with Synthetic Data Cloud for Deep Learning

SKY ENGINE AI is a simulation and deep learning cloud that generates fully annotated, synthetic data, provides advanced domain adaptation algorithms and trains AI computer vision algorithms at scale.

SKY ENGINE AI is a tool for developers: Data Scientists, ML/Software Engineers creating computer vision projects in any industry.

SKY ENGINE AI developers platform for training AI models on synthetic data in virtual reality for computer vision



SKY ENGINE AI Platform Features

Synthetic Data Cloud for Deep Learning in a Virtual Reality for Computer Vision Developers

Evolutionary Synthetic Data Cloud AI stack

Deep Learning in the Metaverse full technology stack – Developer environment core modules

GPU simulator
with sets Physics-based rendering shaders tailored to sensor fusion

AI-based image
and video processor for domain adaptation

Garden of deep neutral
network architectures
for 3D/4D training

Multi-GPU and network
level adaptive deep learning and tasks scheduler

GPU memory level
integration with PyTorch and TensorFlow

includes deep integration of well-known technologies for

Data Scientists and Software Engineers






Everything you would expect
Benefits of using Synthetic Data Cloud

Reduced data acquisition

Large chunks of real-world training images are no longer required reducing data acquisition costs as synthetic data is diverse and covers edge-cases.

Perfectly labelled data

Synthetic data, digital twins come with labels, annotated instances and ground truths, reducing humans work.

Up to 85% cost savings

AI business transformation can be reality when generating massive synthetic training datasets at a fraction of real-world data collection and labelling cost.

Up to 50% more accurate computer vision

Training AI models with pure synthetic data and advanced domain adaptation and testing digital twins in virtual reality greatly improves model performance.

Up to 40x faster AI model development

Accelerate deployment of computer vision models by shortening training iteration cycles with adaptive data-centric evolutionary deep learning workflow.

Democratize access to AI training data

Build trust with your customers and community when creating anonymized synthetic datasets and work safely with data while preserving privacy.




GPU Simulator with Physically-based Rendering shaders for Sensor Fusion

  • Multispectral, physically-based rendering and simulation:
    • Visible light (RGB)
    • NIR (Near-infrared)
    • Thermal
    • X-ray
    • Lidar
    • Radar
    • Sonar
    • Satellite
  • Render passes dedicated to deep learning
  • Animation and motion capture systems support
  • Complex ground truth generation system
  • Determinism and advanced machinery for randomization strategies of scene parameters for continuous and active learning approach
  • Generative materials and images postprocessing
  • Support for Nvidia MDL and Adobe Substance textures
  • Data scientist friendly
  • Compatibility with popular CGI formats


Garden of Neural Networks, Metrics and Tools for streamlined Data Science

+ Specific neural network models, ready for parallel training, tested and optimised for the following tasks:

  • Object detection
  • Classification
  • Semantic Segmentation
  • Image Translation
  • Geometry Reasoning (3D pose and position estimation)
  • Key Points 2D/3D
  • Pose Estimation
  • Domain adaptation
  • Depth Estimation
  • Spectrogram classification


Data generation parameters randomization

Shape your virtual environment in every aspect including camera characteristics, lighting, weather conditions, background, occlusions, colours, object placement, motion, and environment evolution to build diverse and balanced datasets for the robust AI models training resulting in accurate real-world performance.

SKY ENGINE AI Synthetic Data Cloud – Workflow

Allows data generation and building optimal, customised AI models from scratch and training them in a virtual reality


Accelerate AI models training
with Synthetic Data Cloud for
Computer Vision Developers

Unleash the potential of computer vision at scale using full stack synthetic data cloud with evolutionary deep learning in virtual reality that enables massive, customised, well—balanced synthetic data simulation and generation for adaptive AI models training.

Get SKY ENGINE AI Synthetic Data Cloud

Create virtual environments, simulate and generate image and video datasets, and train AI models of high performance in reality.

What’s included:

  • Synthetic data multimodality simulator and image renderer
  • Garden of computer vision deep neural network architectures
  • Library of pre-trained AI models and domain adaptation algorithms
  • Purchase option for professional services, research & assets

Get Synthetic Data for Computer Vision

Our data science team and computer vision experts will help you creating high quality datasets.

What’s included:

  • Consulting of your project’s goals and data requirements
  • Tailored services and data generation
  • Pricing packages for affordable, large datasets
  • Scientific research collaboration and project development


SKY ENGINE AI raises $7M to accelerate vision AI development for automotive, robotics, medical diagnosis & more

SKY ENGINE AI | Synthetic Data Cloud | Series A | Generative AI | Computer Vision

SKY ENGINE AI raises $7M to accelerate vision AI development for automotive, robotics, medical diagnosis & more

SKY ENGINE AI has raised its Series A round to help tech companies improve computer vision with Synthetic Data Cloud for AI developers.

17 January 2024
What is Transfer Learning?

Transfer Learning | Data science | Machine Learning | Synthetic Data | Vision AI

What is Transfer Learning?

Transfer learning is a computer vision approach that involves building a new model on top of a prior one. The goal is to enable the new model to learn characteristics from the existing one, allowing the new...

14 September 2023
SKY ENGINE AI announces Synthetic Data Cloud for Driver Monitoring win with Renault Group

Driver Monitoring | Synthetic Data Cloud | Automotive | Generative AI | Computer Vision

SKY ENGINE AI announces Synthetic Data Cloud for Driver Monitoring win with Renault Group

SKY ENGINE AI has been selected to deliver its world-leading Synthetic Data Cloud for Driver Monitoring System (DMS) software to boost car models capabilities with a major European car manufacturer – Renault...

05 September 2023


Working example: Rugby gameplay 3D Pose Estimation

[1]: from skyrenderer.utils.common import get_presentation_folder_from_assets
[2]: get_presentation_folder_from_assets('/dli/mount/assets', 'gtc03_assets')

1 Motivation

In rugby, as in any similar type of sport, building correct playing strategy before the championship season is a key to success for any professional coach and club owner. While coaches strive at providing best tips and point out mistakes during the game, they still are incapable of noticing every detail and behavioral patterns of both teams while rewatching the matches. For being able to collect such data, analyze it and make inference about team behavior, sophisticated AI algorithms can be used.

In particular, the types of the tasks we would like to solve for fostering the analysis of the rugby team are the location of each player during the match and the 3D pose of each player on the field.

Having such information in real-time will provide necessary evidence for building better playing strategy.

2 Problem

Machine learning algorithms are getting more and more powerful in all kinds of classification problems, including image recognition. Increasingly sophisticated models, given enough correctly labeled data, are able to achieve superb performance and accuracy. In many cases a class of a problem has an efficient solution already discovered, but it cannot be applied - the only bottleneck is missing data.

The process of gathering and - especially - labeling data can be extremely expensive and time-consuming. The images must be manually analyzed by humans, whose labor in such repetitive tasks is not only slow and expensive, but also less accurate, compared to computers.

In addition, there are cases that require modern equipment for the production of labeled data and highly qualified specialists to maintain the production process. This case significantly increases the project cost or in many cases, makes the project realization unaccessible for stakeholders.

3 Solution

What if we could generate automatically the images suited perfectly for the task at hand with the complete and always correct ground truth built-in?

We would like to show our attempt to achieve exactly this on the example of football players pose recognition. The goal is to train the model to accurately recognize the football players and their poses as human keypoints in 3D space on the real-life match footage, like below, having been trained exclusively on artificial, synthetic data. The images are rendered scenes, that are fully controlled by our renderer, so all kinds of ground truths can be provided, depending on the model's requirements.

[3]: from skyrenderer.example_assistant.markdown_helpers import show_jupyter_picture, show_jupyter_movie
[4]: show_jupyter_picture('gtc03_assets/illustrations/football_frame_1.png')
[5]: show_jupyter_picture('gtc03_assets/illustrations/football_frame_2.png')
[6]: show_jupyter_picture('gtc03_assets/illustrations/football_frame_3.png')

3.1 Agenda

  • Dependences
  • Context Configuration
    • The graphic assets
    • Assets configuration
    • Scene Tree Structure
    • Scene
    • Renderer Scenario
    • Renderer Datasource
    • On Synthetic Data
    • On Real Data
[7]: from skyrenderer.core.logger_config import configure_logger
[8]: logger = configure_logger()

First let's visualize the GPUs available on the machine. Based on this we can select which GPUs will be used by rendering and learning. By default we use all available devices.

[9]: !gpustat

/bin/sh: 1: gpustat: not found

[10]: import torch
[11]: AVAILABLE_GPUS = list(range(torch.cuda.device_count()))

4 Sky Engine renderer configuration

4.0.1 Context configuration

It is required to set the path where the assets (images, meshes, animations etc.) are stored. For convenience, the example assistant is configured. It will help with visualizations.

[12]: from skyrenderer.scene.renderer_context import RendererContext
from skyrenderer.scene.scene import SceneOutput
from skyrenderer.example_assistant.visualization_settings import VisualizationDestination
from skyrenderer.example_assistant.display_config import DisplayConfig
from skyrenderer.example_assistant.example_assistant import ExampleAssistant
[13]: root_paths_config = {
'assets_root': '/dli/mount/assets',
'cache_root': '/dli/mount/cache'
renderer_ctx = RendererContext(root_paths_config)
     2021-03-16 15:05:38,218 | skyrenderer.scene.renderer_context | INFO: Root
- root path: /home/skyengine/.local/lib/python3.6/site-packages/skyrenderer
- assets path: /dli/mount/assets
- config path: /home/skyengine/.local/lib/python3.6/sitepackages/
- optix sources path: /home/skyengine/.local/lib/python3.6/sitepackages/
- cache path: /dli/mount/cache
2021-03-16 15:05:38,480 | skyrenderer.service.service | INFO: Open GUI here:
[14] display_config = DisplayConfig(visualization_destination=VisualizationDestination.DISPLAY,
example_assistant = ExampleAssistant(context=renderer_ctx, display_config=display_config)

4.1 The graphic assets

In the Sky Engine pipeline the graphic assets, the building blocks for the scene, are prepared by a CG Artist using third-party software tools. Assets prepared for this scene:

  1. Geometries

    The main format used in Sky Engine for carrying information about scene definition: models and their relative positions (or position ranges for randomization) is Alembic (.abc). Alembic exchange format developed by Sony Pictures Imageworks and Lucasfilm is widely used in the industry and is supported by most of the modern CG tools.

    For this scene an artist prepared:

    • Model of a rugby stadium,
    • Animation of a rugby player with keypoints,
    • Scene definition Alembic file - locators specifying positions of all the geometries, lights and camera. The player does not have a fixed position, it has a position range instead.
  2. Materials

    Sky Engine by default uses a metallic-roughness PBR shader. The input maps for the shader can come from files or from the Substance archive (.sbsar). Sky Engine built-in support for Substance allows for parameter randomization and texture rendering on the fly in background.

    For this scene an artist prepared:

    • Substance archive for rugby players,
    • Substance archive for parts of the stadium: base, grass, logos, crowd,
    • Files with maps for banners, bumpers and screen.
  3. Environmental mapping

    The background for this scene is a simple cloudy sky HDR.

4.1.1 Context configuration

For the Alembic assets prepared according to Sky Engine guidelines, the whole scene can be loaded and visualized without further configuration.

[15]: renderer_ctx.load_abc_scene('stadium')
     2021-03-16 15:05:38,525 | skyrenderer.core.asset_manager.asset_manager | INFO: Syncing git annex…
2021-03-16 15:05:40,209 | skyrenderer.core.asset_manager.asset_manager | INFO: Syncing git annex done.
[16]: renderer_ctx.setup()'Scene\n{str(renderer_ctx)}')
     2021-03-16 15:05:54,532 | main | INFO: Scene
top_node (count: 1)
|-- bumper_GEO_NUL_000 (count: 1)
| +-- bumper_GEO (count: 1)
|-- bumper_GEO_NUL_001 (count: 1)
| +-- bumper_GEO_0 (count: 1)
|-- bumper_GEO_NUL_002 (count: 1)
| +-- bumper_GEO_1 (count: 1)
|-- bumper_GEO_NUL_003 (count: 1)
| +-- bumper_GEO_2 (count: 1
) |-- light_L01_LIGHT_NUL (count: 1)
|-- light_L02_LIGHT_NUL (count: 1)
|-- light_L03_LIGHT_NUL (count: 1)
|-- light_L04_LIGHT_NUL (count: 1)
|-- player_GEO_NUL (count: 1)
| +-- player_GEO (count: 1)
|-- rugby_pitch_GEO_NUL_000 (count: 1)
| +-- rugby_pitch_GEO (count: 1)
|-- rugby_pitch_GEO_NUL_001 (count: 1)
| +-- rugby_pitch_GEO_0 (count: 1)
|-- screen_GEO_NUL (count: 1)
| +-- screen_GEO (count: 1)
|-- banners_GEO_NUL (count: 1)
| +-- banners_GEO (count: 1)
|-- crowd_GEO_NUL (count: 1)
| +-- crowd_GEO (count: 1)
|-- grass_baners_GEO_NUL (count: 1)
| +-- grass_baners_GEO (count: 1)
|-- grass_GEO_NUL (count: 1)
| +-- grass_GEO (count: 1)
|-- logo_adidas_GEO_NUL (count: 1)
| +-- logo_adidas_GEO (count: 1)
|-- stadium_base_GEO_NUL (count: 1)
| +-- stadion_base_GEO (count: 1)
|-- stadium_details_GEO_NUL (count: 1)
| +-- stadion_details_GEO (count: 1)
|-- stripes_GEO_NUL (count: 1)
| +-- stripes_GEO (count: 1)
|-- camera_CAM_NUL (count: 1)
| +-- camera_CAM (count: 1)
+-- camera_target_NUL (count: 1)
[17]: with example_assistant.get_visualizer() as visualizer:
     2021-03-16 15:06:09,938 | skyrenderer.utils.time_measurement | INFO: Render time: 15.40 seconds

4.1.2 Materials

Each loaded object needs to have a material assigned.

[18]: from skyrenderer.scene.scene_layout.layout_elements_definitions import MaterialDefinition
from skyrenderer.basic_types.provider import SubstanceTextureProvider, FileTextureProvider
from skyrenderer.basic_types.procedure import PBRShader
[19]: player_textures = SubstanceTextureProvider(renderer_ctx, 'rugby_player')
renderer_ctx.set_material_definition('player_GEO', MaterialDefinition(player_textures))
[20]: stadium_base_textures = SubstanceTextureProvider(renderer_ctx, 'concrete')
stadium_base_params = PBRShader.create_parameter_provider(renderer_ctx, tex_scale=50)
  MaterialDefinition(stadium_base_textures, parameter_set=stadium_base_params))
[21]: crowd_textures = SubstanceTextureProvider(renderer_ctx, 'crowd')
crowd_params = PBRShader.create_parameter_provider(renderer_ctx, tex_scale=5)
renderer_ctx.set_material_definition('crowd_GEO', MaterialDefinition(crowd_textures, parameter_set=crowd_params))
[22]: grass_textures = SubstanceTextureProvider(renderer_ctx, 'grass')
renderer_ctx.set_material_definition('grass_GEO', MaterialDefinition(grass_textures))
[23]: grass_logos_textures = SubstanceTextureProvider(renderer_ctx, 'logos_grass')
renderer_ctx.set_material_definition('grass_baners_GEO', MaterialDefinition(grass_logos_textures))
[24]: banners_textures = FileTextureProvider(renderer_ctx, 'banners', 'stadium/banners')
renderer_ctx.set_material_definition('banners_GEO', MaterialDefinition(banners_textures))
[25]: screen_texture = FileTextureProvider(renderer_ctx, 'screen', 'stadium/screen')
renderer_ctx.set_material_definition('screen_GEO', MaterialDefinition(screen_texture))
[26]: bumpers_texture = FileTextureProvider(renderer_ctx, 'bumpers', 'stadium/bumpers')
renderer_ctx.set_material_definition('bumper_GEO.?.?$', MaterialDefinition(bumpers_texture), use_regex=True)
[27]: white_params = PBRShader.create_parameter_provider(renderer_ctx, 'white_params', material_color=(0.8, 0.8, 0.8))
renderer_ctx.set_material_definition('stripes_GEO', MaterialDefinition(parameter_set=white_params))
renderer_ctx.set_material_definition('rugby_pitch_GEO.?.?$', MaterialDefinition(parameter_set=white_params),
[28]: renderer_ctx.setup() with example_assistant.get_visualizer() as visualizer:   visualizer(renderer_ctx.render_to_numpy())
     2021-03-16 15:06:32,080 | skyrenderer.utils.time_measurement | INFO: Render time: 7.86 seconds

Let's replace the gray background with a sky.

[29]: from skyrenderer.basic_types.item_component import Background
from skyrenderer.basic_types.procedure import EnvMapMiss
from skyrenderer.basic_types.provider import HdrTextureProvider
[30]: renderer_ctx.define_env(Background(renderer_ctx,
  HdrTextureProvider(renderer_ctx, 'light_sky')))
     2021-03-16 15:06:35,475 | skyrenderer.scene.renderer_context | WARNING: Setting background definition after setup.
[31]: renderer_ctx.setup()
with example_assistant.get_visualizer() as visualizer:
     2021-03-16 15:06:45,492 | skyrenderer.utils.time_measurement | INFO: Render time: 9.75 seconds

4.1.3 Scene configuration

The Sky Engine renderer provides virtually endless possibilities to shuffle, multiply, randomize and organize the assets.

From one Alembic animation we are creating two teams of 20 players each.

[32]: renderer_ctx.layout().duplicate_subtree(renderer_ctx, 'player_GEO_NUL', suffix='team2')
renderer_ctx.layout().get_node('player_GEO_NUL').n_instances = 20
renderer_ctx.layout().get_node('player_GEO_NUL_team2').n_instances = 20
[33]: renderer_ctx.setup()
with example_assistant.get_visualizer() as visualizer:
     2021-03-16 15:06:53,184 | skyrenderer.utils.time_measurement | INFO: Render time: 6.84 seconds

By default, all materials are drawn randomly. To create two proper teams we need to ensure that each team has the same shirt color which is different than the other team's color, while keeping all the other inputs (hair, skin color, socks color, shirt number etc.) random.

To achieve it, we need to put the the players into separate randomization groups and define their drawing strategy. The Substance archive input that controls shirt color is called "Colors_select". It needs to be the same (synchronized) inside the randomization group and different between groups. All the other inputs are kept randomized by default.

[34]: from skyrenderer.randomization.strategy.input_drawing_strategy import SynchronizedInput
from skyrenderer.randomization.strategy.synchronization import Synchronization, SynchronizationDescription
from skyrenderer.randomization.strategy.drawing_strategy import DrawingStrategy
[35]: shirt_sync = SynchronizedInput(SynchronizationDescription(
player_material_strategy = DrawingStrategy(renderer_ctx, inputs_strategies={'Colors_select': shirt_sync})
[36]: renderer_ctx.setup()
with example_assistant.get_visualizer() as visualizer:
     2021-03-16 15:08:15,715 | skyrenderer.utils.time_measurement | INFO: Render time: 4.22 seconds

If you looked closer on the picture above, you might notice that each player is in the exact same pose. By default, Sky Engine plays animations from Alembic files frame by frame, so we need to randomize this parameter.

[37]: from skyrenderer.randomization.strategy.input_drawing_strategy import UniformRandomInput
[38]: player_geometry_strategy = DrawingStrategy(renderer_ctx, frame_numbers_strategy=UniformRandomInput())
[39]: renderer_ctx.setup()
with example_assistant.get_visualizer() as visualizer:
     2021-03-16 15:08:23,451 | skyrenderer.utils.time_measurement | INFO: Render time: 6.82 seconds

During the rugby match, players are not distributed roughly uniformly - they tend to gather in a group closer together. To make the scene look more natural, we can change the way the players' positions are drawn. Instead of drawing them uniformly, we can use random Gaussian random distribution. It is double-random, because first 𝜇 and 𝜎 are drawn, and then the positions for players are drawn also randomly with these parameters.

[40]: from skyrenderer.randomization.strategy.input_drawing_strategy import RandomGaussianRandomInput
[41]: gauss_strategy = DrawingStrategy(renderer_ctx,
  default_input_strategy=RandomGaussianRandomInput(sigma_relative_limits=(0.1, 0.2)))
[42]: renderer_ctx.setup()
with example_assistant.get_visualizer() as visualizer:
  for _ in range(5):
     2021-03-16 15:08:31,028 | skyrenderer.utils.time_measurement | INFO: Render time: 6.71 seconds
     2021-03-16 15:08:37,221 | skyrenderer.utils.time_measurement | INFO: Render time: 5.84 seconds
     2021-03-16 15:08:44,746 | skyrenderer.utils.time_measurement | INFO: Render time: 7.18 seconds
     2021-03-16 15:08:48,757 | skyrenderer.utils.time_measurement | INFO: Render time: 3.66 seconds
     2021-03-16 15:08:52,932 | skyrenderer.utils.time_measurement | INFO: Render time: 3.82 seconds

This concludes configuration of materials, geometries and their positions.

4.1.4 Lights

The artist defined light positions in the Alembic scene definition. By default they have a constant intensity. We will randomize them.

[43]: from skyrenderer.basic_types.provider.provider_inputs import HSVColorInput
from skyrenderer.basic_types.lights import BasicLight
[44]: white_light_provider = BasicLight.create_parameter_provider(renderer_ctx,
  color=HSVColorInput(hue_range=(0, 0),
    saturation_range=(0, 0),
        value_range=(0.4, 1)))
renderer_ctx.set_light('light_L01_LIGHT_NUL', BasicLight(renderer_ctx, white_light_provider))
renderer_ctx.set_light('light_L02_LIGHT_NUL', BasicLight(renderer_ctx, white_light_provider))
renderer_ctx.set_light('light_L03_LIGHT_NUL', BasicLight(renderer_ctx, white_light_provider))
renderer_ctx.set_light('light_L04_LIGHT_NUL', BasicLight(renderer_ctx, white_light_provider)

4.1.5 Camera

Now we must improve the camera and its filters. The rendering process in Sky Engine is defined by a chain of render steps. The output of a step is an input of the next one. We define four render steps: * PinholeRenderStep - simple pinhole camera with randomized horizontal field of view (hfov, in degrees), which simulates random zoom, * Denoiser - AI-Accelerated Optix Denoiser, * Tonemapper - Optix tonemapper with randomized parameters - gamma and exposure, * GaussianBlurPostprocess - the train images should not be perfect. Images with different degree of blurring help the model to generalize better. Here we're using random Gaussian blur.

Additionally, we're reducing the output size to match the deep learning model required input size.

[45]: from skyrenderer.render_chain import RenderChain, PinholeRenderStep, Denoiser, Tonemapper, GaussianBlurPostprocess
from skyrenderer.basic_types.provider.provider_inputs import IntInput, FloatInput
[46]: HEIGHT = 768
WIDTH = 1024
[47]: pinhole_params = PinholeRenderStep.create_hfov_parameter_provider(renderer_ctx,
  hfov=IntInput(min_value=15, max_value=35))
camera_step = PinholeRenderStep(renderer_ctx, origin_name='camera_CAM_NUL', target_name='camera_target_NUL',
[48]: denoiser = Denoiser(renderer_ctx)
[49]: tonemapper_params = Tonemapper.create_parameter_provider(renderer_ctx, gamma=FloatInput(min_value=2, max_value=5),
    exposure=FloatInput(min_value=0.7, max_value=1))
tonemapper = Tonemapper(renderer_ctx, parameter_provider=tonemapper_params)
[50]: gauss_blur_params = GaussianBlurPostprocess.create_random_parameter_provider(renderer_ctx, max_kernel_radius=7,
     max_sigma_x=2, max_sigma_y=0.7)
gauss_blur = GaussianBlurPostprocess(renderer_ctx, parameter_provider=gauss_blur_params)
[51]: renderer_ctx.define_render_chain(RenderChain([camera_step, denoiser, tonemapper, gauss_blur],
[52]: renderer_ctx.setup()
with example_assistant.get_visualizer() as visualizer:
  for _ in range(5):
     2021-03-16 15:09:02,229 | skyrenderer.utils.time_measurement | INFO: Render time: 8.31 seconds
     2021-03-16 15:09:07,964 | skyrenderer.utils.time_measurement | INFO: Render time: 5.50 seconds
     2021-03-16 15:09:11,144 | skyrenderer.utils.time_measurement | INFO: Render time: 2.97 seconds
     2021-03-16 15:09:14,030 | skyrenderer.utils.time_measurement | INFO: Render time: 2.61 seconds
     2021-03-16 15:09:17,424 | skyrenderer.utils.time_measurement | INFO: Render time: 3.17 seconds

4.1.6 Scene semantics

Last but not least, we must provide ground truth - information about scene semantics. This setup is designed for player detection with keypoints, so we must assign the semantic class only to players.

As mentioned before, the keypoints are already present in the player animation. Sky Engine by default calculates all the information about keypoints, if it receives them in the input assets, we just need to visualize them to be sure everything is configure correctly. Green keypoints are visible, red - hidden.

[53]: renderer_ctx.set_semantic_class('player_GEO', 1)
renderer_ctx.set_semantic_class('player_GEO_team2', 1)
[54]: from skyrenderer.scene.scene import SceneOutput}
example_assistant.visualized_outputs = {SceneOutput.BEAUTY, SceneOutput.SEMANTIC, SceneOutput.KEYPOINTS}
[55]: renderer_ctx.setup() }
with example_assistant.get_visualizer() as visualizer:}
   for _ in range(5):}
     2021-03-16 15:09:23,943 | skyrenderer.utils.time_measurement | INFO: Render time: 5.87 seconds
     2021-03-16 15:09:29,358 | skyrenderer.utils.time_measurement | INFO: Render time: 5.18 seconds
     2021-03-16 15:09:33,988 | skyrenderer.utils.time_measurement | INFO: Render time: 4.38 seconds
     2021-03-16 15:09:37,252 | skyrenderer.utils.time_measurement | INFO: Render time: 3.01 seconds
     2021-03-16 15:09:40,526 | skyrenderer.utils.time_measurement | INFO: Render time: 3.01 seconds

Everything is OK, so we can create a renderer datasource for training.

[56]: from skyengine.datasources.multi_purpose_renderer_data_source import MultiPurposeRendererDataSource
[57]: datasource = MultiPurposeRendererDataSource(renderer_context=renderer_ctx, images_number=20,
[57]: datasource = MultiPurposeRendererDataSource(renderer_context=renderer_ctx, images_number=20,

5 Training

Training configuration.

[58]: from deepsky.evaluators.sample_savers import ImageBboxKeyPointSaver, EvalHook
from deepsky.models.pose3d import get_pose_3d_model
from deepsky.trainers.trainer import DefaultTrainer
from deepsky.serializers.simple import SimpleSerializer
from skyengine.datasources.wrappers.mpose3d_wrapper import SEWrapperForDistancePose3D
from import DataLoader
import torchvision.transforms as standard_transforms
[59]: import numpy as np
[60]: class Constants:
  DROP_LAST = True:
  EPOCHS = 1
[61]: transform = standard_transforms.Compose([standard_transforms.ToPILImage(),
[62]: main_datasource = SEWrapperForDistancePose3D(datasource, imgs_transform=transform)
# split the dataset in train and test set
indices = torch.randperm(len(main_datasource)).tolist()
dataset =, indices[:int(len(indices) * 0.9)])
dataset_test =, indices[(len(indices) * 0.9):])
[63]: def collate_fn(batch):
  return tuple(zip(*batch))
[64]: train_data_loader = DataLoader(dataset,
[65]: valid_data_loader = DataLoader(dataset_test,
[66]: model = get_pose_3d_model(main_datasource.joint_num, backbone_pretrained=True)
model = model.cuda(0)'Train length in batches {}'.format(len(train_data_loader)))'Test length in batches {}'.format(len(valid_data_loader)))
     2021-03-16 15:21:34,651 | main | INFO: Train length in batches 18
2021-03-16 15:21:34,652 | main | INFO: Test length in batches 2
[67]: def keypoint_saver_transform(x):
  return x
[68]: key_point_saver = ImageBboxKeyPointSaver(keypoint_saver_transform, labels=['person'], colors_per_class=None,
                  use_labels=False, connections=main_datasource.CONNECTIONS)
[69]: evalbatch = {'keypoints_3D_image_saver': (key_point_saver, 1)}
[70]: def hook_func(evalhook, images_batch, predictions_batch, metas_batch):
  """ hook_func is the function which EvalHook instance will execute after method "update" call. You should
  define this function according to evalbatch input of evaluator inside Trainer """

  for img, preds, metas in zip(images_batch, predictions_batch, metas_batch):
    img = standard_transforms.ToPILImage()(img.cpu())
    poses_coords = metas['poses_coords'].numpy()
    poses_coords[:, :, 2] = metas['poses_viz'].squeeze(2).numpy()
    keypoint_targets = {'boxes': metas['boxes'], 'keypoints': poses_coords, 'stamp': 'GTS'}
    pred_poses_coords = preds['pred_poses_coords'].cpu().numpy()
    pred_poses_coords[:, :, 2] = 1
    keypoint_preds = {'boxes': preds['boxes'].cpu(), 'keypoints': pred_poses_coords, 'stamp': 'PREDS'}
    keypoint_name = 'img{}.png'.format(evalhook.counter)
    keypoints_image_saver_object, freq = evalhook.keypoints_3D_image_saver
      img_name=keypoint_name, images=[img], preds=[keypoint_preds], gts=[keypoint_targets])
[71]: evaluator = EvalHook(evalbatch, hook_func)
[73]: optimizer = torch.optim.Adam(model.parameters(), lr=0.0001)
scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[16, 64], gamma=0.5)
[74]: serializer = SimpleSerializer(
  train_dir='pose3D_tests', ckpt_dir='checkpoints')
[75]: trainer = DefaultTrainer(
  data_loader=train_data_loader, model=model, epochs=Constants.EPOCHS, save_freq=1,
  valid_data_loader=valid_data_loader, optimizer=optimizer, evaluator=evaluator, scheduler=scheduler,
     2021-03-16 15:21:39,627 | deepsky.serializers.simple | INFO: Checkpoint was not provided, start from epoch 1
2021-03-16 15:21:39,628 | deepsky.trainers.generic | INFO: Scheduler step will be applied per epoch

You've probably noticed that we set a tiny dataset: just 20 images. Normally it would naturally be at least a few thousands of images, but we can't wait now a few hours for the training to finish and we want to present the full Sky Engine workflow.

[76]: trainer.train()
     2021-03-16 15:21:39,627 | deepsky.serializers.simple | INFO: Checkpoint was not provided, start from epoch 1
2021-03-16 15:21:39,628 | deepsky.trainers.generic | INFO: Scheduler step will be applied per epoch

[epoch 1]: 100%|  | 18/18 [00:14<00:00, 1.26it/s, str=pose_head:7.064 loss_sum:7.064]
[epoch 1]: 100%|  | 2/2 [00:05<00:00, 2.71s/it, str=pose_head:6.501 loss_sum:6.501]
[inf]: 100%|  | 2/2 [00:00<00:00, 6.32it/s]

     2021-03-16 15:22:02,089 | deepsky.trainers.generic | INFO: {'epoch': 1,'train_loss': 7.064170413547092, 'val_loss': 6.500706195831299}
2021-03-16 15:22:02,709 | deepsky.serializers.simple | INFO: Saving checkpoint checkpoints/pose3D_tests/ckpt_epoch_1_train_loss-7.064_val_loss-6.501.pth.tar

After each epoch we save a checkpoint and produce some inference example on inference data to be able to see the training progress. Images generated during longer training process as above on bigger datasets are presented as follows:

[77]: show_jupyter_picture('gtc03_assets/trained/img1.png')
[78]: show_jupyter_picture('gtc03_assets/trained/img2.png')

5.1 On real data

Lets load model for 3D pose estimation pretrained on large synthetic data and run inference on real rugby match.

[79]: device = torch.device('cuda')
[80]: resume_path = 'gtc03_assets/trained/pose3d.pth.tar'
checkpoint = torch.load(resume_path)
model_weights = checkpoint['state_dict']
model =

We will need also player detection model which also was trained on the same artificial data with bouding boxes provided by the same renderer datasource.

[81]: from deepsky.models.maskrcnn import get_model_from_coco_pretrained
from deepsky.datasources.image_inference import ImageInferenceDatasource
from dem_rugby_helpers import bboxes_viz, plot_pose3D, make_patch, _image_to_3dbox_world, _bboxes_to_low_corner
from PIL import Image
[82]: detection_model = get_model_from_coco_pretrained(num_classes=3,
        anchor_sizes=((16,), (32,), (48,), (64,), (72,)),
        ratios=((0.5, 0.75, 1.0),),
[83]: checkpoint = torch.load('gtc03_assets/trained/rugby_detection.pth.tar')
for k, v in sorted(checkpoint.items()):
  checkpoint[''.join(['_model.', k])] = checkpoint.pop(k)
detection_model =
real_dataset = ImageInferenceDatasource(dir='gtc03_assets/real_data', extension='png')

Lets detect players and vizualize results

[84]: img, file_path = real_dataset[75]
orig_img =
with torch.no_grad():
  img =
  outputs = detection_model(img.unsqueeze(0))
out = outputs.pop()
bboxes = out['boxes'].cpu().detach().numpy()
labels = out['labels'].cpu().detach().numpy()
bboxes = bboxes[np.where(labels == 1)[0]]
[85]: bbox_image = bboxes_viz(orig_img, bboxes)

After bounding boxes were generated, we can crop target objects and estimate the pose in 3D space. for data preprocessing we will use the same datasource we have used during training

[86]: model.eval()
with torch.no_grad():
  results = model((img,), ({'boxes': torch.from_numpy(bboxes).int()},))
results = results.pop()
output_coords, output_bboxes = results['pred_poses_coords'].cpu(), \
[87]: output_coords[:2]
[88]: n = 6
[89]: boxes = _bboxes_to_low_corner(output_bboxes)
crops = make_path(img,
pil_img = standard_transforms.ToPILImage()(crops[n].squeeze(0).cpu())
[90]: coord = _image_to_3dbox_world(output_coords, boxes, 2000)


Frequently Asked Questions

  • Synthetically generated 3D objects positioned on real-data scenes and virtual scenes
  • Overhead image scenes
  • Ground-based image scenes
  • Images of specific and complex interiors and exteriors
  • Motion imagery scenes
  • Images and video of animated humans, vehicles or animals

  • Classification (Whole-image content labels)
  • Image-aligned 2D and 3D bounding boxes (horizontal and vertical box edges)
  • Object-aligned 2D and 3D bounding boxes (bounding box fits snugly about the object)
  • Segmentation (Pixel-level shadings of training object shapes)
  • Instance segmentation
  • 3D Keypoints
  • Tracks of training object movement trajectories across sequences of generated imagery

  • Broad class of object (airplane, automobile)
  • Class of object (truck, van)
  • Specific subclass (Ford F150)
  • Specific object components (chasis, tyres, windows)

  • Light sources
    Point light and spotlight light sources with energy dissipation modelling are supported.
  • Environment lights
    Both image and HDR environment lights are supported
  • Emissive objects
    Light sources with custom geometry (defined in mesh) and surface luminance (defined in material) could be used

  • Cloud Cover
  • Haze
  • Snow
  • Dust

  • Scene pose or collection angle variance
  • Signal attenuation
  • Image imperfection (“bad pixel” or other degradation)
  • Surface reflectance
  • Lens distortion
  • Spectral characteristics

  • General biome type (desert, tundra, grassland, etc.)
  • Urban composition/density
  • Vegetation
  • Water or snow cover
  • Complex interiors (factories, warehouses, operating rooms, assembly lines)

  • Whole object or feature pose
  • Object component orientation (for example, portions of equipment swiveling, extending, or changing based on operation)
  • Damage to object or feature
  • Object or feature paint scheme
  • Domain randomization techniques (adding random shapes, textures, or other distraction for model training)
  • Multiple subclasses of a class of interest

  • Multispectral imagery (including RGB images)
  • Panchromatic imagery
  • Infrared imagery
  • Hyperspectral imagery
  • Radar data
  • Lidar data
  • X-ray imagery (including CT reconstruction)
  • Motion data – electro-optical
  • Motion data – infrared
  • Synthetic aperture radar data
  • Acoustic sensor data

  • Dynamic shadowing
  • Dynamic illumination
  • Vehicle activity and scenarios
  • Human activity and scenarios
  • Weather
  • Sensor platform movement (platform motion, fly-through behavior)
  • Sensor movement (camera jitter, sway, or tilting)

  • Conceptual geometry
  • Approximate geometry
  • Precise geometry
  • Precise geometry for geo-specific features and locations (digital twins)
  • Implicit geometries

  • Implicit materiality (default colors or textures used)
  • Overlaid imagery onto geometry (texture map)
  • Colors or textures included in the geometric primitive (textures rendered onto mesh for example)
  • Material is identified explicitly with a label or metadata artifact
  • Material is represented but not identified explicitly with a label or metadata artifact


Artificial Intelligence Evolved

The SKY ENGINE AI Platform lets you generate your data and train ML models and expand your use cases beyond the limitations of traditional AI.

The Sky Engine deep learning platform is designed to overcome the complex object recognition challenges of modern machine vision.

  • Instantly generate and visually inspect all of your data in SKY ENGINE Integral, regardless of scale
  • Leverage the blazing accurate physics-driven light propagation simulations, data generation and Python data science pipeline of SKY ENGINE Render
  • Reap the benefits of a full-stack AI platform and accelerate third-party BI and data science workflows with standard PyTorch, and TensorFlow connectivity

Bridge Data Generation & Deep Learning

Sky Engine combines a physics simulations-driven image renderer directly integrated with the AI models training framework and is designed to generate images for training machine vision AI systems in virtual environments.

Sky Engine generates training data using virtual scenes. By changing parameters in the CGI scene, Sky Engine is able to generate a massive number of labelled images for AI vision training directly into Deep Learning pipeline with multi-GPU scaling.

SKY ENGINE AI Disrupts Industries

Our solutions are used in areas diverse as healthcare for disease recognition from medical images or organ segmentation for radiation oncology planning to processing video footage for sports analytics like football.

Furthermore, Sky Engine provides ultra efficient methods for defects discrimination in manufacturing or agriculture to support food safety increase.

Generate Synthetic Data at scale to Accelerate your Computer Vision tasks

Subscribe to the SKY ENGINE AI newsletter.
Sign up for our news with press releases, inspiration, market reports and the latest updates or talk directly to sales and get data or AI platform.

You can find out here which data is stored and who can access it.
You can revoke my consent at any time for the future.