Start typing to search...

Pin Data Types - 05 - Video, Audio


Now we've come to the most important data type of the system. Considering that Aximmetry is a visual application.

Data Types
This data type is the video which is designated with the yellow color. It can be any image content generated by the modules. One of the simplest tools of producing image content is the video player module, which plays a video file. So on its output, we can see the contents of the video file. The image content can be received from an external source as well. From cameras or from other types of studio equipment. And of course, it can be content generated internally by an Aximmetry module. A simple example is the Perlin noise module, which generates the texture of a given size filled with Perlin noise. Let's examine the peeker of the video signals. In its header, we can see some of the properties of the image. First of all, its size and its pixel format. On the output of the image generator modules, we usually encounter two kinds of pixel formats. The first is the most fundamental one. The 8 bits per channel RGBa format. The other one usually can be produced by turning on the HDR out switch that is found on most of the generator modules. It's a 16 bits per channel floating-point format. The letter F in the name of the format indicates floating-point. Its first benefit is that the intensity range of the channels are much bigger than the usual 0 to 1 interval- Thus can be used in a high dynamic range context. The second benefit is that it provides a higher precision, which can be important when working with fine color gradients.

Pixel Formats
There are a number of other pixel formats. We can encounter these when we load different kinds of image files. For example, let's choose a wood texture here. It has the DDS extension which is a Direct X specific file format. Let's drop it into the editor. Upon this, an image module is created, which is preloaded with the image I dropped in. If we look at its output, we can see it has the BC1 pixel format. It's a Direct X specific compressed format. Also, we can see a +MIT tag, which indicates that the texture also contains a series of MIT maps, which can be beneficial when the image is scaled or applied as a texture on a 3d object. If we load an image of a usual file format, like jpg, PNG, tiff, etc., we usually get the basic 8-bit RGBa pixel format. However, some of these file formats can store 16-bit or floating-point images. In these cases, we get the appropriate pixel formats. There is a special file format invented specifically for storing HDR images. Its extension is also HDR. Upon loading it, we naturally get a 16-bit floating-point pixel format. By the way, this image is a light map generated by a 3d modeler software. Let's return to the image generator modules, like the Perlin noise. Usually, these have an outsize property, which is for specifying the size of the result image. For example let's make it 5600 by 400. We can see the result, The sizes can be specified up to 8k. In the case that we only specify the width, let's make it 800, and we set the height to zero, we get a square-shaped image. If both sides are zero, then the current default frame size of the system is applied. Currently, it is 1280 by 720. Also known as 720p. This default frame size can be specified in the Preferences window, in the Rendering section. If I now change the frame size to Full HD the generator module will output a full HD image. We've seen before, that turning on the HDR out switch results in generating a 16-bit floating print format image. In the Preferences, we can specify that we want a 32-bit floating-point format instead when HDR images are generated. If we turn on this check box, we can see that the resulting image contains 32 bits per channel. In certain cases, when we need very high precision, this can be beneficial. Usually, it's not needed and also it puts much more load on the GPU when rendering a 3d scene.

More on Pixel Formats
So let's return to the 16-bit mode. In the video peeker, several of the other attributes might appear in addition to the pixel format. For example, here is our mini scene we've seen in the previous chapter. We have a camera here, which looks at a single cube. If we bring up a peeker for the camera output, we can see it's a basic 8-bit RGBa format image. But we can spot times for tag. It indicates that the 3d rendering is performed with the four times anti-aliasing. On the output of the camera, the full four times image information is preserved and can be used by certain post-processing modules. We can also see a plus +D32F tag, which indicates that a depth map is associated with the image. It contains the depth of information generated during the 3d rendering. It's a single channel 32-bit floating-point format image. And also has the times four tag. Since the anti-aliasing setting is also applied on the depth map. Depth map can be used in certain post-processing operations, but it also can be extracted directly using the depth scaler module. If we wire the camera into it separates the depth map from the image. Let's expose it to the output. We can see that the cube which is closer to the camera appears as a dark region, but to see the finer depth differences we have to narrow down the display depth range. Now we only see 0 to 7 meter step range and we can clearly see the increasing depth along the side of the cube. But why is this image red? The reason for this is that the system uses a single channel pixel format for the depth map to save memory space. This single channel is interpreted as a red channel in the system. On the other hand, the single channel is a 32-bit floating-point one and thus it preserves the original precision of the depth map. We can also spot that the +D32F tag is still present with this image. This means that the original associated depth map is passed through the module unaltered. This behavior is typical among the post-processing modules. Even if I insert a blur, though the image itself is distorted, the depth map is still passed through. This preservation of the depth map can be beneficial if we use a depth-based post-processing step later in the chain.

Processing Modules
Let's examine how the different post-processing modules affect the attributes of the image. Let's generate a vertically elongated image with the Perlin noise. For example an 80 by 700 one. If we wire it through a simple post-processing module, which only has a single input and a single output, for example, the blur, both the size and the pixel format of the image are retained during the operation. But there are modules that mix together two or more video inputs. For example, the Blender module. Let's wire two different inputs into it. The first is the picture coming from a video file.Its size is 1280 by 720 and has an 8-bit pixel format. The other is coming from the Perlin noise and has an 80 by 700 size with a 16-bit format. What is happening here is that the result image will inherit the size of the input image with the larger area. In this particular case, the image coming from the video file is the winner. Regarding the pixel format, always the higher precision one is selected. In this case the format of the Perlin noise image. Let's expose the result to the output. Since the size of the video file image carried through, the Perlin noise will be the one that suffers the stretching. The logic behind this is that in many cases we have a secondary image. For example, a mask or vignette that we want to blend onto an image content, but its resolution or aspect ratio is not necessarily identical to the same properties of the image. In these cases, we can wire the mask into a Blender without worry, because it will be stretched automatically. If we do not expect this behavior, and we want control over the resulting size, we can use placer modules to resize the inputs. We have to wire both inputs through a placer. Now we can specify an identical output size for both placers. Now let's leave it 0/0. Thus we will get the system default frame size for both images. We can see that during the blend the Perlin noise image retains its aspect ratio.

Let's return to the still images coming from files. We drag-n-drop a JPEG image into the editor. We get an 8 bit per channel RGBA image, which means a total of 32 bits per pixel. Suppose we want this image in a more compact format. It's not only beneficial from the perspective of GPU memory space, but the rendering itself is faster since the GPU has to move less data. Let's make a compressed BC1 image. One method to do this is to convert the file itself into DDS format, and we use this instead of the jpg. But the system provides settings for on-the-fly conversion as well. Meaning that the conversion automatically happens each time this JPEG image is loaded. The image module has an Image File property. It specifies the path of the source file. By clicking this three dots button we get a file open dialog where we can change the image the module contains. Now let's leave the original image. At the right of the property box there's a little arrow button. It opens a dialog, where we can specify several conversion attributes. To change the pixel format we simply turn on the convert switch and choose the desired target format. A number of formats are available. Now we want to choose the BC1 format. There are two of them. We want the one with no alpha, since the JPEG image does not contain alpha information. We click OK. The conversion is performed immediately. If we bring up the peeker, we can see that the format has changed the BC1. Now if we save the compound, the conversion attributes are saved as well. So next time we open the compound, the image will be converted upon loading and placed into the GPU memory in BC 1 format. If any conversion attribute is set for the image, the little arrow is shown in blue. We can clear all attributes with the right-click menu of the property, using the reset attributes item. We can see the blue color of the arrow has disappeared and we get back to the original 8-bit RGBa format. What other conversion attributes do we have? I won't demonstrate all of them in detail here, I'll just give a quick overview. We can crop and resize the image. We can create a cube map from various slice layouts. We can create a 3d volume texture by slicing up the image. We can generate a normal map if the image contains a height or bump map. We can generate MIT maps and we can convert the pixel format as we've seen before.

Video Inputs
Let's talk about the video input pins a bit. So far we have always used them in a way when we connected a wire into them from another module, that generated or processed the image. But the video pin can contain a still image as well, without the need of connecting anything into it. Let's select the Blur module and click the three dots button of its video property to bring up the Open File dialog. But first, I wire the module to the output, so that we can see the result. So let's return to the Open File dialog. In the case we turn on the instant preview function here at the bottom, the files that we click on are loaded for preview, so that we can see the result immediately. It can be very handy when selecting textures. Another way to load an image into the pin is using the side File Browser. Here we can only see thumbnails of the images during the selection. From here we can drag and drop an image either on the property editor box or on the video pin of the module box itself. Besides the video property, we can find the arrow button for the conversion attributes just like in the image module. To summarize, anywhere we have a video input pin we can load a still image into it as a constant, and we can specify conversion attributes for it. Let's also summarize the info on the drag-n-drop images. Note that we can drag and drop images not only from the file browser of Aximmetry but from any external application. For example Windows Explorer. Where exactly can we drop the images? We can drop it onto an empty area of the flow editor. In this case, an image module is created automatically which is preloaded with the image path in its image file property. If we drop the image onto an already existing image module, no new module is created. Instead, the image is replaced within that module. For any given video input pin we can drop the image either onto its property edit box or onto the pin itself on the module box. We can also drag and drop an image from a property editor. If I grab the property at its name field it behaves identically as if it were grabbed from the file browser. We can drop it onto any empty area, thus extracting the property into an image module. But we can drop it onto an external application. For example, let's drop it onto the Windows Desktop, thus copying the jpg file to the Desktop.

Our next, and last data type is audio. It's designated by the purple color. In Aximmetry all audio data represents a 48-kilohertz sampling of the sound, which is the usual sampling frequency applied with video materials in the world. It can consist of an arbitrary number of channels. Similar to the video data type the source of the audio can be a file, an external device, or can be the result of an internal generator module. For example, the video player module also outputs the soundtrack of the video file. We can see that in this case, the file contains a two-channel stereo audio, but we can load pure audio files as well. If we drag and drop an mp3 file, an audio player module is created, that only has an audio output. There are several formats of audio samples. When the audio comes from a file, we usually get the samples in their native format, which is a 16-bit format. But all audio processing in Aximmetry is done in 32-bit floating-point format. For example, if we add an audio filter module and wire the audio signal through it, the resulting audio already will be in the 32-bit floating-point format designated by the 32F tag in the peeker. We can also see that the stereo signal has become a mono one. The reason for this is that the audio processing modules in Aximmetry can only work on a single channel at a time. But we have modules for splitting the audio into single channels so that we can process them individually, and then later, merge them back into a multi-channel signal if necessary. Let's wire this stereo signal into the audio splitter module. We can see that the two channels appear individually on their outputs. The other outputs are empty naturally. Let's reemerge them into a multi-channel signal in a different configuration. Let's also add a third signal coming from the filter. Since the highest channel, we connected something into is the number 13, we get a 13 channel output, which of course has a number of silent channels besides the three we've connected. An example of an internal generation of sound is the Audio Oscillator module. It can output various waveforms that are conventional in the world of synthesizers, with an arbitrary frequency. Then we can process it in various ways, for example, we can wire it through a filter, we can modulate the cutoff frequency of the filter with the LFO, and so on. This way we can put together a moderate modular synthesis.

Audio Output
How can we hear the sound? A simple way is to use an Audio Out module. We have to specify a DirectX output device here. But if we have special video output equipment, for example, an SDI card, we can pair an audio signal with the output video. Let's suppose that our Output 1 goes to an SDI device. Now if we expose an audio as well, it is paired with the video directly above it, and the audio goes out on the SDI cable together with the picture. In the case of the Video Player and Audio Player modules, we don't have to use an audio out module to hear the sound on our PC. They can play the sound internally. We can see this little speaker icon on the modules. If it's turned on the sound is played without any wiring. We can select the target audio device in the player’s property. The speaker icon is associated with the Use Audio Device property here. They're the same switch. The same goes for the Audio Player of course. And this concludes the tutorial of all the data types that can be used in Aximmetry.

Article content