Different types of cameras can be used to meet the specific needs of a telepresence project. The following table summarizes some of the advantages and disadvantages for each type of most frequently used camera:
Types of camera | Advantages | Disadvantages |
---|---|---|
Professional video camera (ex: CANON) | Adaptable lens; Manual and full parameter control; Remote control possible; Quick setup option; SDI video output. |
More imposing size (not discreet on stage); Medium light sensitive |
Action camera (ex: GoPro) | Wide-angle lens and narrow mode to simulate a standard angle; Remote control possible via smart phone; Lightweight and small in size, it is easy to conceal. |
Wide-angle mode distorts edges and narrow mode reduces pixel resolution; Poor low-light performance; Mini HDMI video output only. |
Mobile devices (e.g. iPhone) | Standard lens useful for capturing proportions on a human scale; Wireless video outputs via WiFi connection (NDI or IP). |
Paid apps with few controllable settings; Lack of physical video output other than USB; Risk of conflict between the camera and other applications; Requires a good WiFi network. |
Robotic camera (PTZ) | Designed to be controlled remotely; Advanced parameter control; Several options for physical video outputs (DVI, HDMI, Ethernet). |
Size sometimes bulky or difficult to install; Lack of manual control; Expensive device. |
Webcam | Easy to use: little or no configuration required. | Limited to a USB connection on a computer; Not very mobile; Not very configurable; Image quality sometimes questionable Distance (zoom) often fixed; Often reserved for one application at a time. |
If the camera takes up too much space, it risks being visually inconspicuous, even obstructing the view of local audience, or even being difficult to position in relation to the desired capture area.
For example, the use of a small action camera makes it possible to position it more discreetly in front of a projection surface, so that it is at face level to maximize the interplay of eye contact between the actors. By the same token, the size of the support holding the camera in place will be reduced accordingly.
This consideration therefore influences the choice of camera preference according to its size: small cameras offer greater flexibility in terms of positioning; in return, their video quality is generally lower than that of professional models, which are often larger.
This set of factors is all the more likely to influence the choice of a camera or the addition of complementary accessories, such as a particular support or a certain type of lens.
For example, if the available space does not allow the camera to be positioned at a distance allowing to obtain a sufficiently wide field of capture, the use of a wide-angle lens could be considered, although this may be at the cost of image distortion.
In another case, you might want to shoot vertically rather than horizontally, in order to display a person in real size. In order to maintain optimal resolution, the camera could then be tilted 90° to the side. It would therefore be necessary to provide a support allowing this type of inclination.
Finally, one could also desire dynamic and variable-angle capture field either from a fixed point, as a robotic camera allows, or one that is completely mobile and connected to the Wi-Fi network, like that of a smartphone.
Once the lighting and the cameras are in place, it is recommended to calibrate the balance their optical settings. This is all the more important in a telepresence context since there is no post-production for live video. Depending on the lighting level of the stage or scene, the ideal exposure depends on the following parameters:
One of the many advantages of a robotic camera is the possibility of remote control. Although the other types of camera do not offer as much dynamism, some still allow you to control certain parameters remotely, be it through a mobile application, a web page, or even, a remote control. During technical tests, for example, the adjustment of certain visual parameters (zoom, focus, brightness, etc.) remains easy even if the camera is not directly accessible (far from the video control room, fixed in height, etc.). )
Portable devices usually have a video camera capable of capturing HD images. Some applications allow these images to be transmitted over a local network, via a Wi-Fi connection (wireless). By using a WiFi terminal or router connected to the same network as the telematics control room, it is possible to receive any audiovisual stream from one of these devices, whether it is an NDI stream or an HTTP stream. In order to obtain a quality stream, it is important to carefully consider any obstacles and/or the distance separating the terminal or the WiFi router from the mobile device in question. If the mobile device strays too far from its source of wireless connection to the local network, the quality of the stream may deteriorate, until it is interrupted.
In addition, it is recommended to restrict this access to the wireless network so that it is only available to mobile devices used as cameras for the event. Indeed, if too many devices were to connect simultaneously to the same WiFi terminal or router, this could cause the connection to be saturated and thus make the quality of the transmission unpredictable.
One of the primary determining factors in choosing the type of camera to use is certainly the quality of the image it produces. However, in a context of telepresence, where the video signal must necessarily be transmitted via the Internet to one or more destinations, the visual quality of the signal can hardly be preserved at its initial value - compression is required. Nonetheless, it is still important to minimize any reduction in quality as much as possible at each stage of the signal transmission, but especially at its source.
On the other hand, if the recording functions are not essential for telepresence, the sensitivity of the camera, i.e. its performance at low light intensity, is particularly important in order to limit the « noise » in the image, especially when there are few or no spotlights or lights available. In such a case, professional-grade cameras generally offer better dynamic range in luminosity – which shows more detail in shadows or bright areas – and better colour rendering, in addition to offering different options for the choice of lens (standard, wide angle, very wide angle, macro, etc.)
For internet transmission purposes, the image resolution is generally maintained at 1920x1080 pixels or reduced to 1280x720 pixels to save the bandwidth of the outgoing internet connection (upload). It is still recommended that you keep the camera signal at optimum resolution and then scale it down, if necessary, downstream of the video source – via a video encoding device or software, for example.
In general, two types of video display are used: projection systems comprising a projector and a projection surface, and direct-view screens (TV, computer screen, showcase or LED video wall, etc. ) While projectors are used to display a video stream on very large surfaces while allowing the format to be changed, they are ideal for representing subjects in real scale for projections in front of an audience. On the other hand, the visibility of their projection is more affected by ambient light than a screen with direct vision. Direct vision screens, due to their more or less restricted size, are easily movable within a space. Their image visibility is less affected by ambient light. They can also be used as an auxiliary video display for participants.
In general, projection systems require more time and preparation to be installed properly. Here are some points to consider:
In order to optimize the sound quality of an event, certain considerations must be foreseen. In this case, the echo phenomenon is a problem frequently encountered in a telepresence context – where the voice or sound interaction between two places can often play an important role.
In order to limit, or even completely eliminate this phenomenon, it is necessary to recognize the origin of the issue; that is to say the reinjection, with delay, of an incoming audio signal simultaneously with an outgoing signal, which is then retransmitted to the originating location and connected to a local output at that site. In other words, the phenomenon boils down to this: the sound picked up at one place is both heard locally and transmitted to a remote place where it is also emitted by loudspeakers; this signal is again picked up by a microphone at the remote location which sends it back to its place of origin; the signal is then emitted again by the local loudspeakers, with a delay of a few hundred milliseconds; hence the echo effect.
Here are the main factors to consider in managing the echoing effect :
More directional microphones are less sensitive to surrounding noise-- that is to say those which have a cardioid, supercardioid, hypercardioid or shotgun type polar pattern – make it possible to point said microphone in the direction of sound sources that you want to capture specifically while limiting off-axis response, thereby reducing the intensity of any other unwanted surrounding sound.
Omnidirectional microphones pick up sound from all directions and should be placed some distance from the source (12-20cm), but are less susceptible to unwanted handling noise, wind or plosives (« t », « p » and « b »);
Directional microphones are less likely to cause feedback, but are more susceptible to plosives, proximity effect (exaggerated bass), wind, and handling noise.
A shotgun microphone (supercardioid) is mainly used to capture subjects in a fixed position – who are not moving – such as during a seated interview, otherwise they risk constantly leaving the field of capture of said microphone.
On the other hand, some types of microphones are designed for very specific uses. For example, miniature microphones, such as a lavalier or headband microphone, are generally wireless, hands-free, very discreet and placed very close to the voice source (mouth). They are therefore ideal for keeping the input signal gain low enough to limit leakage from other surrounding sound sources.
Dynamic type microphones (eg Shure SM58, Sennheiser E845, etc.) are commonly used for the stage thanks to their robustness, both in terms of the simplicity of their construction and their ability to support very large dynamic ranges without degradation. On the other hand, their sensitivity, and therefore their audio fidelity, is quite limited. This is why they are mainly used to capture the voice or close-mic’ed instruments, which can be an advantageous in limiting the capture of unwanted sound sources, such as venue or stage speakers.
Condenser type microphones (e.g. Neumann U87 or TLM 103, Shure Beta 53, Sennheiser HSP, most lavaliers and shotguns, etc.) are commonly used in controlled environments (recording studios, television or filming, etc.) thanks to their greater sensitivity and audio fidelity. On the other hand, they are more fragile and more sensitive to unwanted noise than dynamic microphones, in addition to requiring a power supply and pre-amplification. Due to their greater sensitivity, these mics are not recommended in a high volume settings or near an amplified sound system. In addition, the smaller the microphone’s diaphragm (e.g. lavaliers or micro-headsets) the more amplification required for the microphone, due to the need to compensate for its lower sensitivity to slight variations in amplitude. This has the effect of increasing its sensitivity to ambient sounds or feedback effects. In this context, headsets are preferred over lavaliers, because they can be located closer to the voice source (mouth) and therefore require less gain, which reduces the risk of leakage (echo) and feedback, while in use on stage.
In addition to using a directional microphone, pointed in the direction of the source that one wishes to capture, it should ideally be located as close as possible to said source and proportionally as far as possible from any other sound source that the we do not want to capture.
By combining the directionality of the microphone and its proximity to the source, it becomes easier to keep the microphone’s input gain relatively low, thereby reducing its sensitivity to sounds from surrounding and more distant sources.
Conversely, the loudspeakers must be located as far as possible from the microphones whose signal is destined to be transmitted remotely, insofar as the signal output from these loudspeakers also comes from this place. Otherwise, the signal coming from the remote location and emitted by local loudspeakers will likely be picked up by a microphone and then rerouted to the loudspeaker outputs of the originating location. If the remote location has an equally problematic configuration, the signal will be picked up again, then sent back to the other location and so on.
However, despite moving the speakers away from and pointing the speakers emitting sound from a remote location away from the microphones, it is highly likely that this will not be enough to completely eliminate audio leakage if the sound level of the signals coming out of the speakers is too intense. This is why it is recommended to balance the loudness level of the outputs with the input gain of the microphones, in order to create a large enough dynamic difference between the loudness level of the desired sounds – ideally stronger – and undesired – ideally weaker. This balance will notably facilitate the adjustment of the threshold of a noise gate and, by extension, its effectiveness.
The principle of the noise gate comes down to attenuating (lowering) partially or completely the sound level of an incoming audio signal when it is below a certain pre-established threshold (usually in dB). If the sound level is, contrarily, at or above this threshold, the signal then remains unchanged (the gate remains open) at the output. The use of a noise gate – whether software or hardware – on the incoming signals can be very effective in eliminating the echo effect due to leaking, insofar as the preceding recommendations are well respected. Indeed, if the dynamic difference between the desirable and undesirable sounds is not large enough, it will be very difficult to adjust the level of the threshold of the gate. Nevertheless, such a tool is absolutely essential for good audio management in a telepresence context. If the equipment used for an event does not include such a tool, then a way must be provided to add one, ideally for each of the microphone inputs being sent to a remote location.
At the same time, good communication must be established with the remote technical team, in order to ensure that no echo effect is actually heard at the destination and to be able to notify the latter if this is the case locally.
From a strictly sonic point of view, it is clearly preferable to equip each participant, speaker or performer with in-ear monitors (either one or two earpieces) rather than using loudspeakers distributed around the stage, especially for musicians in the case of a concert. Such a solution may nevertheless prove to be too complex or costly, depending on the needs and the number of individuals to be equipped. This still limits the sound signals emitted by the main speakers, which are often directed at the audience rather than the participants.
Using the headset, participants could hear a remote speaker clearly while isolating the signal coming out of their microphone, thereby making the function of the noise gate more efficient. The fact remains that the sound intensity coming out of the main speakers must remain at a reasonable level in relation to the sensitivity of the microphones.
The act of spatializing sound sources so that they are perceived as coming from the same place as their image is a strategy that not only facilitates the intelligibility of a dialogue, but which fits more broadly into an immersive philosophy of telepresence by optimizing the credibility of the experience.
By locating sound sources as coming from their corresponding visual projection – using a perforated or woven (acoustically transparent) projection surface to conceal a loudspeaker directly behind it, for example – the feeling of tangible presence can then be reinforced by the fact that the image and the sound of a remote presence are located together within the space.
A simple example is that the voice of a subject located on the right side of the stage is diffused by a loudspeaker located on the right of said stage, while the voice of a remote subject, whose image is projected to the left side of the stage, is broadcast by a loudspeaker also located on the left.
Several different ways of spatializing the sources can be used depending on the technical and telematic means available. However, in all cases, the theoretical principle remains: to simulate the presence of a place or of distant subjects in a credible way.