@SPACE1 =

COMUNICAÇÃO DE ÁUDIO E VÍDEO

INSTITUTO SUPERIOR TÉCNICO

Year 2011/2012 - 2º Semester, Responsible: Prof. Fernando Pereira

1^st Exam – 11^th June 2012 (Monday)

MEEC: The marks should be out before 18th June (Monday) at 12am at the CAV Web page and the exam checking session will on the 18th June (Monday) at 5pm in room LT4.

MERC: The marks should be out before 18th June (Monday) at 12am at the CAV Web page and the exam checking session will on the 18th June (Monday) at 2pm in room 0.13.

The exam is 3 hours long. Answer all the questions in a detailed way, including all the computations performed and justifying well your answers.

Don’t get ‘trapped’ by any question; move forward to another question and return later. Good luck !

I (1.0 + 1.0 + 1.0 = 3 val.)

Consider a facsimile transmission using the READ coding method at 3200 bit/s for pages with 1000 lines, each line with 1728 samples. Consider also that, on average, 80% of the samples in each line are white.

a) Assuming that

1. the unidimensionally coded lines have an average compression factor of 10 for the black runs and 20 for the white runs

2. the bidimensionally coded lines have an average compression factor of 20 for the black runs and 25 for the white runs

compute the global compression factor when a value of k equal to 3 is used to limit the propagation of channel errors. (R: 20.83)

b) Assume now that there is the need to increase the TMVL due to receiver limitations and this implies a reduction in 20% of the compression factors stated above (this means the values are now 80% of the values above). For this case, determine the maximum value of k that may be used if one needs to obtain decoding resynchronization, on average, at least once every 500 bits. (R: 5)

c) In general, identify two advantages and one drawback of using lossless source coding regarding lossy source coding.

II (1.0 + 0.5 + 1.0 val. = 2.5 val.)

Consider the JPEG standard to code photographic images.

a) Determine the average number of bits per pixel (considering both the luminance and the chrominances) that are spent when coding a 4:2:2 image with 16 bit/sample and a global compression factor (for the luminance and the chrominances) of 25. (R: 1.28 bit/pixel)

b) How many bits have to be spent to code a 4:2:0 colour image with 576´720 luminance resolution if the luminance compression factor is 20 and the chrominance compression factor is twice the one for the luminance ? (R: 414720 for 16 bit/sample )

c) Identify the simplest modulation that may be used to transmit in a 2 MHz bandwidth a 25 Hz video sequence coded as JPEG images in the format and conditions defined in b). (R: 64-PSK, 64-QAM)

III (1.0 + 1.5 + 0.5 + 0.5 = 3.5 val.)

Consider a videotelephony communication using Recommendation ITU-T H.261. The video sequence is coded with a CIF spatial resolution and a frame rate of 12.5 Hz at a rate of 128 kbit/s.

The video content to code is horizontally divided into two equal parts; however, while the bottom part is fixed, the top part is moving. Since the encoder processes sequentially the macroblocks, it is observed that all bits are uniformly generated in the first half of the time interval that the encoder usually dedicates to encode each image. At the encoder, the bits wait for transmission in an output buffer.

Knowing that the first image has used 15360 bits, the second image 20480 bit, and the third image 10240 bits, determine:

a) The time instants at which the receiver obtains all bits for the first, second and third images. (R: 120, 280 and 360 ms)

b) The minimum size of the encoder output buffer in order all bits above are transmitted without problems. (R: 20480 bits)

c) The initial visualization delay associated to the system defined in b). (R: 240 ms)

d) The maximum number of bits that the 4th image may spent (still assuming that it only spends bits in the top half). (R: 10240 bits)

IV (2.5 + 0.5 = 3 val.)

Suppose that you are contacted by a company to design a digital storage system for short clips from Euro 2012. The company requires editing flexibility with a maximum access time per image below 0.8 s and needs to store the largest number of 1.5 minutes clips in a disk with 100 GBytes of capacity. The maximum access speed to the disk is 10 Mbit/s. The clips have HDTV resolution, this means 1920×1152 (Y) and 960×1152 (Cr, Cb) at 25 Hz. Assuming that you have at your disposal providing the required video quality:

1. a JPEG coding solution with a compression factor of 30 for both the luminance and chrominances

2. a MPEG-2 Video coding solution using N=6 and M=2 with the following compression factors:

· I frames: 20 and 25 for the luminance and chrominances, respectively

· P frames: 40 and 50 for the luminance and chrominances, respectively

· B frames: 50 and 60 for the luminance and chrominances, respectively

a) Determine, justifying, which coding solution should be proposed to your client if the system is only for storage and non-real time processing. (R: MPEG-2)

b) How many full video clips would you be able to store in the disk with the two coding solutions above. (R: 300 and 410 clips)

V (1.0 + 0.5+ 1.0 + 0.5 + 1.0 = 4 val.)

Consider a DVB system for the transmission of digital TV.

a) Regarding the audio signal, explain the benefits and drawbacks of increasing the number of audio channels, sampling frequency and number of bits/sample (1 benefit and 1 drawback for each case).

b) Still regarding the audio signal, why is it advisable to adopt a frequency decomposition of the signal to efficiently code it ?

c) Consider now that H.264/AVC is used for video coding. Why does this standard define the 4×4 and 16×16 Intra prediction modes ? For which type of content are these two prediction modes more appropriate ?

d) Regarding the DVB channel coding, which are the main advantage and drawback for the user experience of using larger block sizes ?

e) Considering now the DVB-T modulation, explain which is the main benefit of increasing the number of carriers if the bandwidth is kept constant. How should the number of carriers vary for increasing cell sizes ?

VI (1.0 + 1.0 + 1.0 + 1.0 = 4 val.)

As you know, 3D video is nowadays very popular.

a) Identify and explain the two main ways of providing the user a 3D video experience.

b) Define both and explain the difference between stereo and movement parallaxes. Which of these types of parallax may be present in a multiview video system with many views ? Why ?

c) Explain what does a requirement on view-switching random access typically ask for. When is this type of requirement important for an user ?

d) Compute the typical bitrate for a video stereo pair when using the Multiview Video Coding (MVC) with standard resolution if the two views are coded with similar PSNR. Compute the same bitrate for a system with 10 views. (R: 15.5 Mbit/s)