|
COMUNICAÇÃO
DE ÁUDIO E VÍDEO
INSTITUTO
SUPERIOR TÉCNICO
Year 2011/2012 - 2º
Semester, Responsible: Prof. Fernando Pereira
1st Exam – 11th
June 2012 (Monday)
MEEC: The marks should be
out before 18th June (Monday) at 12am at
the CAV Web page and the exam checking session will on the 18th June (Monday) at 5pm in room LT4.
MERC: The marks should be
out before 18th June (Monday) at 12am at
the CAV Web page and the exam checking session will on the 18th June (Monday) at 2pm in room 0.13.
The exam is 3 hours long.
Answer all the questions in a detailed way, including all the computations
performed and justifying well your answers.
Don’t get ‘trapped’
by any question; move forward to another question and return later. Good luck !
Consider a facsimile transmission using the
READ coding method at 3200 bit/s for pages with 1000 lines, each line with 1728
samples. Consider also that, on average, 80% of the samples in each line are
white.
a) Assuming that
1. the unidimensionally coded lines have an average compression
factor of 10 for the black runs and 20 for the white runs
2. the bidimensionally coded lines have an average compression
factor of 20 for the black runs and 25 for the white runs
compute the global compression factor when a value of k equal to 3 is used to limit the
propagation of channel errors. (R: 20.83)
b) Assume now that there
is the need to increase the TMVL due to receiver limitations and this implies a
reduction in 20% of the compression factors stated above (this means the values
are now 80% of the values above). For this case, determine the maximum value of k that may be used if
one needs to obtain decoding resynchronization, on average, at least once every
500 bits. (R: 5)
c) In general, identify two advantages and one drawback of
using lossless source coding regarding lossy source
coding.
II (1.0 + 0.5 + 1.0
val. = 2.5 val.)
Consider the JPEG standard to code photographic
images.
a) Determine the average number of bits per pixel
(considering both the luminance and the chrominances)
that are spent when coding a 4:2:2 image with 16
bit/sample and a global compression factor (for the luminance and the chrominances) of 25.
(R: 1.28 bit/pixel)
b) How many bits have to be spent to code a 4:2:0 colour image with 576´720 luminance
resolution if the luminance compression factor is 20 and the chrominance
compression factor is twice the one for the luminance ?
(R: 414720 for 16 bit/sample )
c) Identify the simplest modulation that may be
used to transmit in a 2 MHz bandwidth a 25 Hz video sequence coded as JPEG
images in the format and conditions defined in b). (R: 64-PSK, 64-QAM)
III (1.0 + 1.5 + 0.5
+ 0.5 = 3.5 val.)
Consider a videotelephony
communication using Recommendation ITU-T H.261. The video sequence is coded
with a CIF spatial resolution and a frame rate of 12.5 Hz at a rate of 128 kbit/s.
The video content to code is horizontally
divided into two equal parts; however, while
the bottom part is fixed, the top part is moving. Since the encoder
processes sequentially the macroblocks, it is
observed that all bits are uniformly generated in the first half of the time
interval that the encoder usually dedicates to encode each image. At the
encoder, the bits wait for transmission in an output buffer.
Knowing that the first image has used 15360
bits, the second image 20480 bit, and the third image 10240 bits, determine:
a) The time instants at which the receiver
obtains all bits for the first, second and third images. (R: 120, 280 and 360 ms)
b) The minimum size of the encoder output buffer
in order all bits above are transmitted without problems. (R: 20480 bits)
c) The initial visualization delay associated
to the system defined in b). (R: 240 ms)
d) The maximum number of bits that the 4th image
may spent (still assuming that it only spends bits
in the top half). (R: 10240 bits)
IV (2.5 + 0.5 = 3 val.)
Suppose that you are contacted by a company to
design a digital storage system for short clips from Euro 2012. The company
requires editing flexibility with a maximum access time per image below 0.8 s and needs to store the
largest number of 1.5 minutes clips in a disk with 100 GBytes
of capacity. The maximum access speed to the disk is 10 Mbit/s.
The clips have HDTV resolution, this means 1920×1152 (Y) and 960×1152 (Cr, Cb) at 25 Hz. Assuming that you have at your disposal
providing the required video quality:
1. a JPEG coding
solution with a compression factor of 30 for both the luminance and chrominances
2. a MPEG-2 Video coding
solution using N=6 and M=2 with the following compression factors:
·
I frames: 20 and 25 for the luminance and chrominances, respectively
·
P frames: 40 and 50 for the luminance and chrominances, respectively
·
B frames: 50 and 60 for the luminance and chrominances, respectively
a) Determine, justifying,
which coding solution should be proposed
to your client if the system is only for storage and non-real time processing.
(R: MPEG-2)
b) How many full video clips would you be able to
store in the disk with the two coding
solutions above. (R: 300 and 410 clips)
V (1.0 + 0.5+ 1.0 + 0.5 + 1.0 = 4 val.)
Consider a DVB system for the transmission of digital TV.
a) Regarding the audio
signal, explain the benefits and
drawbacks of increasing the number of audio channels, sampling frequency and
number of bits/sample (1 benefit and 1 drawback for each case).
b) Still regarding the
audio signal, why is it advisable to
adopt a frequency decomposition of the signal to efficiently code it ?
c) Consider now that
H.264/AVC is used for video coding. Why
does this standard define the 4×4 and 16×16
Intra prediction modes ? For which type of content are these two
prediction modes more appropriate ?
d) Regarding the DVB
channel coding, which are the main
advantage and drawback for the user experience of using larger block sizes ?
e) Considering now the
DVB-T modulation, explain which is the main benefit of increasing the number of carriers if the bandwidth
is kept constant. How should the number of carriers vary for increasing
cell sizes ?
VI (1.0 + 1.0 + 1.0 +
1.0 = 4 val.)
As you know, 3D video is nowadays very popular.
a) Identify and explain
the two main ways of providing the user
a 3D video experience.
b) Define both and
explain the difference between stereo
and movement parallaxes. Which of these types of parallax may be present in
a multiview video system with many views
? Why ?
c) Explain what does a requirement on view-switching random access typically
ask for. When is this
type of requirement important for an user ?
d) Compute the typical bitrate for a video stereo
pair when using the Multiview Video Coding (MVC) with standard
resolution if the two views are coded with similar PSNR. Compute the same bitrate for a system with 10 views. (R: 15.5 Mbit/s)