|
COMUNICAÇÃO DE ÁUDIO E VÍDEO
INSTITUTO SUPERIOR TÉCNICO
Year
2012/2013 – 1st Semester, Responsible: Prof. Fernando Pereira
1st Exam – 9th
January 2013, 8am (Wednesday)
MEEC: The marks should be
out before 11th January (Friday), 2pm at
the CAV Web page and the exam checking session will on the 11th January (Friday), 5pm in room LT4.
The exam is 3 hours long.
Answer all the questions in a detailed way, including all the computations
performed and justifying well your answers.
Don’t get ‘trapped’
by any question; move forward to another question and return later. Good luck !
Consider a facsimile transmission using the READ
coding method at 3200 bit/s for pages with 1000 lines, each line with 1728
samples. Consider also that, on average, 75% of the samples in each line are
white.
Assume that
1. the unidimensionally coded lines have an average compression
factor of 15 for the back runs and 25 for the white runs
2. the bidimensionally coded lines have an average compression
factor of 22 for the back runs and 30 for the white runs
a) How many bits does a unidimensionally
and a bidimensionally coded line spent on average ? (R: 80.64 e 62.84 bit/line)
b) If the k parameter of
the READ method is 5, what is (on average) the periodicity (in bits) of
recovering the decoding synchronization? (R: 332 bit)
c) Provide a formula for
the global compression factor (of a
full page) only as a function of the parameter k.
d) Why does bilevel image coding typically follow a lossless approach ?
II (0.5 + 0.5 + 0.5 +
1.0 + 1.0 val. = 3.5 val.)
Consider the JPEG standard to code photographic
images with a 576×720 luminance resolution, 4:2:2 colour
subsampling and 8 bit/sample.
a) How many more luminance blocks than chrominance blocks exist
in this type of images. (R: same number)
b) Determine the average number of bits per pixel
(considering both the luminance and the chrominances)
that are spent when coding this type of image with a global compression factor
(for the luminance and the chrominances together) of
20. (R: 0.8 bit/pixel)
c) Determine the total number of bits that have to
be spent to code an image if an average number of 4 DCT coefficients are coded
per block and each coefficient costs, on average, 3 bits for the luminance and
2 bits for the chrominance; additionally consider that the EOB (End of Block)
word costs 2 bits. (R: 155 520 bit)
d) Why is it reasonable
to say that the DCT representation involves a frequency interpretation of the image content?
e) Explain a mechanism allowing to exploit some
redundancy between neighboring blocks in JPEG coding ?
(R: prediction of the DC coefficient from left neighboring block)
III (0.5 + 0.5 + 1.0
+ 1.0 + 0.5 = 3.5 val.)
Consider a videotelephony
communication using Recommendation ITU-T H.261. The video sequence is coded
with a CIF spatial resolution, a frame rate of 10 Hz and a constant bitrate channel of 64 kbit/s.
The bits for each coded image are uniformly generated in the time between the
acquisitions of two images.
Knowing that the first image has used 9600
bits, the second image 16000 bit, and the third image 4800 bits, determine:
a) Considering that a constant bitrate channel is used, what architectural element allows the encoder spending a very different
number of bits per frame ? (R: encoder output buffer)
b) The time instants at which the receiver obtains all bits for the second
and third images. (R: 400 and 475 ms)
c) The minimum size of the encoder output buffer in order all bits above
are transmitted without problems. (R: 12800 bit)
d) The initial visualization delay associated to the system defined in c)
while justifying the formula used. (R: 300 ms)
e) The maximum number of bits that the 5th image may spend. (R: 14400 bit)
IV (0.5 + 1 + 0.5 + 1 = 3 val.)
Consider the MPEG-1 and MPEG-2 Audio standards.
a) Determine the coding rate for stereo audio content
with a 22 kHz bandwidth and the usual number of bit/sample if coded with a
Layer 2 codec to reach CD transparent quality. (R: 176 kbit/s)
b) What are the 2 main ways audio perceptual masking
contributes to reduce the bitrate when coding the
audio signal ?
c) Why does the Layer 3 codec use
the (M)DCT with an overlapping window ? (R: Reduce block effect)
d) Why is it reasonable to say
that the Layer 3 codec has a hybrid
time/frequency coding structure
?
V (0.5 + 1.2 + 0.5 + 0.8 = 3 val.)
Suppose that you are contacted by a company to design
a digital storage system for video clips. The company requires some editing
flexibility and needs to store the largest number of 5 minutes clips in a disk
with 1 TByte (1012) of capacity. The
maximum access speed to the disk is 50 Mbit/s. The
clips have HDTV resolution with the following characteristics: 1920×1152 (Y),
4:4:4, 8 bit/sample at 25 Hz.
a) Assuming that you have at your
disposal, providing the required video quality, a JPEG coding solution with
average compression factors of 40 and 45 for the luminance and chrominances, respectively, determine the maximum access time for an image
knowing that the compression factors for critical frames are 25% lower than
average. (R: 32.768 ms)
b) Assuming now that you have at
your disposal, providing the required video quality, a MPEG-2 Video coding
solution with N=15 and M=3 with the following average compression factors:
·
I frames: 30 and 35 for the luminance and chrominances,
respectively
·
P frames: 40 and 50 for the luminance and chrominances,
respectively
·
B frames: 50 and 60 for the luminance and chrominances,
respectively
Determine the maximum access time for an image knowing
that the compression factors for critical frames are again 25% lower than
average. (R: 233 ms)
c) Determine, justifying, which coding solution would you propose to
your client if a maximum random access requirement of 50 ms is put forward
together with the requirement of maximizing the number of clips stored in the
disk. (R: JPEG)
d) How many full video clips would you be able to store in the disk for the
proposed solution. (R: 868 full clips)
VI (1.0 + 1.0 + 0.5 + 0.5 +
1.0 = 4.0 val.)
Consider
a DVB system for the transmission of digital TV.
a) What does it mean using hierarchical B frames in H.264/AVC ? What is the main difference regarding classical B frames as used in MPEG-2 Video. Give a practical example.
b) What type of 3D video coding format would you choose to provide the
users with stereo TV channels with the minimum impact on the transmission and
coding chain ? What is the main implication in the
stereo views regarding the previous single 2D view ? (R:
frame compatible format)
c) Explain why is it possible to
transmit H.264/AVC coded content in a MPEG-2 Systems enabled TV transmission
chain?
d) Regarding channel coding in
DVB-x2, explain which parameters and how
may be used to tune the delay and the error correction power
? (R: block length and coding rate)
e) In DVB-S2, the set of allowed
modulations evolved in two different
directions to improve the modulation efficiency regarding the single
modulation specified in DVB-S. Which are these directions ?
Which of these directions may be more critical for applications requiring
higher reliability ? (R: increase the number of phases
and the number of amplitudes)