With
the advancement of broadband internet services,
high quality video on the web becomes possible.
Watching movie trailers over the net is
now a common activity among movie lovers
and many webmasters have begun to enrich
their web sites by incorporating video presentations.
To support the e-Learning activities, the
Computing Services Centre (CSC) has recently
upgraded the video
streaming service both in the
video encoding and the streaming servers
such that video teaching materials can be
accessed anywhere by taking advantage of
the bandwidth increase of Internet access.
The following is a brief account of the
technologies involved in this upgrade.
Roughly speaking, there are two common methods
to publish video materials over the net.
The first one is to store video clips as
data files on web sites. This requires users
to download a video file completely to their
local disk (this may take several minutes
depending on the file size and the bandwidth)
before the video can be played on their
client PC. This is definitely not welcomed
by users, though it is a simple and inexpensive
solution to web developers. The second method
is to store video clips as streaming files
on a special video
server. This allows videos to
be accessed in real-time, which means users
can watch any parts of a video without the
need to wait for a long downloading time.
This is usually called Video-on-Demand
(VoD). It has been introduced
to the University since 1999, and is called
the CityVoD.
At the time when the CityVoD was first released,
videos were captured at a resolution of
352x288 (it is called CIF, which is the
resolution of VCD). Although this resolution
was the best one available in VoD service
at the time, it is not good enough for some
of our video materials, such as recording
of lectures and forums. Since lectures and
forums might have PowerPoint slide shows
and demonstrations, CIF
resolution could not show all the details
of them. In order to show the content clearly,
we would need to quadruple the resolution
to 720x576 (it is called full-D1
or SD, which is also the resolution of DVD-Video).
At the time when we launched the CityVoD,
it was nearly impossible to adapt such a
resolution. However, as technology advanced,
we foresaw that streaming full-D1 video
over the net would become possible in the
near future. Therefore, we kept an eye on
the technology of video processing and encoding
in the past few years in order to adapt
full-D1 resolution on our video services.
When powerful servers with huge and fast
storage become available, two challenges
are left to be solved. The first one is
how to convert interlaced
videos, which are stored in VHS
tapes or DVD-Video discs, to non-interlaced
video clips. It is called de-interlacing
(or line-doubling). The second is to reduce
the bandwidth of encoded files so that our
campus network will not be overloaded when
many such videos are streaming over it.
Let's get into detail one by one.
De-interlacing
Video cameras were originally designed for
interlaced display such as CRT TV, which
displays images by drawing lines (or scanning
lines) from top to bottom and in alternating
interlaced fields (see figure 1). Each field
contains half of the scanning lines of the
display. The first field consists of odd
lines (1, 3, 5 ¡K), the second field consists
of even lines (2, 4, 6 ¡K), and so on and
so forth. In PAL video standard, there are
50 such fields per second, and each field
consists of 288 lines, which gives a total
of 576 lines on the display.
Figure 1: Alternating interlaced fields
of CRT TV
Traditionally, people say PAL video has
25 frames (complete images with all scanning
lines) per second, and each frame is comprised
of two interlaced fields (see figure 2a).
This is INCORRECT. To be exact, PAL video
has 50 unique fields per second, and each
field is an independent snapshot in time
(see figure 2b).
Figure
2a: Traditional thinking of interlaced fields
of a moving circle in video-based materials
Figure
2b: Interlaced fields of a moving circle
in video-based materials
PC, however, uses non-interlaced (progressive)
display, which shows all scanning lines
(without alternating field) in each time
interval. Therefore, to play interlaced
video on PC (or progressive display), we
have to convert interlaced fields into non-interlaced
frames (complete images) in order to display
them progressively (see figure 3). This
process is called de-interlacing*.
Figure
3
De-interlacing video-based materials is
not an easy job.
Many de-interlacers in the market cannot
do the job well. Some of them simply scale
each field into frame by interpolating "missing"
lines from the lines above and below. This
results in images that look very soft as
the image resolution was lost. Others merge
two consecutive fields directly to form
a frame. Although this can maintain the
image resolution, when there is a moving
object, it looks like as if there were spiky
lines (like tines of a comb) sticking out
from the sides of the object. This is usually
called combing or feathering (see figure
4).
Figure
4
*
De-interlacing videos is not needed when encoding
videos to CIF resolution since we can simply
squeeze each interlaced field into a progressive
frame as resolution is reduced by half.
De-interlace movies
(film-based materials), on the other hand,
is quite easy since movies themselves are
series of progressive images. Only simple
3:2 Pulldown detection is needed to de-interlace
them, which will not be gone into details
here.
Many
well-known, good de-interlace techniques,
such as Directional Correlational Deinterlacing
(DCDi) by Faroudja, are unfortunately
available only in hardware chipsets and
unavailable in PC software.
After sourcing and studying both commercial
and open-source software solutions for some
time, we finally come up with a reasonably
good solution by enhancing and integrating
several products together.
Bandwidth
As you may know, the bandwidth of DVD-Video
is about 5 to 6Mb/s on average. It may cause
traffic jam over our campus network if a
lot of such videos are streaming over it.
In order to tackle this problem, we tested
the most common encoding methods, e.g. Windows
Media, RealMedia,
MPEG4,
etc., in order to source an efficient one
so that the video can be encoded in a much
lower bandwidth without sacrificing the
video quality. Thanks to the release of
RealMedia version 10, a good quality (well
developed) video with full-D1 resolution
can be encoded to bandwidth as low as 1Mb/s.
If the video is in low quality, bandwidth
of 2Mb/s is needed.
Higher bandwidth is needed for low quality
video mainly because of the
noise (technically, it is called
mosquito noise) contained in video frames.
Most of the encoded video formats do not
store frames individually, but store the
differences between frames. If a video contains
certain degree of noise, the difference
between frames will be great and require
more bandwidth to encode it. Since many
of our recordings were captured indoor (such
as classrooms and lecture theatres) without
special lighting, this will inevitably introduce
certain degree of noise to the recorded
videos. To keep the bandwidth required to
a minimum, we tried applying de-noise filters
to this kind of videos. Unfortunately, simple
de-noise filters available in common video
encoders may also remove the details with
the noise. Thanks to many video enthusiasts
all over the world, a lot of open-source
de-noise filters are available to tackle
different kinds of noise. Finally, a filter
named "high quality three-dimensional
de-noise filter" was found to work
very well in most of our video materials.
With the help of the de-noise filter, we
can always keep all the video materials
to a bandwidth of about 1 Mb/s.
Reminders
Video streaming with full-D1 resolution
is now available in our video services.
To experience what this new standard can
bring you, simply visit our CityVoD web
pages (select "CityVoD" in the
box of "News Services" in the
e-Portal) and click on the "high resolution
version" icon on the right of your
desired video. Please note that this version
of video is available for on-campus machines
only. For smooth playback, it also requires
Pentium III 1GHz processor with at least
256MB system memory and RealPlayer 10 installed.