|摘要: ||傳統的視訊編碼技術; 如ITU-T 的H.26x 和ISO/IEC 的MPEG-x [2-9]， 在編碼與解碼端間存 在著一種不平衡 (Asymmetric) 的特性，即是其在編碼端的複雜度為解碼端的5 至10 倍。 雖 然這種在複雜度上的大差異情況， 對傳統如廣播系統 (Broadcast-like System) 或視訊傳播網 路系統而言並無太大的影響， 由於此種類型的系統僅需要一次的編碼，然後大量的解碼就可 以了。 但是就其他系統之應用，例如；感測視訊網絡（Visual sensor network）、無線電行動電 話 （Wireless mobile phone）和低功率監控系統（Lower power surveillance system）而言就不 同了， 因為它們需要許多的編碼器，但僅需單一的解碼即可。 在這類的系統應用情況下，採 用傳統的視訊編碼技術就不太合適， 因為這會使得終端設施成本提高，較為有利的作法是設 計其他編解碼器 (Encoder/decoder)， 將其中編碼器的複雜度移轉到解碼端。 分散式視訊編碼 （Distributed Video Coding, DVC）技術就是具有以上所描述特點的編碼技術，可以有效取代傳 統視訊編碼技術，是實現上述目標的最佳選擇 [10-13]。 以此為基礎的無線視訊傳輸網路 （Wireless video transmission network）， 的確可以有效的降低終端設施 (Terminal devices) 視 訊編碼的複雜度並降低成本。 基本上，傳統所謂的分散式視訊編碼 (DVC) 其觀念乃是源自於兩個重要的信息理論 （Information theorems）；分別為Slepian-Wolf (SW)  和Wyner-Ziv (WZ) 。 其中，SW 編碼理論是一種無損失（Lossless）編碼理論，它被應用在通道編碼（Channel coding or Error control coding）技術中，可以在接收端更正因傳送過程中所產生的錯誤信號。 相對於SW 編 碼理論，WZ 則是將原信號SW 編碼理論推展到可以適用於有損失（Lossy）情況的編碼理論， 它主要被應用在資料源編碼（Source coding）技術中。 分散式視訊編碼的主要功用就是將編 碼端的高複雜度轉移至解碼端，以有效降低視訊終端設備的成本 (Cost)。 我們擬提出的分散式視訊編碼 (DVC) 無線視頻傳輸網路架構，主要是針對參考文獻  與 ，加以改進而來。 實際上，在傳統的DVC 無線視訊傳輸網路中他們只考慮改善視訊壓縮 編碼端 (Encoder) 的複雜度（Complexity）部分， 所以只能獲得低複雜度的視訊壓縮編碼器。 在本計畫中，我們除了有效改善視訊壓縮編碼端的複雜度外，也考慮到解碼器部分的改善; 亦 即是在下載（Down-link） 部分我們採用所謂解碼友好編碼器（Decoding-Friendly Encoder Design, DFED）[14-18] 的設計觀念， 而非依照傳統的視訊壓縮技術 (H.264/AVC) 做視訊壓 縮。 如此，我們能設計出比傳統的無線視頻傳輸網路更簡單及低成本（Low-cost）的視訊壓 縮編碼器和解碼器 (Encoder/decoder)， 並保有理想的視訊通信品質。 同時，因為在網路端的 DVC 編碼器採用我們最近所提出具較好效能的DVC 編碼器；即所謂填充型分散式視訊編碼 (Padding-based DVC）架構 [19-20]。 如此， 我們可以設計出具低複雜度的視訊編碼、解碼終 端設備並應用到低功率監控系統（Lower power surveillance system）。綜上所述，在本兩年期的 子計畫中，我們擬以具低功率安全監控系統為探討主題， 配合其他子計畫改進我們最近所完 成以填充型分散式視訊編碼 (Padding-Based Distributed Video Coding, Padding-Based DVC）架 構為主的不可破壞的 (Indestructible) 無線網路安全監控系統， 這種監控系統非常適合用於如 博物館 (Museum) 或圖書館 (Library) 等具有貴重珍藏及設施的環境。|
Conventional video coding technologies, e.g., the ITU-T H.26x and ISO/IEC MPEG-x -, are mainly based on the two principles of predictive and transform coding to better present the video signals and thus to achieve a more efficient coding without affecting the decoded signal quality. It supports spatial intra prediction on top of inter prediction, and the enhanced inter prediction features include the use of multiple reference frames, variable block-size motion compensation and quarter-pixel precision. The above design, which implies complex encoders and lightweight decoders, is well-suited for broadcasting-like applications, where a single sender is transmitting data to many receivers. Usually, it generates a result of unbalanced computational complexity (CC) at encoder and decoder, the CC in the encoder is 5~10 times over decoder side. This asymmetry in complexity becomes a big disadvantage for a growing number of emerging applications, such as (1) low-power sensor networks, (2) wireless video surveillance cameras and (3) mobile communication devices. They are rather relying on an upstream model. In this case, many clients, often mobile, low-power, and with limited computing resources, are transmitting data to a central server. It is usually advantageous to have lightweight encoding with high compression efficiency and resilience to transmission errors. To achieve the goal just described, recently, a new coding paradigm, referred to as Distributed Video Coding (DVC) -,  , has emerged based on two Information theorems from the seventies (1973 and 1976): (1) Slepian-Wolf (SW) and (2) Wyner-Ziv (WZ). Basically, the SW theorem  establishes some lower bounds on the achievable rates for the lossless coding of two or more correlated data streams. While the WZ theorem  shows that this result still holds for lossy coding under the assumptions that the sources are jointly Gaussian, with side information at the decoder. With the above-mentioned two theorems, they were three major groups for the development of DVC architectures, viz., the Stanford's WZ video coding, Berkeley's PRISM and Europe's DISCOVER, respectively. DVC offers a number of potential advantages which make it well-suited for the aforementioned emerging upstream applications. First, it primarily has a modified complexity balance between the encoder and decoder, in contrast to conventional video codecs. Furthermore, due to its intrinsic joint source-channel coding framework, DVC is robust to channel errors. Because it does not rely on a prediction loop, DVC provides codec independent scalability. Finally, DVC is well-suited for multi-view coding by exploiting correlation between views without requiring communications between the cameras, which may be an important architectural advantage. We note that in the conventional DVC based transmission network system  it only considered the improvement in the part of encoder (to obtain a low complexity encoder). However, in this project we will propose a new video transmission network scheme, in which the improvement of the decoder part is also considered . Hence, we are able to obtain a simpler and lower cost encoder/decoder of terminal (portable) devices than the traditional ones in wireless video transmission networks. In this multiple years project, based on the new padding-based DVC codec proposed by us more recently ,, we will apply it to the wireless video surveillance systems -. The padding-based DVC can be employed to further reduce the CC and hardware cost of the encoder. The major difference compared with three traditional DVC codecs mentioned-above is that we do not divide frames into WZ frame and key frame, separately . Associated with other sub-projects, we would like to build up an Indestructible Video Surveillance Wireless Network System with New Low-Cost DVC Paradigm. In the first year, we emphasize on the specific part of Mutual Bi-Directional Frames Coding at Decoder for the Padding-Based DVC. While in the second year, we will focus on the software enhancement of the wireless network framework with modification of Decoding-Friendly Encoder Design (DFED) [14-18]. Our basic consideration to simplify steps at downlink is to adopt the technique of DFED which may be employed to reduce some complexity of the conventional video coding (such as a complex standard like H.264/AVC) at encoder then get a lower complexity decoder of terminal (portable) device. Under this consideration the DFED must restrict some steps to be used at encoder, for instances, non-variable block size of motion estimation, only one reference frame, integer pixel for motion search, and larger block size, etc [14-18]. During the simplification with the restricted steps at encoder (in the network term) a simpler video decoder can be obtained. While at uplink side, our recent proposed padding-based DVC encoder is adopted, and then in conjunction with the above mentioned DFED decoder, a simpler and low cost encoder/decoder of terminal (portable) devices for wireless video transmission network, is devised for Wireless Video Surveillance Network Systems. If time is sufficient, it is of interest to take into account the effects of noise and fading in a wireless channel on the performance of a padding-based DVC codec with necessary modifications to the decoding algorithm while proposing a MIMO based diversity scheme to take advantage of the multipath propagation effects to improve the system performance.