Chapter I: Introduction

In the past few years, the rise of the pan-network audio-visual field, represented by online video, has been one of the most dazzling events in the rapid development of the Internet economy. The pan-network audio-visual field is not only an important basic application in the Internet field since the 21st century and the main carrier of public cultural life, but also plays an important role in promoting the transformation of China's new and old kinetic energy. According to the '2021 China Network Audio-Visual Development Report' released by the China Network Audio-Visual Program Service Association, as of December 2020, the scale of China's network audio-visual users reached 9.44 billion, and the scale of the pan-network audio-visual industry exceeded 6 trillion yuan in 2020. However,Since the birth of Pan Network Audio-Visual, piracy has spread like a cancer, and its harm has increased day by day. Strengthening network copyright protection is an urgent and arduous task.

Copyright protection technology refers to the technology for establishing evidence and monitoring, collecting evidence for copyright infringement. Currently, applications of copyright protection such as blockchain, artificial intelligence, and digital watermarking mainly focus on copyright confirmation, monitoring, and evidence collection.The digital watermarking technology has important application value in the copyright confirmation and monitoring processes.Digital watermarks have the ability to find copyright infringement and trace the source, which can further trace the origin of infringement compared to other technologies.

Digital watermarking technology embeds copyright information, unique identification information, and other information in the carrier of digital works in a visible or invisible way, for the purpose of proving the source of the work. Among them, invisible hidden watermarks have the characteristics of being undetectable by the naked eye but detectable by algorithms, and can resist certain degrees of cutting, splicing, and editing operations. However, with the continuous upgrade of piracy technology, the robustness of traditional hidden watermarking technology in complex attack scenarios is facing more arduous challenges. Attackers can use complex and diverse editing techniques to destroy the protected carrier and make the extraction of copyright watermarks fail. In response to this problem,This paper introduces a method to enhance the robustness of hidden watermarking using deep learning technology; as shown in the figure below, this technology can effectively improve the robustness of hidden watermarking technology under various complex attacks with efficient and lightweight computation. 1657077760_62c50000862be75c3f041.png!small?1657077761151

Chapter II: Technical Background

1. Steganography (Steganography

Steganography generally refers to the process of embedding hidden information into information carriers such as images or videos, among which most steganography algorithms are based on knowledge of the spatial domain for information embedding. In recent years, the development of image steganography has been increasingly diverse, from the earliest LSB, LSB-Match to content-adaptive steganography: HUGO[1] (spatial adaptive steganography algorithm), WOW[2], SUNIWARD[3], to the deep learning steganography of today.Steganographic algorithms can now automatically embed hidden information into texture-rich and noise-rich image areas while maintaining the complex high-order statistical characteristics of the image.

2. Steganalysis (Steganalysis

Steganalysis is a technology that analyzes the statistical characteristics of images to determine whether additional information is hidden in the image, even to estimate the amount of information embedded and obtain the hidden information content.The current research field of steganalysis usually treats steganalysis as a binary classification problem, the goal is to distinguish between carrier images and stego images, the following figure shows an example of steganalysis (the legend comes from the datasetBOSSbase_1.0.1

1657077776_62c500105766f95e794a2.png!small?1657077777283

The left figure is the carrier image, the middle is the stego image, and the right figure is the difference image (the difference image between the carrier image and the stego image).

Steganalysis methods are generally divided into two categories. One isBased on traditional feature image steganalysis analysis method, this type of method mainly includes feature extraction, feature enhancement, and feature classifier; among them, the feature extraction and enhancement part has a decisive role in the training of the classifier later, and feature selection is very dependent on manual work, with defects such as time-consuming and poor robustness, representative steganalysis models include SPAM[4], SRM[5], DCTR[6], etc. Another type of methodIt is a steganalysis method based on deep learning, the model is mainly divided intoSemi-learning model and fully learning modelThe semi-learning model relies on the 30 filter cores of SRM as the preprocessing layer for the network to learn, representing networks such as Xu-Net[7], Ye-Net[8], etc. The fully learning model relies entirely on the powerful learning ability of deep neural networks to learn important residual feature information from the complex pixel information, representing deep networks such as SRNet[9]. Fully learning deep networks are superior to semi-learning deep networks in terms of detection accuracy and have higher robustness.

3. Digital Watermarking (Digital WaterMarking

Digital watermarking technology refers to the embedding of specific encoded information into digital signals, which may be audio, image, or video, etc. If a signal with digital watermark is copied, the embedded information will also be copied. Digital watermarking technology is a content-based, non-cryptographic computer information hiding technology, which is an effective method for protecting information security, achieving anti-counterfeiting and traceability, and copyright protection. Digital watermarks are generally divided into visible watermarks and hidden watermarks.Hidden watermarks are markers added to carrier data (audio, video, etc.) that are generally undetectable by the human eye and machines. One of the important applications of hidden watermarks is to protect copyrights, hoping to prevent or stop unauthorized copying and duplication of digital media.

4. Watermark Detection

There are generally two methods for detecting hidden watermark information. One isBased on autocorrelation detection methodsThis method is based on the correlation function generated by the watermark embedding algorithm to generate the corresponding detection algorithm, and another isUsing template matching methodsThis method utilizes the idea of template matching in image processing, setting up a template when adding a watermark, and adding a watermark through the template; when detecting a watermark, similarity calculation is performed on the image to be tested using the template; when the similarity exceeds the set threshold, it is identified as detecting a watermark, otherwise there is no watermark.

5. Contact and Distinction

Steganography & Steganalysis(Steganography&Steganalysis): SteganographyMore emphasis is placed onConcealment of information embeddedThat is, how to embed so that the stego-image cannot be detected by the adversary, usually the stego-image is lossless in communication;Steganalysis expects to determine whether an image is a stego-image or an original image without damaging the carrier data.

Digital Watermark(Digital Watermarking) More emphasis is placed onInformation embedded inRobustnessWatermark-containing carriers will be attacked during the process of dissemination, such as compression, cropping, scaling, and editing. It is necessary to ensure that the digital watermark can still maintain its effectiveness in the face of such attacks, which is an important premise for copyright protection.

3. Deep Learning for Hidden Watermark Recognition

Compared with digital steganography,In addition to requiring the concealment of the watermark, hidden watermark also pays more attention to the robustness of the watermark information.Hidden watermark carriers will encounter many complex and unknown attacks in real-world scenarios, which usually leads to the destruction of part or all of the watermark information features, ultimately resulting in the inability to detect or extract the watermark information. Traditional watermark detection methods are mostly based on correlation detection, template extraction, and other methods to determine whether there is a watermark in the carrier. These methods are less effective in the face of complex attacks, and the features added by different hidden watermarks are diverse, so it takes a lot of time and effort to design separate analysis and detection schemes for specific watermark methods. Deep learning has a natural advantage in dealing with these issues.We can simulate real-world attacks during the training process to enhance robustness and use a mixture of multiple watermarking algorithms to improve the generalization ability of the model.

1. Dataset Construction

Due to the defects of traditional datasets, such as single training image size and small data scale, we constructed an original carrier dataset containing 1000 videos and 200,000 images; the carrier dataset ensures data diversity and diversity as much as possible, including videos and images of various styles such as movies, people, landscapes, science and technology, music, and cartoons.We created a hidden watermark dataset on this dataset, which includes various video and image watermarking algorithms. Finally, we merged the original carrier set and the watermark set as our training set..

The quality of the dataset directly affects the final expressive ability of the model. Therefore, we clean the training set, and use various image quality models to filter and clean the carrier quality. In order to fully verify the generalization ability of the model,We use real-world accumulated real data as the validation set, and annotate and augment it. Some complex transformations are applied to the validation set to simulate complex and unknown attack forms in real life.

2, Model training

2.1 Model

Considering the indicators of accuracy and performance, we choose lightweight neural networks MobileNetV3[10] series, MobileNetV3_small and MobileNetV3_large as alternative models, and adjust the model architecture for watermark recognition tasks to make it more suitable for the task. The MobileNet series models have shown excellent results in various computer vision tasks, using depthwise separable convolutions to build lightweight deep neural networks, which can effectively balance delay and accuracy. To compare the differences between deep models in the computer vision field and deep models in image steganalysis, we also select SRNet as one of the alternative models.

The table below shows the results of pre-training (under the same experimental environment) of the selected three alternative models on the test set. We comprehensively examine the performance and accuracy of the models. It can be seen that MobileNetV3_large is superior to SRNet in both accuracy and speed. Therefore, MobileNetV3_large is selected as the basic model for hidden watermark recognition.

2.2 Training

The robustness of hidden watermark detection is the most concerned indicator. After thieves steal works, they may make a series of modifications, confusion, and transformations. This means that ourHidden watermarks are facing a variety of attack forms, such as common ones like translation, flipping (mirror), Gaussian blur, color jittering, affine transformation, random cropping, and complex forms like splicing, image mixing, image cutting and pasting, information compression, and format conversion.. To enhance the model's robustness to these transformation methods during detection, we simulate potential attack transformations that the data may encounter during network transmission to perform data augmentation in the training phase, further enhancing the model's generalization ability. The table below shows the generalization ability of the model on the validation set under different data augmentation scenarios.

1657078226_62c501d2b43603035254d.png!small?1657078227363

Finally, during the training phaseMixed data augmentation forms are used,For the data, probability-based flipping, translation padding, different ratio compression, image mixing, and other treatments are first performed, and then the data is randomly cropped., and ensure that hidden watermark features are still present in the data after transformation.

3. Generalization

A suitable optimizer paired with an appropriate learning rate decay strategy can accelerate the convergence speed of the model and its ability to learn features. WeUsing the Adamw optimizer with a weighted penalty term and the cosine decay strategy, it achieved good accuracy on the test set and validation set.. We trained the MobileNetV3_large model using the Adamw optimizer and the CosineAnnealingWarmRestarts learning rate decay strategy on the collected training set. It reached an accuracy of 97.15% on the test set.

In the face of various unknown combined attack business scenarios, our model has achieved an overall accuracy of 92.08%;When connecting the watermark detection model with the watermark extraction model, it can improve the watermark processing speed by more than twice without sacrificing accuracy; When connected in parallel, it can significantly improve the robustness of watermark algorithms in complex attack scenarios.

Four, Summary

Digital watermark technology is an important means to protect the legitimate rights and interests of creators. In order to avoid infringement risks and seek benefits, pirates may edit original works through various methods, which requires that the added digital watermark can still function effectively under such unknown circumstances, providing continuous protection for creators. When the carrier data is maliciously modified, it may lead to the unrecognizability of the watermark in it, which will seriously affect the robustness of copyright protection technology.Deep learning technology allows the model to understand the features in the hidden watermark that humans cannot perceive, helping us to recover the damaged digital watermark information and effectively improve the robustness and reliability of hidden watermark technology in real-world scenarios.

The algorithms mentioned in this paper have been implemented in sensitive scenarios such as videos, images, and web pages on TikTok, Feishu, Toutiao, and Xigua Video, achieving good results. Among them, Feishu has applied hidden watermark algorithms in all scenarios.In specific practice, hidden watermark algorithms can enhance the internal information security management of Feishu customers and prevent screen capture and photo leaks.In addition, hidden watermarks can also effectively help enterprise users achieve copyright protection and link tracking, with many advantages such as high accuracy, high effectiveness, strong resistance to attacks, and seamless experience, providing users with comprehensive safety escort from the physical to the application level.

In the future, related watermark capabilities will appear in the Volcano Engine Cloud Security product matrix, serving Volcano Cloud customers and used to solve copyright issues and trace data leakage problems.

Five, References

Pevný T, Filler T, Bas P. Using high-dimensional image models to perform highly undetectable steganography. International Workshop on Information Hiding. Springer, Berlin, Heidelberg, 2010: 161-177.
Holub V, Fridrich J. Designing steganographic distortion using directional filters. 2012 IEEE International workshop on information forensics and security (WIFS). IEEE, 2012: 234-239.
Holub V, Fridrich J. Digital image steganography using universal distortion. Proceedings of the first ACM workshop on Information hiding and multimedia security. 2013: 59-68
Jindal N, Liu B. Review spam detection. Proceedings of the 16th international conference on World Wide Web. 2007: 1189-1190.
Fridrich J, Kodovsky J. Rich models for steganalysis of digital images. IEEE Transactions on Information Forensics and Security, 2012, 7(3): 868-882.
Holub V, Fridrich J. Low-complexity features for JPEG steganalysis using undecimated DCT. IEEE Transactions on Information Forensics and Security, 2014, 10(2): 219-228.
Xu G, Wu H Z, Shi Y Q. Structural design of convolutional neural networks for steganalysis. IEEE Signal Processing Letters, 2016, 23(5): 708-712.
Ye J, Ni J, Yi Y. Deep learning hierarchical representations for image steganalysis. IEEE Transactions on Information Forensics and Security, 2017, 12(11): 2545-2557.
Boroumand M, Chen M, Fridrich J. Deep residual network for steganalysis of digital images[J]. IEEE Transactions on Information Forensics and Security, 2018, 14(5): 1181-1193.
Howard, Andrew G. et al. 'Searching for MobileNetV3.' 2019 IEEE/CVF International Conference on Computer Vision (ICCV) (2019): 1314-1324.