14 Oct 2024, 10:16 AM

23 Feb 2025, 10:22 PM

5. Watermarking embedding and detection

Watermarking Embedding

In the context of watermarking, the embedding process is where a watermark is inserted into a host asset (such as an image, audio, or video) while maintaining the quality and integrity of the original content. Depending on the strategy, the watermark embedding process can either be independent of the host asset or can adapt to it. The latter, known as informed watermarking, adjusts the embedding to the specific characteristics of the host, optimizing invisibility and robustness.

Additive Watermarking

A simple and widely used embedding strategy is additive watermarking, where the watermark signal is directly added to the host signal. The formula used for additive watermarking is:

f_{w, i} = f_{i} + γ \cdot w_{i}

Where:

$f_{i}$ is the $i$ -th component of the original feature vector of the host asset.
$w_{i}$ is the $i$ -th sample of the watermark signal.
$g a m m a$ is a scaling factor that controls the strength of the watermark.
$f_{w, i}$ is the $i$ -th component of the watermarked feature vector.

This approach is simple and flexible, allowing the watermark strength to be adapted to the local characteristics of the host. For instance, the watermarking strength can be locally adjusted as:

f_{w, i} = f_{i} + γ_{i} \cdot w_{i}

Or, in cases where the watermark strength depends on the host feature $f_{i}$ , it can be expressed as:

f_{w, i} = f_{i} + γ_{i} \cdot w_{i} \cdot f_{i}

This technique assumes that the host features follow a Gaussian distribution and that the primary form of attack is the addition of white Gaussian noise (AWGN). Under these conditions, correlation-based decoding becomes optimal, minimizing the overall error probability for a given false detection rate.

Example of Additive Watermarking in DWT Domain

A well-known application of additive watermarking can be found in the work of Barni et al. (IEEE Transactions on Image Processing, 2001). The authors propose a watermarking method based on the Discrete Wavelet Transform (DWT). The watermark is embedded in the wavelet coefficients of the host image, and the modification is limited to the amount that can be tolerated without degrading the visual quality. This is achieved by exploiting perceptual rules, such as:

The human eye is less sensitive to noise in high-resolution bands and bands with a 45° orientation.
The eye is also less sensitive to noise in areas with very high or very low brightness.
Highly textured areas can tolerate more noise, but the sensitivity increases near edges.

The watermarking process is further optimized using a weighting function that takes into account the local sensitivity of different regions of the image to noise. This function ensures that the watermark is embedded more strongly in areas where it is less likely to be noticed, thus improving the invisibility of the watermark while maintaining robustness. The method effectively places the watermark in regions where the human eye is less perceptive to changes, allowing it to be hidden more securely.

The watermark itself is a binary pseudorandom sequence, which is embedded in the DWT coefficients of the three largest detail subbands of the image. These subbands correspond to the areas where modifications are less perceptible, ensuring that the watermark is well hidden. The DWT is particularly effective in this context, as it allows the identification of image regions where disturbances can be masked more easily, leveraging the multi-scale analysis inherent to the wavelet domain.

For the detection phase, Barni et al. adopt a strategy based on the Newman-Pearson theorem, which ensures that the probability of detecting the watermark is maximized while minimizing the risk of false positives. This approach provides a reliable detection mechanism, even in scenarios where the watermarked image may have been subjected to various forms of attack or degradation.

By combining these perceptual insights with the DWT's ability to localize modifications in less noticeable regions, Barni et al. achieve an effective balance between robustness and invisibility, making their watermarking technique both practical and resilient.

Multiplicative Watermarking

In contrast to additive watermarking, multiplicative watermarking scales the watermark based on the energy of the corresponding host feature. The watermark signal is embedded as:

f_{w, i} = f_{i} + γ \cdot w_{i} \cdot f_{i}

or alternatively

f_{w, i} = f_{i} + γ \cdot w_{i} \cdot | f_{i} |

This technique is more commonly used in the frequency domain, as it leverages the fact that disturbances at certain frequencies are less perceptible if the host asset already contains similar frequency components. It also makes it harder for attackers to estimate the watermark by averaging multiple watermarked versions of the host asset.

Multiplicative watermarking tends to insert the watermark into the most important parts of the host, making it more challenging to analyze and remove. Furthermore, traditional results from digital communication and information theory may not directly apply because they often assume additive noise, while multiplicative watermarking introduces different types of distortions.

Watermarking Robustness and Optimization

One of the key goals in watermarking is to achieve both invisibility and robustness. These two requirements often conflict, as increasing the strength of the watermark to improve robustness may make it more visible. Therefore, finding an optimal balance is crucial. In many cases, the watermark is embedded in important areas of the host asset to simultaneously meet invisibility and robustness constraints.

However, achieving this balance requires solving complex optimization problems, as traditional communication theory does not fully apply to the type of distortions introduced by watermarking. Moreover, multiplicative watermarking, in particular, requires sophisticated detection and decoding mechanisms due to its non-additive nature.

Watermarking Detection

In watermarking, once the watermark has been embedded in a host asset, the next crucial step is detection. The ability to detect the watermark reliably, even after the host signal has been attacked or modified, is essential for ensuring the watermark’s robustness and effectiveness.

In most cases, the structure of the detector (or decoder) is derived using a simplified channel model. This simplification is necessary due to the wide range of possible attacks and the difficulties in developing accurate statistical models of the host's features. In general, watermark detection can be treated as a signal detection problem in a noisy environment, where the noise may include both the unknown characteristics of the host signal and the effects of possible attacks.

$A$ or $A_{w}$ to indicate that it may not coincide with either the original or the watermarked version of the asset. The decision is based on the set of observed variables, $f^{'}$ , which are the features extracted from $A^{'}$ .

Reformulating Detection as Hypothesis Testing

We can reformulate the detection problem as a hypothesis testing problem, a classic statistical approach where the goal is to choose between two hypotheses based on observed data. In this context, the detection problem can be formulated as follows:

H_{0} : A^{'} does not contain the watermark w

H_{1} : A^{'} contains the watermark w

Where:

$H_{0}$ is the null hypothesis, which encompasses two situations:
1. $A^{'}$ is not watermarked at all.
2. $A^{'}$ contains a watermark, but it is different from $w$ .
$H_{1}$ is the alternative hypothesis, which posits that $A^{'}$ contains the watermark $w$ .

The goal of the detection process is to define a test that distinguishes between $H_{1}$ and the composite alternative hypothesis $H_{0}$ . The test must be optimum with respect to a predefined criterion, such as minimizing error rates or maximizing detection robustness.

Bayes Risk

In Bayesian hypothesis testing, decisions are made by minimizing the Bayes risk. The Bayes risk is defined as the expected value of a loss function that penalizes incorrect decisions. For our case, the loss function $L_{i j}$ is defined as follows:

$L_{01}$ : The loss incurred when hypothesis $H_{0}$ is true, but we incorrectly choose $H_{1}$ .
$L_{10}$ : The loss incurred when hypothesis $H_{1}$ is true, but we incorrectly choose $H_{0}$ .

The observation variables correspond to the vector of extracted features $f^{'}$ , and the decision rule maps $f^{'}$ to a binary decision:

$1$ : Corresponds to accepting $H_{1}$ , meaning we decide that the watermark $w$ is present.
$0$ : Corresponds to accepting $H_{0}$ , meaning we decide that the watermark $w$ is absent or that $A^{'}$ contains a different watermark.

The decision regions are:

$R_{1}$ : The acceptance region for hypothesis $H_{1}$ (watermark detected).
$R_{0}$ : The rejection region for hypothesis $H_{0}$ (watermark not detected or a different watermark detected).

The decision rule can be expressed as:

ϕ (f^{'}) = {\begin{cases} 1 & if f^{'} \in R_{1} (we accept H_{1}) \\ 0 & if f^{'} \in R_{0} (we accept H_{0}) \end{cases}

We said we want to minimize the bayes risk, and this leads to a decision criterion based on the likelihood ratio.

Minimization of Bayes Risk and Likelihood Ratio Test

The likelihood ratio is a fundamental concept in hypothesis testing, where the goal is to compare the probability of the observed data under two competing hypotheses: the presence or absence of the watermark.

The likelihood ratio, $λ (f^{'})$ , is defined as:

λ (f^{'}) = \frac{p (f^{'} | H_{1})}{p (f^{'} | H_{0})}

Where:

$p (f^{'} | H_{1})$ is the probability density function (pdf) of the feature vector $f^{'}$ conditioned on hypothesis $H_{1}$ (i.e., the watermark is present).
$p (f^{'} | H_{0})$ is the pdf of $f^{'}$ conditioned on hypothesis $H_{0}$ (i.e., the watermark is absent).

Let $p_{0}$ and $p_{1}$ be the a priori probabilities of hypotheses $H_{0}$ and $H_{1}$ , respectively. The decision rule for minimizing the Bayes risk can be defined as:

ϕ (f^{'}) = {\begin{cases} 1 & if \frac{p (f^{'} | H_{1})}{p (f^{'} | H_{0})} > \frac{p_{0} L_{01}}{p_{1} L_{10}} \\ 0 & otherwise \end{cases}

Where:

$L_{01}$ is the loss incurred when $H_{0}$ is true but $H_{1}$ is chosen (false positive).
$L_{10}$ is the loss incurred when $H_{1}$ is true but $H_{0}$ is chosen (false negative).

This decision rule implies that the detector operates by comparing the likelihood ratio to a detection threshold. The threshold, $T$ , is derived based on the probabilities and the associated losses.

Detection Threshold

The detection threshold $T$ is given by:

T = \frac{p_{0} L_{01}}{p_{1} L_{10}}

Setting the threshold is critical to minimize the overall error probability $P_{e}$ . The error probability is a weighted sum of the probabilities of false alarms and missed detections, defined as:

P_{e} = p_{0} P_{f} + p_{1} (1 - P_{d})

Where:

$P_{f}$ is the false alarm probability, which represents the likelihood of detecting the watermark when it is not present.
$P_{d}$ is the detection probability, which represents the likelihood of correctly detecting the watermark when it is present.

To minimize $P_{e}$ , the optimal detection threshold $T$ should be set to 1 when:

$L_{01} = L_{10}$
$p_{0} = p_{1}$

In this case, the false alarm probability $P_{f}$ and the detection probability $P_{d}$ are balanced, meaning:

P_{f} = 1 - P_{d}

Thus, the minimum error probability is achieved when the probability of missing the watermark and the probability of falsely detecting the watermark are equal.

Challenges in Practical Applications

While the Bayes decision theory provides an optimal solution in theory, in practical applications, several challenges arise:

Unknown A Priori Probabilities:
- In many cases, the a priori probabilities $p_{0}$ and $p_{1}$ are unknown, making it difficult to calculate the exact threshold.
Non-Gaussian Attacks:
- When the attack model deviates from additive white Gaussian noise (AWGN), the probability of missing the watermark increases, complicating detection.
False Detection Constraints:
- In many real-world scenarios, it is often not acceptable for the false detection probability $P_{f}$ to exceed a certain level. For example, false positives in copyright protection can lead to unnecessary legal disputes or penalties.

Alternative Optimization Criteria

Given the limitations in practical watermark detection systems, an alternative optimization criterion is often adopted. Instead of minimizing the overall error probability $P_{e}$ , the goal is to minimize the probability of missing the watermark** $P_{m}$ (where $P_{m} = 1 - P_{d}$ ) subject to a constraint on the maximum allowable false alarm probability $P_{f}^{max}$ .

This criterion is more practical for applications where false alarms have a higher cost. The steps for optimizing under this criterion are:

Set the Maximum Allowed False Alarm Probability:
- Define a maximum allowable value for $P_{f}$ , based on the specific application.
Use the Likelihood Ratio as the Detection Statistic:
- Continue using the likelihood ratio $λ (f^{'}) = \frac{p (f^{'} | H_{1})}{p (f^{'} | H_{0})}$ as the detection statistic.
Determine the Threshold According to the False Alarm Probability:
- Adjust the threshold $T$ to ensure that the false alarm probability does not exceed the predefined maximum value $P_{f}^{max}$ .

In practical terms, this means that the detection system is fine-tuned to minimize missed detections, ensuring that the watermark is reliably detected when present, while keeping the false alarm rate within acceptable limits.

Ecco la versione finale dei tuoi appunti trasformata in un capitolo coerente e adatto per il tuo libro. Ho organizzato il testo in modo fluido, mantenendo la formalità e la chiarezza necessarie per un libro tecnico, e includendo le formule in formato LaTeX.

Neyman-Pearson Detection Criterion

In many practical applications of watermark detection, it is often preferable to adopt a detection criterion that focuses on minimizing the probability of missing the watermark while maintaining control over the false alarm probability. The most widely used approach in such cases is the Neyman-Pearson detection criterion, which provides a framework for optimizing the detector’s performance under these constraints.

Minimizing Missed Detection Subject to a False Alarm Constraint

The Neyman-Pearson criterion aims to maximize the probability of correctly detecting the watermark (i.e., minimizing the missed detection probability $P_{m}$ while ensuring that the false alarm probability $P_{f}$ does not exceed a prescribed limit. The optimization problem can be described as:

P {λ (f^{'}) > T ∣ H_{0}} = P_{f}^{*}

Where:

$λ (f^{'})$ is the likelihood ratio.
$T$ is the detection threshold.
$P_{f}^{*}$ is the maximum allowable false alarm probability.

Once the detection threshold $T$ is set based on the desired false alarm rate, the probability of missing the watermark can be computed as:

1 - P_{d} = \int_{- \infty}^{T} p (λ ∣ H_{1}) d λ

Where:

$P_{d}$ is the detection probability.
$p (λ ∣ H_{1})$ is the likelihood of observing $λ$ given that $H_{1}$ is true.

Performance Evaluation Using ROC Curves

The performance of a detector based on the Neyman-Pearson criterion is typically evaluated using Receiver Operating Characteristic (ROC) curves. These curves plot:

$1 - P_{d}$ (missed detection rate) against $P_{f}$ (false alarm probability), or
$P_{d}$ (detection probability) against $P_{f}$ (false alarm probability).

The ROC curve provides a visual representation of the trade-off between detection accuracy and false alarms. A steeper ROC curve indicates better detector performance, as it reflects higher detection rates for lower false alarm probabilities.

The decision rule for the Neyman-Pearson criterion is given by:

ϕ (f^{'}) = {\begin{cases} 1 & if λ (f^{'}) > T \\ 0 & otherwise \end{cases}

AWGN Channel Model

To model a specific watermark embedding and detection strategy, we need to consider both the host signal features and the attacks on the watermarked signal. A common and simplified model used in watermarking is the Additive White Gaussian Noise (AWGN) channel. In this context, we assume:

The host signal features follow a Gaussian distribution (although this is not always true in practice).
The attack is modeled as additive Gaussian noise.

In this scenario, the watermarked signal $f_{w, i}$ is described by the equation:

f_{w, i} = f_{i} + γ w_{i} + n_{i}

Where:

$f_{i}$ represents the host signal feature.
$w_{i}$ is the watermark signal.
$n_{i}$ is the additive Gaussian noise.
$γ$ is a scaling factor that controls the strength of the watermark.

Additionally, the model assumes that:

The host features and noise are uncorrelated (this assumption holds true for frequency-domain techniques).
$f_{i}$ and $n_{i}$ are identically distributed random variables (though this is not always true in practice).

Optimum Detection in the AWGN Case

In the case of an AWGN channel, the optimum detection strategy is to use correlation detection, where the decision statistic is based on the correlation between the received signal and the watermark.

The decision rule for correlation detection is given by:

ϕ = \frac{1}{n} \sum_{i = 1}^{n} f_{i}^{'} w_{i}

Where $f_{i}^{'}$ represents the received (potentially noisy) feature, and $w_{i}$ is the original watermark signal. The decision rule compares the correlation $ϕ$ to a threshold $T_{ϕ}$ :

{\begin{cases} ϕ > T_{ϕ} (accept H_{1}) \\ ϕ < T_{ϕ} (accept H_{0}) \end{cases}

Computing False Alarm and Missed Detection Probabilities

In the AWGN case, the false alarm and missed detection probabilities can be computed, provided that the variance and mean of the host features are known. Let $μ_{ρ}$ and $σ_{ρ}$ represent the mean and variance of the correlation statistic under hypotheses $H_{0}$ and $H_{1}$ , respectively.

The false alarm probability $P_{f}$ and detection probability $P_{d}$ are determined by the overlap between the distributions of the correlation statistic under the two hypotheses. By fixing the threshold $T_{ϕ}$ , we can control $P_{f}$ and $P_{d}$ .

AWGN Channel and Detector Performance

In the case of an Additive White Gaussian Noise (AWGN) channel, we can derive a formula that completely characterizes the performance of the watermark detector. This allows us to evaluate the detection performance in terms of false alarm probability $P_{f}$ , missed detection probability $P_{d}$ , and the signal-to-noise ratio (SNR).

The detection threshold $T_{ϕ}$ for the AWGN channel is given by:

T_{ϕ} = \sqrt{\frac{2 σ_{x}^{2} σ_{w}^{2}}{n}} {erfc}^{- 1} (2 P_{f}^{*})

Where:

$σ_{x}^{2}$ is the variance of the host signal features.
$σ_{w}^{2}$ is the variance of the watermark signal.
$n$ is the number of samples used for detection.
${erfc}^{- 1}$ is the inverse complementary error function.

The overall performance of the detector, in terms of the missed detection probability $1 - P_{d}$ , is:

(1 - P_{d}) (P_{f}^{*}) = \frac{1}{2} erfc (\sqrt{\frac{n \cdot SNR}{2} - {erfc}^{- 1} (2 P_{f}^{*})})

Where the signal-to-noise ratio (SNR) is defined as:

SNR = \frac{γ^{2} σ_{w}^{2}}{σ_{x}^{2}}

Key Points About AWGN Channel Performance

Detector Independence: The detector does not need to know the exact strength $γ$ of the watermark used during embedding. This allows flexibility in the detection process, as the watermark strength can vary.
Embedding Flexibility: The embedder can adjust the watermark strength $γ$ based on the specific needs of the application, balancing the trade-off between imperceptibility (ensuring the watermark remains undetected by the human eye) and robustness (ensuring the watermark is detectable by the system). Importantly, the detector need not be informed about the specific strength chosen for embedding.