1. Image Processing Using Python
When working with images in Python, several libraries provide tools for processing and manipulating image data. Some of the most commonly used libraries are:
Let's dive into how we can use these tools to load, manipulate, and analyze images.
Opening and Displaying Images
To start working with images, the first step is to load them into our program. Both Pillow and OpenCV offer easy ways to achieve this.
Loading an Image with Pillow
Pillow allows you to load an image and convert it into a format that NumPy can handle:
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
image = Image.open('lena_color.jpg') # Using Pillow
print(type(image)) # Verifies that it's a PIL image
im = np.asarray(image, dtype=np.uint8) # Converts to NumPy array
print(type(im)) # Now it's a NumPy array
plt.imshow(im) # Displays the image
plt.show()
Loading an Image with OpenCV
OpenCV can also be used to load and display images:
import cv2
input_image = cv2.imread('lena_color.jpg') # Using OpenCV
plt.imshow(input_image) # This may display incorrect colors (BGR format)
Why Does OpenCV Display Images Incorrectly?
By default, OpenCV reads images in BGR (Blue-Green-Red) format instead of RGB (Red-Green-Blue). This causes the image to look incorrect (often with a blue tint). To fix this, we need to convert the image to RGB:
# Option 1: Manually split and merge channels
b, g, r = cv2.split(input_image)
merged = cv2.merge([r, g, b])
plt.imshow(merged) # Correct colors
# Option 2: Automatic conversion using OpenCV
plt.imshow(cv2.cvtColor(input_image, cv2.COLOR_BGR2RGB)) # Simple conversion
Understanding Color Spaces
RGB Color Space
RGB is the most common color space, used by cameras and displays. It has three color channels: Red, Green, and Blue. These channels capture different wavelengths of light. Interestingly, there are twice as many green filters on a typical camera sensor due to the human eye’s increased sensitivity to green light (a consequence of the Bayer filter).
We can visualize each channel individually:
plt.subplot(131)
plt.title('R')
plt.imshow(im[:, :, 0], cmap='gray')
plt.subplot(132)
plt.title('G')
plt.imshow(im[:, :, 1], cmap='gray')
plt.subplot(133)
plt.title('B')
plt.imshow(im[:, :, 2], cmap='gray')
plt.show()
YCbCr Color Space
Sometimes, working in the RGB color space isn't optimal. The YCbCr color space is particularly useful for compression and broadcasting. YCbCr splits the image into:
- Y: Luminance (brightness)
- Cb: Chrominance Blue (blue color difference)
- Cr: Chrominance Red (red color difference)
The luminance channel, Y, is a weighted sum of the RGB components and closely matches human brightness perception. Cb and Cr represent the color differences from blue and red. YCbCr is commonly used in JPEG compression.
The Y component (luminance) should be prioritized, as it's more perceptible to the human eye than chrominance.
Converting to YCbCr:
transformed = image.convert('YCbCr')
ycbcr = np.asarray(transformed, dtype=np.uint8)
plt.title('YCbCr Image')
plt.imshow(ycbcr)
plt.show()
# Visualizing individual Y, Cb, Cr channels
matplotlib.rcParams['figure.figsize'] = [25, 25]
plt.subplot(131)
plt.title('Y')
plt.imshow(ycbcr[:, :, 0], cmap='gray')
plt.subplot(132)
plt.title('Cb')
plt.imshow(ycbcr[:, :, 1], cmap='gray')
plt.subplot(133)
plt.title('Cr')
plt.imshow(ycbcr[:, :, 2], cmap='gray')
plt.show()
Image Processing Example: JPEG Compression
def jpeg_compression(img, QF):
from PIL import Image
img = Image.fromarray(img)
img.save('tmp.jpg', "JPEG", quality=QF)
compressed_img = Image.open('tmp.jpg')
compressed_img = np.asarray(compressed_img, dtype=np.uint8)
os.remove('tmp.jpg')
return compressed_img
# Apply compression to the Y, Cb, or Cr channel
transformed = image.convert('YCbCr')
ycbcr = np.asarray(transformed, dtype=np.uint8)
idx = 0 # Modify luminance (Y channel)
component = ycbcr[:, :, idx]
compressed = jpeg_compression(component, 25)
# Visualize the result
plt.figure(figsize=(15, 6))
plt.subplot(121)
plt.title('Original')
plt.imshow(component, cmap='gray')
plt.subplot(122)
plt.title('Compressed')
plt.imshow(compressed, cmap='gray')
plt.show()
Additional YCbCr Image Processing
You can perform additional operations such as inversion, brightness adjustment, or gamma correction on individual Y, Cb, or Cr channels. For example, here's how to invert the luminance:
# Inversion
plt.subplot(141)
plt.title('Original')
plt.imshow(ycbcr)
plt.subplot(142)
plt.title('Luminance (Y)')
plt.imshow(ycbcr[:, :, 0], cmap='gray')
plt.subplot(143)
plt.title('Inversion')
plt.imshow(255 - ycbcr[:, :, 0], cmap='gray')
plt.show()
Masking Operations
Masking is useful for editing specific areas of an image. For example, we can set a region of the image to black:
edited = np.copy(im)
edited[20:220, 20:220, :] = 0 # Mask region
plt.figure(figsize=(10, 10))
# Display original and edited images side by side
plt.subplot(121)
plt.title('Original')
plt.imshow(im)
plt.subplot(122)
plt.title('Edited')
plt.imshow(edited)
plt.show()
Cropping and Slicing
Cropping or slicing specific regions of an image can be useful when focusing on particular areas:
# Cropping a region
plt.subplot(121)
hat = im[60:250, 70:350]
plt.title('CROP')
plt.imshow(hat)
# Slicing a region
plt.subplot(122)
crop = im[:, 130:300]
plt.title('SLICE')
plt.imshow(crop)
plt.show()
Image Composition
You can combine or superimpose different images. Here's how to place the cropped "hat" image back onto the original image:
fresh_image = cv2.cvtColor(cv2.imread('lena_color.jpg'), cv2.COLOR_BGR2RGB)
fresh_image[200:200 + hat.shape[0], 200:200 + hat.shape[1]] = hat
plt.imshow(fresh_image)
plt.show()
Edge Detection
Edge detection is a fundamental tool in image processing, helping to identify points in an image where the brightness changes sharply. These edges often correspond to boundaries of objects and can be used to extract features or to embed watermarks in less noticeable areas. One of the first successful techniques for detecting edges was Sobel’s edge detector, which uses convolution to detect gradients in the image.
Edges are particularly useful for watermarking because they correspond to textures and high-frequency areas, making it easier to hide the watermark without it being visible.
Sobel Edge Detection
The Sobel filter computes the gradient of an image in two directions: along the x-axis (horizontal edges) and the y-axis (vertical edges). By combining these gradients, we can highlight the strongest edges in the image.
Here’s how to apply the Sobel filter in both directions:
sobelimage = cv2.cvtColor(input_image, cv2.COLOR_BGR2GRAY) # Convert to grayscale
matplotlib.rcParams['figure.figsize'] = [10, 10]
# Compute Sobel gradients
sobelx = cv2.Sobel(sobelimage, cv2.CV_64F, 1, 0, ksize=3) # Gradient along x-axis
sobely = cv2.Sobel(sobelimage, cv2.CV_64F, 0, 1, ksize=3) # Gradient along y-axis
# Display the results
plt.subplot(131)
plt.title('Edges over X axis')
plt.imshow(sobelx, cmap='gray')
plt.subplot(132)
plt.title('Edges over Y axis')
plt.imshow(sobely, cmap='gray')
# Compute gradient magnitude
magnitude = np.sqrt(sobelx**2 + sobely**2)
magnitude = cv2.convertScaleAbs(magnitude) # Convert to 8-bit for display
# Apply threshold to highlight strong edges
_, thresholded = cv2.threshold(magnitude, 100, 255, cv2.THRESH_BINARY)
plt.subplot(133)
plt.title('Combined edges')
plt.imshow(thresholded, cmap='gray')
plt.show()
Edges correspond to high-frequency areas in the image, where changes in intensity are more abrupt. Watermarks are often embedded in these areas because they are less noticeable to the human eye.
Canny Edge Detection
The Canny edge detector is another powerful technique that produces cleaner and more precise edge detection compared to Sobel. Canny uses two thresholds to control edge detection sensitivity:
- Threshold 1: The lower boundary for detecting weak edges.
- Threshold 2: The upper boundary for stronger edges.
Here’s how you can experiment with the thresholds:
th1 = 30 # Lower threshold
th2 = 60 # Upper threshold
d = 3 # Gaussian blur parameter
# Apply Gaussian blur to smooth the image
edgeresult = cv2.GaussianBlur(input_image, (2*d+1, 2*d+1), -1)[d:-d, d:-d]
# Convert to grayscale and apply Canny edge detection
gray = cv2.cvtColor(edgeresult, cv2.COLOR_BGR2GRAY)
edge = cv2.Canny(gray, th1, th2)
# Highlight the edges in the image
edgeresult[edge != 0] = (0, 255, 0) # Mark edges in green
plt.imshow(cv2.cvtColor(edgeresult, cv2.COLOR_BGR2RGB))
plt.show()
The first threshold controls the likelihood of finding an edge, while the second decides how strongly to follow that edge. Experimenting with these values helps fine-tune the sensitivity of the detector.
Image Processing Techniques
In this section, we will explore different ways to modify and enhance images using Python.
Adding Noise (AWGN)
Additive White Gaussian Noise (AWGN) is a common technique used to simulate noise in an image. By adding noise, we can test the robustness of image processing techniques or watermarks under challenging conditions.
import numpy as np
import matplotlib.pyplot as plt
plt.figure(figsize=(15, 6)) # Set the figure size (15 width, 6 height)
# Add Additive White Gaussian Noise (AWGN) to an image
def awgn(img, std, seed):
"""
Additive White Gaussian Noise (AWGN) to an image.
Parameters:
img : numpy array, the input image (grayscale or channel)
std : float, the standard deviation of the Gaussian noise
seed : int, seed for the random number generator (for reproducibility)
Returns:
attacked : numpy array, the image with Gaussian noise added
"""
mean = 0.0 # The mean of the Gaussian noise, typically 0.0
np.random.seed(seed) # Set the seed for reproducibility
noise = np.random.normal(mean, std, img.shape) # Generate Gaussian noise with the given standard deviation
attacked = img + noise # Add noise to the original image
attacked = np.clip(attacked, 0, 255) # Ensure pixel values stay between 0 and 255 (valid pixel range)
return attacked
# Apply noise to the green channel
attacked = awgn(original[:, :, 1], 30.0, 123)
# Show original and noisy images side by side
plt.subplot(121)
plt.title('Original (Green Channel)')
plt.imshow(original[:, :, 1], cmap='gray')
plt.subplot(122)
plt.title('AWGN Applied')
plt.imshow(attacked, cmap='gray')
plt.show()
Let's clarify any doubts you might have about the code:
-
mean = 0.0
: This defines the mean of the noise, which is set to 0.0. This is typical for Gaussian noise: most of the noise values will be close to zero, with some values distributed both positively and negatively. -
np.random.seed(seed)
: This sets the seed for the random number generator. Using a seed ensures that the same noise is generated every time the code is executed, which is useful for repeatable experiments. -
np.random.normal(mean, std, img.shape)
:
This line generates an array of random numbers that follow a normal (Gaussian) distribution. The noise is generated with a mean of 0.0 and a standard deviation specified bystd
(in this case, 30.0). The functionnp.random.normal
creates an array of the same size as the image (img.shape
), meaning that each pixel in the image will have an associated noise value. -
attacked = img + np.random.normal(mean, std, img.shape)
:
Here, the generated noise is added to the original image. Since the image's pixel values range from 0 to 255 (in the case of grayscale or RGB images), adding noise might produce pixel values that exceed 255 or fall below 0. -
np.clip(attacked, 0, 255)
:
This function clips (limits) the pixel values in the "attacked" image to the range [0, 255]. Any pixel values that exceed this range are set to 255 (the maximum), while values below 0 are set to 0 (the minimum), since pixel values cannot be negative or greater than 255. -
return attacked
:
This returns the "attacked" image, which is the original image with the added Gaussian noise.
Adding noise helps simulate real-world imperfections in images and test the effectiveness of image enhancement or watermarking techniques under noisy conditions.
Blurring with Gaussian Filter
Blurring is used to reduce noise and smooth images. The Gaussian Filter works by averaging pixel values with their neighbors, giving more weight to closer pixels. This technique is commonly used to attenuate high-frequency components like noise or fine details.
Here's an example:
from scipy.ndimage import gaussian_filter
def blur(img, sigma):
attacked = gaussian_filter(img, sigma)
return attacked
# Apply Gaussian blur to the green channel
attacked = blur(original[:, :, 1], [3, 2])
# Show original and blurred images
plt.figure(figsize=(15, 6))
plt.subplot(121)
plt.title('Original (Green Channel)')
plt.imshow(original[:, :, 1], cmap='gray')
plt.subplot(122)
plt.title('Gaussian Blurred')
plt.imshow(attacked, cmap='gray')
plt.show()
Blurring can be useful in watermarking when you want to reduce noise or smooth the image before embedding a watermark, ensuring that the watermark is embedded in less noticeable areas.
Sharpening
Sharpening is the opposite of blurring. It exaggerates the brightness difference along edges, making details more prominent. This can be useful to enhance an image before further processing or analysis.
def sharpening(img, sigma, alpha):
blurred = gaussian_filter(img, sigma)
sharpened = img + alpha * (img - blurred) # Enhance edges
return sharpened
# Apply sharpening to the green channel
attacked = sharpening(original[:, :, 1], 3, 0.2)
# Show original and sharpened images
plt.figure(figsize=(15, 6))
plt.subplot(121)
plt.title('Original (Green Channel)')
plt.imshow(original[:, :, 1], cmap='gray')
plt.subplot(122)
plt.title('Sharpened')
plt.imshow(attacked, cmap='gray')
plt.show()
Median Filtering
Median filtering is a technique that is particularly effective for removing salt-and-pepper noise while preserving edges. Instead of averaging values, it selects the median value of the surrounding pixels, which makes it more effective in preserving edges. Surrounding pixels is a very generic term, and that's why we need to specify the size of the local area of pixels where the median filter will be applied. This is called the kernel, and its size is referred to as the kernel size
from scipy.signal import medfilt
def median(img, kernel_size):
attacked = medfilt(img, kernel_size)
return attacked
# Apply median filter to the green channel
attacked = median(original[:, :, 1], [3, 3]) # passing 3 or [3,3] is the same
# Show original and filtered images
plt.figure(figsize=(15, 6))
plt.subplot(121)
plt.title('Original (Green Channel)')
plt.imshow(original[:, :, 1], cmap='gray')
plt.subplot(122)
plt.title('Median Filter Applied')
plt.imshow(attacked, cmap='gray')
plt.show()
Median filtering is more effective at preserving edges while removing noise, making it ideal for images with sharp features or for edge detection preprocessing.
Resizing
Resizing an image can be useful when you need to scale it down or up for different applications. However, resizing often results in a loss of detail, especially when downscaling, because pixel values are interpolated.
from skimage.transform import rescale
def resizing(img, scale):
x, y = img.shape
attacked = rescale(img, scale, anti_aliasing=True)
attacked = rescale(attacked, 1/scale, anti_aliasing=True)
attacked = attacked[:x, :y] # Crop back to original size
return attacked
# Apply resizing to the green channel
attacked = resizing(original[:, :, 1], 0.5)
# Show original and resized images
plt.figure(figsize=(15, 6))
plt.subplot(121)
plt.title('Original (Green Channel)')
plt.imshow(original[:, :, 1], cmap='gray')
plt.subplot(122)
plt.title('Resized and Restored')
plt.imshow(attacked, cmap='gray')
plt.show()
JPEG Compression
JPEG compression reduces the amount of data in an image by discarding certain details. It is widely used in multimedia applications because it offers a good balance between file size and quality. However, since it’s a lossy compression scheme, increasing the compression level results in visible artifacts such as blocking, blurring, and chromatic aberrations.
When applying JPEG compression, we can specify a Quality Factor (QF), which ranges from 0 to 100. A lower QF means higher compression and more noticeable artifacts, while a higher QF preserves more image quality but results in a larger file size.
Here’s how we can apply JPEG compression in Python:
def jpeg_compression(img, QF):
from PIL import Image
img = Image.fromarray(img)
img.save('tmp.jpg', "JPEG", quality=QF) # Save with specified Quality Factor
compressed_img = Image.open('tmp.jpg')
compressed_img = np.asarray(compressed_img, dtype=np.uint8)
os.remove('tmp.jpg') # Clean up temporary file
return compressed_img
# Apply JPEG compression to the original image
attacked = jpeg_compression(original, 10) # QF = 10 means strong compression
# Show original and compressed images side by side
plt.figure(figsize=(15, 6))
plt.subplot(121)
plt.title('Original Image')
plt.imshow(original)
plt.subplot(122)
plt.title('JPEG Compressed (QF=10)')
plt.imshow(attacked)
plt.show()
JPEG compression reduces file size by discarding fine details in areas with low visual impact, introducing artifacts. These artifacts become more pronounced as the compression level increases (lower Quality Factor).
Exercises
Exercise 1
Apply AWGN to an image and then evaluate what filter between: Gaussian, Median, Sharpening and Resizing better attenuate the noise induced by the AWGN.
import cv2
import numpy as np
from matplotlib import pyplot as plt
# Function to add AWGN to an image
def awgn(img, std, seed):
mean = 0.0 # some constant
np.random.seed(seed)
attacked = img + np.random.normal(mean, std, img.shape)
attacked = np.clip(attacked, 0, 255)
return attacked
def blur(img, sigma):
from scipy.ndimage.filters import gaussian_filter
attacked = gaussian_filter(img, sigma)
return attacked
def sharpening(img, sigma, alpha):
import scipy
from scipy.ndimage import gaussian_filter
import matplotlib.pyplot as plt
#print(img/255)
filter_blurred_f = gaussian_filter(img, sigma)
attacked = img + alpha * (img - filter_blurred_f)
return attacked
def median(img, kernel_size):
from scipy.signal import medfilt
attacked = medfilt(img, kernel_size)
return attacked
def resizing(img, scale):
from skimage.transform import rescale
x, y = img.shape
attacked = rescale(img, scale)
attacked = rescale(attacked, 1/scale)
attacked = attacked[:x, :y]
return attacked
# Load the image
image = cv2.imread('lena_color.jpg', cv2.IMREAD_GRAYSCALE)
# Apply AWGN to the image
noisy_image = awgn(image, 30, 123)
gaussian_filtered = blur(noisy_image, 1)
median_filtered = median(noisy_image, 3)
sharpened_image = sharpening(noisy_image, 1, 0.1)
resized_image = resizing(noisy_image, 0.7)
# Plot the results
plt.figure(figsize=(12, 8))
plt.subplot(2, 3, 1)
plt.imshow(image,cmap='gray')
plt.title("Original")
plt.subplot(2, 3, 2)
plt.imshow(noisy_image,cmap='gray')
plt.title("Noisy (AWGN)")
plt.subplot(2, 3, 3)
plt.imshow(gaussian_filtered,cmap='gray')
plt.title("Gaussian Filter")
plt.subplot(2, 3, 4)
plt.imshow(median_filtered,cmap='gray')
plt.title("Median Filter")
plt.subplot(2, 3, 5)
plt.imshow(sharpened_image,cmap='gray')
plt.title("Sharpening Filter")
plt.subplot(2, 3, 6)
plt.imshow(resized_image,cmap='gray')
plt.title("Resized")
plt.tight_layout()
plt.show()
print('You should play with different values of the parameters used to apply AWGN and the other filters. \nHowever, for these combination the filter choice is between the Median and the Gaussian filter. \nSpecifically, the Median filter preserve textures much better than the Gaussian one.')
Exercise 2
Convert a RGB image into YCbCr, apply compressions and other processing and evaluate which components deteriorate the most the resulting RGB image.
import os
def jpeg_compression(img, QF):
from PIL import Image
img = Image.fromarray(img)
img.save('tmp.jpg',"JPEG", quality=QF)
attacked = Image.open('tmp.jpg')
attacked = np.asarray(attacked,dtype=np.uint8)
os.remove('tmp.jpg')
return attacked
# Convert RGB to YCbCr
# Load the image
image = cv2.imread('lena_color.jpg')
ycbcr_image = cv2.cvtColor(image, cv2.COLOR_BGR2YCrCb)
# Compress YCbCr channels
# Try to set Cb, Cr to 100 and Y to 10.
y_compressed = jpeg_compression(ycbcr_image[:,:,0], 100)
cb_compressed = jpeg_compression(ycbcr_image[:,:,1], 10)
cr_compressed = jpeg_compression(ycbcr_image[:,:,2], 10)
# Merge compressed channels back
ycbcr_compressed = cv2.merge([y_compressed, cb_compressed, cr_compressed])
# Convert back to RGB for comparison
rgb_compressed = cv2.cvtColor(ycbcr_compressed, cv2.COLOR_YCrCb2RGB)
# Plot the original and compressed image
plt.figure(figsize=(12, 6))
plt.subplot(1, 2, 1)
plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
plt.title("Original RGB Image")
plt.subplot(1, 2, 2)
plt.imshow(rgb_compressed)
plt.title("Compressed RGB Image")
plt.tight_layout()
plt.show()
print('Compressing the Y channel, visually degrade the compressed image more \nthan compressing with the same QF both the Cb and Cr channel.')
Exercise 3
Extract edges from an image and localize in these regions an attack.
def awgn(img, std, seed):
mean = 0.0 # some constant
np.random.seed(seed)
attacked = img + np.random.normal(mean, std, img.shape)
attacked = np.clip(attacked, 0, 255)
return attacked
# convert the image to grayscale for edge detection
gray_image = cv2.imread('lena_grey.bmp', cv2.IMREAD_GRAYSCALE)
# create a copy of this image where we will localize the edges
edges_results = gray_image.copy()
# attack the entire image
attacked = awgn(gray_image, 10, 123)
# create a copy of this image where we will localize the attack
localized_Attack = gray_image.copy()
# Apply Canny edge detection
# edges is 0 if no edge, 255 if edge
edges = cv2.Canny(gray_image, 100, 200)
# localize the AWGN attack on edges
localized_Attack[edges > 0] = attacked[edges > 0]
# plot original image and edges
plt.figure(figsize=(12, 6))
plt.subplot(1, 4, 1)
plt.imshow(gray_image, cmap='gray')
plt.title("Original Image")
plt.subplot(1, 4, 2)
plt.imshow(edges, cmap='gray')
plt.title("Edges Detected")
plt.subplot(1, 4, 3)
plt.imshow(attacked, cmap='gray')
plt.title("Global Attack")
plt.subplot(1, 4, 4)
plt.imshow(localized_Attack, cmap='gray')
plt.title("Localized Attack")
plt.tight_layout()
plt.show()
print('Final check to know we localized the attacks')
print('Values of the pixels before applying AWGN locally to the image: ', gray_image[edges > 0])
print('Values of the attacked pixels after applying AWGN globally to the entire image: ', attacked[edges > 0])
print('Values of the attacked pixels after applying AWGN locally to the image: ',localized_Attack[edges > 0])
print('As you can see, localize the attack to textured regions improve its quality in terms of invisibility!!!')