Linear Filters
Images
Denote a image (matrix of integer values) as \(I\)
For gray-scale images, \(I_{m\times n}\); where \(I(i,j)\) is called intensity
For color images, \(I_{m\times n\times \{1,2,3\}}\), 3 for RGB values
Alternatively, think of a grayscale image as a mapping \(I:\mathbb N^2 \rightarrow \{0,1,...,255\}\), i.e. position \((i,j)\rightarrow\) gray-scale, where 0 is black, 255 is white
Example For image \(I(i,j)\); \(I(i,j)+50\) lighten the image, \(I(i,-j)\) rotate the image horizontally
Image filters
Modify the pixels in an image based on some function of a local neighborhood of each pixel
Can be used to enhance (denoise), detect patterns (matching), extract information (texture, edges)
Boundary Effects
Consider the boundary of the image, there are three modes: full, same, valid
Correlation
Given a image \(I\), the filtered image is
The entries of the weight kernel of mask \(F(u,v)\) are often called the filter coefficients
Denote this correlation operator \(F\otimes I\)
OpenCV cv2.filter2d
We can also write correlation in a more compact form using vectors, i.e.
Let \(\mathbf f = F\) be the kernel matrix, \(\mathbf t_{ij} = T_{i,j}(I_{\{i-k, i+k\} \times \{j-k, j+k\}})\) be the matrix of the image centered at \((i,j)\) and of size \(2k + 1\), then \(G(i,j)=\mathbf f\cdot \mathbf t_{ij}\)
And the full image
Example of Filter kernels
Unchanged
\(\begin{bmatrix} 0 &0 &0 \\ 0 &1 &0 \\ 0 &0 &0 \end{bmatrix}\)
shifting to the left by 1px
\(\begin{bmatrix} 0 &0 &0 \\ 0 &0 &1 \\ 0 &0 &0 \end{bmatrix}\)
Sharpening
To enhance the effect, using a larger center value
If we enlarge the matrix to 5
k_sharp_3_2 = - np.ones((3,3), np.float32) / 9
k_sharp_3_2[1,1] = 2
k_sharp_3_2 = k_sharp_3_2
k_sharp_3_5 = - np.ones((3,3), np.float32) *4 / 9
k_sharp_3_5[1,1] = 5
k_sharp_5_5 = - np.ones((5,5), np.float32) *4 / 25
k_sharp_5_5[2,2] = 5
Smoothing
Smoothing 1: Moving Averaging filter
Simplest thing is to replace each pixel by the average of its neighbors, i.e. \(F\) will be a \((2k+1)^{-2} J_{(2k+1)\times (2k+1)}\) matrix
Assumption neighboring pixels are similar, and noise is independent of pixels.
For example, the following uses \(\frac{1}{9}J_{3\times 3}\) and \(\frac{1}{25}J_{5\times 5}\), note that the matrix should add up to one, hence the image is normalized.
k_3 = np.ones((3,3),np.float32) / 9
k_5 = np.ones((5,5),np.float32) / 25
k_11 = np.ones((11,11),np.float32) / 121
Smoothing 2: Gaussian Filter
When we want nearest neighboring pixels to have the most influence on the output, which can removes high-frequency components from the image (low-pass filter).
For example,
The Gaussian kernel is an approximation of a 2d Gaussian function
# simple version of a square Gaussian function
# this filter is equivalent to cv2.GaussianBlur(img, (size, size), sigma)
def Gaussian(size, sigma):
coef = 1 / (2 * np.pi * sigma **2)
gaus = np.fromfunction(lambda x, y: coef * np.e \
** ((-1*((x-(size-1)/2)**2+(y-(size-1)/2)**2))/(2*sigma**2)), (size, size))
return gaus / np.sum(gaus)
A more general form of the Gaussian kernel is obtained by the multi-variante Gaussian distribution \(\mathcal N(\mu, \Sigma)\), where
Properties of Smoothing
- All values are positive
- The kernel sums up to 1 to prevent re-scaling of the image
- Remove high-frequency components (edges); low-pass filter
Filtering image to find image crop (Normalized cross-correlation)
Let \(\mathbf f\) be a image crop, \(\mathbf t\) be the original image, then
is the "normalized score" \((0\sim 1)\), where \(1\) indicates the matching
Consider the following image crop
def findMatch(crop, target):
target = cv2.cvtColor(target, cv2.COLOR_BGR2GRAY).astype(float)
crop = cv2.cvtColor(crop, cv2.COLOR_BGR2GRAY).astype(float)
H_crop, W_crop = crop.shape[:2]
ave_kernel = np.ones((H_crop, W_crop)) / (H_crop * W_crop)
kNorm = (np.sum(crop * crop)) ** 0.5
targetNorm = np.sqrt(cv2.filter2D(target * target, -1, ave_kernel) * (H_crop * W_crop))
conv = cv2.filter2D(target, -1, crop / kNorm)
return conv / targetNorm
Convolution
A convolution \(F*I\) is equivalent to flip \(F\) along the diagonal and apply correlation. i.e.
Obviously, for a symmetric filter matrix, convolution and correlation will do the same
Convolution is the natural linear feature, and it is
- commutative \(f*g = g*f\),
- associative \(f*(g*h) = (f*g)*h\),
- distributive \(f*(g+h)=f*g + f*h\)
- associative with scalar \(\lambda (f*g)=(\lambda f)*g\)
- The Fourier transform of two convolved images is the product of their individual Fourier transforms \(\mathcal F(f*g) =\mathcal F(f)\cdot \mathcal F(g)\)
Separable Filters
A convolution filter is separable if it can be written as the outer product of two 1D filters. i.e. \(F = vh^T\), then \(F*I = v*(h*I)\) by associative property.
Example
kernel = np.array([[-1, 0, 1],
[-2, 0, 2],
[-1, 0, 1]])
kernel_h = np.array([[-1, 0, 1]])
kernel_v = np.array([[1, 2, 1]]).T
# note that images are in uint8
# and the 1D filters are not normalized
img_gray = img_gray.astype(float)
img_k = cv2.filter2D(img_gray, -1, kernel)
img_k1 = cv2.filter2D(img_gray, -1, kernel_h)
img_k2 = cv2.filter2D(img_k1, -1, kernel_v)
print(np.max(np.abs(img_k - img_k2)))
#>> 0.0
How to tell is separable
Quickcheck it has rank 1 (otherwise it cannot be written as 2 1D array)
Singular value decomposition (SVD) decompose by \(F = \mathbf U \Sigma \mathbf V^T = \sum_i \sigma_i u_i v_i^T\) with \(\Sigma = diag(\sigma_i)\) if only one singular value \(\sigma_i\) is non-zero, then it is separable and \(\sqrt{\sigma_1}\mathbf u, \sqrt{\sigma_1} \mathbf v_1^T\) are the 1D filters
Source code
import cv2
import numpy as np
from scipy.signal import convolve2d
# load as gray scale
img_gray = cv2.imread("../assets/yurina.jpg", cv2.IMREAD_GRAYSCALE)
# load as colored
img = cv2.imread("../assets/yurina.jpg")
# --8<-- [start:shift]
# shift down by 25px
k_shifting = np.zeros((50, 50))
k_shifting[0, 24] = 1
# --8<-- [end:shift]
img_shifting = cv2.filter2D(img, -1, k_shifting, borderType=cv2.BORDER_CONSTANT)
cv2.imwrite("../assets/filter_shift.jpg", np.hstack((img, img_shifting)))
# --8<-- [start:sharpen]
k_sharp_3_2 = - np.ones((3,3), np.float32) / 9
k_sharp_3_2[1,1] = 2
k_sharp_3_2 = k_sharp_3_2
k_sharp_3_5 = - np.ones((3,3), np.float32) *4 / 9
k_sharp_3_5[1,1] = 5
k_sharp_5_5 = - np.ones((5,5), np.float32) *4 / 25
k_sharp_5_5[2,2] = 5
# --8<-- [end:sharpen]
img_sharp_3_2 = cv2.filter2D(img, -1, k_sharp_3_2)
img_sharp_3_5 = cv2.filter2D(img, -1, k_sharp_3_5)
img_sharp_5_5 = cv2.filter2D(img, -1, k_sharp_5_5)
cv2.imwrite("../assets/filter_sharpen.jpg", np.hstack((img_sharp_3_2, img_sharp_3_5, img_sharp_5_5)))
# --8<-- [start:average]
k_3 = np.ones((3,3),np.float32) / 9
k_5 = np.ones((5,5),np.float32) / 25
k_11 = np.ones((11,11),np.float32) / 121
# --8<-- [end:average]
img_3 = cv2.filter2D(img,-1,k_3)
img_5 = cv2.filter2D(img,-1,k_5)
img_11 = cv2.filter2D(img,-1,k_11)
cv2.imwrite("../assets/filter_ave.jpg", np.hstack((img_3, img_5, img_11)))
# --8<-- [start:gauss]
# simple version of a square Gaussian function
# this filter is equivalent to cv2.GaussianBlur(img, (size, size), sigma)
def Gaussian(size, sigma):
coef = 1 / (2 * np.pi * sigma **2)
gaus = np.fromfunction(lambda x, y: coef * np.e \
** ((-1*((x-(size-1)/2)**2+(y-(size-1)/2)**2))/(2*sigma**2)), (size, size))
return gaus / np.sum(gaus)
# --8<-- [end:gauss]
smoothed = []
for size in [5, 10, 15]:
row = []
for sigma in [1, 3, 5]:
title = f"fsize={size}, sigma={sigma}"
temp = cv2.filter2D(img, -1, Gaussian(size, sigma))
temp = cv2.putText(temp, title, (10, 10),cv2.FONT_HERSHEY_SIMPLEX, .5, (255,255,255), 1)
row.append(temp)
smoothed.append(np.hstack(row))
cv2.imwrite("../assets/filter_gaussian.jpg", np.vstack(smoothed))
# --8<-- [start:match]
def findMatch(crop, target):
target = cv2.cvtColor(target, cv2.COLOR_BGR2GRAY).astype(float)
crop = cv2.cvtColor(crop, cv2.COLOR_BGR2GRAY).astype(float)
H_crop, W_crop = crop.shape[:2]
ave_kernel = np.ones((H_crop, W_crop)) / (H_crop * W_crop)
kNorm = (np.sum(crop * crop)) ** 0.5
targetNorm = np.sqrt(cv2.filter2D(target * target, -1, ave_kernel) * (H_crop * W_crop))
conv = cv2.filter2D(target, -1, crop / kNorm)
return conv / targetNorm
# --8<-- [end:match]
img_crop = img.copy()
cv2.rectangle(img_crop, (15, 35), (65, 85), 255)
crop = img[35: 85, 15: 65]
matched = findMatch(crop, img) * 255
cv2.rectangle(matched, (15, 35), (65, 85), 255)
cv2.imwrite(
"../assets/filter_match.jpg",
np.hstack((img_crop, np.tile(matched[:, :, None], 3)))
)
# --8<-- [start:sep]
kernel = np.array([[-1, 0, 1],
[-2, 0, 2],
[-1, 0, 1]])
kernel_h = np.array([[-1, 0, 1]])
kernel_v = np.array([[1, 2, 1]]).T
# note that images are in uint8
# and the 1D filters are not normalized
img_gray = img_gray.astype(float)
img_k = cv2.filter2D(img_gray, -1, kernel)
img_k1 = cv2.filter2D(img_gray, -1, kernel_h)
img_k2 = cv2.filter2D(img_k1, -1, kernel_v)
print(np.max(np.abs(img_k - img_k2)))
#>> 0.0
# --8<-- [end:sep]
cv2.putText(img_k, f"Sobel_x 3*3",
(20, 200),cv2.FONT_HERSHEY_SIMPLEX, .5, (255,255,255), 1)
cv2.putText(img_k1, f"h",
(20, 200),cv2.FONT_HERSHEY_SIMPLEX, .5, (255,255,255), 1)
cv2.putText(img_k2, f"v * h",
(20, 200),cv2.FONT_HERSHEY_SIMPLEX, .5, (255,255,255), 1)
cv2.imwrite("../assets/filter_sep.jpg", np.hstack((img_k, img_k1, img_k2)))