GrabCut: object extraction by separating the foreground and background of an image
GrabCut is an algorithm that is used to extract the foreground from an
image.
OpenCV has a python implementation of this algorithm
which we can use for our purpose.
GrabCut in OpenCV
OpenCV provides the implementation of the GrabCut as a function
cv2.grabCut()
. The function takes seven parameters as arguments and returns three parameters.
cv.grabCut(img, mask, rect, bgdModel, fgdModel, iterCount[, mode])
Parameters:
img: take an image as input ( 8-bit 3-channel image).
mask: input/output 8-bit
single-channel mask. When we set parameter mode to
cv2.GC_INIT_WITH_RECT,
the mask will be initialized automatically.
rect: coordinates of a rectangle or
bounding box which includes the foreground object in the format
(x, y, w, h). The parameter is only
used when the parameter
mode==GC_INIT_WITH_RECT
.
bgdModel: temporary array for the background model and use it internally.
fgdModel: temporary arrays for the foreground model and use it internally.
iterCount: number of iterations the algorithm should run. Note that the result can be refined with further calls
with
mode=cv2.GC_INIT_WITH_MASK
or
mode=cv2.GC_EVAL
.
mode: either
cv2.GC_INIT_WITH_RECT
or
cv2.GC_INIT_WITH_MASK
, depending on whether we are initializing GrabCut with a bounding box or a
mask, respectively. Another two modes
cv2.GC_EVAL
and
cv2.GC_EVAL_FREEZE_MODEL
.
Returns:
mask: The output mask after applying GrabCut
bgdModel: The temporary array used to model the background (we can ignore this value)
fgdModel: The temporary array for the foreground (we can ignore this value)
Note:
grabCut()
changes the image values. If we need the original image we have to keep it
in a separate variable.
- One is GrabCut with bounding box initialization: our object which we wanted to extract/segmented lies into that bounding box.
- Another is a mask that approximated the segmentation.
Let’s explore these two varieties of GrabCut.
GrabCut with bounding box initialization
The bounding box can be generated manually or using an approach like Haar Cascade, from object detection, approaches R-CNN, SSDs, etc.
We will manually define the bounding box.
Import necessary packages
import numpy as np
import cv2
import matplotlib.pyplot as plt
Load image and visualize
Read the image from the directory with OpenCV and visualize it by plotting with matplotlib.
Note: OpenCV reads images in the BGR order.
# Load image
file_name = 'sample_image.jpg'
image = cv2.imread(file_name)
# Visualize
plt.plot(image)
plt.show()
Photo by Anoir Chafik on Unsplash
Initialize image mask
It is a mask image where we specify which areas are background,
foreground, or probable background/foreground, etc. It is done by the
following flags,
cv2.GC_BGD
,
cv2.GC_FGD
,
cv2.GC_PR_BGD
,
cv2.GC_PR_FGD
, or simply
pass 0,1,2,3 to image.
mask = np.zeros(image.shape[:2], np.uint8)
A NumPy array initialized with zeros and shape is the same as the image but single-channel and 8-bit unsigned integers.
Defining bounding box
A tuple named as
bounding_box
which contains four elements as the coordinate of the bounding box and width, height.
bounding_box = (1200,300, 500,1100)
Our object lies within the bounding box. It is obvious outside of the will be treated as background and bounding box has a probable foreground and probable background pixels.
Initialize the bgdModel and fgdModel
Two NumPy arrays initialized with zeros and datatype of elements should be a float.
fgd_model = np.zeros((1, 65), np.float64)
bgd_model = np.zeros((1, 65), np.float64)
GrabCut needs two empty arrays to use internally when segmenting the foreground from the background.
Initialize number of iterations
A variable named as
num_iterations
. It will
tell the GrabCut that how many times it will run to segment and extract our foreground.
num_iterations = 5
Apply grabCut()
Now run the GrabCut algorithm by calling the function
cv2.grabCut()
with mode
GC_INIT_WITH_RECT
.
cv2.grabCut(img=image, mask=mask, rect=bounding_box,
fgdModel=fgd_model, bgdModel=bgd_model,
iterCount=num_iterations, mode=cv2.GC_INIT_WITH_RECT
)
The operation of the
grabCut(),
modifies the
mask image. In the new mask image, pixels will be marked with four flags denoting the background or foreground as specified above.
Extract the segmented image
In the NumPy array mask
,
replace all 0 and 2 with 0(as background pixel); all 1 and 3 with 1(as
foreground pixel) and store it to another variable
modified_mask
. And
multiply the
modified_mask
with the
image
to get the
segmented image.
Since mask image contains the 0-background,
1-foreground, 2-probable background, 3-probable foreground.
# modified the mask by replace array value 0,2 to 0 and and 1,3 to 1
modified_mask = np.where((mask==0) | (mask==2), 0, 1).astype('uint8')
# Getting the segmented image
seg_image = image*modified_mask[:,:,np.newaxis]
Visualize the extracted result
Plotting the segmented image with
plot
the method from the
matplotlib.pyplot
to
visualize the output/result.
# Plot the segmeted image
plt.imshow(seg_image)
plt.show()
Here is our segmented image with GrabCut,
The complete code
GrabCut with mask initialization
In this process, we provide the approximate segmentation of the object in the image. GrabCut improves the segmentation and extracts the foreground contain the object from the image.
We can create a mask with basic image processing (thresholding, edge detection, contour filtering, etc.). Deep learning can be used to create a
mask (ex., Mask R-CNN and U-Net). We also can create a mask manually with photo editing software.
Most of the steps are the same as before.
So here is our source image or target image.
Photo by Nico Meier on Unsplash
And here is the image which we will use as a mask.
So let's see how GrabCut improves the image segmentation.
Import necessary packages
import numpy as np
import cv2
import matplotlib.pyplot as plt
Load image and visualize
In this step, we will read two images. One is a target image on which we will apply our GrabCut and another contains our mask.
# Load image
target_file = 'sample_image.jpg'
mask_file = 'mask_image.jpg'
image = cv2.imread(target_file)
mask = cv2.imread(mask_file, cv2.IMREAD_GRAYSCALE) # read into gray sacle
We can see how well our mask segmented the object by applying bitwise and operation. Apply bitwise and operation and plot the resultant image to visualize the approximate segmentation of our foreground.
masked_img = cv2.bitwise_and(image, image, mask=mask)
# Visualize
plt.imshow(masked_img)
Above fig is the approximated segmentation of our foreground.
Modified mask and Initialize the bgdModel, fgdModel
fgd_model = np.zeros((1, 65), np.float64)
bgd_model = np.zeros((1, 65), np.float64)
mask = np.where(mask>0, 3, 0).astype(np.uint8)
We need to set the mask pixel value to probable foreground(cv2.GC_PR_FGD=3) or background (cv2.GC_BGD=0). So we set pixel value to 3 if pixel value is greater than 0. That's mean everything except zero pixels value are going to treated as foreground.
Apply grabCut()
Now run the GrabCut algorithm by calling the function
cv2.grabCut()
with
mode=GC_INIT_WITH_MASK. But rect=None cause we set the mode to
GC_INIT_WITH_MASK.
# Apply grab cut with mask initialization
cv2.grabCut(img=image, mask=mask, rect=None,
fgdModel=fgd_model, bgdModel=bgd_model,
iterCount=5, mode=cv2.GC_INIT_WITH_MASK
)
Extract and visualize the segmented foreground
# Extract segmented image
modified_mask = np.where((mask==0) | (mask==2), 0, 1).astype('uint8')
seg_image = image * modified_mask[:,:,np.newaxis]
#Visualize
plt.imshow(seg_image)
plt.show()
So we can see how segentation is improved.
Comments
Post a Comment