GrabCut: object extraction by separating the foreground and background of an image

GrabCut is an algorithm that is used to extract the foreground from an image.
OpenCV has a python implementation of this algorithm which we can use for our purpose.

GrabCut in OpenCV

OpenCV provides the implementation of the GrabCut as a function cv2.grabCut() . The function takes seven parameters as arguments and returns three parameters.


	cv.grabCut(img, mask, rect, bgdModel, fgdModel, iterCount[, mode])

Parameters:

img: take an image as input ( 8-bit 3-channel image).

mask: input/output 8-bit single-channel mask. When we set parameter mode to cv2.GC_INIT_WITH_RECT, the mask will be initialized automatically.

rect: coordinates of a rectangle or bounding box which includes the foreground object in the format (x, y, w, h). The parameter is only used when the parameter mode==GC_INIT_WITH_RECT.

bgdModel: temporary array for the background model and use it internally.

fgdModel: temporary arrays for the foreground model and use it internally.

iterCount: number of iterations the algorithm should run. Note that the result can be refined with further calls with mode=cv2.GC_INIT_WITH_MASK or mode=cv2.GC_EVAL.

mode: either cv2.GC_INIT_WITH_RECT or cv2.GC_INIT_WITH_MASK, depending on whether we are initializing GrabCut with a bounding box or a mask, respectively. Another two modes cv2.GC_EVAL and cv2.GC_EVAL_FREEZE_MODEL .

Returns:

mask: The output mask after applying GrabCut

bgdModel: The temporary array used to model the background (we can ignore this value)

fgdModel: The temporary array for the foreground (we can ignore this value)

Note: grabCut() changes the image values. If we need the original image we have to keep it in a separate variable.

So we can see there have two main varieties of the GrabCut algorithm.

One is GrabCut with bounding box initialization: our object which we wanted to extract/segmented lies into that bounding box.
Another is a mask that approximated the segmentation.

Let’s explore these two varieties of GrabCut.

GrabCut with bounding box initialization

The bounding box can be generated manually or using an approach like Haar Cascade, from object detection, approaches R-CNN, SSDs, etc.

We will manually define the bounding box.

Import necessary packages


      import numpy as np
      import cv2
      import matplotlib.pyplot as plt

Load image and visualize

Read the image from the directory with OpenCV and visualize it by plotting with matplotlib.

Note: OpenCV reads images in the BGR order.


      # Load image
      file_name = 'sample_image.jpg'
      image = cv2.imread(file_name)
      
      # Visualize
      plt.plot(image)
      plt.show()

Photo by Anoir Chafik on Unsplash

Initialize image mask

It is a mask image where we specify which areas are background, foreground, or probable background/foreground, etc. It is done by the following flags, cv2.GC_BGD, cv2.GC_FGD, cv2.GC_PR_BGD, cv2.GC_PR_FGD, or simply pass 0,1,2,3 to image.


		mask = np.zeros(image.shape[:2], np.uint8)

A NumPy array initialized with zeros and shape is the same as the image but single-channel and 8-bit unsigned integers.

Defining bounding box

A tuple named as bounding_box which contains four elements as the coordinate of the bounding box and width, height.


		bounding_box = (1200,300, 500,1100)

Our object lies within the bounding box. It is obvious outside of the will be treated as background and bounding box has a probable foreground and probable background pixels.

Initialize the bgdModel and fgdModel

Two NumPy arrays initialized with zeros and datatype of elements should be a float.


    fgd_model = np.zeros((1, 65), np.float64)
    bgd_model = np.zeros((1, 65), np.float64)

GrabCut needs two empty arrays to use internally when segmenting the foreground from the background.

Initialize number of iterations

A variable named as num_iterations. It will tell the GrabCut that how many times it will run to segment and extract our foreground.

		num_iterations = 5

Apply grabCut()

Now run the GrabCut algorithm by calling the function cv2.grabCut() with mode GC_INIT_WITH_RECT .


   cv2.grabCut(img=image, mask=mask, rect=bounding_box,
              fgdModel=fgd_model, bgdModel=bgd_model, 
              iterCount=num_iterations, mode=cv2.GC_INIT_WITH_RECT
              )

The operation of the grabCut(),modifies the mask image. In the new mask image, pixels will be marked with four flags denoting the background or foreground as specified above.

Extract the segmented image

In the NumPy array mask, replace all 0 and 2 with 0(as background pixel); all 1 and 3 with 1(as foreground pixel) and store it to another variable modified_mask. And multiply the modified_mask with the image to get the segmented image.
Since mask image contains the 0-background, 1-foreground, 2-probable background, 3-probable foreground.


    # modified the mask by replace array value 0,2 to 0 and and 1,3 to 1
    modified_mask = np.where((mask==0) | (mask==2), 0, 1).astype('uint8')
    
    # Getting the segmented image
    seg_image = image*modified_mask[:,:,np.newaxis]

Visualize the extracted result

Plotting the segmented image with plot the method from the matplotlib.pyplot to visualize the output/result.


    # Plot the segmeted image
    plt.imshow(seg_image)
    plt.show()

Here is our segmented image with GrabCut,

The complete code

	import numpy as np
	import cv2
	import matplotlib.pyplot as plt

	file_name = "sample_image.jpg"

	# Read the image. Note: OpenCV read image into BGR, not RGB.
	image = cv2.imread(file_name)
	print(image.shape)
	# image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
	# plt.imshow(image)
	# plt.show()

	# Initialize output mask: shape must be same as image but for single channel and data type 8-bit unsigned integers
	mask = np.zeros(image.shape[:2], np.uint8) #

	# defining bounding box which has contain our object
	bounding_box = (300,30, 700,400)

	# Initialize bgdModel and fgdModel which are used by GrabCut intetnally
	fgd_model = np.zeros((1, 65), np.float64)
	bgd_model = np.zeros((1, 65), np.float64)

	# Number of iterations: how many times the algorthm will run to get our segmented object
	num_iterations = 5

	# Apply GrabCut with 5 iterations and set mode to GC_INIT_WITH_RECT
	cv2.grabCut(img=image, mask=mask, rect=bounding_box,
	fgdModel=fgd_model, bgdModel=bgd_model,
	iterCount=num_iterations, mode=cv2.GC_INIT_WITH_RECT
	)

	# Extract the segmented image
	modified_mask = np.where((mask==0) \| (mask==2), 0, 1).astype('uint8') # modified the mask by replace array value 0,2 to 0 and and 1,3 to 1

	# plot the masking image to visualize
	# plt.imshow(modified_mask)
	# plt.show()

	# Getting the segmented image
	seg_image = image*modified_mask[:,:,np.newaxis]

	# Plot the segmeted image
	plt.imshow(seg_image)
	plt.show()

view raw grab_cut.py hosted with ❤ by GitHub

GrabCut with mask initialization

In this process, we provide the approximate segmentation of the object in the image. GrabCut improves the segmentation and extracts the foreground contain the object from the image.

We can create a mask with basic image processing (thresholding, edge detection, contour filtering, etc.). Deep learning can be used to create a mask (ex., Mask R-CNN and U-Net). We also can create a mask manually with photo editing software.
Most of the steps are the same as before.

So here is our source image or target image.

Photo by Nico Meier on Unsplash

And here is the image which we will use as a mask.

So let's see how GrabCut improves the image segmentation.

Import necessary packages

        import numpy as np
        import cv2
        import matplotlib.pyplot as plt

Load image and visualize

In this step, we will read two images. One is a target image on which we will apply our GrabCut and another contains our mask.


        # Load image
        target_file = 'sample_image.jpg'
        mask_file = 'mask_image.jpg'

        image = cv2.imread(target_file)
        mask = cv2.imread(mask_file, cv2.IMREAD_GRAYSCALE) # read into gray sacle

We can see how well our mask segmented the object by applying bitwise and operation. Apply bitwise and operation and plot the resultant image to visualize the approximate segmentation of our foreground.


        masked_img = cv2.bitwise_and(image, image, mask=mask)
        
        # Visualize
        plt.imshow(masked_img)

Above fig is the approximated segmentation of our foreground.

Modified mask and Initialize the bgdModel, fgdModel


        fgd_model = np.zeros((1, 65), np.float64)
        bgd_model = np.zeros((1, 65), np.float64)
        
        mask = np.where(mask>0, 3, 0).astype(np.uint8)

We need to set the mask pixel value to probable foreground(cv2.GC_PR_FGD=3) or background (cv2.GC_BGD=0). So we set pixel value to 3 if pixel value is greater than 0. That's mean everything except zero pixels value are going to treated as foreground.

Apply grabCut()

Now run the GrabCut algorithm by calling the function cv2.grabCut() with mode=GC_INIT_WITH_MASK. But rect=None cause we set the mode to GC_INIT_WITH_MASK.


        # Apply grab cut with mask initialization
        cv2.grabCut(img=image, mask=mask, rect=None,
                    fgdModel=fgd_model, bgdModel=bgd_model,
                    iterCount=5, mode=cv2.GC_INIT_WITH_MASK
                   )

Extract and visualize the segmented foreground


    # Extract segmented image
    modified_mask = np.where((mask==0) | (mask==2), 0, 1).astype('uint8')
    seg_image = image * modified_mask[:,:,np.newaxis]

    #Visualize
    plt.imshow(seg_image)
    plt.show()

So we can see how segentation is improved.

Full Implementation

	# import necessary packages
	import numpy as np
	import cv2
	import matplotlib.pyplot as plt

	# Load image
	target_file = 'sample_image.jpg'
	mask_file = 'mask_image.jpg'

	image = cv2.imread(target_file)
	mask = cv2.imread(mask_file, cv2.IMREAD_GRAYSCALE)# read into gray sacle

	# approximate segmentation for foreground
	masked_img = cv2.bitwise_and(image, image, mask= mask)

	# visualize image
	plt.imshow(masked_img)
	plt.show()

	# Initialize bgdModel and fgdModel
	fgd_model = np.zeros((1, 65), np.float64)
	bgd_model = np.zeros((1, 65), np.float64)

	# set pixel value to probable foreground(3) or background (0)
	mask = np.where(mask>0, 3, 0).astype(np.uint8)

	# Apply grab cut with mask initialization
	cv2.grabCut(img=image, mask=mask, rect=None,
	fgdModel=fgd_model, bgdModel=bgd_model,
	iterCount=5, mode=cv2.GC_INIT_WITH_MASK
	)
	# Extract segmented image
	modified_mask = np.where((mask==0) \| (mask==2), 0, 1).astype('uint8')
	seg_image = image*modified_mask[:,:,np.newaxis]

	#Visualize
	plt.imshow(seg_image)
	plt.show()

view raw grab_cut_mask.py hosted with ❤ by GitHub

Search This Blog

Slack Handbook of Code