Semantic & Instance Segmentation (Mask R-CNN, U-Net) — MCQs | Digital Image Processing

1. What is the primary goal of semantic segmentation in image processing?

(A) Detecting edges in images

(B) Assigning a class label to each pixel

(D) Converting color images to grayscale

2. Which segmentation technique assigns different labels to distinct object instances of the same class?

(A) Semantic segmentation

(B) Edge detection

(D) Thresholding

3. Which deep learning model is widely used for instance segmentation?

(A) ResNet

(B) Mask R-CNN

(D) AlexNet

4. Which architecture is commonly used for biomedical image segmentation tasks?

(A) U-Net

(B) R-CNN

(D) DenseNet

5. In Mask R-CNN, what component is responsible for predicting object masks?

(A) Feature Pyramid Network

(B) Region Proposal Network

(D) Mask branch

6. What does U-Net use to combine low-level and high-level features?

(A) Dense connections

(B) Skip connections

(D) Residual blocks

7. Which part of U-Net architecture is responsible for upsampling the feature maps?

(A) Contracting path

(B) Bottleneck

(D) Residual path

8. What is ROIAlign used for in Mask R-CNN?

(A) Generating bounding boxes

(B) Detecting object classes

(D) Downsampling feature maps

9. Which loss function is commonly used for pixel-wise classification in segmentation?

(A) Cross-entropy loss

(B) Triplet loss

(D) Contrastive loss

10. What is a major limitation of semantic segmentation?

(A) Cannot detect object boundaries

(B) Cannot handle multiple classes

(D) Cannot process RGB images

11. What advantage does instance segmentation provide over semantic segmentation?

(A) Faster training

(B) Fewer parameters

(D) Easier implementation

12. Which layer in U-Net is responsible for downsampling the input image?

(A) Transpose convolution

(B) Max pooling

(D) Softmax

13. In semantic segmentation, which class is often used to represent background pixels?

(A) Class 1

(B) Class 0

(D) Class -1

14. What is the output size of a segmentation model for an image of size 256×256?

(A) 1×1

(B) 256×256

(D) 512×512

15. Why is skip connection used in U-Net?

(A) To reduce training time

(B) To improve gradient flow

(D) To regularize the model

16. Which component in Mask R-CNN is modified from Faster R-CNN to enable segmentation?

(A) Backbone network

(B) Region proposal network

(D) Additional mask prediction head

17. What type of data is required to train a semantic segmentation model?

(A) Image-level labels

(B) Pixel-level annotations

(D) Object counts

18. What does the term “end-to-end” mean in the context of segmentation models?

(A) Model runs only on CPUs

(B) Training is done in multiple stages

(D) Only inference is performed

19. Which evaluation metric is commonly used for segmentation tasks?

(A) Accuracy

(B) Mean Average Precision

(D) F1 Score

20. What is the role of data augmentation in training segmentation models?

(A) Reduces model accuracy

(B) Improves generalization

(D) Reduces dataset size

21. What kind of masks does Mask R-CNN generate for detected objects?

(A) Binary masks

(B) Grayscale masks

(D) Alpha masks

22. Which model extends Faster R-CNN for segmentation tasks?

(A) YOLOv5

(B) U-Net++

(D) PSPNet

23. What does the “contracting path” in U-Net primarily consist of?

(A) Upsampling layers

(B) Convolution and pooling layers

(D) Residual connections

24. What kind of convolution is used to upsample feature maps in U-Net?

(A) Depthwise convolution

(B) Standard convolution

(D) Dilated convolution

25. Which activation function is typically used at the final layer of a binary segmentation model?

(A) ReLU

(B) Sigmoid

(D) Softmax

26. Which activation function is used at the final layer of a multi-class semantic segmentation model?

(A) Sigmoid

(B) Softmax

(D) Tanh

27. What problem does semantic segmentation aim to solve in image processing?

(A) Object detection

(B) Classifying each pixel in the image

(D) Removing noise

28. In segmentation, what is the meaning of class imbalance?

(A) Equal number of samples per class

(B) More classes than objects

(D) Same mask for all objects

29. Which of the following models is NOT primarily used for segmentation?

(A) U-Net

(B) SegNet

(D) ResNet

30. Which backbone is commonly used in Mask R-CNN?

(A) MobileNet

(B) ResNet

(D) Inception

31. How does Mask R-CNN improve spatial alignment compared to Faster R-CNN?

(A) Uses ROIAlign instead of ROIPool

(B) Removes max pooling

(D) Uses depth-wise convolutions

32. Which model uses a U-shaped architecture for segmentation?

(A) YOLOv4

(B) DeepLab

(D) Mask R-CNN

33. What is one limitation of U-Net?

(A) Cannot handle grayscale images

(B) Not suitable for large images without tiling

(D) Does not support skip connections

34. In segmentation, what is meant by a “mask”?

(A) Image caption

(B) Region of interest

(D) Feature vector

35. Which layer increases the spatial resolution of features in U-Net?

(A) Max pooling

(B) Dropout

(D) Flatten

36. Which problem arises due to overlapping objects in instance segmentation?

(A) Poor bounding boxes

(B) Low image resolution

(D) Gradient vanishing

37. How is multi-class segmentation different from binary segmentation?

(A) Operates only on grayscale images

(B) Uses multiple input images

(D) Requires bounding boxes

38. What does a pixel-wise classification involve?

(A) Assigning RGB values to each pixel

(B) Predicting class labels for individual pixels

(D) Calculating object velocity

39. Which of the following is a challenge in semantic segmentation?

(A) Object tracking

(B) Class overlap

(D) Reducing color channels

40. What does “fine-grained segmentation” refer to?

(A) Classifying only large objects

(B) Coarse labeling of regions

(D) Only segmenting backgrounds

41. What is the function of the decoder in U-Net?

(A) Extract low-level features

(B) Reduce spatial dimensions

(D) Normalize data

42. Which image modality is U-Net particularly effective for?

(A) Natural scene images

(B) Aerial images

(D) Infrared images

43. What is the role of batch normalization in segmentation networks?

(A) Removes noise from input

(B) Stabilizes and speeds up training

(D) Downsamples the image

44. Which type of data labeling is required for training Mask R-CNN?

(A) Image-level labels only

(B) Bounding boxes only

(D) Captions for each object

45. Which model applies atrous (dilated) convolution for segmentation tasks?

(A) YOLOv3

(B) DeepLab

(D) AlexNet

46. Why is semantic segmentation important in autonomous driving?

(A) To reduce latency

(B) To compress video data

(D) To track object motion

47. What is one benefit of using data augmentation in segmentation tasks?

(A) Lower memory usage

(B) Faster inference

(D) Increased overfitting

48. What is an important property of the masks predicted by Mask R-CNN?

(A) Grayscale masks for better depth

(B) Pixel-wise alignment with detected objects

(D) Fixed size regardless of object scale

49. What is the main purpose of the Region Proposal Network (RPN) in Mask R-CNN?

(A) Segment objects pixel-by-pixel

(B) Generate feature maps

(D) Upsample the image resolution

50. Which challenge is commonly addressed using post-processing in segmentation models?

(A) Reducing number of classes

(B) Aligning image histogram

(D) Increasing training loss

More MCQs on Digital image Processing