1. What is the primary goal of semantic segmentation in image processing?
(A) Detecting edges in images
(B) Assigning a class label to each pixel
(C) Enhancing image contrast
(D) Converting color images to grayscale
2. Which segmentation technique assigns different labels to distinct object instances of the same class?
(A) Semantic segmentation
(B) Edge detection
(C) Instance segmentation
(D) Thresholding
3. Which deep learning model is widely used for instance segmentation?
(A) ResNet
(B) Mask R-CNN
(C) VGGNet
(D) AlexNet
4. Which architecture is commonly used for biomedical image segmentation tasks?
(A) U-Net
(B) R-CNN
(C) YOLO
(D) DenseNet
5. In Mask R-CNN, what component is responsible for predicting object masks?
(A) Feature Pyramid Network
(B) Region Proposal Network
(C) ROIAlign
(D) Mask branch
6. What does U-Net use to combine low-level and high-level features?
(A) Dense connections
(B) Skip connections
(C) Pooling layers
(D) Residual blocks
7. Which part of U-Net architecture is responsible for upsampling the feature maps?
(A) Contracting path
(B) Bottleneck
(C) Expanding path
(D) Residual path
8. What is ROIAlign used for in Mask R-CNN?
(A) Generating bounding boxes
(B) Detecting object classes
(C) Accurate spatial alignment
(D) Downsampling feature maps
9. Which loss function is commonly used for pixel-wise classification in segmentation?
(A) Cross-entropy loss
(B) Triplet loss
(C) Mean squared error
(D) Contrastive loss
10. What is a major limitation of semantic segmentation?
(A) Cannot detect object boundaries
(B) Cannot handle multiple classes
(C) Cannot distinguish between instances of the same class
(D) Cannot process RGB images
11. What advantage does instance segmentation provide over semantic segmentation?
(A) Faster training
(B) Fewer parameters
(C) Better object counting and tracking
(D) Easier implementation
12. Which layer in U-Net is responsible for downsampling the input image?
(A) Transpose convolution
(B) Max pooling
(C) Batch normalization
(D) Softmax
13. In semantic segmentation, which class is often used to represent background pixels?
(A) Class 1
(B) Class 0
(C) Class 255
(D) Class -1
14. What is the output size of a segmentation model for an image of size 256×256?
(A) 1×1
(B) 256×256
(C) 64×64
(D) 512×512
15. Why is skip connection used in U-Net?
(A) To reduce training time
(B) To improve gradient flow
(C) To combine low-level spatial features with high-level semantic features
(D) To regularize the model
16. Which component in Mask R-CNN is modified from Faster R-CNN to enable segmentation?
(A) Backbone network
(B) Region proposal network
(C) Classifier head
(D) Additional mask prediction head
17. What type of data is required to train a semantic segmentation model?
(A) Image-level labels
(B) Pixel-level annotations
(C) Bounding boxes
(D) Object counts
18. What does the term “end-to-end” mean in the context of segmentation models?
(A) Model runs only on CPUs
(B) Training is done in multiple stages
(C) Full pipeline is trained in a single unified model
(D) Only inference is performed
19. Which evaluation metric is commonly used for segmentation tasks?
(A) Accuracy
(B) Mean Average Precision
(C) Intersection over Union (IoU)
(D) F1 Score
20. What is the role of data augmentation in training segmentation models?
(A) Reduces model accuracy
(B) Improves generalization
(C) Slows down training
(D) Reduces dataset size
21. What kind of masks does Mask R-CNN generate for detected objects?
(A) Binary masks
(B) Grayscale masks
(C) RGB masks
(D) Alpha masks
22. Which model extends Faster R-CNN for segmentation tasks?
(A) YOLOv5
(B) U-Net++
(C) Mask R-CNN
(D) PSPNet
23. What does the “contracting path” in U-Net primarily consist of?
(A) Upsampling layers
(B) Convolution and pooling layers
(C) Recurrent units
(D) Residual connections
24. What kind of convolution is used to upsample feature maps in U-Net?
(A) Depthwise convolution
(B) Standard convolution
(C) Transposed convolution
(D) Dilated convolution
25. Which activation function is typically used at the final layer of a binary segmentation model?
(A) ReLU
(B) Sigmoid
(C) Tanh
(D) Softmax
26. Which activation function is used at the final layer of a multi-class semantic segmentation model?
(A) Sigmoid
(B) Softmax
(C) ReLU
(D) Tanh
27. What problem does semantic segmentation aim to solve in image processing?
(A) Object detection
(B) Classifying each pixel in the image
(C) Enhancing image resolution
(D) Removing noise
28. In segmentation, what is the meaning of class imbalance?
(A) Equal number of samples per class
(B) More classes than objects
(C) Some classes have fewer labeled pixels
(D) Same mask for all objects
29. Which of the following models is NOT primarily used for segmentation?
(A) U-Net
(B) SegNet
(C) Mask R-CNN
(D) ResNet
30. Which backbone is commonly used in Mask R-CNN?
(A) MobileNet
(B) ResNet
(C) VGG16
(D) Inception
31. How does Mask R-CNN improve spatial alignment compared to Faster R-CNN?
(A) Uses ROIAlign instead of ROIPool
(B) Removes max pooling
(C) Uses feature pyramids
(D) Uses depth-wise convolutions
32. Which model uses a U-shaped architecture for segmentation?
(A) YOLOv4
(B) DeepLab
(C) U-Net
(D) Mask R-CNN
33. What is one limitation of U-Net?
(A) Cannot handle grayscale images
(B) Not suitable for large images without tiling
(C) Requires more than 10 GPUs
(D) Does not support skip connections
34. In segmentation, what is meant by a “mask”?
(A) Image caption
(B) Region of interest
(C) Binary or labeled pixel map
(D) Feature vector
35. Which layer increases the spatial resolution of features in U-Net?
(A) Max pooling
(B) Dropout
(C) Up-convolution
(D) Flatten
36. Which problem arises due to overlapping objects in instance segmentation?
(A) Poor bounding boxes
(B) Low image resolution
(C) Ambiguous mask assignment
(D) Gradient vanishing
37. How is multi-class segmentation different from binary segmentation?
(A) Operates only on grayscale images
(B) Uses multiple input images
(C) Predicts more than two classes per pixel
(D) Requires bounding boxes
38. What does a pixel-wise classification involve?
(A) Assigning RGB values to each pixel
(B) Predicting class labels for individual pixels
(C) Estimating camera parameters
(D) Calculating object velocity
39. Which of the following is a challenge in semantic segmentation?
(A) Object tracking
(B) Class overlap
(C) Small object detection
(D) Reducing color channels
40. What does “fine-grained segmentation” refer to?
(A) Classifying only large objects
(B) Coarse labeling of regions
(C) Precise boundaries and detailed labeling
(D) Only segmenting backgrounds
41. What is the function of the decoder in U-Net?
(A) Extract low-level features
(B) Reduce spatial dimensions
(C) Reconstruct segmentation map
(D) Normalize data
42. Which image modality is U-Net particularly effective for?
(A) Natural scene images
(B) Aerial images
(C) Medical images
(D) Infrared images
43. What is the role of batch normalization in segmentation networks?
(A) Removes noise from input
(B) Stabilizes and speeds up training
(C) Compresses the model
(D) Downsamples the image
44. Which type of data labeling is required for training Mask R-CNN?
(A) Image-level labels only
(B) Bounding boxes only
(C) Pixel-wise masks and bounding boxes
(D) Captions for each object
45. Which model applies atrous (dilated) convolution for segmentation tasks?
(A) YOLOv3
(B) DeepLab
(C) Mask R-CNN
(D) AlexNet
46. Why is semantic segmentation important in autonomous driving?
(A) To reduce latency
(B) To compress video data
(C) To identify road areas and obstacles
(D) To track object motion
47. What is one benefit of using data augmentation in segmentation tasks?
(A) Lower memory usage
(B) Faster inference
(C) Improved model generalization
(D) Increased overfitting
48. What is an important property of the masks predicted by Mask R-CNN?
(A) Grayscale masks for better depth
(B) Pixel-wise alignment with detected objects
(C) Class labels for the whole image
(D) Fixed size regardless of object scale
49. What is the main purpose of the Region Proposal Network (RPN) in Mask R-CNN?
(A) Segment objects pixel-by-pixel
(B) Generate feature maps
(C) Propose candidate object bounding boxes
(D) Upsample the image resolution
50. Which challenge is commonly addressed using post-processing in segmentation models?
(A) Reducing number of classes
(B) Aligning image histogram
(C) Refining mask boundaries and removing noise
(D) Increasing training loss
