Unit 3: Making Machines See – Computer Vision

CBSE XII AI
UNIT 7: Generative AI
July 12, 2025
CBSE XII AI
Unit 6: UNDERSTANDING NEURAL NETWORKS
July 23, 2025
CBSE XII AI

Notes

Introduction to Computer Vision

  • Computer Vision (CV) enables machines to “see” and understand digital images and videos.
  • It is a field of Artificial Intelligence (AI) using deep learning and sensors to interpret visual input.
  • Applications span across industries: from healthcare to self-driving cars.

Fundamentals of Computer Vision

  • CV mimics human vision using cameras (eyes), algorithms (visual cortex).
  • Aims to extract meaningful data from visual content and make decisions.
  • CV systems outperform humans in speed, accuracy, and scalability.

How Machines See

Computer Vision is the art and science of enabling machines to perceive and understand visual information. Here’s how machines interpret images and learn to “see”:


📸 Digital Images – The Basics

  • A digital image is made up of tiny squares called pixels (short for picture elements).
  • Each pixel holds numeric values that represent color or brightness.

Types of Images:

  • Grayscale Images:
    • Each pixel has a value from 0 to 255:
      • 0 = Black, 255 = White
      • Values in between represent different shades of gray.
  • Color Images:
    • Use the RGB model:
      • R = Red, G = Green, B = Blue
    • Each channel has a value between 0–255.
    • Combining RGB values gives over 16 million colors!

💻 Binary Representation of Images

  • Computers don’t understand images like humans do—they process them as numbers.
  • Each pixel value is stored as binary (0s and 1s).
  • Example:
    • 1 byte = 8 bits → 2⁸ = 256 possible values
    • 00000000 (binary) = 0 (black)
    • 11111111 (binary) = 255 (white)

Activity Insight:

Students can convert grayscale images into binary (0s and 1s) using tools and visualize how images are reconstructed from numeric data.


🧠 From Pixels to Understanding

  • Pixels alone don’t mean anything to a machine.
  • Through pattern recognition and machine learning, computers learn:
    • What patterns of pixels represent (e.g., edges, faces, objects).
    • How to classify or locate those patterns in images.

🧬 Human vs Machine Vision (Analogy)

Human VisionMachine Vision (CV)
EyesCamera/Sensor
Retina (light detection)Image sensor collects pixel data
Optic nerve (signal path)Data pipeline (software & hardware)
Visual cortex (interpret)Algorithms + Deep learning models

Machines replicate this process using layers of neural networks that:

  • Detect edges ➡️ shapes ➡️ patterns ➡️ entire objects
  • Make predictions (e.g., “This is a cat”)

🧪 Visual Processing Activities

To help students understand this, the handbook includes a step-by-step grayscale activity:

  1. Choose an image and resize to reduce complexity.
  2. Convert it to grayscale (removes color).
  3. Extract pixel values using a tool.
  4. Paste values into a document → observe image made of 0s and 1s.
  5. Explore how computers build visual meaning from numbers.

Computer Vision Process – 5 Stages

1. Image Acquisition

  • Capture images via cameras, scanners, or software.
  • Quality depends on lighting, resolution, angles.
  • Special tools: MRI, CT scans.

2. Preprocessing

  • Noise Reduction – removes distortions.
  • Normalization – standardizes pixel ranges (e.g., 0–1).
  • Resizing/Cropping – adjusts image size.
  • Histogram Equalization – enhances contrast and detail.

3. Feature Extraction

  • Identifies visual patterns:
    • Edge Detection – boundaries between regions.
    • Corner Detection – identifies sharp intersections.
    • Texture Analysis – surface features.
    • Color Features – distinguishes based on color.

4. Detection & Segmentation

  • Single Object:
    • Classification – what the object is.
    • Localization – where it is in the image.
  • Multiple Objects:
    • Detection – identifies and labels multiple objects.
    • Segmentation – pixel-wise analysis:
      • Semantic: classifies general regions (e.g., animals).
      • Instance: distinguishes individual objects.

5. High-Level Processing

  • Deeper interpretation:
    • Understand scene context, relationships, and insights.
    • Used in medical diagnosis, robotics, etc.

Applications of Computer Vision 🌍

Computer Vision is widely used in real-life applications, many of which are integrated into products we use daily.

📸 Facial Recognition

  • Used in security systems and social media platforms (e.g., auto-tagging in Facebook).
  • Identifies individuals based on unique facial features.

🏥 Healthcare

  • Assists in diagnosing diseases using medical image analysis (X-rays, MRIs).
  • Detects abnormalities (e.g., tumors) and enhances precision in treatment.

🚗 Autonomous Vehicles

  • Cameras help cars understand their environment.
  • Recognizes traffic signs, pedestrians, and lanes using real-time video processing.

📄 Optical Character Recognition (OCR)

  • Converts handwritten or printed text in images into machine-readable text.
  • Used in scanning documents, digitizing invoices, etc.

🔍 Machine Inspection

  • Automatically inspects products in manufacturing for defects and flaws.

🧱 3D Model Building

  • Converts 2D images into 3D models.
  • Used in AR/VR, urban planning, robotics, and navigation.

🎥 Surveillance

  • CCTV and computer vision work together to monitor public spaces.
  • Identifies suspicious behavior, threats, and ensures safety.

🧬 Biometrics

  • Fingerprint and iris recognition systems validate user identity.
  • Widely used in smartphones, banking, and immigration.

Challenges in Computer Vision

Despite its power, CV has limitations and ethical concerns:

🧠 Reasoning & Analytical Skills

  • Difficult to train machines to interpret abstract or complex visual cues.
  • Understanding context still lags behind human capabilities.

📷 Image Acquisition Issues

  • Quality of input affects output.
  • Factors like lighting, background clutter, occlusions, and camera angles degrade performance.

🔒 Privacy and Security

  • Facial recognition in public raises surveillance and consent concerns.
  • CV systems may inadvertently violate privacy rights.

🧪 Bias & Misinformation

  • Biased training data can result in discriminatory outputs.
  • Fake images/videos (deepfakes) can spread misinformation.

Future of Computer Vision 🌟

The field is evolving rapidly with increasing integration into everyday tech.

🚀 Improved Accuracy

  • New deep learning models offer human-level precision.
  • Better feature extraction = more reliable results.

⏱️ Real-Time Processing

  • Faster image recognition with minimal latency.
  • Important for safety-critical tasks (e.g., self-driving cars).

🤖 Integration with AI

  • CV will become more powerful when combined with natural language processing, robotics, etc.

🧬 Enhanced Applications

  • Healthcare: personalized diagnostics and treatment.
  • Security: smarter surveillance and fraud detection.
  • Retail: automated inventory and customer insights.



ai cbse
ai cbse
This site is dedicated to provide contents, notes, questions bank,blogs,articles and other materials for AI students of CBSE.

Leave a Reply

Your email address will not be published. Required fields are marked *