NOTES CBSE AI X
Unit 2: AI Project Cycle
September 19, 2024
NOTES CBSE AI X
Unit 5: Computer Vision
September 19, 2024
NOTES CBSE AI X

Data Sciences

MCQs

  1. What is the primary component that AI depends on?
    a) Algorithms
    b) Hardware
    c) Data
    d) Software
    Answer: c) Data
  2. Which of the following is NOT a domain of AI mentioned in the document?
    a) Data Sciences
    b) Computer Vision
    c) Natural Language Processing
    d) Quantum Computing
    Answer: d) Quantum Computing
  3. Data Sciences primarily work around which type of data?
    a) Image data
    b) Numeric and alphanumeric data
    c) Textual data
    d) Speech data
    Answer: b) Numeric and alphanumeric data
  4. Which of the following is an example of an AI application in finance?
    a) Airline route planning
    b) Fraud and risk detection
    c) Website recommendations
    d) Personality prediction
    Answer: b) Fraud and risk detection
  5. How do search engines use data science?
    a) To predict flight delays
    b) To suggest movies
    c) To deliver search results
    d) To classify images
    Answer: c) To deliver search results
  6. What is the primary objective of data science in genomics?
    a) To improve airline route planning
    b) To enhance movie recommendations
    c) To personalize treatment based on DNA
    d) To predict stock prices
    Answer: c) To personalize treatment based on DNA
  7. What is the primary focus of targeted advertising?
    a) Fraud detection
    b) Customer segmentation
    c) Maximizing click-through rates
    d) Improving search results
    Answer: c) Maximizing click-through rates
  8. Which model is chosen for predicting food wastage in restaurants?
    a) Classification
    b) Clustering
    c) Regression
    d) Reinforcement
    Answer: c) Regression
  9. What does the AI project goal in the restaurant scenario aim to predict?
    a) Customer behavior
    b) Food wastage
    c) Food quantity to be prepared
    d) Restaurant ratings
    Answer: c) Food quantity to be prepared
  10. Which factor is NOT part of the system map for the food wastage problem?
    a) Number of customers
    b) Price of dish
    c) Customer feedback
    d) Quantity of unconsumed dish
    Answer: c) Customer feedback
  11. Which of the following is an offline data collection method?
    a) Sensors
    b) Surveys
    c) Government portals
    d) Kaggle
    Answer: b) Surveys
  12. What format is used to store tabular data with comma-separated values?
    a) JSON
    b) XML
    c) CSV
    d) SQL
    Answer: c) CSV
  13. What is the primary use of NumPy in Python?
    a) Text processing
    b) Web development
    c) Arithmetic operations on arrays
    d) Image recognition
    Answer: c) Arithmetic operations on arrays
  14. Which Python package is primarily used for data manipulation and analysis?
    a) NumPy
    b) Matplotlib
    c) Pandas
    d) TensorFlow
    Answer: c) Pandas
  15. Which data structure is used by Pandas to handle 2-dimensional data?
    a) Series
    b) DataFrame
    c) Array
    d) Tuple
    Answer: b) DataFrame
  16. Which visualization library is mentioned for creating 2D plots in Python?
    a) NumPy
    b) Matplotlib
    c) Pandas
    d) Seaborn
    Answer: b) Matplotlib
  17. What type of data is typically visualized using scatter plots?
    a) Continuous data
    b) Discontinuous data
    c) Textual data
    d) Image data
    Answer: b) Discontinuous data
  18. What is the key feature of a histogram?
    a) Represents frequency distribution
    b) Shows textual data
    c) Displays social networks
    d) Analyzes time series
    Answer: a) Represents frequency distribution
  19. What do box plots represent in data visualization?
    a) Frequency of data
    b) Quartiles and outliers
    c) Discontinuous data
    d) Categorical data
    Answer: b) Quartiles and outliers
  20. What is the key concept of the K-Nearest Neighbour algorithm?
    a) Majority voting
    b) Decision trees
    c) Nearest neighbors classification
    d) Predicting stock prices
    Answer: c) Nearest neighbors classification
  21. What is the primary application of K-Nearest Neighbour (KNN) in the document’s game activity?
    a) Predicting stock market trends
    b) Personality prediction
    c) Flight delay prediction
    d) DNA analysis
    Answer: b) Personality prediction
  22. In the personality prediction game, which axis represents being task-focused?
    a) Positive X-axis
    b) Negative X-axis
    c) Positive Y-axis
    d) Negative Y-axis
    Answer: b) Negative X-axis
  23. How many nearest neighbors are considered in KNN when K=1?
    a) 1
    b) 2
    c) 3
    d) 5
    Answer: a) 1
  24. As the value of K increases in KNN, predictions become:
    a) Less stable
    b) More stable
    c) Faster
    d) Less accurate
    Answer: b) More stable
  25. In the example of predicting fruit sweetness, when K=2, the prediction:

a) Is sweet
b) Is not sweet
c) Becomes uncertain
d) Depends on color
Answer: c) Becomes uncertain

  1. What is the disadvantage of having K=1 in KNN?
    a) More computational resources
    b) Unstable predictions
    c) Slower prediction times
    d) Too many errors
    Answer: b) Unstable predictions
  2. What is an advantage of using Python in data science?
    a) Complex syntax
    b) Limited libraries
    c) Predefined functions for statistics
    d) Slow performance
    Answer: c) Predefined functions for statistics
  3. What is Mean in statistics?
    a) The sum of all values divided by the number of values
    b) The middle value of a dataset
    c) The most frequent value in a dataset
    d) The range of values in a dataset
    Answer: a) The sum of all values divided by the number of values
  4. What does the term ‘mode’ refer to in statistics?
    a) The highest value
    b) The most frequent value
    c) The middle value
    d) The sum of all values
    Answer: b) The most frequent value
  5. Standard deviation is used to measure:
    a) The mean
    b) The range
    c) The variability of data
    d) The mode
    Answer: c) The variability of data
  6. Which Python package is primarily used for data visualization?
    a) NumPy
    b) Pandas
    c) Matplotlib
    d) SciPy
    Answer: c) Matplotlib
  7. What type of plot is used to represent continuous data frequency?
    a) Scatter plot
    b) Box plot
    c) Histogram
    d) Line plot
    Answer: c) Histogram
  8. Outliers in a dataset are typically represented in box plots as:
    a) Boxes
    b) Whiskers
    c) Circles or dots
    d) Lines
    Answer: c) Circles or dots
  9. Which type of data is commonly used in data science projects?
    a) Textual data
    b) Numeric and alpha-numeric data
    c) Visual data
    d) Audio data
    Answer: b) Numeric and alpha-numeric data
  10. Data Science integrates methods from:
    a) History and Literature
    b) Statistics and Computer Science
    c) Biology and Chemistry
    d) Architecture and Engineering
    Answer: b) Statistics and Computer Science
  11. What is a common source of online data for data science projects?
    a) Personal interviews
    b) Kaggle
    c) Classroom surveys
    d) Newspapers
    Answer: b) Kaggle
  12. Which AI domain deals with image and visual data?
    a) Data Sciences
    b) Computer Vision
    c) Natural Language Processing
    d) Genetic Analysis
    Answer: b) Computer Vision
  13. What is the significance of NaN in a dataset?
    a) Represents text data
    b) Represents an error
    c) Represents missing or invalid data
    d) Represents numerical data
    Answer: c) Represents missing or invalid data
  14. What does CSV stand for?
    a) Comma-Separated Values
    b) Code-Specific Variables
    c) Constant Search Value
    d) Computer-Structured Variables
    Answer: a) Comma-Separated Values
  15. Which tool is used to predict flight delays in airlines?
    a) Classification model
    b) K-Nearest Neighbours
    c) Regression model
    d) Neural networks
    Answer: c) Regression model
  16. What is the primary benefit of using a box plot?
    a) It shows continuous data
    b) It provides frequency distribution
    c) It displays data quartiles and outliers
    d) It identifies trends over time
    Answer: c) It displays data quartiles and outliers
  17. How are errors in data typically represented?
    a) Through graphs
    b) Through incorrect or invalid values
    c) Through statistical models
    d) Through outliers
    Answer: b) Through incorrect or invalid values
  18. What is the role of Pandas in Python?
    a) Image processing
    b) Text analysis
    c) Data manipulation and analysis
    d) Speech recognition
    Answer: c) Data manipulation and analysis
  19. Which data format is primarily used for tabular data?
    a) JSON
    b) CSV
    c) XML
    d) SQL
    Answer: b) CSV
  20. What is a common use of regression models in data science?
    a) Classification of text
    b) Predicting numerical values
    c) Recognizing speech
    d) Processing images
    Answer: b) Predicting numerical values
  21. What is the importance of cleaning data before analysis?
    a) To format the data
    b) To ensure accuracy
    c) To speed up the analysis
    d) To display data visually
    Answer: b) To ensure accuracy
  22. Which of the following is NOT a statistical tool used in data analysis?
    a) Mean
    b) Mode
    c) Regression
    d) Histogram
    Answer: d) Histogram
  23. What is the key challenge in airline route planning?
    a) Flight delays
    b) Customer loyalty programs
    c) Predicting customer satisfaction
    d) Predicting profitability
    Answer: a) Flight delays
  24. Which of the following is an advantage of using NumPy arrays?
    a) Can hold multiple data types
    b) Homogeneous data collection
    c) Simple to initialize
    d) Require more memory
    Answer: b) Homogeneous data collection
  25. What is the purpose of website recommendation systems?
    a) Predict user behavior
    b) Improve user experience
    c) Boost product sales
    d) All of the above
    Answer: d) All of the above

QUESTION-ANSWERS:

  1. Explain how Artificial Intelligence (AI) depends on data.
    AI fundamentally relies on data to function and improve its intelligence. Data fed into AI systems allow them to identify patterns, make predictions, and learn. Different types of data—numeric, visual, or textual—are used in various AI domains like Data Science (numeric data), Computer Vision (visual data), and Natural Language Processing (NLP) (textual data). Without data, AI cannot operate effectively.
  2. What are the key fields that Data Science integrates, and how do they contribute to its functions?
    Data Science integrates fields such as Mathematics, Statistics, Computer Science, and Information Science. Mathematics provides theoretical foundations, while statistics offers tools for analyzing data. Computer Science enables the development of algorithms to process large datasets, and Information Science focuses on managing and retrieving this data. Together, these fields allow data scientists to analyze and make predictions from complex datasets.
  3. Describe the role of Data Science in fraud and risk detection in finance.
    Data Science plays a crucial role in fraud and risk detection by analyzing historical data, customer profiles, and expenditures. In finance, it helps companies identify potential risks, detect fraud, and prevent bad debts. By examining customer behavior and transaction patterns, data scientists create predictive models that reduce financial losses and optimize risk management strategies.
  4. How does Data Science contribute to advancements in genetics and genomics?
    In genetics and genomics, Data Science enables personalized treatments by analyzing large-scale genomic data. It helps in understanding the relationship between DNA and health, allowing researchers to predict how individuals may react to certain drugs or be predisposed to diseases. This deeper insight into human DNA helps in developing advanced genetic risk prediction models for more tailored healthcare solutions.
  5. Explain the use of data science algorithms in internet search engines.
    Search engines like Google, Bing, and Yahoo use data science algorithms to deliver accurate search results quickly. These algorithms analyze user queries, rank relevant results, and improve over time through machine learning. With massive amounts of data processed daily, such algorithms help filter, sort, and rank results based on relevance, thereby enhancing the search experience.
  6. What is targeted advertising, and how does Data Science enhance it?
    Targeted advertising uses data science algorithms to personalize advertisements based on user behavior and preferences. Data collected from online activity, such as browsing history and past purchases, is analyzed to display relevant ads. This increases the effectiveness of digital marketing campaigns by ensuring that ads reach the right audience, leading to higher conversion rates compared to traditional advertising methods.
  7. What is the AI project cycle, and why is it important in Data Science?
    The AI project cycle consists of several stages: problem scoping, data acquisition, data exploration, modeling, and evaluation. This cycle is essential for systematically addressing real-world problems using AI and Data Science. It helps ensure that the problem is well understood, the right data is collected, accurate models are built, and their predictions are properly evaluated for effective solutions.
  8. How can Data Science help reduce food wastage in restaurants?
    Data Science can help predict the amount of food to be prepared by analyzing historical data such as the number of customers, dish preferences, and past food wastage. By using regression models, restaurants can estimate the exact amount of food needed each day, minimizing wastage and reducing losses. Predictive algorithms help restaurants optimize their supply chain and daily food production.
  9. Describe the significance of regression models in predicting food wastage.
    Regression models, which are part of supervised learning, predict continuous values by analyzing trends over time. In the restaurant scenario, regression models use historical data—such as customer footfall and dish consumption—to predict future demand. By training these models on previous data, restaurants can accurately estimate the amount of food to prepare, thus preventing excess wastage.
  10. What is the difference between offline and online data collection methods?
    Offline data collection involves gathering data through traditional means such as surveys, interviews, or observations, while online data collection utilizes digital platforms like sensors, government portals, and open-source websites (e.g., Kaggle). Offline methods are useful for localized or manual data collection, whereas online methods allow access to larger datasets from reliable sources, facilitating more comprehensive analysis in Data Science.
  11. Explain the importance of data cleaning in Data Science.
    Data cleaning is crucial because it ensures the accuracy and consistency of the data used in analysis. It involves identifying and correcting errors such as incorrect values, missing entries, and outliers. Without cleaning, faulty data can lead to incorrect conclusions and poor model performance. Clean data enhances the reliability of the models, ensuring accurate and meaningful insights.
  12. What are the advantages of using Python’s NumPy for data analysis?
    NumPy is a powerful library for numerical operations in Python, widely used in Data Science for handling large datasets. It allows for efficient mathematical computations, such as matrix operations and arithmetic functions, on arrays. NumPy’s ability to process multi-dimensional data and its speed in handling large datasets make it a preferred choice for data manipulation and analysis in scientific computing.
  13. How do box plots represent data distribution, and what insights do they offer?
    Box plots graphically display the distribution of a dataset by dividing it into quartiles. They show the minimum, first quartile, median, third quartile, and maximum, along with any outliers. Box plots are useful for identifying the spread and skewness of data, as well as detecting outliers. They offer a quick visual summary of data distribution and are commonly used in exploratory data analysis.
  14. What is the K-Nearest Neighbour (KNN) algorithm, and how does it work?
    The KNN algorithm is a simple, supervised learning technique used for classification and regression tasks. It classifies data points based on the majority label of their nearest neighbors. For instance, if K=3, the algorithm looks at the 3 closest points to the unknown data and assigns it the most common class among those neighbors. KNN is based on the principle that similar data points exist near each other.
  15. What are some common sources of open data for Data Science projects?
    Common sources of open data for Data Science projects include government portals, open-source platforms like Kaggle, and world organizations’ statistical websites. These sources provide reliable datasets that can be used for various purposes, from academic research to business analytics. Access to open data ensures that data scientists have authentic and diverse datasets to work with for model training and analysis.
ai cbse
ai cbse
This site is dedicated to provide contents, notes, questions bank,blogs,articles and other materials for AI students of CBSE.

Leave a Reply

Your email address will not be published. Required fields are marked *