Chapter 26: Statistics and Measures of Dispersion (Set-1)
In statistics, a “population” means
A Small chosen group
B Random guess list
C Entire group studied
D Only numeric data
Population includes all individuals/items about which we want information. A sample is only a part of it. Conclusions about population are often drawn using sample results.
A “sample” is best described as
A Part of population
B Whole population
C Only qualitative set
D Only grouped table
A sample is a smaller subset selected from the population. We study the sample to estimate population characteristics when studying the full population is difficult or costly.
Which is an example of qualitative data
A Heights in cm
B Blood group types
C Marks scored
D Time in seconds
Qualitative data represent categories or labels, not measured numbers. Blood group (A, B, AB, O) is categorical, while height, marks, and time are numerical values.
Which is an example of discrete data
A Weight of bag
B Temperature today
C Number of siblings
D Height of plants
Discrete data come from counting and take whole-number values. Number of siblings is counted. Weight, temperature, and height are measured and can take fractional values.
A frequency distribution mainly shows
A Exact order only
B Only pie percentages
C Only class midpoints
D Counts for values
A frequency distribution organizes data into values or class intervals and shows how many observations fall in each. It helps summarize large datasets clearly.
In grouped data, a class interval is
A Range of a class
B Highest data value
C Total frequencies sum
D Middle of class
A class interval is the span covered by a class, like 10–20. It shows which values are included in that group, helping organize continuous data.
Cumulative frequency means
A Frequency of midpoint
B Difference of classes
C Running total frequency
D Average of frequencies
Cumulative frequency adds frequencies successively. It tells how many observations are at or below (less-than type) a boundary, helping locate medians, quartiles, and draw ogives.
A histogram is used for
A Only categorical data
B Continuous grouped data
C Only individual values
D Only two variables
Histograms display grouped continuous data using adjacent rectangles. Width shows class interval and height shows frequency (or frequency density), helping visualize distribution shape.
An ogive is a graph of
A Cumulative frequency
B Class frequency only
C Scatter of points
D Bar chart categories
An ogive is a cumulative frequency curve. It is drawn using cumulative frequencies against class boundaries and is used to find median, quartiles, and percentiles graphically.
Measures of central tendency include
A Range, variance, SD
B Quartile deviation only
C Mean, median, mode
D Covariance and r
Central tendency describes the “center” of data. Mean, median, and mode represent typical values. Dispersion measures like range and standard deviation describe spread, not center.
The arithmetic mean is
A Middle value only
B Most frequent value
C Largest minus smallest
D Sum divided by n
Mean is computed by adding all observations and dividing by the number of observations. It uses every value, so it can be affected by extreme observations.
Median of ordered data is
A Most repeated value
B Total of all values
C Middle position value
D Half of maximum
Median is the middle value when data are arranged in order. For even n, it is the average of the two middle values, making it less affected by extremes.
Mode refers to
A Most frequent value
B Middle value always
C Sum of values
D Spread of values
Mode is the observation occurring most often. It is useful for categorical data too. A dataset may have one mode, more than one mode, or no clear mode.
Weighted mean is used when
A All values equal
B Values have weights
C Data are only ranks
D Only for medians
Weighted mean gives different importance to different values using weights, like credit points. It is calculated as (Σwx)/(Σw), not a simple average.
Empirical relation (approx.) is
A Mean = Median + Mode
B Median = Mean + Mode
C Mode = 3Median − 2Mean
D Mode = Mean − Median
For moderately skewed distributions, an approximate relation connects mean, median, and mode. It is not exact for every dataset but helps estimate one measure from others.
Dispersion in data mainly means
A Total sum of values
B Only the median value
C Number of classes
D Spread around center
Dispersion describes how scattered observations are around a central value. Two datasets can have the same mean but different dispersion, so spread measures are essential.
The range equals
A Max minus min
B Mean minus median
C Median minus mode
D Q3 minus Q1
Range is the simplest dispersion measure: largest value minus smallest value. It is easy but depends only on extremes, so it is sensitive to outliers.
Interquartile range is
A Q1 minus Q3
B Q2 minus Q1
C Q3 minus Q1
D Q3 minus Q2
Interquartile range (IQR) covers the middle 50% of data: Q3 − Q1. It reduces the effect of extreme values and is useful for skewed distributions.
Quartile deviation equals
A (Q3 + Q1)/2
B (Q3 − Q1)/2
C Q3 × Q1
D Q3 − Q2
Quartile deviation (semi-interquartile range) is half of IQR. It measures spread of the central half of the data and is less affected by extreme values.
Mean deviation uses
A Average of absolute deviations
B Squares of deviations
C Cubes of deviations
D Only extreme values
Mean deviation is the average of absolute deviations from a central value (mean or median). Absolute values avoid cancellation of positive and negative deviations.
Mean deviation about mean is
A Σ(x−x̄) / n
B Σ(x−x̄)² / n
C Σ|x−x̄| / n
D Σ|x| / n
Mean deviation about mean averages the absolute distances from the mean. Using absolute values ensures all deviations contribute positively to total spread.
Mean deviation is minimum about
A Median
B Mean
C Mode
D Maximum value
The sum of absolute deviations is minimized at the median. Therefore mean deviation about median is often smaller or equal compared to mean deviation about mean.
A limitation of mean deviation is
A Uses all values
B Easy to compute
C Based on squares
D Signs are ignored
Mean deviation takes absolute values, so the direction of deviation (above or below center) is ignored. This makes it less useful for some algebraic manipulations.
Variance is based on
A Absolute deviations
B Only class widths
C Squared deviations
D Cumulative counts
Variance is the mean of squared deviations from the mean. Squaring makes all deviations positive and gives more weight to larger deviations, useful in many formulas.
For ungrouped data, population variance is
A Σ|x−x̄| / n
B Σ(x−x̄)² / n
C Σ(x−x̄) / n
D (max−min)/n
Population variance divides by n (total observations). It measures average squared spread around mean. Standard deviation is the square root of variance.
Standard deviation equals
A Square root variance
B Variance squared
C Mean deviation only
D Half of range
Standard deviation (SD) is √variance. It returns the spread to original units, making interpretation easier than variance, which is in squared units.
If all observations increase by 5, SD
A Becomes five times
B Becomes zero
C Remains unchanged
D Doubles always
Adding a constant shifts all values equally, so deviations from the mean stay the same. Therefore variance and SD do not change under translation.
If all observations are multiplied by 3, variance
A Multiplies by 3
B Multiplies by 9
C Adds 3 only
D Becomes unchanged
Scaling data by a factor k multiplies each deviation by k, so squared deviations multiply by k². Hence variance becomes k² times and SD becomes |k| times.
If all observations are multiplied by 3, SD
A Multiplies by 9
B Adds 3 only
C Becomes unchanged
D Multiplies by 3
Standard deviation scales linearly with the multiplication factor. If each value is multiplied by 3, spread in the same units becomes three times larger.
Coefficient of variation (CV) is
A (Mean/SD)×100
B SD minus mean
C (SD/Mean)×100
D Mean plus SD
CV compares dispersion relative to the mean. It is useful for comparing variability across different datasets, especially when means are different or units differ.
Smaller CV indicates
A More consistency
B Less consistency
C Larger spread always
D Mean is maximum
A smaller coefficient of variation means the standard deviation is small compared to the mean, showing values are more tightly clustered and the series is more consistent.
Which measure depends only on extremes
A Variance
B Mean deviation
C Range
D Quartile deviation
Range uses only the maximum and minimum values. Because it ignores all middle observations, it is highly sensitive to outliers and may not represent typical spread.
A relative measure of dispersion is
A Variance
B Coefficient of variation
C Standard deviation
D Interquartile range
Relative measures compare dispersion to a central value, making them unit-free. CV is relative. Variance, SD, and IQR are absolute measures in data units.
Mean deviation coefficient (about mean) is
A Mean / MD
B MD × Mean
C MD − Mean
D MD / Mean
Coefficient of mean deviation makes mean deviation comparable across datasets. It is computed as mean deviation divided by the central value (mean or median), giving a relative measure.
For grouped data, mean is computed using
A Class boundaries only
B Only cumulative total
C Class midpoints
D Only extreme classes
In grouped data, individual values are not known, so we use class marks (midpoints) as representative values. Then mean is Σ(f×m)/Σf.
Median class in grouped data is the class where
A CF crosses N/2
B Frequency is maximum
C Midpoint is highest
D Range is smallest
For grouped data, locate N/2 in cumulative frequency. The class whose cumulative frequency first exceeds N/2 is the median class, used in the median formula.
Mode for grouped data uses
A Median class only
B Modal class frequency
C Mean of class marks
D Cumulative frequency
The modal class is the class with highest frequency. Grouped mode formula uses frequencies of modal class and adjacent classes to estimate the most typical value.
A scatter plot is used for
A Only grouped frequencies
B Only one-variable mean
C Relationship of two variables
D Only class intervals
Scatter plots show paired data points (x, y). The pattern of dots suggests positive, negative, or no relationship, giving a visual idea of correlation.
Correlation describes
A Total data size
B Class width changes
C Only central value
D Degree of association
Correlation measures how strongly two variables move together. It does not imply causation. Positive means both increase together; negative means one increases as the other decreases.
Correlation coefficient r lies between
A −1 and +1
B 0 and 1 only
C −∞ to +∞
D 1 and 10
The correlation coefficient ranges from −1 (perfect negative) to +1 (perfect positive). Values near 0 indicate weak linear relation, though other patterns may still exist.
Covariance indicates
A Median location only
B Maximum class width
C Joint variability sign
D Frequency total only
Covariance shows whether two variables tend to increase together (positive) or move oppositely (negative). Its magnitude depends on units, so correlation is preferred for comparison.
A z-score represents
A Raw class midpoint
B Standardized distance
C Absolute deviation sum
D Cumulative frequency
A z-score tells how many standard deviations a value is away from the mean. It helps compare values from different distributions using a common standardized scale.
Standard error is related to
A Sample mean variability
B Maximum of dataset
C Class interval size
D Mode of dataset
Standard error measures how much a sample mean is expected to vary from sample to sample. It depends on standard deviation and sample size, and is used in estimation.
Which graph is best for categorical data
A Histogram
B Ogive
C Bar chart
D Box plot only
Bar charts compare frequencies of categories with separated bars. Histograms are for continuous grouped data with adjacent bars. Ogives show cumulative frequency, not categories.
Percentiles divide data into
A 4 equal parts
B 10 equal parts
C 2 equal parts
D 100 equal parts
Percentiles split ordered data into 100 parts. For example, the 90th percentile is the value below which 90% observations lie, useful for ranking and cutoffs.
Quartiles divide data into
A Ten equal parts
B Four equal parts
C Hundred equal parts
D Two equal parts
Quartiles split ordered data into four equal parts: Q1 (25%), Q2 (50% median), and Q3 (75%). They are used to understand spread and build box plots.
Chebyshev’s idea states that
A Most data near mean
B Mean equals mode always
C At least 1−1/k² within k SD
D Range equals 2SD
Chebyshev’s inequality gives a minimum proportion of data within k standard deviations of the mean for any distribution (k>1). It works even without normality assumptions.
If variance is 25, SD is
A 5
B 10
C 25
D 50
Standard deviation is the square root of variance. √25 = 5. SD is in the same unit as data, while variance is in squared units.
For data 2, 4, 6, mean is
A 3
B 6
C 4
D 2
Mean = (2+4+6)/3 = 12/3 = 4. Mean uses all values and gives the average level of the dataset.
For data 3, 3, 7, mode is
A 7
B 3
C 5
D 10
Mode is the value occurring most frequently. Here 3 appears twice while 7 appears once, so 3 is the mode. Mode helps identify the most common observation.