Chapter 26: Statistics and Measures of Dispersion (Set-2)
A “variable” in statistics is
A Fixed constant always
B Only a formula
C Characteristic that changes
D Only a table
A variable is any measurable or observable characteristic that can take different values among individuals or observations, such as height, marks, age, or category like blood group.
Data collected first-hand is called
A Primary data
B Secondary data
C Grouped data
D Bivariate data
Primary data are collected directly by the investigator for a specific study, such as surveys or experiments. Secondary data are already collected by others and reused.
Data taken from books/reports is
A Primary data
B Secondary data
C Raw data
D Nominal data
Secondary data come from existing sources like reports, census records, journals, or websites. They are useful but may not perfectly match the current study purpose.
Raw data means
A Only grouped values
B Only percentages
C Unorganized original values
D Only graph points
Raw data are the original observations as collected, without sorting, grouping, or summarizing. They are later organized into tables or graphs for analysis.
A “class mark” is
A Midpoint of class
B Lower class limit
C Upper class limit
D Class width only
Class mark (mid-value) represents a class interval and equals (lower limit + upper limit)/2. It is widely used for computing mean and variance in grouped data.
Class width is found by
A Lower minus upper
B Mean minus median
C Frequency minus CF
D Upper minus lower
Class width (class size) is the difference between upper and lower class boundaries/limits. It shows how wide each group interval is in a frequency distribution.
In a “less-than ogive,” points are plotted using
A Upper boundaries
B Lower boundaries
C Class marks only
D Frequencies only
A less-than ogive uses cumulative frequencies against upper class boundaries. It shows how many observations are less than a given value, useful for median and quartiles.
In a “more-than ogive,” points use
A Upper boundaries
B Midpoints only
C Lower boundaries
D Percentages only
A more-than ogive uses cumulative frequencies (more-than type) against lower class boundaries. It shows how many observations exceed a value and helps locate quantiles.
A frequency polygon is drawn by joining
A Class limits with bars
B Midpoints with lines
C Cumulative CF points
D Only highest bars
Frequency polygon is made by plotting frequencies at class marks (midpoints) and joining them by straight lines. It gives a clear picture of distribution shape.
A pie chart mainly shows
A Parts of whole
B Cumulative frequency
C Two-variable relation
D Class width pattern
A pie chart represents categories as sectors of a circle. Each sector angle is proportional to its share, making it useful for showing percentage composition.
When data has an extreme outlier, best average is
A Mean
B Range
C Median
D Variance
Median is resistant to extreme values because it depends on position, not magnitude. Mean gets pulled toward outliers, so median better represents “typical” value in such cases.
Mean is most affected by
A Middle values
B Class labels
C Sample size only
D Extreme values
Mean uses every value, so very large or very small observations strongly influence it. That is why mean is sensitive to outliers compared to median and mode.
If all values are equal, variance is
A 1
B 0
C Mean value
D Range value
When all observations are identical, every deviation from the mean is zero. Squared deviations are also zero, so variance and standard deviation both become zero.
A distribution with one peak is
A Bimodal
B Trimodal
C Unimodal
D Uniform
Unimodal distribution has a single most frequent class/value. If there are two peaks, it is bimodal. Modality describes the number of prominent modes.
The second quartile Q2 is
A Median
B Mean
C Mode
D Range
Q2 divides ordered data into two equal halves, which is exactly the median. Quartiles are position-based measures: Q1 (25%), Q2 (50%), Q3 (75%).
Percentile P50 equals
A Mode
B Range
C Median
D Variance
The 50th percentile is the value below which 50% observations lie. That is the definition of median. Therefore P50 and Q2 represent the same location measure.
A box plot is based on
A Five-number summary
B Mean and SD
C Class width only
D Correlation value
Box plot uses minimum, Q1, median, Q3, and maximum. It clearly shows spread, central value, and possible outliers, often using IQR to flag outliers.
Which is an absolute dispersion measure
A Coefficient variation
B Standard deviation
C Coefficient range
D Relative deviation
Absolute measures are in original units (or squared units), like range, IQR, SD, variance. Relative measures are unit-free ratios, like coefficient of variation.
A unit-free dispersion measure is
A Standard deviation
B Mean deviation
C Coefficient of variation
D Interquartile range
CV = (SD/Mean)×100, so units cancel. It helps compare variability of datasets having different scales or different measurement units.
If mean = 20 and SD = 4, CV is
A 20%
B 4%
C 80%
D 5%
CV = (SD/Mean)×100 = (4/20)×100 = 20%. This indicates the SD is one-fifth of the mean, showing moderate relative variability.
Mean deviation is also called
A Standard deviation
B Quartile deviation
C Relative deviation
D Mean absolute deviation
Mean deviation is the average of absolute deviations from a central value. Because absolute values are used, it is often called mean absolute deviation.
Mean deviation about median uses
A Squares from median
B Absolute from mean
C Absolute from median
D Squares from mean
Mean deviation about median is computed by averaging |x − median| (or using frequencies in grouped data). It measures typical absolute distance from the median.
For grouped data, mean deviation uses
A Class marks
B Class boundaries only
C Ogive points
D Pie angles
In grouped data, exact values are unknown, so class marks represent each class. Mean deviation is computed using frequencies and absolute deviations of class marks from mean/median.
Variance unit is
A Same unit
B Squared unit
C No unit
D Percent unit
Variance is based on squared deviations, so its unit becomes the square of the original unit (e.g., cm²). Standard deviation returns to original unit by square root.
SD is preferred over variance because
A Uses only extremes
B Ignores outliers
C Same unit as data
D Always integer
Standard deviation is in the same unit as observations, making interpretation easier. Variance is in squared units, which is less intuitive for understanding spread.
Population SD formula divides by
A n
B n−1
C n+1
D √n
For a population, variance and SD use division by n. For a sample variance used as an unbiased estimator, division by (n−1) is commonly used.
Sample variance commonly divides by
A n
B n−1
C n+1
D 2n
Sample variance often uses (n−1) in the denominator to correct bias when estimating population variance from a sample. This adjustment is called Bessel’s correction.
If a constant is added to data, variance
A Increases by constant
B Doubles always
C Unchanged
D Becomes zero
Adding a constant shifts all values and the mean equally, leaving deviations unchanged. Therefore squared deviations, variance, and standard deviation remain the same.
If data are multiplied by k, SD becomes
A |k| times
B k² times
C k/2 times
D Unchanged
Multiplying every value by k multiplies each deviation by k, so SD scales by |k|. Variance scales by k² due to squaring of deviations.
If data are multiplied by k, variance becomes
A |k| times
B k times
C Unchanged
D k² times
Variance is the mean of squared deviations. Multiplying all deviations by k makes squared deviations multiply by k², so variance becomes k² times the original.
Combined mean of two groups depends on
A Only two SDs
B Only two medians
C Sizes and means
D Only two ranges
Combined mean is a weighted average: (n1x̄1 + n2x̄2)/(n1 + n2). It depends on group sizes and their means, not on dispersion measures directly.
If two series have equal mean, lower SD means
A Less consistency
B More consistency
C More skewness
D Higher median
With equal means, smaller SD indicates values are closer to the mean, showing less spread. Hence that series is more consistent or stable.
Which measure is most robust to outliers
A Interquartile range
B Range
C Variance
D Mean
IQR uses middle 50% data and ignores extreme tails, making it resistant to outliers. Range and variance react strongly to extreme observations.
Median for even number of values is
A Middle value only
B Most frequent only
C Sum of two middle
D Average of two middle
For even n, there is no single middle value. Median is the mean of the two central values after ordering, keeping the position-based nature of median.
The relationship “r = 0” indicates
A Perfect positive
B Perfect negative
C No linear relation
D Same mean always
r = 0 means no linear association between variables. However, a nonlinear relationship may still exist. Correlation measures only linear association.
If r = −1, the relation is
A Perfect negative
B Weak negative
C No relation
D Perfect positive
r = −1 indicates a perfect negative linear relationship: as one variable increases, the other decreases exactly along a straight line.
If r = +1, the relation is
A Weak positive
B Perfect positive
C No relation
D Perfect negative
r = +1 indicates a perfect positive linear relationship: all points lie exactly on an increasing straight line, meaning variables increase together in fixed proportion.
Correlation is unaffected by
A Extreme outliers always
B Sample size always
C Change of origin, scale
D Frequency table only
Correlation coefficient is unit-free and based on standardized values, so changing origin (adding constants) or scale (multiplying) does not change r, though outliers can affect it.
Moments in statistics are mainly used for
A Shape description
B Finding class width
C Drawing histogram
D Counting frequencies
Moments summarize distribution characteristics. Central moments help describe variability and shape, including skewness and kurtosis. They are more advanced summaries beyond mean and variance.
First raw moment about origin equals
A Mean
B Variance
C Median
D Mode
The first moment about origin is the average of x values, which is the arithmetic mean. Higher moments relate to spread and shape features of distribution.
Skewness describes
A Peak height only
B Asymmetry of distribution
C Total data size
D Class width change
Skewness tells whether a distribution is symmetric or tilted. Positive skew generally has a longer right tail; negative skew has a longer left tail.
Kurtosis describes
A Central location
B Class frequency sum
C Peakedness/tailedness
D Scatter direction
Kurtosis indicates how peaked a distribution is and how heavy its tails are compared to normal distribution. It complements skewness in describing distribution shape.
A regression line is used to
A Find median only
B Draw histogram
C Compute range
D Predict one variable
Regression relates two variables and provides an equation to estimate or predict one variable based on the other. It is widely used in forecasting and data modeling.
Bivariate data means
A Two variables per item
B One variable only
C Only grouped classes
D Only categorical list
Bivariate data include paired observations like (height, weight) or (study hours, marks). It is used for scatter plots, correlation, covariance, and regression analysis.
The normal distribution is
A Always two peaks
B Uniform flat line
C Bell-shaped curve
D Only step curve
Normal distribution is symmetric and bell-shaped around the mean. Many natural measurements approximately follow it, and its spread is described well by standard deviation.
In normal distribution, mean equals
A Mean = mode only
B Mean = median = mode
C Median = range
D Mode = variance
For a perfectly normal (symmetric) distribution, the mean, median, and mode coincide at the center. This reflects symmetry and a single highest peak.
The 68–95–99.7 rule relates to
A Normal distribution spread
B Range calculation
C Ogive drawing
D Pie chart angles
For normal distribution, about 68% values lie within 1 SD, 95% within 2 SD, and 99.7% within 3 SD of mean. It helps interpret SD meaning.
Standard deviation mainly measures
A Middle position only
B Most frequent value
C Typical spread around mean
D Total observations
SD indicates the typical distance of observations from the mean. Larger SD means more scatter. It uses squared deviations, so larger gaps influence SD more strongly.
For data 1, 2, 3, 4, median is
A 2.5
B 2
C 3
D 4
Ordered data are 1,2,3,4. With even count, median is average of the two middle values: (2+3)/2 = 2.5. Median is position-based, not magnitude-based.
For data 5, 5, 5, 5, SD is
A 5
B 1
C 25
D 0
All observations are identical, so each deviation from the mean is zero. Therefore variance becomes zero and standard deviation, being its square root, is also zero.