Chapter 26: Statistics and Measures of Dispersion (Set-5)
Data: 2, 4, 6, 8, 10; the sample variance is
A 8
B 6
C 10
D 5
Mean = 6. Sum of squared deviations = 40. Sample variance = 40/(5−1) = 10. Population variance would be 40/5 = 8, but sample variance divides by n−1.
If variance of X is 9, variance of (2X+5) is
A 18
B 9
C 4
D 36
Variance under linear transformation: Var(aX+b)=a²Var(X). Here a=2, b=5. So Var(2X+5)=4×9=36. Adding 5 does not change variance.
If SD of X is 7, SD of (−3X+2) is
A 21
B −21
C 9
D 7
SD scales by |a| under transformation (aX+b). Here a=−3, so SD becomes |−3|×7 = 3×7 = 21. SD is never negative.
If Var(X)=4, Var(Y)=9 and Cov(X,Y)=3, then Var(X+Y)=
A 22
B 19
C 25
D 16
Use Var(X+Y)=Var(X)+Var(Y)+2Cov(X,Y). Substituting 4, 9, and 3 gives 4+9+6=19. Covariance increases variance of the sum when positive.
If Var(X)=4, Var(Y)=9 and Cov(X,Y)=3, then Var(X−Y)=
A 10
B 19
C 22
D 7
Var(X−Y)=Var(X)+Var(Y)−2Cov(X,Y)=4+9−6=7. Positive covariance reduces variance of the difference because common movement cancels out.
If r=0.6, SDx=5, SDy=10, then Cov(X,Y) is
A 30
B 3
C 0.6
D 50
Cov = r·SDx·SDy = 0.6×5×10 = 0.6×50 = 30. Covariance carries units as product of units of X and Y.
If Cov(X,Y)=0 and Var(X),Var(Y) finite, it guarantees
A Independent always
B Perfect linearity
C Same distributions
D Uncorrelated only
Covariance zero means no linear association (uncorrelated). It does not necessarily imply independence unless additional conditions hold (like jointly normal variables).
For class intervals 0–10,10–20,20–30, frequencies 2,3,5; the mean is
A 20
B 21
C 18
D 22
Using class marks 5, 15, 25: Σfx=180 and Σf=10, so mean=18. Since 18 is not listed, the options contain an error; correct mean is 18.
For grouped data, median formula uses
A L + [(N/2−CF)/f]h
B L + [(N−CF)/f]h
C L + [(CF−N/2)/f]h
D L + [(N/2)/f]h
Grouped median is found by interpolation: Median = L + [(N/2 − cumulative frequency before median class)/frequency of median class] × class width h. It assumes uniform spread in the class.
For grouped mode, the formula includes
A L + [(f0−f1)/(f2−f1)]h
B L + [(f2−f0)/(f1)]h
C L + [(f1)/(f0+f2)]h
D L + [(f1−f0)/(2f1−f0−f2)]h
Grouped mode uses modal class frequency f1 and adjacent class frequencies f0 and f2: Mode = L + [(f1−f0)/(2f1−f0−f2)]h, estimating peak within modal class.
If mean deviation is taken about median, it is
A Maximum possible
B Minimum possible
C Always equals SD
D Always equals range
The sum (and hence mean) of absolute deviations is minimized when taken about the median. This is why median is a best center for absolute deviation based measures.
If all values are increased by 10, coefficient of variation generally
A Changes
B Unchanged always
C Becomes zero
D Doubles
Adding a constant does not change SD, but it changes mean. Since CV = (SD/mean)×100, the denominator changes, so CV typically changes under translation.
If all values are multiplied by 2, coefficient of variation
A Doubles
B Halves
C Becomes zero
D Unchanged
Multiplying by k scales mean by k and SD by |k|, so SD/mean stays the same. Hence CV remains unchanged under scaling by a nonzero constant.
Two series have same SD but different means; lower CV means
A Lesser consistency
B Same variability
C Greater consistency
D No conclusion
With same SD, the series with larger mean will have smaller CV, indicating smaller relative variability. Lower CV implies values are more consistent relative to the average level.
For data 1, 3, 3, 7, 9, the second central moment equals
A Variance
B Mean deviation
C Range
D Median
The second central moment is the average of squared deviations from the mean, which is variance (population version). Central moments describe shape; second central moment measures spread.
For any distribution with finite variance, Chebyshev guarantees within 3 SD at least
A 3/4
B 8/9
C 2/3
D 9/10
Chebyshev: proportion within k SD is at least 1−1/k². For k=3: 1−1/9 = 8/9. This is a minimum guarantee for any distribution.
A histogram with unequal class widths should use height as
A Frequency only
B Cumulative frequency
C Relative frequency only
D Frequency density
With unequal widths, heights must be frequency/width so that rectangle area represents frequency. Otherwise, wider classes would look falsely larger just due to width.
A distribution is strongly right-skewed; best location measure is
A Median
B Mean
C Mode always
D Range
In strong skewness, mean is pulled toward long tail and becomes less representative. Median is resistant to outliers and skewness, giving a better typical value.
A distribution is roughly symmetric with no extreme values; best location measure is
A Median
B Mode
C Range
D Mean
For symmetric distributions without outliers, mean uses all values and is stable. It also supports many algebraic properties, making it preferred when assumptions are reasonable.
If r is computed after adding constants to both variables, r
A Changes always
B Unchanged
C Becomes zero
D Becomes negative
Correlation is invariant under change of origin: adding constants shifts means but does not alter standardized deviations. Hence correlation coefficient remains the same.
If r is computed after multiplying X by 3 and Y by 2, r
A Unchanged sign same
B Becomes six times
C Becomes zero
D Becomes negative
Correlation is invariant under positive scaling; multiplying by positive constants does not change r. If one multiplier were negative, sign of r would flip.
If X is multiplied by −1 and Y unchanged, correlation becomes
A Unchanged
B Zero always
C Undefined
D Sign flips
Multiplying one variable by −1 reverses its direction. Covariance changes sign, SD stays positive, so r changes sign while keeping the same magnitude.
Standard error of mean equals
A SD×√n
B √n/SD
C SD/√n
D SD/n
Standard error measures sampling variability of the mean. As n increases, √n increases, so standard error decreases, meaning sample mean becomes more precise.
If SD=12 and n=36, standard error equals
A 2
B 3
C 4
D 6
√36 = 6, so SE = 12/6 = 2. Standard error decreases with larger sample sizes, showing improved reliability of the sample mean.
If two independent variables have Var(X)=5, Var(Y)=7, then Var(X+Y)=
A 2
B 35
C 0
D 12
For independent variables, Cov=0, so Var(X+Y)=Var(X)+Var(Y)=5+7=12. Independence removes the covariance term from sum variance.
If two independent variables have Var(X)=5, Var(Y)=7, then Var(X−Y)=
A 2
B 12
C 35
D 0
Var(X−Y)=Var(X)+Var(Y)−2Cov. With independence, Cov=0, so it becomes 5+7=12. Variance of sum and difference are equal when Cov=0.
If data are coded y=(x−A)/h, then variance of x equals
A h Var(y)
B Var(y)/h²
C h² Var(y)
D Var(y)+A
x = A + hy. Translation by A doesn’t affect variance, scaling by h multiplies variance by h². Hence Var(x)=h²Var(y).
If SD of coded variable y is 2 and h=5, then SD of x is
A 20
B 5
C 2.5
D 10
Since x = A + hy, SD scales by |h|. So SD(x)=|h|×SD(y)=5×2=10. Adding A does not change SD.
For a dataset, if mean deviation about mean is 0, then
A All values equal
B Mean is zero
C Median is zero
D Data are negative
Mean deviation about mean is average of |x−mean|. If it is zero, every absolute deviation must be zero, implying each value equals the mean.
In a normal distribution, about 95% values lie within
A 1 SD
B 3 SD
C 2 SD
D 0.5 SD
Empirical rule: approximately 68% within 1 SD, 95% within 2 SD, and 99.7% within 3 SD of the mean. This helps interpret SD in normal data.
If a distribution has heavy tails compared to normal, it is
A Platykurtic
B Mesokurtic
C Uniform
D Leptokurtic
Leptokurtic distributions have higher kurtosis and heavier tails than normal. Platykurtic have lighter tails, mesokurtic matches normal reference.
If a distribution has flatter peak than normal, it is
A Platykurtic
B Leptokurtic
C Mesokurtic
D Skewed
Platykurtic distributions have lower kurtosis, flatter peak, and lighter tails compared to normal. Kurtosis relates to tail weight and peak shape, not central location.
Pearson’s correlation coefficient measures
A Any association
B Linear association
C Only ranks
D Only causation
Pearson’s r measures strength and direction of linear relationship between two variables. It does not capture nonlinear patterns well and does not imply cause-effect.
A strong nonlinear relationship can still have
A r equal 1
B Covariance huge
C SD equal zero
D r near zero
Pearson’s r captures only linear association. If variables have a curved relationship, r can be close to zero even though a strong pattern exists in the scatter plot.
In grouped mode, if modal class and adjacent class frequencies are equal (f0 = f1 = f2), then the mode is
A Not uniquely determined
B L + h/2
C L
D U
When adjacent frequencies are equal, the top is flat and there is no single peak. The grouped mode formula gives 0/0, so mode cannot be uniquely fixed.
If Q1=20, Q3=50, then IQR and QD are
A 15 and 30
B 70 and 35
C 30 and 15
D 50 and 20
IQR = Q3−Q1 = 50−20=30. Quartile deviation QD = IQR/2 = 15. These describe spread of middle 50% of data.
For any data set, variance equals
A Square of mean minus mean of squares
B Mean absolute deviation
C Range divided by n
D Mean of squares minus square of mean
Var(X)=E(X²)−[E(X)]² for population form. It is a useful shortcut: compute mean of squares and subtract square of mean to get variance.
If mean is 4 and mean of squares is 22, variance is
E(X²)−[E(X)]² is variance. Since SD = √variance, variance = SD² = 3² = 9. This holds for population variance definition.
Which statement about variance is true
A Can be negative
B Unit-free always
C Depends on origin
D Always nonnegative
Variance is average of squared deviations, so it cannot be negative. It is independent of change of origin (adding constant) but changes with scale (multiplication).
If two series have equal CV, then
A Means equal
B SDs equal
C Relative spreads equal
D Ranges equal
Equal CV means SD/mean ratio is same for both, so their variability relative to mean is equal. Absolute SD and mean may still differ.
If a dataset is shifted by +k, which changes
A Variance changes
B SD changes
C IQR changes
D Mean changes
Adding a constant shifts central location measures (mean, median, mode) by k. Dispersion measures based on deviations (variance, SD, IQR, range) remain unchanged.
If a dataset is scaled by ×k, which changes
A CV changes always
B SD scales by |k|
C Correlation changes always
D Skewness shifts by k
Scaling data by k multiplies SD by |k| and variance by k². CV remains unchanged for nonzero k. Correlation also remains unchanged under positive scaling.
Using the position rule PkPk at k(n+1)/100k(n+1)/100, for ordered data 1 to 9, the 90th percentile position is
A 9th position
B 8th position
C 7th position
D 10th position
Here n=9. Position = 90(n+1)/100 = 90×10/100 = 9. So P90 lies at the 9th position by this specified rule. Position of P90 is 90(n+1)/100 = 90×10/100 = 9. The 9th value in ordered list is 9. Different textbooks may use slightly different percentile conventions.
If a variable is constant, its correlation with any variable is
A Zero
B One
C Negative one
D Undefined
Constant variable has SD=0. Correlation formula divides by SDx·SDy, so division by zero occurs, making correlation undefined even if covariance may be zero.
In regression, the line of Y on X minimizes
A Horizontal squared errors
B Absolute errors only
C Vertical squared errors
D Percent errors
Regression of Y on X estimates Y given X and minimizes sum of squared vertical deviations (errors in Y). The other regression line (X on Y) minimizes horizontal errors.
If r=0, regression lines are
A Perpendicular
B Parallel
C Same line
D No lines exist
When r=0, regression coefficients become 0, so one regression line is horizontal (Y=ȳ) and the other is vertical (X= x̄), which are perpendicular.
If r=±1, regression lines are
A Perpendicular
B Parallel
C Random
D Same line
With perfect correlation, all points lie exactly on a single straight line. Both regression lines coincide because prediction is exact and errors are zero along that line.
If SDx=4, SDy=6 and r=0.5, regression coefficient of Y on X is
A 1.5
B 0.75
C 0.5
D 2
b_yx = r(SDy/SDx) = 0.5×(6/4)=0.5×1.5=0.75. It is slope of regression line of Y on X.
If SDx=4, SDy=6 and r=0.5, regression coefficient of X on Y is
A 0.75
B 1.5
C 2
D 0.333
b_xy = r(SDx/SDy) = 0.5×(4/6)=0.5×(2/3)=1/3≈0.333. This is slope of regression line of X on Y.