Chapter 3 : Numerical Descriptive Measures Notes

 Contents


3.1 Central Tendency

3.2 Variation and Shape

3.3 Exploring Numerical Data 

3.4 Numerical Descriptive Measures for a Population 

3.5 The Covariance and the Coefficient of Correlation

3.6 Descriptive Statistics: Pitfalls and Ethical Issues


3.1 Central Tendency 

Central Tendency is a summary measure which uses central or middle position of the data with a single value to represent the center of the figure.  

There are three measures of Central tendency - Mean , Median and Mode. 

The Mean 

The Arithmetic Mean or simply the mean is the balance-point of the whole data . It represents the central value of the set of data. Just like the fulcrum on a seesaw , the mean is the mid-point of the data given. It is the most common measure of Central Tendency . 

Mean is calculate by adding all the values of the data given by the number of values of the data set . The mean of a sample data consisting of n number of values is 

"Mean = Sum of the values / No. of the values "


The Median 

The median is the middle value of the ordered data . The topmost condition to find median is that you have to order the data into smallest to largest value in which the half values will be less than or equal to the median and the other half values will be greater than or equal to the median.

Rule to calculate the median : If there are odd number of terms of values in data then the median  is the middle most value and if there are even number of terms of values in data then the median is the average of the two middle most values.

The Mode 

The mode is the most appearing value in the data or we can say that mode is the value having the highest frequency . For a data , there can be several mode or no mode at all . When the frequency of two or more than two values are same then the mode will include all the unique values appearing. But if there are no repetition of values (all values occurs equally ) then there will be no mode. 

The Geometric Mean 

The Geometric Mean is the mean of a data calculated by taking the radical root of the products of the values . This is done to measure the rate of change of a variable over time. 

This is different from arithmetic mean because arithmetic mean is calculated by adding the given data values by the total number of values but in geometric mean we multiply the given data and then take the root of the values radically. 

3.2 Variation and shape 

Variation and shape are the other forms of  characteristics indicators of a variable. 

Variation is the spread,  or the width of the values of  a data. The width of the variation  is called the Range. It is the difference of the largest and the smallest values. Range describes the total variation of the values , but it does not tell about the distribution of the values like how are they spread. whether the largest is upmost or the smallest. Hence taking range is not good for the measure of variation. 

"Range= X(largest) - X(Smallest) . 

Example- There is given the number of apples you eat when you are hungry in the morning. 

8 , 7 , 3 , 9 , 2 , 5 , 8 , 8 , 12 . 

Range : 12-2= 10 . The range signifies that the largest difference between no. of apples you eat in morning is 10. 

The Variance and The Standard Deviation 

We saw that range failed to tell much about the distribution of the values ( Whether they are spread in cluster or at extremes ) , hence the variance and the standard deviation are the two measures that tell us about the distribution 

  • Sum of Squares(SS): A measure of variation that differs from one data set to another squares the difference between each value and the mean and then sums these squared differences. The sum of these squared differences, known as the sum of squares (SS), is then used to compute the sample variance (S^2) and the sample standard deviation (S).

The Coefficient Of Variation : The coefficient of variation is equal to the standard deviation divided by the mean, multiplied by 100%. Unlike the measures of variation presented previously, the coefficient of variation (CV) measures the scatter in the data relative to the mean. The coefficient of variation is a relative measure of variation that is always expressed as a percentage rather than in terms of the units of the particular data.

Z Scores : The Z score of a value is the difference between that value and the mean, divided by the standard deviation. A Z score of 0 indicates that the value is the same as the mean. If a Z score is a positive or negative number, it indicates whether the value is above or below the mean and by how many standard deviations. Z scores help identify outliers, the values that seem excessively different from most of the rest of the values . 


Post a Comment

0 Comments