# Statistics

### Collection and Presentation of Data

Facts or figures, collected with a definite purpose, are called data. It is a systematic record of facts or different values of a quantity. Data is of two types - Primary data and Secondary data.

Primary data
Data collected directly from the source through observation or conversation, or participation is called primary data.

Secondary data
The data gathered from a source where it already exists is called secondary data.

Range of the data
The difference between the highest and lowest values in the given data is called the range of the given data.
Range = Highest value – Lowest value

Frequency
The number of times a value occurs in the given data is called the frequency of that value.
Frequency distribution table
A table that shows the frequency of different values in the given data is called a frequency distribution table. A frequency distribution table that shows the frequency of each individual value in the given data is called an ungrouped frequency distribution table. Tally marks are used to condense the data in tabular form. A table that shows the frequency of groups of values in the given data is called a grouped frequency distribution table.

The groupings used to group the values in given data are called classes or class-intervals. The number of values that each class contains is called the class size or class width.

The least value of a class is called the lower class limit. The greatest value of a class is called the upper class limit. The difference between two successive upper class limits or two successive lower class limits gives the class width of the interval.

There are gaps in between the upper and lower limits of two consecutive classes. There is a difference between the upper limit of a class and the lower limit of its succeeding class. To make the class limits overlapping we add half of this difference to each of the upper limits and subtract the same from each of the lower limits

### Graphical Representation of Data

Graphical representation of data helps in faster and easier interpretation of data.

Bar graph
A bar graph is a visual representation of data. A bar graph uses bars or rectangles of the same width but different heights to represent different values of data.

In a bar graph:
• bars represent the data items whose values are to be plotted.
• the bars have equal gaps between them.
• the width of the bars does not matter.
• the height of the bars represents different values of the data items.

Histogram
A histogram is a form of bar graph which is used for continuous class intervals.

In a histogram:
•  the bars do not have gaps between them.
•  the width of the bars is proportional to the class intervals of data.
•  the height of the bars represents the different values of the variable.
•  the area of each rectangle is proportional to its corresponding frequency.

Frequency polygon
A frequency polygon is formed by joining the midpoints of the adjacent rectangles in a histogram with line segments.

Class mark for a class interval =   Upper class limit + Lower class limit2     .

A frequency polygon can also be formed by joining the class marks of the given data with line segments.

The midpoints at each end are joined to the immediately lower or higher assumed class interval of zero frequency. By this we can ensure that the area of a histogram is equal to the area enclosed by its corresponding frequency polygon. Frequency polygons are used to represent the data when the data is continuous and very large.

### Measures of Central Tendency

Mean
The mean of a given set of values is equal to the sum of all the values divided by the total number of values.

where, = Mean
∑ = Summation sign
n = Number of values
xi = Values of x with i ranging from 1 to n.

Median
The value that lies in the very centre of a given set of values arranged in ascending or descending order, is called the median of the given data. The median is that value of the given number of observations, which divides it into exactly two parts.

If the number of given values is odd, median = []th value, where n = number of given values.

If the number of given values is even, median = mean of ()th and ()th values, where n = number of given values.

Mode
The value that occurs the most number of times in a given set of values is called the mode of the given data or an observation with maximum frequency is known as mode.

Mean, Median and Mode together are called the measures of central tendencies of data. They are the representatives of the data. The central tendencies of a data depend on distribution of values and must be considered with other information for effective interpretation of data. Extreme values of a data affect the mean whereas median and the mode are not effected by the extreme values.

<< Back to NCERT/CBSE Notes