Courses
Courses for Kids
Free study material
Offline Centres
More
Store Icon
Store

Grouping of Data

Reviewed by:
ffImage
hightlight icon
highlight icon
highlight icon
share icon
copy icon
SearchIcon

What is Grouping Data?

Grouping data plays a significant role when we have to deal with large data. This information can also be displayed using a pictograph or a bar graph. Data is formed by arranging individual observations of a variable into groups so that a frequency distribution table of these groups provides a convenient way of summarising or analyzing the data. This is how we define grouped data.


In mathematics, in the topic grouping data, we basically learn to define grouped data mathematically. When the number of observations is very large, we may condense the data into several groups, by the concept of a grouping of data. We record the frequency of observations falling in each of the groups. The presentation of data in groups along with the frequency of each group is called the frequency distribution of the grouped data.


What are the Advantages of Grouping Data?

The Advantages of grouping data in statistics are-

  • It helps to focus on important subpopulations and ignores irrelevant ones.

  • Grouping of data improves the accuracy/efficiency of estimation.

  • Frequency Distribution Table for Grouped Data

  • To analyze the frequency distribution table for grouped data when the collected data is large, then we can follow this approach to analyze it easily.


Example

Consider the marks of 50 students of class VII obtained in an examination. The maximum mark of the exam is 50.


23, 8, 13, 18, 32, 44, 19, 8, 25, 27, 10, 30, 22, 40, 39, 17, 25, 9, 15, 20, 30, 24, 29, 19, 16, 33, 38, 46, 43, 22, 37, 27, 17, 11, 34, 41, 35, 45, 31, 26, 42, 18, 28, 30, 22, 20, 33, 39, 40, 32


If we create a frequency distribution table for each and every observation, then it will form a large table. So for easy understanding, we can make a table with a group of observations say 0 to 10, 10 to 20, etc.


The distribution obtained in the above table is known as the grouped frequency distribution. This helps us to bring various significant inferences like:

  1. Many students have secured between 20-40, i.e. 20-30 and 30-40.

  2. 8 students have secured higher than 40 marks, i.e. they got more than 80% in the examination.


In the above-obtained table, the groups 0-10, 10-20, 20-30,… are known as class intervals (or classes). It is observed that 10 appears in both intervals, such as 0-10 and 10-20. Similarly, 20 appears in both the intervals, such as 10-20 and 20-30. But it is not feasible that observation of either 10 or 20 can belong to two classes concurrently. To avoid this inconsistency, we choose the rule that the general conclusion will belong to the higher class. It means that 10 belongs to the class interval 10-20 but not to 0-10. Similarly, 20 belongs to 20-30 but not to 10-20, etc.


Consider a class that says 10-20, where 10 is the lower class interval and 20 is the upper-class interval. The difference between upper and lower class limits is called class height or class size or class width of the class interval.


This is how we create a frequency distribution table for grouped data as shown above.


Histogram

We can show the above frequency distribution table graphically using a histogram. We need to consider class intervals on the horizontal axis and we need to consider the frequency on the vertical axis.


Let’s See A Few Grouped Data Examples in Detailed Step-by-Step Explanations.


Example 1. The marks obtained by forty students of class VIII in an examination are listed below: 

16, 17, 18, 3, 7, 23, 18, 13, 10, 21, 7, 1, 13, 21, 13, 15, 19, 24, 16, 2, 23, 5, 12, 18, 8, 12, 6, 8, 16, 5, 3, 5, 0, 7, 9, 12, 20, 10, 2, 23 


Divide the data into five groups, namely, 0-5, 5-10, 10-15, 15-20 and 20-25, where 0-5 means marks greater than or equal to 0 but less than 5 and similarly 5-10 means marks greater than or equal to 5 but less than 10, and so on. Prepare a grouped frequency table for the grouped data.


Solution: We need to arrange the given observations in ascending order. After arranging them in ascending order we get them as


0, 1, 2, 2, 3, 3, 5, 5, 5, 6, 7, 7, 7, 8, 8, 9, 10, 10, 12, 12, 12, 13, 13, 13, 15, 16, 16,16, 17, 18, 18, 18, 19, 20, 21, 21, 23, 23, 23, 24


Thus, the frequency distribution of the data may be given as follows:


Note: Here, each of the groups that is 0-5, 5-10, 10-15, 15-20, and 20-25 is known as a class interval. In the class interval 10-15, the number 10 is known as the lower limit and 15 is known as the upper limit of the class interval and the difference between the upper limit and the lower limit of any given class interval is known as the class size.


Thus, the class size in the above frequency distribution is equal to 5.


The mid-value of a class is known to be its class mark and the class mark is obtained by adding its upper and lower class limits and dividing the sum by 2.


Thus, the class mark of 0-5 range is equal to (0 + 5)/2 = 2.5


And the class mark of 5-10 range is equal to (5 + 10)/2 = 7.5, etc.


Questions to be Solved:

Question 1)The weights (in kg) of 35 persons are given below: 

43, 51,  62,47, 48, 40, 50, 62, 53, 56, 40, 48, 56, 53, 50, 42, 55, 52, 48, 46, 45, 54, 52, 50, 47, 44, 54, 55, 60, 63, 58, 55, 60, 53,58


Prepare a frequency distribution table equal to the class size. One such class is the 40-45 class (where 45 is not included).

FAQs on Grouping of Data

1. What is meant by a grouping of data?

When working with a huge volume of data, data grouping is quite significant. A bar graph or pictogram can also be used to display this data. In the following sections of this article, you will find a grouping of data definitions and solved examples.


When the number of observations is high enough to be considered. We can make use of the grouping of data ideas to separate the data into several categories. Individual observations of a variable are grouped into groups, and the frequency distribution table of these groups is a useful way to summarise the data. The benefits of grouping data include improved estimation accuracy and efficiency, as well as the ability to focus on key subpopulations while ignoring unimportant ones.

2. What is data handling?

Data handling, also known as data manipulation, is a phrase that is used in everyday life as well as in mathematics. When recording, gathering, and presenting any type of information or data is required, data handling is the preferred method. Statistics is a term we hear a lot, but it's just another way of saying "data handling." Data handling is used and chosen for everything from creating a bar chart of different pupils' favorite candy to showing a major survey conducted on the Covid-19 cases.

We frequently come across facts like:

  • The number of Covid cases.

  • Total goals scored by Russia in the FIFA World Cup

In such circumstances, the information is referred to as Data. Data can be represented in a variety of ways, including statistical and graphical representations. Graphical methods are frequently more visually appealing and easier to comprehend for the average individual. Data can be represented in a graphical manner in various ways:

  • Pictograph 

  • Bar Graph 

  • Double Bar Graph 

3. What are histograms?

A histogram is an approximate representation of numerical data distribution. The first person to introduce it was Karl Pearson. The first step in making a histogram is to "bin" the range of values, which means dividing the full range of values into a series of intervals, and the number of values that fall into each interval is also counted. Bins are often defined as non-overlapping, sequential periods of a variable. The bins (intervals) must be next to each other and are often (but not always) of the same size.


If the bins are of the same measurements. A rectangle with a height proportionate to the frequency. The number of cases in each bin—is constructed above the bin. Normalizing a histogram allows you to see "relative" frequencies. The proportion of cases falling into each of various categories is then displayed, with the sum of the heights equaling one.


Bins do not have to be of the same width. In this case, the constructed rectangle is said to have an area proportional to the number of cases in the bin. The frequency density—the number of cases per unit of the variable on the horizontal axis—becomes the vertical axis instead of the frequency.

4. How are bar charts used?

Bar graphs and charts are used to visualize category data. Categorical data is the division of information into separate categories, such as age groups, shoe sizes, months of the year, and animals. These are usually qualitative categories. Categories appear across the horizontal axis in a column of the bar chart, and the height of the bar is corresponding to the value of each category.


Bar charts have a discrete scope of categories and are typically scaled to fit all of the data on the graph. When the categories being compared do not have a natural ordering, the bars on the chart can be arranged in any sequence. Pareto charts are bar charts that are organized from highest to lowest incidence.


With grouped (or "clustered") bar charts and stacked bar charts, bar graphs can be utilized for more sophisticated data comparisons.


Each categorical group in a grouped (clustered) bar chart has two or more color-coded bars to indicate that grouping. For example, a business owner with a couple of stores may make a grouped bar chart with different colored bars to represent each store, with the horizontal axis displaying the months of the year and the vertical axis showing income.

5. Differentiate between qualitative data and quantitative data?

Quantitative and qualitative data are the two categories of data available. Quantitative data is used to describe numerical information, whereas qualitative data is used to communicate non-numerical information. Temperature measurements, for example, would be considered this type of data.


Qualitative data, on the other hand, is used to describe information in words. After gathering data, it must be organized, which necessitates the separation of grouped and ungrouped data. Both types of data are useful, but the distinction is that ungrouped data is raw data. This indicates that it has only been gathered and has not been organized into any groups or classifications. On the other hand, Grouped data is arranged into groups from raw data. The Vedantu app and website contain free study materials.

6. What is grouped and ungrouped data?

Data is often described as ungrouped or grouped. Ungrouped data is the data given as individual data points. Grouped data is data given in intervals whereas Ungrouped data is without a frequency distribution.

7. How can we convert ungrouped data to grouped data?

The first step is to determine how many classes you want to have. Next, you subtract the lowest value in the data set from the highest value in the data set and then you divide by the number of classes that you want to have.

8. What are grouped data examples?

Grouped data is data that has been bundled together in categories. Frequency tables and histograms can be used to show this type of data:

  • Relative frequency histogram showing book sales for a certain day, sorted by price. 

  • A grouped frequency table showing grouped data by height. 

These are the few grouped data examples from many other examples out there.