The GROUP BY clause is used to group data
SQL allows the user to store more than 30 types of data in as many columns as required, so sometimes, it becomes difficult to find similar data in these columns. Group By in SQL helps us club together identical rows present in the columns of a table. This is an essential statement in SQL as it provides us
with a neat dataset by letting us summarize important data like sales, cost, and salary. The Group By statement is used to group together any rows of a column with the same value stored in them, based on a function specified in the statement. Generally, these functions are one of the aggregate functions such as MAX() and SUM(). This statement is used with the SELECT
command in SQL. The SQL Group By statement uses the split-apply-combine strategy. The
SUM() aggregate function, which results in the arithmetic sum of the rows’ values, has been applied to the groups in the above illustration. To better understand how SQL Order By is used, let’s get an insight into this statement's syntax. SELECT
column_1, function_name(column_2) FROM table_name WHERE [condition] GROUP BY column_name ORDER BY column_name; NOTE: The WHERE clause always precedes the ORDER BY statement. NOTE:
The ORDER BY statement always appears after the GROUP BY statement and is applied to the groups of rows formed. Aggregate functions are one of the most vital parts of the SQL Group By statement, so let’s see what these are. The five aggregate functions that we can use with the SQL Order By statement are:
Let’s apply the concepts stated above to the columns of a table to get an even clearer picture of the Group By statement. Using Group By on a Single ColumnLet’s take a sample “Employee_dept” table:
We have used the AVG() aggregate function for this purpose. This will result in the following table: There was no ORDER BY clause, so the table has no definite order of rows.
To find the number of employees from each city, this is what is required: And it will result in the following table: As we can see, the groups have been ordered in ascending order, according to the number of employees from each city.
To find the maximum salary for each department with “Dept_ID” greater than 1003: The WHERE clause will be applied to the rows before they are grouped. The above query will result in the following:
To find the total salary of all the employees: As we can see, the count function on “Dept_ID” returns the total number of records in the table, and the sum function on “Salary” returns the arithmetic sum of all the employees’ salaries. The SQL Group By statement can be applied to multiple columns of a table in a single query. Let's see how this is done. Using Group By on Multiple ColumnsIn the “Employee_dept” table, to find the number of employees from each city in all the departments, we’ll use the following query: As we can see, the GROUP BY is applied to two columns. This will result in the following table: As we can see, each city and department combination is represented in the table, and the number of employees for each of these combinations has been calculated. Using the Having Clause With Group ByThe WHERE clause cannot be used to filter the grouped rows based on a condition. We have the HAVING clause for this purpose. The syntax for using this clause is: SELECT column_1, function(column_2) FROM table_name WHERE [condition] GROUP BY column_names HAVING [condition] ORDER BY column_names; The HAVING clause always precedes the ORDER BY statement in the query. In the “Employee_dept” table, we'll use the following query to find the cities with multiple employees. As we can see, a condition has been applied to the group of cities formed by the GROUP BY statement. This will result in the following: Only the cities with multiple employees are returned. With this, we come to the end of this article about the popular Group By statement. Gain expertise in the latest Business analytics tools and techniques with the Post Graduate Program in Business Analysis. Enroll now! Next StepsData aggregation or summarization is one of the most important tasks and skills for anyone who manages data. From profit and loss statements to creating a perfect visualization of data, SQL Group By is the tool that allows us to do these kinds of tasks efficiently. Now that you know how to aggregate and summarize data, it is time for you to start querying, manipulating, and visualizing all kinds of data to move forward in your journey to become an expert in SQL. If you liked this article and want to get certified, check out our Business Analyst Master’s Program as it will help you learn the A-Z of SQL as well. Do you have any questions for us? Please mention them in the comment section of our “How to Aggregate Data Using Group By in SQL” article, and we’ll have our experts in the field answer them for you right away! What is the GROUP BY clause used for?The GROUP BY Clause is utilized in SQL with the SELECT statement to organize similar data into groups. It combines the multiple records in single or more columns using some functions.
Which clause is used to group the data?The groupby clause is used to group the data according to particular column or row.
What is GROUP BY clause in SQL?In SQL, the GROUP BY clause is used to group rows by one or more columns. For example, SELECT country, COUNT(*) AS number FROM Customers GROUP BY country; Run Code. Here, the SQL command groups the rows by the country column, and counts the number of each country (because of the COUNT() function).
|