➤ How to Code a Game
➤ Array Programs in Java
➤ Java Inline Thread Creation
➤ Java Custom Exception
➤ Hibernate vs JDBC
➤ Object Relational Mapping
➤ Check Oracle DB Size
➤ Check Oracle DB Version
➤ Generation of Computers
➤ XML Pros & Cons
➤ Git Analytics & Its Uses
➤ Top Skills for Cloud Professional
➤ How to Hire Best Candidates
➤ Scrum Master Roles & Work
➤ CyberSecurity in Python
➤ Protect from Cyber-Attack
➤ Solve App Development Challenges
➤ Top Chrome Extensions for Twitch Users
➤ Mistakes That Can Ruin Your Test Metric Program
Most Asked SQL Interview Questions | We have listed mostly asked SQL interview questions on SQL. Let us first create schema and sample data that can be used to cover most of the SQL questions listed. The schema includes tables for employees, departments, customers, orders, products, and salaries.
Table of Contents
- SQL Fundamentals
- SQL Constraints and Data Types
- SQL Joins and Relationships
- Query
- Advanced SQL Concepts
-- Creating Employee Table
CREATE TABLE Employees (
EmployeeID INT PRIMARY KEY,
FirstName VARCHAR(50),
LastName VARCHAR(50),
DepartmentID INT,
Salary DECIMAL(10, 2),
JoiningDate DATE,
ManagerID INT
);
-- Creating Department Table
CREATE TABLE Departments (
DepartmentID INT PRIMARY KEY,
DepartmentName VARCHAR(50)
);
CREATE TABLE Leaves (
LeaveID INT PRIMARY KEY,
EmployeeID INT,
LeaveDate DATE,
FOREIGN KEY (EmployeeID) REFERENCES Employees(EmployeeID)
);
-- Inserting Data into Employees Table
INSERT INTO Employees VALUES (1, 'John', 'Doe', 1, 60000, '2018-01-10', NULL);
INSERT INTO Employees VALUES (2, 'Jane', 'Smith', 2, 75000, '2017-03-15', 1);
INSERT INTO Employees VALUES (3, 'Sam', 'Brown', 1, 55000, '2020-06-20', 1);
INSERT INTO Employees VALUES (4, 'Lisa', 'White', 3, 90000, '2016-09-05', NULL);
INSERT INTO Employees VALUES (5, 'Mark', 'Green', 2, 80000, '2019-11-12', 2);
INSERT INTO Employees VALUES (6, 'Emily', 'Clark', 1, 67000, '2019-01-05', 1);
INSERT INTO Employees VALUES (7, 'Daniel', 'Lewis', 3, 72000, '2021-05-10', 4);
INSERT INTO Employees VALUES (8, 'Sophia', 'Wilson', 4, 98000, '2015-12-15', 6);
INSERT INTO Employees VALUES (9, 'James', 'Johnson', NULL, 65000, '2020-10-10', 7);
INSERT INTO Employees VALUES (10, 'Olivia', 'Jones', 2, 82000, '2018-03-23', 2);
INSERT INTO Employees VALUES (11, 'Zara', 'Ali', 5, 45000, '2021-07-11', 8);
INSERT INTO Employees VALUES (12, 'Amit', 'Sharma', 6, 50000, '2022-05-22', 3);
-- Inserting Data into Departments Table
INSERT INTO Departments VALUES (1, 'HR');
INSERT INTO Departments VALUES (2, 'Finance');
INSERT INTO Departments VALUES (3, 'Engineering');
INSERT INTO Departments VALUES (4, 'Marketing');
INSERT INTO Departments VALUES (5, 'Sales');
INSERT INTO Departments VALUES (6, 'Operations');
INSERT INTO Departments VALUES (7, 'Logistics');
INSERT INTO Leaves VALUES (1, 1, '2024-05-20');
INSERT INTO Leaves VALUES (2, 2, '2024-03-15');
INSERT INTO Leaves VALUES (3, 3, '2024-02-20');
INSERT INTO Leaves VALUES (4, 4, '2024-07-10');
INSERT INTO Leaves VALUES (5, 5, '2024-04-25');
INSERT INTO Leaves VALUES (6, 6, '2024-06-05');
INSERT INTO Leaves VALUES (7, 7, '2024-08-12');
INSERT INTO Leaves VALUES (8, 8, '2024-01-10');
INSERT INTO Leaves VALUES (9, 9, '2024-09-15');
INSERT INTO Leaves VALUES (10, 10, '2024-11-05');
INSERT INTO Leaves VALUES (11, 11, '2024-12-01');
INSERT INTO Leaves VALUES (12, 12, '2024-10-10');
CREATE TABLE Products (
ProductId INT PRIMARY KEY,
ProductName VARCHAR(100),
Price DECIMAL(10, 2)
);
CREATE TABLE Sales (
SaleId INT PRIMARY KEY,
ProductId INT,
Quantity INT,
FOREIGN KEY (ProductId) REFERENCES Products(ProductId)
);
CREATE TABLE Orders (
OrderId INT PRIMARY KEY,
CustomerId INT,
OrderDate DATE
);
CREATE TABLE Payments (
PaymentId INT PRIMARY KEY,
OrderId INT,
PaymentDate DATE,
FOREIGN KEY (OrderId) REFERENCES Orders(OrderId)
);
INSERT INTO Products (ProductId, ProductName, Price) VALUES (1, 'Product A', 10.00);
INSERT INTO Products (ProductId, ProductName, Price) VALUES (2, 'Product B', 20.00);
INSERT INTO Products (ProductId, ProductName, Price) VALUES (3, 'Product C', 30.00);
INSERT INTO Products (ProductId, ProductName, Price) VALUES (4, 'Product D', 40.00);
INSERT INTO Products (ProductId, ProductName, Price) VALUES (5, 'Product E', 50.00);
INSERT INTO Sales (SaleId, ProductId, Quantity) VALUES (1, 1, 100);
INSERT INTO Sales (SaleId, ProductId, Quantity) VALUES (2, 2, 200);
INSERT INTO Sales (SaleId, ProductId, Quantity) VALUES (3, 3, 150);
INSERT INTO Sales (SaleId, ProductId, Quantity) VALUES (4, 4, 80);
INSERT INTO Sales (SaleId, ProductId, Quantity) VALUES (5, 5, 120);
INSERT INTO Orders (OrderId, CustomerId, OrderDate) VALUES (1, 101, '2023-01-01');
INSERT INTO Orders (OrderId, CustomerId, OrderDate) VALUES (2, 102, '2023-01-02');
INSERT INTO Orders (OrderId, CustomerId, OrderDate) VALUES (3, 103, '2023-01-03');
INSERT INTO Orders (OrderId, CustomerId, OrderDate) VALUES (4, 104, '2023-01-04');
INSERT INTO Orders (OrderId, CustomerId, OrderDate) VALUES (5, 105, '2023-01-05');
INSERT INTO Payments (PaymentId, OrderId, PaymentDate) VALUES (1, 1, '2023-01-06');
INSERT INTO Payments (PaymentId, OrderId, PaymentDate) VALUES (2, 2, '2023-01-07');
INSERT INTO Payments (PaymentId, OrderId, PaymentDate) VALUES (3, 3, '2023-01-08');
INSERT INTO Payments (PaymentId, OrderId, PaymentDate) VALUES (4, 4, '2023-01-09');
SQL Fundamentals
1. Explain the order of execution of SQL.
The order of execution of SQL statements can be a bit tricky because it’s not always intuitive. However, understanding it can significantly enhance your ability to write efficient queries. Here’s a breakdown of the typical order of execution for a SQL query:
- FROM Clause: The first step is to process the
FROM
clause to determine the tables involved and perform any necessary joins. - WHERE Clause: Next, the
WHERE
clause filters the rows based on the specified conditions. - GROUP BY Clause: After filtering, the
GROUP BY
clause groups the remaining rows based on the specified columns. - HAVING Clause: The
HAVING
clause then filters the groups created in the previous step. - SELECT Clause: The
SELECT
clause processes the columns and expressions that should be returned in the result set. - DISTINCT Clause: If used, the
DISTINCT
clause removes duplicate rows from the result set. - ORDER BY Clause: The
ORDER BY
clause sorts the result set based on the specified columns and directions. - LIMIT/OFFSET Clause: Finally, the
LIMIT
orOFFSET
clause limits the number of rows returned.
SELECT DISTINCT DepartmentID, AVG(Salary)
FROM Employees
WHERE Salary > 50000
GROUP BY DepartmentID
HAVING AVG(Salary) > 60000
ORDER BY DepartmentID
LIMIT 10;
- FROM Employees: Identify the table to be queried.
- WHERE Salary > 50000: Filter rows where
Salary
is greater than 50000. - GROUP BY DepartmentID: Group the filtered rows by
DepartmentID
. - HAVING AVG(Salary) > 60000: Filter groups where the average
Salary
is greater than 60000. - SELECT DISTINCT DepartmentID, AVG(Salary): Select and compute the required columns, ensuring no duplicate rows.
- ORDER BY DepartmentID: Sort the result set by
DepartmentID
. - LIMIT 10: Limit the result set to 10 rows.
2. What is the difference between WHERE and HAVING?
The WHERE clause is used to filter rows before any groupings are made. It’s applied to individual rows in the database table.
SELECT *
FROM Employees
WHERE Salary > 50000;
The HAVING clause is used to filter groups after the GROUP BY operation has been applied. It’s used to filter the aggregated results. The HAVING clause is primarily used with aggregate functions (like SUM, COUNT, AVG, etc.).
SELECT DepartmentID, AVG(Salary)
FROM Employees
GROUP BY DepartmentID
HAVING AVG(Salary) > 60000;
3. What is the use of GROUP BY?
The GROUP BY
clause in SQL is used to arrange identical data into groups. It is often used in conjunction with aggregate functions (such as COUNT, SUM, AVG, MAX, MIN) to perform calculations on each group of data. Use Cases:
- Aggregate Data: Summarize data by certain categories.
- Data Analysis: Analyze patterns within groups.
- Reporting: Generate reports with grouped data insights.
This query groups employees by DepartmentID
and calculates the average salary for each department:-
SELECT DepartmentID, AVG(Salary)
FROM Employees
GROUP BY DepartmentID;
4. What is the difference between GROUP BY and WHERE Clause?
The GROUP BY clause groups rows with the same values in specified columns. It is used with aggregate functions.
SELECT DepartmentID, AVG(Salary)
FROM Employees
GROUP BY DepartmentID;
The WHERE clause filters rows before grouping. It specified conditions for individual rows.
SELECT *
FROM Employees
WHERE Salary > 50000;
5. What is the difference between DML, DDL, and DCL?
1. Data Manipulation Language (DML):
- Purpose: Used for managing data within schema objects.
- Common Commands:
SELECT
,INSERT
,UPDATE
,DELETE
. - Example:
INSERT INTO Employees (EmployeeID, FirstName, LastName) VALUES (1, 'John', 'Doe');
2. Data Definition Language (DDL):
- Purpose: Used for defining and modifying database structures or schema.
- Common Commands:
CREATE
,ALTER
,DROP
,TRUNCATE
. - Example:
CREATE TABLE Employees (
EmployeeID INT PRIMARY KEY,
FirstName VARCHAR(50),
LastName VARCHAR(50)
);
3. Data Control Language (DCL):
- Purpose: Used for controlling access to data in the database.
- Common Commands:
GRANT
,REVOKE
. - Example:
GRANT SELECT ON Employees TO userName;
Summary:
- DML: Manipulates the data.
- DDL: Defines the structure.
- DCL: Controls access.
6. How many types of clauses in SQL?
SQL (Structured Query Language) includes several types of clauses, each serving a different purpose in structuring and executing queries. Here are the key types of SQL clauses:
- SELECT Clause: Specifies the columns to be retrieved.
SELECT column1, column2 FROM table;
- FROM Clause: Specifies the table(s) from which to retrieve data.
SELECT * FROM table;
- WHERE Clause: Filters records based on specified conditions.
SELECT * FROM table WHERE condition;
- GROUP BY Clause: Groups rows sharing a property so aggregate functions can be applied to each group.
SELECT column, COUNT(*)
FROM table
GROUP BY column;
- HAVING Clause: Filters groups based on conditions, used with
GROUP BY
.
SELECT column, COUNT(*)
FROM table
GROUP BY column
HAVING COUNT(*) > 1;
- ORDER BY Clause: Sorts the result set of a query by one or more columns.
SELECT * FROM table
ORDER BY column ASC|DESC;
- LIMIT/OFFSET Clause: Limits the number of rows returned by a query.
SELECT * FROM table
LIMIT number OFFSET number;
- JOIN Clause: Combines rows from two or more tables based on a related column.
SELECT columns
FROM table1
JOIN table2 ON table1.column = table2.column;
7. Differences between GROUP BY
and PARTITION BY
?
GROUP BY
- Purpose: Used to aggregate data across multiple rows by one or more columns.
- Functionality: Combines rows that have the same values into summary rows, like
SUM
,AVG
,COUNT
, etc. - Usage: Commonly used with aggregate functions to generate summary data.
SELECT DepartmentId, SUM(Salary)
FROM Employees
GROUP BY DepartmentId;
This query groups the employees by their DepartmentId
and calculates the total salary for each department.
PARTITION BY
- Purpose: Used with window functions to perform calculations across a set of table rows that are somehow related to the current row.
- Functionality: Does not reduce the number of rows returned. Instead, it adds a column that contains aggregate values calculated over a partition.
- Usage: Commonly used with functions like
ROW_NUMBER()
,RANK()
,DENSE_RANK()
,NTILE()
,LAG()
,LEAD()
, etc.
SELECT EmployeeId, DepartmentId, Salary,
SUM(Salary) OVER (PARTITION BY DepartmentId) AS TotalDepartmentSalary
FROM Employees;
This query calculates the total salary for each department but does not group the rows. It adds a new column TotalDepartmentSalary
that contains the total salary for the department of each employee.
Key Differences
- Reduction of Rows:
GROUP BY
: Aggregates data and reduces the number of rows.PARTITION BY
: Adds aggregate data without reducing the number of rows.
- Usage Context:
GROUP BY
: Used for general data aggregation in summary reports.PARTITION BY
: Used for analytical purposes to provide detailed insights within groups of data.
8. Explain the use of CASE statements in SQL.
The CASE
statement in SQL is a powerful tool used to implement conditional logic directly within SQL queries. It’s similar to if-else
statements in programming languages. You can use the CASE
statement to create custom output, perform conditional aggregations, and more. Syntax:-
CASE
WHEN condition1 THEN result1
WHEN condition2 THEN result2
...
ELSE resultN
END
Categorizing Employees Based on Salary
SELECT
EmployeeID,
FirstName,
LastName,
Salary,
CASE
WHEN Salary < 50000 THEN 'Low Salary'
WHEN Salary >= 50000 AND Salary <= 80000 THEN 'Medium Salary'
ELSE 'High Salary'
END AS SalaryCategory
FROM
Employees;
Handling Null Values: You can also use the CASE
statement to handle NULL
values:
SELECT
EmployeeID,
FirstName,
LastName,
ManagerID,
CASE
WHEN ManagerID IS NULL THEN 'No Manager'
ELSE CAST(ManagerID AS VARCHAR)
END AS ManagerStatus
FROM
Employees;
Conditional Aggregations: The CASE
statement can also be used for conditional aggregations. For example, calculating the total salary for different departments:
SELECT
DepartmentID,
SUM(CASE WHEN Salary > 60000 THEN Salary ELSE 0 END) AS TotalHighSalaries,
SUM(CASE WHEN Salary <= 60000 THEN Salary ELSE 0 END) AS TotalLowSalaries
FROM
Employees
GROUP BY
DepartmentID;
SQL Constraints and Data Types
1. What are constraints and types of Constraints?
In SQL, constraints are rules applied to table columns to ensure the integrity, accuracy, and reliability of the data within the database. Constraints can be applied during table creation or after the table has been created.
Types of Constraints:
- Primary Key Constraint: Ensures that each row in a table has a unique identifier. Example:
CREATE TABLE Employees (
EmployeeID INT PRIMARY KEY,
FirstName VARCHAR(50),
LastName VARCHAR(50)
);
- Foreign Key Constraint: Ensures the referential integrity of the data by linking two tables. Example:
CREATE TABLE Orders (
OrderID INT PRIMARY KEY,
EmployeeID INT,
CONSTRAINT fk_Employee
FOREIGN KEY (EmployeeID) REFERENCES Employees(EmployeeID)
);
- Unique Constraint: Ensures all values in a column or a set of columns are unique. Example:
CREATE TABLE Products (
ProductID INT PRIMARY KEY,
ProductName VARCHAR(50) UNIQUE
);
- Not Null Constraint: Ensures that a column cannot have a NULL value. Example:
CREATE TABLE Customers (
CustomerID INT PRIMARY KEY,
CustomerName VARCHAR(50) NOT NULL
);
- Check Constraint: Ensures that all values in a column satisfy a specific condition. Example:
CREATE TABLE Employees (
EmployeeID INT PRIMARY KEY,
Age INT CHECK (Age >= 18)
);
- Default Constraint: Provides a default value for a column when no value is specified. Example:
CREATE TABLE Orders (
OrderID INT PRIMARY KEY,
OrderDate DATE DEFAULT GETDATE()
);
Summary:
- Primary Key: Unique identifier for each row.
- Foreign Key: Links two tables.
- Unique: Ensures unique values in a column.
- Not Null: Ensures no NULL values.
- Check: Ensures column values meet a condition.
- Default: Sets a default value for a column.
2. What is the difference between “Primary Key” and “Unique Key”?
- We can have only one Primary Key in a table whereas we can have more than one Unique Key in a table.
- The Primary Key cannot have a NULL value whereas a Unique Key may have only one null value.
- By default, a Primary Key is a Clustered Index whereas by default, a Unique Key is a unique non-clustered index.
- A Primary Key supports an Auto Increment value whereas a Unique Key doesn’t support an Auto Increment value.
3. Different types of Operators?
Type of Operator | Operators | Example | Explanation |
---|---|---|---|
Arithmetic | + , - , * , / , % | SELECT 5 + 2; | Performs basic arithmetic operations. |
Comparison | = , <> /!= , > , < , >= , <= | SELECT * FROM Employees WHERE Salary > 50000; | Compares two values. |
Logical | AND , OR , NOT | SELECT * FROM Employees WHERE Salary > 50000 AND DepartmentID = 1; | Combines multiple conditions. |
Bitwise | & , | , ^ , ~ , << , >> | SELECT 5 & 3; | Performs operations on bits of integer values. |
Other | BETWEEN , IN , LIKE , IS NULL , IS NOT NULL | SELECT * FROM Employees WHERE Salary BETWEEN 40000 AND 60000; | Performs various specialized operations. |
-- 1. Arithmetic Operators
-- Addition (+)
SELECT 5 + 2; -- Result: 7
-- Subtraction (-)
SELECT 5 - 2; -- Result: 3
-- Multiplication (*)
SELECT 5 * 2; -- Result: 10
-- Division (/)
SELECT 5 / 2; -- Result: 2.5
-- Modulo (%)
SELECT 5 % 2; -- Result: 1
-- 2. Comparison Operators
-- Equal to (=)
SELECT * FROM Employees WHERE Salary = 50000;
-- Not equal to (<> or !=)
SELECT * FROM Employees WHERE Salary <> 50000;
-- Greater than (>)
SELECT * FROM Employees WHERE Salary > 50000;
-- Less than (<)
SELECT * FROM Employees WHERE Salary < 50000;
-- Greater than or equal to (>=)
SELECT * FROM Employees WHERE Salary >= 50000;
-- Less than or equal to (<=)
SELECT * FROM Employees WHERE Salary <= 50000;
-- 3. Logical Operators
-- AND
SELECT * FROM Employees WHERE Salary > 50000 AND DepartmentID = 1;
-- OR
SELECT * FROM Employees WHERE Salary > 50000 OR DepartmentID = 1;
-- NOT
SELECT * FROM Employees WHERE NOT Salary = 50000;
-- 4. Bitwise Operators
-- Bitwise AND (&)
SELECT 5 & 3; -- Result: 1
-- Bitwise OR (|)
SELECT 5 | 3; -- Result: 7
-- Bitwise XOR (^)
SELECT 5 ^ 3; -- Result: 6
-- Bitwise NOT (~)
SELECT ~5; -- Result: -6
-- Bitwise Shift Left (<<)
SELECT 5 << 1; -- Result: 10
-- Bitwise Shift Right (>>)
SELECT 5 >> 1; -- Result: 2
-- 5. Other Operators
-- BETWEEN
SELECT * FROM Employees WHERE Salary BETWEEN 40000 AND 60000;
-- IN
SELECT * FROM Employees WHERE DepartmentID IN (1, 2, 3);
-- LIKE
SELECT * FROM Employees WHERE FirstName LIKE 'J%';
-- IS NULL
SELECT * FROM Employees WHERE LastName IS NULL;
-- IS NOT NULL
SELECT * FROM Employees WHERE LastName IS NOT NULL;
4. Difference between CHAR and VARCHAR?
1. CHAR
:
- Fixed-Length: Allocates a fixed amount of memory, regardless of the actual data length.
- Storage Efficiency: Less efficient for varying-length data; better for consistent-length data.
- Performance: Generally faster for read operations because of fixed length.
- Padding: Pads extra spaces to the right of the value to meet the defined length.
CREATE TABLE Employees (
EmployeeCode CHAR(10)
);
If you store “12345” in a CHAR(10)
, it will be stored as "12345_ _ _ _ _"
(with five trailing space characters).
2. VARCHAR
:
- Variable-Length: Allocates only the memory needed for the actual data length, plus one or two bytes for length storage.
- Storage Efficiency: More efficient for varying-length data.
- Performance: Slightly slower for read operations compared to
CHAR
due to length calculations. - Padding: Does not pad extra spaces; stores the exact length of the value.
CREATE TABLE Employees (
EmployeeName VARCHAR(50)
);
If you store “John” in a VARCHAR(50)
, it will be stored just as “John”.
Summary:
CHAR
: Best for fixed-length data, may cause wasted space if data length varies, generally faster for consistent-length data.VARCHAR
: Best for varying-length data, more storage-efficient, but can be slightly slower due to extra length calculations.
5. What is an index? Explain its different types.
An index in SQL is a database object that improves the speed of data retrieval operations on a table by providing quick access to the rows. Think of it as a book’s index, which helps you quickly find the information you need without having to read every page.
Types of Indexes:
- Primary Key Index:
- Automatically Created: When a primary key is defined.
- Purpose: Ensures that the primary key column has unique values and can be used to quickly access rows.
- Unique Index:
- Ensures Uniqueness: Prevents duplicate values in a column.
- Purpose: Used to enforce unique constraints on columns other than the primary key.
- Clustered Index:
- Physical Order: Reorders the physical storage of the table’s data to match the index.
- One Per Table: A table can have only one clustered index.
CREATE CLUSTERED INDEX idx_EmployeeID ON Employees(EmployeeID);
- Non-Clustered Index:
- Logical Order: Maintains a separate structure from the data rows, with pointers to the physical data.
- Multiple Per Table: A table can have multiple non-clustered indexes.
CREATE INDEX idx_LastName ON Employees(LastName);
- Full-Text Index:
- Text Search: Used for full-text searches in text-based columns.
- Purpose: Efficiently searches large text fields, like
VARCHAR
orTEXT
columns.
CREATE FULLTEXT INDEX idx_FullText ON Articles(Content);
- Composite Index:
- Multiple Columns: An index on multiple columns.
- Purpose: Used to improve performance on queries involving multiple columns in the
WHERE
clause.
CREATE INDEX idx_Composite ON Employees(LastName, FirstName);
Summary:
- Primary Key Index: Ensures primary key uniqueness and quick access.
- Unique Index: Ensures column value uniqueness.
- Clustered Index: Reorders physical storage.
- Non-Clustered Index: Logical ordering with pointers.
- Full-Text Index: Efficient text searches.
- Composite Index: Multiple column indexing.
6. How do you use indexing to improve SQL query performance?
Indexing can significantly enhance SQL query performance by reducing the amount of data that needs to be scanned to find the required information. Here’s how you can use indexing effectively:
Key Strategies for Using Indexing:
- Identify High-Frequency Queries: Analyze which queries are most commonly executed and create indexes on the columns frequently used in
WHERE
,JOIN
,ORDER BY
, andGROUP BY
clauses. - Choose the Right Index Type: Use Clustered Indexes for primary keys or columns with unique values, as they store the actual data rows in the index. Use Non-Clustered Indexes for columns used in search conditions and sorting, as they provide quick lookups without altering the physical order of data.
- Index Selective Columns: Create indexes on columns with high selectivity (i.e., columns where the majority of values are unique or not repetitive). This increases the efficiency of searches.
- Use Composite Indexes: When queries filter on multiple columns, create composite indexes to cover those columns. Ensure the columns in the composite index match the order of the columns in the query’s
WHERE
clause.CREATE INDEX idx_Composite ON Employees(LastName, FirstName);
- Avoid Over-Indexing: While indexes improve read performance, they can slow down write operations (INSERT, UPDATE, DELETE) because the index needs to be updated. Balance the number of indexes to optimize both read and write operations.
- Regular Maintenance: Periodically rebuild and reorganize indexes to remove fragmentation, which can degrade performance over time.
ALTER INDEX idx_Composite ON Employees REBUILD;
- Covering Indexes: Use covering indexes (indexes that include all the columns needed by a query) to avoid lookups to the actual table.
CREATE INDEX idx_Covering ON Employees (LastName, FirstName, DepartmentID);
- Monitor and Analyze Performance: Use database performance monitoring tools to track index usage and query performance. Adjust indexes based on the analysis.
EXPLAIN SELECT * FROM Employees WHERE LastName = 'Smith';
Here’s an example of creating a non-clustered index on a column often used in search conditions:
CREATE INDEX idx_LastName ON Employees(LastName);
7. What is the difference between TRUNCATE and DELETE?
1. TRUNCATE:
- Purpose: Used to quickly remove all rows from a table, effectively resetting it to an empty state.
- It permanently deletes all rows from a table.
- Operation Type: DDL (Data Definition Language).
- Performance: Generally faster than
DELETE
because it does not log individual row deletions. - Transaction Control: Cannot be rolled back in most databases; treated as a DDL command.
- Triggers: Does not fire triggers.
- Syntax:
TRUNCATE TABLE tableName;
Example:
TRUNCATE TABLE Employees;
2. DELETE:
- Purpose: Used to remove specific rows from a table based on a condition.
- Operation Type: DML (Data Manipulation Language).
- Performance: Slower than
TRUNCATE
for large tables because it logs individual row deletions. - Transaction Control: Can be rolled back if enclosed in a transaction block.
- Triggers: Fires triggers.
- Syntax:
DELETE FROM tableName WHERE condition;
Example:
DELETE FROM Employees WHERE DepartmentID = 1;
8. How would you optimize a slow SQL query?
Optimizing a slow SQL query can involve several strategies to enhance its performance. Here are some common techniques:
- Analyzing the Query Plan (Use EXPLAIN): Use the
EXPLAIN
command to understand how your query is executed.
EXPLAIN SELECT * FROM Employees WHERE DepartmentID = 1;
- Indexing (Create Indexes): Ensure that columns used in
WHERE
,JOIN
,ORDER BY
, andGROUP BY
clauses have appropriate indexes.
CREATE INDEX idx_DepartmentID ON Employees(DepartmentID);
- Optimize Joins (Use Proper Joins): Ensure that you use the most efficient join type for your query.
SELECT e.Name, d.DepartmentName
FROM Employees e
INNER JOIN Departments d ON e.DepartmentID = d.DepartmentID;
- Avoiding SELECT * (Specify Columns): Select only the columns you need instead of using
SELECT *
.
SELECT Name, DepartmentName
FROM Employees
WHERE DepartmentID = 1;
- Filtering Early (Use WHERE Clauses): Filter data as early as possible in the query.
SELECT Name, DepartmentName
FROM Employees
WHERE DepartmentID = 1;
- Optimizing Subqueries (Use Joins or CTEs): Replace subqueries with joins or Common Table Expressions (CTEs) if possible.
-- Using a join instead of a subquery
SELECT e.Name, d.DepartmentName
FROM Employees e
INNER JOIN Departments d ON e.DepartmentID = d.DepartmentID
WHERE d.DepartmentName = 'Sales';
- Limiting Results (Use LIMIT): If you need only a subset of the results, use the
LIMIT
clause.
SELECT Name, DepartmentName
FROM Employees
WHERE DepartmentID = 1
LIMIT 10;
- Analyzing and Updating Statistics (Update Statistics): Ensure that the database statistics are up-to-date for the optimizer to make informed decisions.
ANALYZE TABLE Employees;
- Avoiding Functions in WHERE Clause (Use Indexed Columns): Avoid using functions on columns in the
WHERE
clause as it can prevent index usage.
-- Instead of this
SELECT * FROM Employees WHERE YEAR(JoiningDate) = 2021;
-- Use this
SELECT * FROM Employees
WHERE JoiningDate BETWEEN '2021-01-01' AND '2021-12-31';
- Partitioning Large Tables (Partition Tables): Break down large tables into smaller, more manageable pieces for faster query performance.
Difference between ISNULL() and IFNULL()?
- ISNULL(expression): Checks if an expression is NULL and returns 1 if true, 0 otherwise.
- IFNULL(expression, replacement): Replaces NULL values with a specified replacement value.
SELECT
EmployeeID,
FirstName,
ISNULL(ManagerID) AS IsManagerIDNull
FROM
Employees;
SELECT
EmployeeID,
FirstName,
IFNULL(ManagerID, 'No Manager') AS ManagerID
FROM
Employees;
SQL Joins and Relationships
1. Explain all types of joins in SQL.
1. Inner Join: Returns records that have matching values in both tables. Syntax:
SELECT a.*, b.*
FROM TableA a
INNER JOIN TableB b ON a.CommonColumn = b.CommonColumn;
Example:
SELECT e.firstname, d.departmentname
FROM employees e
INNER JOIN departments d
ON e.departmentid = d.departmentid;
2. Left (Outer) Join: Returns all records from the left table and the matched records from the right table. Unmatched records from the left table are also included. Syntax:
SELECT a.*, b.*
FROM TableA a
LEFT JOIN TableB b ON a.CommonColumn = b.CommonColumn;
Example:
SELECT e.firstname, d.departmentname
FROM employees e
LEFT JOIN departments d
ON e.departmentid = d.departmentid;
3. Right (Outer) Join: Returns all records from the right table and the matched records from the left table. Unmatched records from the right table are also included. Syntax:
SELECT a.*, b.*
FROM TableA a
RIGHT JOIN TableB b ON a.CommonColumn = b.CommonColumn;
Example:
SELECT e.firstname, d.departmentname
FROM employees e
RIGHT JOIN departments d
ON e.departmentid = d.departmentid;
4. Full (Outer) Join: Returns all records when there is a match in either left or right table. Unmatched records from both tables are included. Syntax:
SELECT a.*, b.*
FROM TableA a
FULL OUTER JOIN TableB b ON a.CommonColumn = b.CommonColumn;
Example:
SELECT Employees.Name, Departments.DepartmentName
FROM Employees
FULL OUTER JOIN Departments ON Employees.DepartmentID = Departments.DepartmentID;
MySQL does not support the FULL OUTER JOIN
clause. Most modern relational database management systems (RDBMS) support FULL OUTER JOIN, including: PostgreSQL, SQL Server, Oracle, SQLite (with a workaround using LEFT JOIN and UNION). However, as mentioned earlier, MySQL does not support FULL OUTER JOIN
directly. You can achieve the same result using a combination of LEFT JOIN
and UNION
.
-- Simulating FULL OUTER JOIN in MySQL
SELECT a.column1, b.column2
FROM table1 a
LEFT JOIN table2 b ON a.common_column = b.common_column
UNION
SELECT a.column1, b.column2
FROM table1 a
RIGHT JOIN table2 b ON a.common_column = b.common_column;
5. Cross Join: Returns the Cartesian product of both tables, meaning it returns all possible combinations of rows. Syntax:
SELECT a.*, b.*
FROM TableA a
CROSS JOIN TableB b;
Example:
SELECT e.firstname, d.departmentname
FROM employees e
CROSS JOIN departments d;
6. Self Join: A self join is a regular join but the table is joined with itself. Syntax:
SELECT a.*, b.*
FROM TableA a
INNER JOIN TableA b ON a.CommonColumn = b.CommonColumn;
Example: SQL Query will show the employees’ and managers’ names.
SELECT e1.FirstName AS EmployeeName, e2.FirstName AS ManagerName
FROM Employees e1
INNER JOIN Employees e2
ON e1.ManagerID = e2.EmployeeID;
Summary:
- Inner Join: Matches records in both tables.
- Left Join: All records from the left table and matched records from the right.
- Right Join: All records from the right table and matched records from the left.
- Full Join: All records when there is a match in either left or right table.
- Cross Join: Cartesian product of both tables.
- Self Join: Table joined with itself.
2. Difference between INNER JOIN and OUTER JOIN?
INNER JOIN returns only the rows where there is a match in both tables. OUTER JOIN returns all rows from one or both tables, along with the matched rows. If there is no match, NULL values are included for non-matching rows. Types of outer join: LEFT JOIN, RIGHT JOIN, FULL JOIN.
- INNER JOIN: Only returns rows with matching data in both tables.
- LEFT JOIN: Returns all rows from the left table, with matched rows from the right table (NULL if no match).
- RIGHT JOIN: Returns all rows from the right table, with matched rows from the left table (NULL if no match).
- FULL JOIN: Returns all rows from both tables, with matched rows and NULLs where there is no match.
3. Difference between UNION and UNION ALL.
1. UNION:
- Purpose: Combines the results of two or more SELECT statements into a single result set, removing duplicates.
- Duplicates: Removes duplicate rows from the result set.
- Use Case: When you need to combine results and ensure there are no duplicate rows.
- Syntax:
SELECT column1, column2 FROM table1
UNION
SELECT column1, column2 FROM table2;
Example:
SELECT FirstName, LastName FROM Employees
UNION
SELECT FirstName, LastName FROM Managers;
SELECT departmentId
FROM employees
UNION
SELECT departmentId
FROM departments;
2. UNION ALL:
- Purpose: Combines the results of two or more SELECT statements into a single result set, including duplicates.
- Duplicates: Does not remove duplicate rows; all rows from each SELECT statement are included.
- Use Case: When you need to combine results and are okay with including duplicates for performance reasons or specific data requirements.
- Syntax:
SELECT column1, column2 FROM table1
UNION ALL
SELECT column1, column2 FROM table2;
Example:
SELECT FirstName, LastName FROM Employees
UNION ALL
SELECT FirstName, LastName FROM Managers;
SELECT departmentId
FROM employees
UNION ALL
SELECT departmentId
FROM departments;
Summary:
- UNION: Combines results and removes duplicates, ensuring unique rows in the result set. May have a performance overhead due to duplicate removal.
- UNION ALL: Combines results and includes all duplicates, which can be faster since it doesn’t check for duplicates.
4. What are the various types of relationships in SQL?
In SQL, relationships between tables are defined to maintain data integrity and facilitate querying related data. There are several types of relationships:
1. One-to-One (1:1) Relationship: Each row in Table A is linked to one and only one row in Table B, and vice versa. Example: A table for Employees
and a table for EmployeeDetails
where each employee has one set of detailed information.
CREATE TABLE Employees (
EmployeeID INT PRIMARY KEY,
Name VARCHAR(50)
);
CREATE TABLE EmployeeDetails (
DetailID INT PRIMARY KEY,
EmployeeID INT,
Address VARCHAR(100),
FOREIGN KEY (EmployeeID) REFERENCES Employees(EmployeeID)
);
2. One-to-Many (1:N) Relationship: Each row in Table A can be linked to multiple rows in Table B, but each row in Table B is linked to only one row in Table A. Example: A table for Departments
and a table for Employees
where each department can have multiple employees.
CREATE TABLE Departments (
DepartmentID INT PRIMARY KEY,
DepartmentName VARCHAR(50)
);
CREATE TABLE Employees (
EmployeeID INT PRIMARY KEY,
Name VARCHAR(50),
DepartmentID INT,
FOREIGN KEY (DepartmentID) REFERENCES Departments(DepartmentID)
);
3. Many-to-One (N:1) Relationship: Each row in Table B can be linked to multiple rows in Table A, but each row in Table A is linked to only one row in Table B (essentially the inverse of One-to-Many). Example: Similar to the One-to-Many example, viewed from the perspective of the Employees
table.
4. Many-to-Many (N:N) Relationship: Each row in Table A can be linked to multiple rows in Table B, and each row in Table B can be linked to multiple rows in Table A. Example: A table for Students
and a table for Courses
where each student can enroll in multiple courses and each course can have multiple students. This relationship is usually implemented using a junction table.
CREATE TABLE Students (
StudentID INT PRIMARY KEY,
StudentName VARCHAR(50)
);
CREATE TABLE Courses (
CourseID INT PRIMARY KEY,
CourseName VARCHAR(50)
);
CREATE TABLE StudentCourses (
StudentID INT,
CourseID INT,
PRIMARY KEY (StudentID, CourseID),
FOREIGN KEY (StudentID) REFERENCES Students(StudentID),
FOREIGN KEY (CourseID) REFERENCES Courses(CourseID)
);
5. Self-Referencing Relationship: A table has a foreign key that references its own primary key, useful for hierarchical data. Example: An Employees
table where each employee may have a manager, who is also an employee.
CREATE TABLE Employees (
EmployeeID INT PRIMARY KEY,
Name VARCHAR(50),
ManagerID INT,
FOREIGN KEY (ManagerID) REFERENCES Employees(EmployeeID)
);
Summary:
- One-to-One: Each row in one table is linked to one row in another table.
- One-to-Many: One row in a table is linked to multiple rows in another table.
- Many-to-One: Multiple rows in one table are linked to one row in another table.
- Many-to-Many: Multiple rows in one table are linked to multiple rows in another table via a junction table.
- Self-Referencing: A table references itself, useful for hierarchical relationships.
5. Difference between Primary Key and Secondary Key?
Primary Key:
- Purpose: Uniquely identifies each record in a table.
- Uniqueness: Enforces uniqueness; no duplicate values are allowed.
- Null Values: Cannot contain NULL values.
- Definition: Typically defined at the table creation.
CREATE TABLE Employees (
EmployeeID INT PRIMARY KEY,
Name VARCHAR(50)
);
In this example, EmployeeID
is the primary key, ensuring each employee has a unique identifier.
Secondary Key:
- Purpose: Provides additional means to identify records, often used for searching or indexing.
- Uniqueness: Not necessarily unique; can contain duplicate values.
- Null Values: Can contain NULL values.
- Definition: Usually created as an index to improve query performance.
CREATE INDEX idx_LastName ON Employees(LastName);
In this example, LastName
is used as a secondary key to speed up searches based on the last name.
Summary:
- Primary Key:
- Ensures each record in a table is unique.
- Cannot be NULL.
- Defined when the table is created.
- Secondary Key:
- Used to improve query performance and searches.
- Can have duplicate values and contain NULLs.
- Typically created as an index after the table is created.
6. What is the difference between SUBQUERY and JOIN?
1. SUBQUERY:
- Definition: A query nested within another query. It can be used to return data that will be used in the main query.
- Usage: Commonly used in SELECT, INSERT, UPDATE, and DELETE statements to provide a result set for the outer query.
- Syntax:
SELECT column1
FROM table1
WHERE column2 = (SELECT column2
FROM table2
WHERE condition);
Example:
SELECT FirstName
FROM Employees
WHERE DepartmentID = ( SELECT DepartmentID
FROM Departments
WHERE DepartmentName = 'HR'
);
- Performance: Can be less efficient than joins for large datasets since it may execute the subquery for each row of the outer query.
2. JOIN:
- Definition: Combines rows from two or more tables based on a related column between them.
- Usage: Used to retrieve related data from multiple tables in a single query.
- Syntax:
SELECT a.column1, b.column2
FROM table1 a
JOIN table2 b ON a.common_column = b.common_column;
Example:
SELECT Employees.Name, Departments.DepartmentName
FROM Employees
JOIN Departments ON Employees.DepartmentID = Departments.DepartmentID;
- Types of Joins: Includes INNER JOIN, LEFT JOIN, RIGHT JOIN, FULL JOIN, etc.
- Performance: Generally more efficient than subqueries for large datasets as it allows the database to combine tables and filter rows in a single pass.
Summary:
- Subquery:
- A query within another query.
- Can be used in SELECT, INSERT, UPDATE, and DELETE.
- May be less efficient for large datasets.
- Join:
- Combines rows from multiple tables based on related columns.
- Typically more efficient for retrieving related data.
- Various types (INNER, LEFT, RIGHT, FULL).
In essence, subqueries are useful for situations where you need to perform a query on the result of another query, while joins are more efficient for combining and retrieving data from multiple tables. Each has its own use cases and can be chosen based on the specific requirements and performance considerations of the query.
Query
1. Write a SQL query to find the second highest salary of an employee.
SELECT *
FROM Employees
ORDER BY Salary DESC
LIMIT 1 OFFSET 1;
Find nth highest salary of the Employee table
SELECT DISTINCT Salary
FROM Employees
ORDER BY Salary DESC
LIMIT 1 OFFSET n-1;
With Employee information:-
WITH RankedSalaries AS (
SELECT
EmployeeID,
DENSE_RANK() OVER (ORDER BY Salary DESC) AS sal_rank
FROM Employees
)
SELECT *
FROM Employees e
WHERE e.EmployeeID IN (
SELECT EmployeeID
FROM RankedSalaries
WHERE sal_rank = 7
);
2. Write a SQL query to find the top 3 departments with the highest average salary.
SELECT DepartmentId, AVG(Salary) AS AvgSal
FROM Employees
GROUP BY DepartmentId
ORDER BY AvgSal DESC
LIMIT 3;
3. Write a SQL query to find the top 2 products with the highest sales.
SELECT p.ProductId, p.ProductName, SUM(s.Quantity) AS TotalSales
FROM Products p
JOIN Sales s ON p.ProductId = s.ProductId
GROUP BY p.ProductId, p.ProductName
ORDER BY TotalSales DESC
LIMIT 2;
4. Write a SQL query to find the top 3 products with the highest average price.
SELECT ProductId, ProductName, AVG(Price) AS AvgPrice
FROM Products
GROUP BY ProductId, ProductName
ORDER BY AvgPrice DESC
LIMIT 3;
5. How do you handle duplicate rows in a SQL query?
SELECT DISTINCT column1, column2
FROM table_name;
6. Write a SQL query to find the customers who have placed an order but have not made a payment.
SELECT DISTINCT o.CustomerId
FROM Orders o
LEFT JOIN Payments p ON o.OrderId = p.OrderId
WHERE p.PaymentId IS NULL;
7. Write a SQL query to find the employees who work in the same department as their manager.
SELECT
e.EmployeeId,
e.Firstname,
e.departmentId
FROM
Employees e
JOIN
Employees e1 ON e.departmentid = e1.departmentid
AND e.managerid = e1.employeeid;
8. Write a SQL query to find the employees who have worked for more than 5 years.
SELECT
EmployeeID,
joiningDate
FROM
Employees
WHERE
joiningDate <= DATE_SUB(CURDATE(), INTERVAL 5 YEAR);
9. Write a SQL query to find the employees who have the same name and work in the same department.
SELECT
e.EmployeeId,
e.FirstName
FROM
Employees e
JOIN
Employees e1 ON e.departmentId = e1.departmentId
AND e.FirstName = e1.FirstName
AND e.EmployeeId != e1.EmployeeId;
10. Write a SQL query to find the departments with no employees.
SELECT
d.*
FROM
Departments d
LEFT JOIN
Employees e ON d.departmentid = e.departmentid
WHERE
e.employeeId IS NULL;
11. Write a SQL query to find the employees who have not taken any leave in the last 6 months.
SELECT
e.*
FROM
Employees e
LEFT JOIN
Leaves l ON e.EmployeeID = l.EmployeeID
AND l.LeaveDate >= DATE_SUB(CURDATE(),
INTERVAL 6 MONTH)
WHERE
l.LeaveID IS NULL;
- LEFT JOIN Employees and Leaves: Links to the
Employees
table with theLeaves
table on theEmployeeID
and filter leave dates within the last 6 months. - WHERE l.LeaveID IS NULL: Ensures that only employees who have not taken any leave in the last 6 months are selected.
12. Find the highest-earning employee in each department. Show their id, name, and department.
SELECT
e.EmployeeID,
e.Firstname,
d.DepartmentName
FROM
Employees e
JOIN
Departments d ON e.departmentId = d.DepartmentId
WHERE
e.EmployeeID IN (
SELECT
EmployeeID
FROM
Employees
GROUP BY
DepartmentId
HAVING
MAX(Salary)
);
13. Write a SQL query to find the employee name, department name, and manager name.
SELECT
e1.Firstname,
d.DepartmentName,
e2.Firstname AS ManagerName
FROM
Employees e1
INNER JOIN
Employees e2 ON e1.managerId = e2.EmployeeID
JOIN
Departments d ON e1.departmentId = d.DepartmentId;
14. How to delete DUPLICATE records from a table using a SQL Query?
Deleting duplicate records from a table can be handled in several ways. Below is a common approach using a DELETE
statement with a subquery. This example assumes we have a table named Employees
and we want to delete duplicate rows based on the FirstName
and LastName
columns, while keeping the row with the lowest EmployeeID
.
Step 1: Identify Duplicates. First, identify the duplicates and decide which row to keep. Typically, you might keep the row with the lowest EmployeeID
.
SELECT
EmployeeID,
FirstName,
LastName,
COUNT(*)
FROM
Employees
GROUP BY
FirstName,
LastName
HAVING
COUNT(*) > 1;
Step 2: Delete Duplicates. Next, delete the duplicate rows. Here’s how you can do it:
DELETE e1
FROM Employees e1
INNER JOIN Employees e2
ON e1.FirstName = e2.FirstName
AND e1.LastName = e2.LastName
AND e1.EmployeeID > e2.EmployeeID;
Explanation:
- INNER JOIN Employees e2 ON e1.FirstName = e2.FirstName AND e1.LastName = e2.LastName: Joins the table to itself based on the columns that define duplicates (
FirstName
andLastName
in this case). - AND e1.EmployeeID > e2.EmployeeID: Ensures that we keep the row with the smallest
EmployeeID
by comparing each pair of duplicates. - DELETE e1: Deletes the duplicate rows.
15. How to read the TOP 5 records from a table using an SQL query?
The LIMIT
clause allows you to specify the number of records to return from the result set. To read the top 5 records from a table, you can use the following query:-
SELECT *
FROM Employees
ORDER BY Salary DESC
LIMIT 5;
Advanced SQL Concepts
1. Explain all types of window functions. (mainly rank, row_num, dense_rank, lead & lag)
Window functions in SQL are a powerful feature that allow you to perform calculations across a set of table rows that are related to the current row. Here are the main types of window functions:
1. Aggregate Functions: These functions perform a calculation on a set of values and return a single value. When used as window functions, they can provide cumulative or moving calculations.
- SUM(): Calculates the total sum of a numeric column.
SELECT DepartmentId, SUM(Salary) As DeptTotal
FROM Employees
GROUP BY DepartmentId;
SELECT FirstName, DepartmentID,
SUM(Salary) OVER (PARTITION BY DepartmentID) AS DeptTotal
FROM Employees;
- AVG(): Calculates the average value of a numeric column.
SELECT FirstName, DepartmentID,
AVG(Salary) OVER (PARTITION BY DepartmentID) AS DeptAVG
FROM Employees;
- MIN(): Finds the minimum value in a set.
SELECT FirstName, DepartmentID,
MIN(Salary) OVER (PARTITION BY DepartmentID) AS DeptMin
FROM Employees;
- MAX(): Finds the maximum value in a set.
SELECT FirstName, DepartmentID,
MAX(Salary) OVER (PARTITION BY DepartmentID) AS DeptMax
FROM Employees;
- COUNT(): Counts the number of rows.
SELECT FirstName, DepartmentID,
Count(*) OVER (PARTITION BY DepartmentID) AS DeptCount
FROM Employees;
2. Ranking Functions: These functions assign a ranking value to each row within a partition.
- ROW_NUMBER(): Assigns a unique sequential integer to rows within a partition.
SELECT FirstName, DepartmentID, Salary,
ROW_NUMBER() OVER (PARTITION BY DepartmentID ORDER BY Salary Desc) AS RowNum
FROM Employees;
- RANK(): Assigns a rank to each row within a partition, with gaps for ties.
SELECT FirstName, DepartmentID, Salary,
RANK() OVER (PARTITION BY DepartmentID ORDER BY Salary Desc) AS SalRank
FROM Employees;
- DENSE_RANK(): Assigns a rank to each row within a partition, without gaps for ties.
SELECT FirstName, DepartmentID, Salary,
DENSE_RANK() OVER (PARTITION BY DepartmentID ORDER BY Salary Desc) AS DenseSalRank
FROM Employees;
- NTILE(): Divides rows into a specified number of approximately equal groups.
SELECT FirstName, DepartmentID, Salary,
NTILE(4) OVER (PARTITION BY DepartmentID ORDER BY Salary Desc) AS NtileGroup
FROM Employees;
3. Value Functions: These functions allow you to access data from other rows within the same result set.
- LAG(): Provides access to a value from a previous row in the result set.
SELECT FirstName, Salary,
LAG(Salary, 1) OVER (ORDER BY Salary) AS PrevSalary
FROM Employees;
- LEAD(): Provides access to a value from a subsequent row in the result set.
SELECT FirstName, Salary,
LEAD(Salary, 1) OVER (ORDER BY Salary) AS NextSalary
FROM Employees;
- FIRST_VALUE(): Returns the first value in an ordered set of values.
SELECT FirstName, Salary,
FIRST_VALUE(Salary) OVER (ORDER BY Salary) AS FirstSalary
FROM Employees;
- LAST_VALUE(): Returns the last value in an ordered set of values.
SELECT FirstName, Salary,
LAST_VALUE(Salary) OVER (ORDER BY Salary) AS LastSalary
FROM Employees;
4. Cumulative Distribution Functions: These functions calculate the distribution of values in a result set.
- CUME_DIST(): Calculates the cumulative distribution of a value within a set of values.
SELECT FirstName, Salary,
CUME_DIST() OVER (ORDER BY Salary) AS CumeDist
FROM Employees;
- PERCENT_RANK(): Calculates the relative rank of a row within a partition.
SELECT FirstName, Salary,
PERCENT_RANK() OVER (ORDER BY Salary) AS CumeDist
FROM Employees;
Summary:
- Aggregate Functions:
SUM()
,AVG()
,MIN()
,MAX()
,COUNT()
- Ranking Functions:
ROW_NUMBER()
,RANK()
,DENSE_RANK()
,NTILE()
- Value Functions:
LAG()
,LEAD()
,FIRST_VALUE()
,LAST_VALUE()
- Cumulative Distribution Functions:
CUME_DIST()
,PERCENT_RANK()
1.a) RANK() vs DENSE_RANK()?
Both of these functions are used to assign ranks to rows in a result set based on the values of a specified column, but they handle ties differently.
Example Table: Scores
Student | Score |
---|---|
A | 90 |
B | 90 |
C | 85 |
D | 80 |
E | 80 |
F | 75 |
The RANK()
function assigns a unique rank to each row within a partition of the result set. If there are ties, the same rank is assigned to the tied values, but the next rank is incremented by the number of ties.
SELECT Student, Score, RANK() OVER (ORDER BY Score DESC) AS Rank
FROM Scores;
Student | Score | Rank |
---|---|---|
A | 90 | 1 |
B | 90 | 1 |
C | 85 | 3 |
D | 80 | 4 |
E | 80 | 4 |
F | 75 | 6 |
The DENSE_RANK()
function is similar to RANK()
, but it does not leave gaps between the ranks when there are ties.
SELECT Student, Score, DENSE_RANK() OVER (ORDER BY Score DESC) AS DenseRank
FROM Scores;
Student | Score | DenseRank |
---|---|---|
A | 90 | 1 |
B | 90 | 1 |
C | 85 | 2 |
D | 80 | 3 |
E | 80 | 3 |
F | 75 | 4 |
Key Differences:
- RANK():
- Leaves gaps in the ranking when there are ties.
- E.g., Scores of 90 are both ranked 1, but the next score (85) is ranked 3.
- DENSE_RANK():
- No gaps in the ranking.
- E.g., Scores of 90 are both ranked 1, and the next score (85) is ranked 2.
2. How do you use window functions to solve complex queries?
Window functions are a powerful tool in SQL that help you solve complex queries involving running totals, moving averages, cumulative sums, rankings, and more, all without the need for subqueries or self-joins. Here’s how you can use window functions to address different complex query scenarios:
1. Running Total: Calculate a running total of sales for each employee.
SELECT EmployeeID, SaleDate, SalesAmount,
SUM(SalesAmount) OVER (PARTITION BY EmployeeID ORDER BY SaleDate) AS RunningTotal
FROM Sales;
This query partitions the sales by employee and orders them by the sale date to calculate the cumulative sum.
2. Moving Average: Calculate a 3-day moving average of sales for each product.
SELECT ProductID, SaleDate, SalesAmount,
AVG(SalesAmount) OVER (PARTITION BY ProductID ORDER BY SaleDate ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS MovingAvg
FROM Sales;
This calculates the average sales amount over the current row and the preceding two rows for each product.
3. Ranking: Rank employees within their departments based on their sales.
SELECT EmployeeID, DepartmentID, SalesAmount,
RANK() OVER (PARTITION BY DepartmentID ORDER BY SalesAmount DESC) AS SalesRank
FROM EmployeeSales;
This assigns a rank to each employee within their department based on their sales amount.
4. First and Last Values: Get the first and last sales dates for each employee.
SELECT EmployeeID, SaleDate, SalesAmount,
FIRST_VALUE(SaleDate) OVER (PARTITION BY EmployeeID ORDER BY SaleDate) AS FirstSaleDate,
LAST_VALUE(SaleDate) OVER (PARTITION BY EmployeeID ORDER BY SaleDate
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS LastSaleDate
FROM Sales;
This retrieves the first and last sale dates for each employee, considering all sales dates.
5. Cumulative Distribution: Calculate the cumulative distribution of salaries within a department.
SELECT EmployeeID, DepartmentID, Salary,
CUME_DIST() OVER (PARTITION BY DepartmentID ORDER BY Salary) AS CumulativeDistribution
FROM Employees;
This calculates the cumulative distribution of salaries within each department, showing the relative position of each salary.
6. Percent Rank: Determine the percentile rank of employees’ salaries within their department.
SELECT EmployeeID, DepartmentID, Salary,
PERCENT_RANK() OVER (PARTITION BY DepartmentID ORDER BY Salary) AS PercentRank
FROM Employees;
This calculates the percentile rank of each employee’s salary within their department.
Summary:
- Running Total: Use
SUM()
for cumulative sums. - Moving Average: Use
AVG()
with a range frame. - Ranking: Use
RANK()
,ROW_NUMBER()
, orDENSE_RANK()
. - First and Last Values: Use
FIRST_VALUE()
andLAST_VALUE()
. - Cumulative Distribution: Use
CUME_DIST()
. - Percent Rank: Use
PERCENT_RANK()
.
3. Which is faster between CTE and Subquery?
The performance difference between Common Table Expressions (CTEs) and subqueries can vary depending on the specific context and database engine. Here’s a comparison to help understand when each might be faster or more appropriate:
Common Table Expressions (CTEs):
- Readability: CTEs often make complex queries easier to read and maintain.
- Performance: In some cases, CTEs can be optimized similarly to subqueries. However, they may not always be as efficient as using temporary tables or indexes for complex operations.
- Reusability: CTEs can be defined once and referenced multiple times within the same query.
- Execution: Some database engines optimize CTEs by materializing them (storing the result temporarily), while others treat them like inline views.
Subqueries:
- Performance: Subqueries, especially correlated subqueries, can sometimes be less efficient than joins or CTEs because they may execute multiple times. However, this depends on the specific query and database engine.
- Readability: Inline subqueries can make queries harder to read and maintain compared to CTEs.
- Execution: Subqueries can be optimized differently by the database engine, and in some cases, they might be more efficient than CTEs for simple, non-correlated scenarios.
Using a CTE:
WITH SalesCTE AS (
SELECT EmployeeID, SUM(SalesAmount) AS TotalSales
FROM Sales
GROUP BY EmployeeID
)
SELECT Employees.Name, SalesCTE.TotalSales
FROM Employees
JOIN SalesCTE ON Employees.EmployeeID = SalesCTE.EmployeeID;
Using a Subquery:
SELECT Employees.Name, (
SELECT SUM(SalesAmount)
FROM Sales
WHERE Sales.EmployeeID = Employees.EmployeeID
) AS TotalSales
FROM Employees;
Summary:
- CTEs: Generally better for readability and maintainability, potentially reusable within the same query.
- Subqueries: May be more efficient in simple, non-correlated scenarios, but can be less readable and maintainable.
Ultimately, the performance depends on the specific query, the data involved, and the database engine’s optimization capabilities. It’s often a good practice to test both approaches and analyze the execution plans to determine which performs better for your particular use case.
4. How do you use stored procedures to improve SQL query performance?
Stored procedures can enhance SQL query performance in several key ways. Let’s dive into some strategies and benefits specific to MySQL:
1. Precompiled Execution: Stored procedures are compiled once and stored in the database, meaning they can be executed more quickly than dynamic SQL because the database engine doesn’t have to parse and compile the query each time it’s run. Example:
DELIMITER //
CREATE PROCEDURE GetEmployeeDetails()
BEGIN
SELECT * FROM Employees;
END //
DELIMITER ;
2. Reduced Network Traffic: Instead of sending multiple SQL statements across the network, a single call to a stored procedure can be made. This reduces network overhead and speeds up the overall performance. Example:
CALL GetEmployeeDetails();
3. Code Reusability and Maintenance: Stored procedures allow for code reuse and centralize logic in the database, making it easier to maintain and update queries. Changes made to the stored procedure are automatically reflected wherever the procedure is called. Example:
DELIMITER //
CREATE PROCEDURE UpdateEmployeeSalary(IN EmployeeID INT, IN NewSalary DECIMAL(10,2))
BEGIN
UPDATE Employees
SET Salary = NewSalary
WHERE EmployeeID = EmployeeID;
END //
DELIMITER ;
4. Enhanced Security: Stored procedures can provide an additional layer of security by controlling access to data. Users can be granted permission to execute the stored procedure without granting them direct access to the underlying tables. Example:
GRANT EXECUTE ON PROCEDURE UpdateEmployeeSalary TO UserName;
5. Improved Performance through Execution Plans: Stored procedures use cached execution plans, which means the database engine can reuse previously generated plans for subsequent executions, leading to faster performance. Example:
DELIMITER //
CREATE PROCEDURE GetEmployeeByDepartment(IN DepartmentID INT)
BEGIN
SELECT * FROM Employees
WHERE DepartmentID = DepartmentID;
END //
DELIMITER ;
6. Better Error Handling: Stored procedures can include comprehensive error handling mechanisms to ensure that exceptions are managed efficiently, improving the robustness of the application. Example:
DELIMITER //
CREATE PROCEDURE SafeUpdateEmployeeSalary(IN EmployeeID INT, IN NewSalary DECIMAL(10,2))
BEGIN
DECLARE EXIT HANDLER FOR SQLEXCEPTION
BEGIN
-- Error handling code
ROLLBACK;
END;
START TRANSACTION;
UPDATE Employees
SET Salary = NewSalary
WHERE EmployeeID = EmployeeID;
COMMIT;
END //
DELIMITER ;
7. Modular Programming: Benefit: Stored procedures support modular programming by allowing complex operations to be broken down into smaller, reusable procedures. This enhances code readability and maintainability. Example:-
DELIMITER //
CREATE PROCEDURE CalculateBonus(IN EmployeeID INT, OUT Bonus DECIMAL(10,2))
BEGIN
DECLARE Salary DECIMAL(10,2);
SELECT Salary INTO Salary FROM Employees WHERE EmployeeID = EmployeeID;
SET Bonus = Salary * 0.10;
END //
DELIMITER ;
Summary:
- Precompiled Execution: Faster execution due to precompilation.
- Reduced Network Traffic: Less network overhead.
- Code Reusability: Centralized and reusable logic.
- Enhanced Security: Controlled data access.
- Cached Execution Plans: Faster performance through plan reuse.
- Error Handling: Robust error management.
- Modular Programming: Improved code organization.
5. Explain View concepts.
A view in SQL is a virtual table that is based on the result set of a SQL query. It does not store data physically but provides a way to simplify complex queries and enhance security by restricting access to specific data.
Characteristics:
- Virtual Table: Acts like a table but does not store data itself.
- Derived from Query: Created using a SELECT statement.
- Read-Only or Updatable: Views can be read-only or allow updates depending on the SQL database and the view’s definition.
Creating a View: To create a view, use the CREATE VIEW
statement followed by the view name and the SELECT query.
CREATE VIEW ViewName AS
SELECT column1, column2, ...
FROM table
WHERE condition;
Example:
CREATE VIEW EmployeeView AS
SELECT EmployeeID, Name, DepartmentName
FROM Employees
JOIN Departments ON Employees.DepartmentID = Departments.DepartmentID;
Using a View: You can query a view just like a regular table.
SELECT * FROM EmployeeView;
Benefits:
- Simplifies Complex Queries: Abstracts complex joins and filters into a single view.
- Enhances Security: Restricts access to specific columns or rows.
- Data Consistency: Ensures consistent query results without duplicating data.
- Eases Maintenance: Centralizes query logic, making it easier to update.
Updating Data through a View: Views can sometimes be updatable, allowing you to modify data in the underlying tables.
UPDATE EmployeeView
SET DepartmentName = 'Marketing'
WHERE EmployeeID = 101;
Restrictions:
- Not all views are updatable. Restrictions depend on the SQL database and the complexity of the view.
- Views with complex joins, aggregate functions, or DISTINCT clauses are typically read-only.
Dropping a View: To remove a view, use the DROP VIEW
statement.
DROP VIEW ViewName;
Summary:
- Views: Virtual tables created from SQL queries.
- Creation: Defined using
CREATE VIEW
. - Usage: Queried like regular tables for simplified access and security.
- Benefits: Simplifies complex queries, enhances security, and ensures data consistency.
- Updating: Some views are updatable; others are read-only.
- Maintenance: Centralized query logic eases updates and management.
6. What are triggers in SQL?
A trigger in SQL is a set of instructions that automatically execute (or “fire”) in response to specific events on a particular table or view. Triggers can be used to enforce business rules, maintain data integrity, and audit changes.
Types of Triggers:
- Before Triggers:
- Fires before an insert, update, or delete operation on a table.
- Usage: Often used to validate or modify data before it is written to the table.
CREATE TRIGGER BeforeInsertEmployee
BEFORE INSERT ON Employees
FOR EACH ROW
BEGIN
IF NEW.Salary < 0 THEN
SET NEW.Salary = 0;
END IF;
END;
- After Triggers:
- Fires after an insert, update, or delete operation on a table.
- Usage: Often used to update related tables, enforce referential integrity, or log changes.
CREATE TRIGGER AfterInsertEmployee
AFTER INSERT ON Employees
FOR EACH ROW
BEGIN
INSERT INTO EmployeeLog(EmployeeID, Action)
VALUES (NEW.EmployeeID, 'INSERT');
END;
Trigger Timing:
- BEFORE: Executes before the triggering event.
- AFTER: Executes after the triggering event.
Trigger Events:
- INSERT: Fires when a new row is inserted.
- UPDATE: Fires when a row is updated.
- DELETE: Fires when a row is deleted.
Trigger Levels:
- Row-Level Triggers: Execute once for each row affected by the triggering event.
- Statement-Level Triggers: Execute once for the entire statement, regardless of the number of rows affected.
Creating a Trigger: To create a trigger, use the CREATE TRIGGER
statement.
CREATE TRIGGER trigger_name
BEFORE|AFTER INSERT|UPDATE|DELETE
ON table_name
FOR EACH ROW
BEGIN
-- Trigger logic here
END;
Example Triggers:
- Before Insert Trigger:
DELIMITER //
CREATE TRIGGER BeforeInsertEmployee
BEFORE INSERT ON Employees
FOR EACH ROW
BEGIN
IF NEW.Salary < 0 THEN
SET NEW.Salary = 0;
END IF;
END //
DELIMITER ;
- After Insert Trigger:
DELIMITER //
CREATE TRIGGER AfterInsertEmployee
AFTER INSERT ON Employees
FOR EACH ROW
BEGIN
INSERT INTO EmployeeLog(EmployeeID, Action)
VALUES (NEW.EmployeeID, 'INSERT');
END //
DELIMITER ;
Advantages of Triggers:
- Automation: Automatically enforce business rules and data integrity.
- Consistency: Ensure that related changes are handled uniformly.
- Auditing: Track changes and maintain historical records.
Disadvantages of Triggers:
- Complexity: Can make the system harder to debug and maintain.
- Performance Overhead: May introduce performance overhead if not used carefully.
Summary:
- Triggers: Automated actions that fire in response to specific table events.
- Types: Before and after triggers, each can fire on insert, update, or delete events.
- Levels: Row-level (per row) and statement-level (per statement).
- Advantages: Enforce rules, maintain consistency, and track changes.
- Disadvantages: Can increase complexity and impact performance.
7. What is the stored procedure in SQL?
A stored procedure in SQL is a precompiled set of one or more SQL statements that are stored under a name and processed as a unit. Stored procedures allow for efficient, reusable, and secure execution of SQL code within a database.
Key Features of Stored Procedures:
- Precompiled Execution: Stored procedures are compiled once and stored in the database. This precompilation means they can execute faster than ad-hoc SQL queries because the database does not need to parse and compile the query each time.
- Reusability: Stored procedures can be reused across multiple programs and applications, promoting code reuse and reducing redundancy.
- Security: They can encapsulate complex logic and restrict direct access to the underlying tables. Permission can be granted to execute the procedure without allowing access to the actual data.
- Maintainability: Changes to business logic are centralized within the stored procedure, making it easier to update and maintain without altering the application code.
- Network Traffic Reduction: Reduces network traffic by executing multiple SQL statements in a single call to the database.
Creating a Stored Procedure: To create a stored procedure, you use the CREATE PROCEDURE
statement followed by the procedure name and the SQL statements it encompasses.
DELIMITER //
CREATE PROCEDURE GetEmployeeDetails()
BEGIN
SELECT * FROM Employees;
END //
DELIMITER ;
In this example, GetEmployeeDetails
is a stored procedure that retrieves all records from the Employees
table.
Executing a Stored Procedure: To execute a stored procedure, you use the CALL
statement.
CALL GetEmployeeDetails();
This call runs the GetEmployeeDetails
procedure, which retrieves and displays the employee data.
Parameters in Stored Procedures: Stored procedures can accept parameters to make them more dynamic.
DELIMITER //
CREATE PROCEDURE GetEmployeeByID(IN EmployeeID INT)
BEGIN
SELECT * FROM Employees
WHERE EmployeeID = EmployeeID;
END //
DELIMITER ;
In this example, the GetEmployeeByID
procedure takes an input parameter EmployeeID
and returns the details of that specific employee.
Advantages of Stored Procedures:
- Performance: Precompiled and cached execution plans improve performance.
- Modularity: Break down complex operations into simpler, reusable procedures.
- Security: Protects database integrity by controlling access to data.
- Consistency: Ensures consistent implementation of business logic.
8. Difference between the Function and Store procedure?
The function performs calculations and returns a single value or a table. It can be used in SELECT, WHERE, and other clauses. Syntax in MySQL:-
DELIMITER //
CREATE FUNCTION FunctionName(param datatype)
RETURNS datatype
DETERMINISTIC
BEGIN
-- Function logic
RETURN value;
END //
DELIMITER ;
Example:-
DELIMITER //
CREATE FUNCTION CalculateSquare (Number INT)
RETURNS INT
DETERMINISTIC
BEGIN
RETURN Number * Number;
END //
DELIMITER ;
Calling the function:- You can call this function in a SELECT statement or wherever expressions are allowed.
SELECT CalculateSquare(5) AS SquareResult;
Stored Procedures perform a series of operations like insert, update, delete, and more complex logic. It was executed using the EXEC command in SQL server/Oracle, and CALL in MySQL. Syntax in MySQL:-
DELIMITER //
CREATE PROCEDURE ProcedureName()
BEGIN
-- Procedure logic
END //
DELIMITER ;
Example:-
DELIMITER //
CREATE PROCEDURE UpdateEmployeeSalary(IN EmployeeID INT, IN NewSalary DECIMAL(10, 2))
BEGIN
UPDATE Employees
SET Salary = NewSalary
WHERE EmployeeID = EmployeeID;
END //
DELIMITER ;
Executing the stored procedure in MySQL:-
CALL UpdateEmployeeSalary(1, 70000.00);
9. What is a Cursor?
A cursor in SQL is a database object used to retrieve, manipulate, and navigate through a result set row by row. It provides a way to process individual rows returned by a query, which is useful for performing operations that require row-by-row processing rather than set-based processing.
Key Concepts
- Declare a Cursor: Define the cursor and the SQL query that it will execute.
- Open the Cursor: Execute the query and populate the cursor with the result set.
- Fetch Rows: Retrieve individual rows from the cursor.
- Close the Cursor: Release the cursor and associated resources.
- Deallocate the Cursor: Remove the cursor definition from memory.
Here’s an example of how to use a cursor in SQL to iterate through employees and update their salaries based on certain conditions. MySQL Example
-- Declare variables
DECLARE done INT DEFAULT 0;
DECLARE emp_id INT;
DECLARE emp_salary DECIMAL(10, 2);
-- Declare the cursor
DECLARE emp_cursor CURSOR FOR
SELECT EmployeeID, Salary FROM Employees;
-- Declare a handler for the end of the cursor
DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = 1;
-- Open the cursor
OPEN emp_cursor;
-- Fetch the first row
FETCH emp_cursor INTO emp_id, emp_salary;
-- Loop through all rows
WHILE done = 0 DO
-- Perform operations (example: increase salary by 10%)
UPDATE Employees SET Salary = emp_salary * 1.10 WHERE EmployeeID = emp_id;
-- Fetch the next row
FETCH emp_cursor INTO emp_id, emp_salary;
END WHILE;
-- Close the cursor
CLOSE emp_cursor;
Explanation:
- Declare Variables: Used to store the current row’s data.
- Declare Cursor: Define the cursor with a SELECT query.
- Declare Handler: Handle the condition when the cursor reaches the end.
- Open Cursor: Execute the query and prepare to fetch rows.
- Fetch Rows: Retrieve rows one by one within a loop.
- Perform Operations: Perform any row-by-row operations.
- Close Cursor: Release resources associated with the cursor.
Use Cases
- Iterating Over Rows: When you need to process each row individually.
- Complex Logic: Applying complex business logic that can’t be done in a single SQL statement.
- Row-by-Row Operations: Operations that depend on previous rows’ data.
While cursors are powerful, they can be less efficient than set-based operations. It’s often recommended to use them sparingly and opt for set-based SQL operations when possible.
10. How exceptions can be handled in MySQL?
In MySQL, exception handling can be managed using DECLARE ... HANDLER
statements within stored procedures or functions. These handlers allow you to specify actions to be taken when specific conditions or errors occur. Example Usage:- Here’s an example of how to use exception handling in a stored procedure:
Creating a Stored Procedure with Exception Handling
DELIMITER $$
CREATE PROCEDURE UpdateSalary(
IN emp_id INT,
IN new_salary DECIMAL(10, 2)
)
BEGIN
-- Declare a variable for errors
DECLARE EXIT HANDLER FOR SQLEXCEPTION
BEGIN
-- Handle the exception
ROLLBACK;
SELECT 'An error occurred while updating the salary';
END;
-- Start transaction
START TRANSACTION;
-- Update the employee's salary
UPDATE Employees
SET Salary = new_salary
WHERE EmployeeID = emp_id;
-- Commit the transaction
COMMIT;
-- Success message
SELECT 'Salary updated successfully';
END $$
DELIMITER ;
Explanation
- DELIMITER $$: Changes the statement delimiter to
$$
to allow for the creation of multi-statement procedures. - DECLARE EXIT HANDLER FOR SQLEXCEPTION: Declares a handler for SQL exceptions. If an error occurs, the specified block of code will be executed.
- BEGIN … END: Defines the block of code to execute when an error occurs. Here, it performs a
ROLLBACK
and returns an error message. - START TRANSACTION: Begins a new transaction.
- UPDATE: The operation that could potentially cause an error.
- COMMIT: Commits the transaction if no errors occur.
- SELECT ‘Salary updated successfully’: Returns a success message if the transaction is committed successfully.
Types of Handlers
- CONTINUE HANDLER: Continues execution after handling the exception.
- EXIT HANDLER: Exits the procedure after handling the exception.
Example of CONTINUE HANDLER
DELIMITER $$
CREATE PROCEDURE TestHandler()
BEGIN
DECLARE CONTINUE HANDLER FOR SQLEXCEPTION
BEGIN
SELECT 'Handled an exception';
END;
-- Intentionally cause an error
DROP TABLE NonExistentTable;
-- Execution continues here after handling the error
SELECT 'This will still run';
END $$
DELIMITER ;
In this example, even after the error caused by trying to drop a non-existent table, the procedure continues execution and returns the second message.
Use Cases
- Transaction Management: Ensuring that transactions are correctly handled, rolled back, or committed.
- Custom Error Messages: Providing user-friendly error messages.
- Debugging: Capturing and logging errors for debugging purposes.
If you enjoyed this post, share it with your friends. Do you want to share more information about the topic discussed above or do you find anything incorrect? Let us know in the comments. Thank you!