Most Asked SQL Interview Questions

Most Asked SQL Interview Questions | We have listed mostly asked SQL interview questions on SQL. Let us first create schema and sample data that can be used to cover most of the SQL questions listed. The schema includes tables for employees, departments, customers, orders, products, and salaries.

Table of Contents

SQL Fundamentals
SQL Constraints and Data Types
SQL Joins and Relationships
Query
Advanced SQL Concepts

-- Creating Employee Table
CREATE TABLE Employees (
    EmployeeID INT PRIMARY KEY,
    FirstName VARCHAR(50),
    LastName VARCHAR(50),
    DepartmentID INT,
    Salary DECIMAL(10, 2),
    JoiningDate DATE,
    ManagerID INT
);

-- Creating Department Table
CREATE TABLE Departments (
    DepartmentID INT PRIMARY KEY,
    DepartmentName VARCHAR(50)
);

CREATE TABLE Leaves (
    LeaveID INT PRIMARY KEY,
    EmployeeID INT,
    LeaveDate DATE,
    FOREIGN KEY (EmployeeID) REFERENCES Employees(EmployeeID)
);

-- Inserting Data into Employees Table
INSERT INTO Employees VALUES (1, 'John', 'Doe', 1, 60000, '2018-01-10', NULL);
INSERT INTO Employees VALUES (2, 'Jane', 'Smith', 2, 75000, '2017-03-15', 1);
INSERT INTO Employees VALUES (3, 'Sam', 'Brown', 1, 55000, '2020-06-20', 1);
INSERT INTO Employees VALUES (4, 'Lisa', 'White', 3, 90000, '2016-09-05', NULL);
INSERT INTO Employees VALUES (5, 'Mark', 'Green', 2, 80000, '2019-11-12', 2);
INSERT INTO Employees VALUES (6, 'Emily', 'Clark', 1, 67000, '2019-01-05', 1);
INSERT INTO Employees VALUES (7, 'Daniel', 'Lewis', 3, 72000, '2021-05-10', 4);
INSERT INTO Employees VALUES (8, 'Sophia', 'Wilson', 4, 98000, '2015-12-15', 6);
INSERT INTO Employees VALUES (9, 'James', 'Johnson', NULL, 65000, '2020-10-10', 7);
INSERT INTO Employees VALUES (10, 'Olivia', 'Jones', 2, 82000, '2018-03-23', 2);
INSERT INTO Employees VALUES (11, 'Zara', 'Ali', 5, 45000, '2021-07-11', 8);
INSERT INTO Employees VALUES (12, 'Amit', 'Sharma', 6, 50000, '2022-05-22', 3);

-- Inserting Data into Departments Table
INSERT INTO Departments VALUES (1, 'HR');
INSERT INTO Departments VALUES (2, 'Finance');
INSERT INTO Departments VALUES (3, 'Engineering');
INSERT INTO Departments VALUES (4, 'Marketing');
INSERT INTO Departments VALUES (5, 'Sales');
INSERT INTO Departments VALUES (6, 'Operations');
INSERT INTO Departments VALUES (7, 'Logistics');

INSERT INTO Leaves VALUES (1, 1, '2024-05-20');
INSERT INTO Leaves VALUES (2, 2, '2024-03-15');
INSERT INTO Leaves VALUES (3, 3, '2024-02-20');
INSERT INTO Leaves VALUES (4, 4, '2024-07-10');
INSERT INTO Leaves VALUES (5, 5, '2024-04-25');
INSERT INTO Leaves VALUES (6, 6, '2024-06-05');
INSERT INTO Leaves VALUES (7, 7, '2024-08-12');
INSERT INTO Leaves VALUES (8, 8, '2024-01-10');
INSERT INTO Leaves VALUES (9, 9, '2024-09-15');
INSERT INTO Leaves VALUES (10, 10, '2024-11-05');
INSERT INTO Leaves VALUES (11, 11, '2024-12-01');
INSERT INTO Leaves VALUES (12, 12, '2024-10-10');

CREATE TABLE Products (
    ProductId INT PRIMARY KEY,
    ProductName VARCHAR(100),
    Price DECIMAL(10, 2)
);

CREATE TABLE Sales (
    SaleId INT PRIMARY KEY,
    ProductId INT,
    Quantity INT,
    FOREIGN KEY (ProductId) REFERENCES Products(ProductId)
);

CREATE TABLE Orders (
    OrderId INT PRIMARY KEY,
    CustomerId INT,
    OrderDate DATE
);

CREATE TABLE Payments (
    PaymentId INT PRIMARY KEY,
    OrderId INT,
    PaymentDate DATE,
    FOREIGN KEY (OrderId) REFERENCES Orders(OrderId)
);

INSERT INTO Products (ProductId, ProductName, Price) VALUES (1, 'Product A', 10.00);
INSERT INTO Products (ProductId, ProductName, Price) VALUES (2, 'Product B', 20.00);
INSERT INTO Products (ProductId, ProductName, Price) VALUES (3, 'Product C', 30.00);
INSERT INTO Products (ProductId, ProductName, Price) VALUES (4, 'Product D', 40.00);
INSERT INTO Products (ProductId, ProductName, Price) VALUES (5, 'Product E', 50.00);

INSERT INTO Sales (SaleId, ProductId, Quantity) VALUES (1, 1, 100);
INSERT INTO Sales (SaleId, ProductId, Quantity) VALUES (2, 2, 200);
INSERT INTO Sales (SaleId, ProductId, Quantity) VALUES (3, 3, 150);
INSERT INTO Sales (SaleId, ProductId, Quantity) VALUES (4, 4, 80);
INSERT INTO Sales (SaleId, ProductId, Quantity) VALUES (5, 5, 120);

INSERT INTO Orders (OrderId, CustomerId, OrderDate) VALUES (1, 101, '2023-01-01');
INSERT INTO Orders (OrderId, CustomerId, OrderDate) VALUES (2, 102, '2023-01-02');
INSERT INTO Orders (OrderId, CustomerId, OrderDate) VALUES (3, 103, '2023-01-03');
INSERT INTO Orders (OrderId, CustomerId, OrderDate) VALUES (4, 104, '2023-01-04');
INSERT INTO Orders (OrderId, CustomerId, OrderDate) VALUES (5, 105, '2023-01-05');

INSERT INTO Payments (PaymentId, OrderId, PaymentDate) VALUES (1, 1, '2023-01-06');
INSERT INTO Payments (PaymentId, OrderId, PaymentDate) VALUES (2, 2, '2023-01-07');
INSERT INTO Payments (PaymentId, OrderId, PaymentDate) VALUES (3, 3, '2023-01-08');
INSERT INTO Payments (PaymentId, OrderId, PaymentDate) VALUES (4, 4, '2023-01-09');

SQL Fundamentals

1. Explain the order of execution of SQL.

The order of execution of SQL statements can be a bit tricky because it’s not always intuitive. However, understanding it can significantly enhance your ability to write efficient queries. Here’s a breakdown of the typical order of execution for a SQL query:

FROM Clause: The first step is to process the FROM clause to determine the tables involved and perform any necessary joins.
WHERE Clause: Next, the WHERE clause filters the rows based on the specified conditions.
GROUP BY Clause: After filtering, the GROUP BY clause groups the remaining rows based on the specified columns.
HAVING Clause: The HAVING clause then filters the groups created in the previous step.
SELECT Clause: The SELECT clause processes the columns and expressions that should be returned in the result set.
DISTINCT Clause: If used, the DISTINCT clause removes duplicate rows from the result set.
ORDER BY Clause: The ORDER BY clause sorts the result set based on the specified columns and directions.
LIMIT/OFFSET Clause: Finally, the LIMIT or OFFSET clause limits the number of rows returned.

SELECT DISTINCT DepartmentID, AVG(Salary)
FROM Employees
WHERE Salary > 50000
GROUP BY DepartmentID
HAVING AVG(Salary) > 60000
ORDER BY DepartmentID
LIMIT 10;

FROM Employees: Identify the table to be queried.
WHERE Salary > 50000: Filter rows where Salary is greater than 50000.
GROUP BY DepartmentID: Group the filtered rows by DepartmentID.
HAVING AVG(Salary) > 60000: Filter groups where the average Salary is greater than 60000.
SELECT DISTINCT DepartmentID, AVG(Salary): Select and compute the required columns, ensuring no duplicate rows.
ORDER BY DepartmentID: Sort the result set by DepartmentID.
LIMIT 10: Limit the result set to 10 rows.

2. What is the difference between WHERE and HAVING?

The WHERE clause is used to filter rows before any groupings are made. It’s applied to individual rows in the database table.

SELECT * 
FROM Employees 
WHERE Salary > 50000;

The HAVING clause is used to filter groups after the GROUP BY operation has been applied. It’s used to filter the aggregated results. The HAVING clause is primarily used with aggregate functions (like SUM, COUNT, AVG, etc.).

SELECT DepartmentID, AVG(Salary) 
FROM Employees 
GROUP BY DepartmentID 
HAVING AVG(Salary) > 60000;

3. What is the use of GROUP BY?

The GROUP BY clause in SQL is used to arrange identical data into groups. It is often used in conjunction with aggregate functions (such as COUNT, SUM, AVG, MAX, MIN) to perform calculations on each group of data. Use Cases:

Aggregate Data: Summarize data by certain categories.
Data Analysis: Analyze patterns within groups.
Reporting: Generate reports with grouped data insights.

This query groups employees by DepartmentID and calculates the average salary for each department:-

SELECT DepartmentID, AVG(Salary)
FROM Employees
GROUP BY DepartmentID;

4. What is the difference between GROUP BY and WHERE Clause?

The GROUP BY clause groups rows with the same values in specified columns. It is used with aggregate functions.

SELECT DepartmentID, AVG(Salary)
FROM Employees
GROUP BY DepartmentID;

The WHERE clause filters rows before grouping. It specified conditions for individual rows.

SELECT *
FROM Employees
WHERE Salary > 50000;

5. What is the difference between DML, DDL, and DCL?

1. Data Manipulation Language (DML):

Purpose: Used for managing data within schema objects.
Common Commands: SELECT, INSERT, UPDATE, DELETE.
Example:

  INSERT INTO Employees (EmployeeID, FirstName, LastName) VALUES (1, 'John', 'Doe');

2. Data Definition Language (DDL):

Purpose: Used for defining and modifying database structures or schema.
Common Commands: CREATE, ALTER, DROP, TRUNCATE.
Example:

  CREATE TABLE Employees (
      EmployeeID INT PRIMARY KEY,
      FirstName VARCHAR(50),
      LastName VARCHAR(50)
  );

3. Data Control Language (DCL):

Purpose: Used for controlling access to data in the database.
Common Commands: GRANT, REVOKE.
Example:

  GRANT SELECT ON Employees TO userName;

Summary:

DML: Manipulates the data.
DDL: Defines the structure.
DCL: Controls access.

6. How many types of clauses in SQL?

SQL (Structured Query Language) includes several types of clauses, each serving a different purpose in structuring and executing queries. Here are the key types of SQL clauses:

SELECT Clause: Specifies the columns to be retrieved.

   SELECT column1, column2 FROM table;

FROM Clause: Specifies the table(s) from which to retrieve data.

   SELECT * FROM table;

WHERE Clause: Filters records based on specified conditions.

   SELECT * FROM table WHERE condition;

GROUP BY Clause: Groups rows sharing a property so aggregate functions can be applied to each group.

   SELECT column, COUNT(*)
   FROM table
   GROUP BY column;

HAVING Clause: Filters groups based on conditions, used with GROUP BY.

   SELECT column, COUNT(*)
   FROM table
   GROUP BY column
   HAVING COUNT(*) > 1;

ORDER BY Clause: Sorts the result set of a query by one or more columns.

   SELECT * FROM table
   ORDER BY column ASC|DESC;

LIMIT/OFFSET Clause: Limits the number of rows returned by a query.

   SELECT * FROM table
   LIMIT number OFFSET number;

JOIN Clause: Combines rows from two or more tables based on a related column.

   SELECT columns
   FROM table1
   JOIN table2 ON table1.column = table2.column;

7. Differences between GROUP BY and PARTITION BY?

GROUP BY

Purpose: Used to aggregate data across multiple rows by one or more columns.
Functionality: Combines rows that have the same values into summary rows, like SUM, AVG, COUNT, etc.
Usage: Commonly used with aggregate functions to generate summary data.

SELECT DepartmentId, SUM(Salary)
FROM Employees
GROUP BY DepartmentId;

This query groups the employees by their DepartmentId and calculates the total salary for each department.

PARTITION BY

Purpose: Used with window functions to perform calculations across a set of table rows that are somehow related to the current row.
Functionality: Does not reduce the number of rows returned. Instead, it adds a column that contains aggregate values calculated over a partition.
Usage: Commonly used with functions like ROW_NUMBER(), RANK(), DENSE_RANK(), NTILE(), LAG(), LEAD(), etc.

SELECT EmployeeId, DepartmentId, Salary,
       SUM(Salary) OVER (PARTITION BY DepartmentId) AS TotalDepartmentSalary
FROM Employees;

This query calculates the total salary for each department but does not group the rows. It adds a new column TotalDepartmentSalary that contains the total salary for the department of each employee.

Key Differences

Reduction of Rows:
- GROUP BY: Aggregates data and reduces the number of rows.
- PARTITION BY: Adds aggregate data without reducing the number of rows.
Usage Context:
- GROUP BY: Used for general data aggregation in summary reports.
- PARTITION BY: Used for analytical purposes to provide detailed insights within groups of data.

8. Explain the use of CASE statements in SQL.

The CASE statement in SQL is a powerful tool used to implement conditional logic directly within SQL queries. It’s similar to if-else statements in programming languages. You can use the CASE statement to create custom output, perform conditional aggregations, and more. Syntax:-

CASE 
    WHEN condition1 THEN result1
    WHEN condition2 THEN result2
    ...
    ELSE resultN
END

Categorizing Employees Based on Salary

SELECT 
    EmployeeID,
    FirstName,
    LastName,
    Salary,
    CASE
        WHEN Salary < 50000 THEN 'Low Salary'
        WHEN Salary >= 50000 AND Salary <= 80000 THEN 'Medium Salary'
        ELSE 'High Salary'
    END AS SalaryCategory
FROM 
    Employees;

Handling Null Values: You can also use the CASE statement to handle NULL values:

SELECT 
    EmployeeID,
    FirstName,
    LastName,
    ManagerID,
    CASE
        WHEN ManagerID IS NULL THEN 'No Manager'
        ELSE CAST(ManagerID AS VARCHAR)
    END AS ManagerStatus
FROM 
    Employees;

Conditional Aggregations: The CASE statement can also be used for conditional aggregations. For example, calculating the total salary for different departments:

SELECT 
    DepartmentID,
    SUM(CASE WHEN Salary > 60000 THEN Salary ELSE 0 END) AS TotalHighSalaries,
    SUM(CASE WHEN Salary <= 60000 THEN Salary ELSE 0 END) AS TotalLowSalaries
FROM 
    Employees
GROUP BY 
    DepartmentID;

SQL Constraints and Data Types

1. What are constraints and types of Constraints?

In SQL, constraints are rules applied to table columns to ensure the integrity, accuracy, and reliability of the data within the database. Constraints can be applied during table creation or after the table has been created.

Types of Constraints:

Primary Key Constraint: Ensures that each row in a table has a unique identifier. Example:

     CREATE TABLE Employees (
         EmployeeID INT PRIMARY KEY,
         FirstName VARCHAR(50),
         LastName VARCHAR(50)
     );

Foreign Key Constraint: Ensures the referential integrity of the data by linking two tables. Example:

     CREATE TABLE Orders (
         OrderID INT PRIMARY KEY,
         EmployeeID INT,
         CONSTRAINT fk_Employee
         FOREIGN KEY (EmployeeID) REFERENCES Employees(EmployeeID)
     );

Unique Constraint: Ensures all values in a column or a set of columns are unique. Example:

     CREATE TABLE Products (
         ProductID INT PRIMARY KEY,
         ProductName VARCHAR(50) UNIQUE
     );

Not Null Constraint: Ensures that a column cannot have a NULL value. Example:

     CREATE TABLE Customers (
         CustomerID INT PRIMARY KEY,
         CustomerName VARCHAR(50) NOT NULL
     );

Check Constraint: Ensures that all values in a column satisfy a specific condition. Example:

     CREATE TABLE Employees (
         EmployeeID INT PRIMARY KEY,
         Age INT CHECK (Age >= 18)
     );

Default Constraint: Provides a default value for a column when no value is specified. Example:

     CREATE TABLE Orders (
         OrderID INT PRIMARY KEY,
         OrderDate DATE DEFAULT GETDATE()
     );

Summary:

Primary Key: Unique identifier for each row.
Foreign Key: Links two tables.
Unique: Ensures unique values in a column.
Not Null: Ensures no NULL values.
Check: Ensures column values meet a condition.
Default: Sets a default value for a column.

2. What is the difference between “Primary Key” and “Unique Key”?

We can have only one Primary Key in a table whereas we can have more than one Unique Key in a table.
The Primary Key cannot have a NULL value whereas a Unique Key may have only one null value.
By default, a Primary Key is a Clustered Index whereas by default, a Unique Key is a unique non-clustered index.
A Primary Key supports an Auto Increment value whereas a Unique Key doesn’t support an Auto Increment value.

3. Different types of Operators?

Type of Operator	Operators	Example	Explanation
Arithmetic	`+`, `-`, `*`, `/`, `%`	`SELECT 5 + 2;`	Performs basic arithmetic operations.
Comparison	`=`, `<>`/`!=`, `>`, `<`, `>=`, `<=`	`SELECT * FROM Employees WHERE Salary > 50000;`	Compares two values.
Logical	`AND`, `OR`, `NOT`	`SELECT * FROM Employees WHERE Salary > 50000 AND DepartmentID = 1;`	Combines multiple conditions.
Bitwise	`&`, `\|`, `^`, `~`, `<<`, `>>`	`SELECT 5 & 3;`	Performs operations on bits of integer values.
Other	`BETWEEN`, `IN`, `LIKE`, `IS NULL`, `IS NOT NULL`	`SELECT * FROM Employees WHERE Salary BETWEEN 40000 AND 60000;`	Performs various specialized operations.

-- 1. Arithmetic Operators
-- Addition (+)
SELECT 5 + 2; -- Result: 7
-- Subtraction (-)
SELECT 5 - 2; -- Result: 3
-- Multiplication (*)
SELECT 5 * 2; -- Result: 10
-- Division (/)
SELECT 5 / 2; -- Result: 2.5
-- Modulo (%)
SELECT 5 % 2; -- Result: 1

-- 2. Comparison Operators
-- Equal to (=)
SELECT * FROM Employees WHERE Salary = 50000;
-- Not equal to (<> or !=)
SELECT * FROM Employees WHERE Salary <> 50000;
-- Greater than (>)
SELECT * FROM Employees WHERE Salary > 50000;
-- Less than (<)
SELECT * FROM Employees WHERE Salary < 50000;
-- Greater than or equal to (>=)
SELECT * FROM Employees WHERE Salary >= 50000;
-- Less than or equal to (<=)
SELECT * FROM Employees WHERE Salary <= 50000;

-- 3. Logical Operators
-- AND
SELECT * FROM Employees WHERE Salary > 50000 AND DepartmentID = 1;
-- OR
SELECT * FROM Employees WHERE Salary > 50000 OR DepartmentID = 1;
-- NOT
SELECT * FROM Employees WHERE NOT Salary = 50000;

-- 4. Bitwise Operators
-- Bitwise AND (&)
SELECT 5 & 3; -- Result: 1
-- Bitwise OR (|)
SELECT 5 | 3; -- Result: 7
-- Bitwise XOR (^)
SELECT 5 ^ 3; -- Result: 6
-- Bitwise NOT (~)
SELECT ~5; -- Result: -6
-- Bitwise Shift Left (<<)
SELECT 5 << 1; -- Result: 10
-- Bitwise Shift Right (>>)
SELECT 5 >> 1; -- Result: 2

-- 5. Other Operators
-- BETWEEN
SELECT * FROM Employees WHERE Salary BETWEEN 40000 AND 60000;
-- IN
SELECT * FROM Employees WHERE DepartmentID IN (1, 2, 3);
-- LIKE
SELECT * FROM Employees WHERE FirstName LIKE 'J%';
-- IS NULL
SELECT * FROM Employees WHERE LastName IS NULL;
-- IS NOT NULL
SELECT * FROM Employees WHERE LastName IS NOT NULL;

4. Difference between CHAR and VARCHAR?

1. CHAR:

Fixed-Length: Allocates a fixed amount of memory, regardless of the actual data length.
Storage Efficiency: Less efficient for varying-length data; better for consistent-length data.
Performance: Generally faster for read operations because of fixed length.
Padding: Pads extra spaces to the right of the value to meet the defined length.

  CREATE TABLE Employees (
      EmployeeCode CHAR(10)
  );

If you store “12345” in a CHAR(10), it will be stored as "12345_ _ _ _ _" (with five trailing space characters).

2. VARCHAR:

Variable-Length: Allocates only the memory needed for the actual data length, plus one or two bytes for length storage.
Storage Efficiency: More efficient for varying-length data.
Performance: Slightly slower for read operations compared to CHAR due to length calculations.
Padding: Does not pad extra spaces; stores the exact length of the value.

  CREATE TABLE Employees (
      EmployeeName VARCHAR(50)
  );

If you store “John” in a VARCHAR(50), it will be stored just as “John”.

Summary:

CHAR: Best for fixed-length data, may cause wasted space if data length varies, generally faster for consistent-length data.
VARCHAR: Best for varying-length data, more storage-efficient, but can be slightly slower due to extra length calculations.

5. What is an index? Explain its different types.

An index in SQL is a database object that improves the speed of data retrieval operations on a table by providing quick access to the rows. Think of it as a book’s index, which helps you quickly find the information you need without having to read every page.

Types of Indexes:

Primary Key Index:

Automatically Created: When a primary key is defined.
Purpose: Ensures that the primary key column has unique values and can be used to quickly access rows.

Unique Index:

Ensures Uniqueness: Prevents duplicate values in a column.
Purpose: Used to enforce unique constraints on columns other than the primary key.

Clustered Index:

Physical Order: Reorders the physical storage of the table’s data to match the index.
One Per Table: A table can have only one clustered index.

     CREATE CLUSTERED INDEX idx_EmployeeID ON Employees(EmployeeID);

Non-Clustered Index:

Logical Order: Maintains a separate structure from the data rows, with pointers to the physical data.
Multiple Per Table: A table can have multiple non-clustered indexes.

     CREATE INDEX idx_LastName ON Employees(LastName);

Full-Text Index:

Text Search: Used for full-text searches in text-based columns.
Purpose: Efficiently searches large text fields, like VARCHAR or TEXT columns.

     CREATE FULLTEXT INDEX idx_FullText ON Articles(Content);

Composite Index:

Multiple Columns: An index on multiple columns.
Purpose: Used to improve performance on queries involving multiple columns in the WHERE clause.

     CREATE INDEX idx_Composite ON Employees(LastName, FirstName);

Summary:

Primary Key Index: Ensures primary key uniqueness and quick access.
Unique Index: Ensures column value uniqueness.
Clustered Index: Reorders physical storage.
Non-Clustered Index: Logical ordering with pointers.
Full-Text Index: Efficient text searches.
Composite Index: Multiple column indexing.

6. How do you use indexing to improve SQL query performance?

Indexing can significantly enhance SQL query performance by reducing the amount of data that needs to be scanned to find the required information. Here’s how you can use indexing effectively:

Key Strategies for Using Indexing:

Identify High-Frequency Queries: Analyze which queries are most commonly executed and create indexes on the columns frequently used in WHERE, JOIN, ORDER BY, and GROUP BY clauses.
Choose the Right Index Type: Use Clustered Indexes for primary keys or columns with unique values, as they store the actual data rows in the index. Use Non-Clustered Indexes for columns used in search conditions and sorting, as they provide quick lookups without altering the physical order of data.
Index Selective Columns: Create indexes on columns with high selectivity (i.e., columns where the majority of values are unique or not repetitive). This increases the efficiency of searches.
Use Composite Indexes: When queries filter on multiple columns, create composite indexes to cover those columns. Ensure the columns in the composite index match the order of the columns in the query’s WHERE clause. CREATE INDEX idx_Composite ON Employees(LastName, FirstName);
Avoid Over-Indexing: While indexes improve read performance, they can slow down write operations (INSERT, UPDATE, DELETE) because the index needs to be updated. Balance the number of indexes to optimize both read and write operations.
Regular Maintenance: Periodically rebuild and reorganize indexes to remove fragmentation, which can degrade performance over time. ALTER INDEX idx_Composite ON Employees REBUILD;
Covering Indexes: Use covering indexes (indexes that include all the columns needed by a query) to avoid lookups to the actual table. CREATE INDEX idx_Covering ON Employees (LastName, FirstName, DepartmentID);
Monitor and Analyze Performance: Use database performance monitoring tools to track index usage and query performance. Adjust indexes based on the analysis. EXPLAIN SELECT * FROM Employees WHERE LastName = 'Smith';

Here’s an example of creating a non-clustered index on a column often used in search conditions:

CREATE INDEX idx_LastName ON Employees(LastName);

7. What is the difference between TRUNCATE and DELETE?

1. TRUNCATE:

Purpose: Used to quickly remove all rows from a table, effectively resetting it to an empty state.
It permanently deletes all rows from a table.
Operation Type: DDL (Data Definition Language).
Performance: Generally faster than DELETE because it does not log individual row deletions.
Transaction Control: Cannot be rolled back in most databases; treated as a DDL command.
Triggers: Does not fire triggers.
Syntax: TRUNCATE TABLE tableName;

Example:

  TRUNCATE TABLE Employees;

2. DELETE:

Purpose: Used to remove specific rows from a table based on a condition.
Operation Type: DML (Data Manipulation Language).
Performance: Slower than TRUNCATE for large tables because it logs individual row deletions.
Transaction Control: Can be rolled back if enclosed in a transaction block.
Triggers: Fires triggers.
Syntax: DELETE FROM tableName WHERE condition;

Example:

  DELETE FROM Employees WHERE DepartmentID = 1;

8. How would you optimize a slow SQL query?

Optimizing a slow SQL query can involve several strategies to enhance its performance. Here are some common techniques:

Analyzing the Query Plan (Use EXPLAIN): Use the EXPLAIN command to understand how your query is executed.

EXPLAIN SELECT * FROM Employees WHERE DepartmentID = 1;

Indexing (Create Indexes): Ensure that columns used in WHERE, JOIN, ORDER BY, and GROUP BY clauses have appropriate indexes.

  CREATE INDEX idx_DepartmentID ON Employees(DepartmentID);

Optimize Joins (Use Proper Joins): Ensure that you use the most efficient join type for your query.

  SELECT e.Name, d.DepartmentName
  FROM Employees e
  INNER JOIN Departments d ON e.DepartmentID = d.DepartmentID;

Avoiding SELECT * (Specify Columns): Select only the columns you need instead of using SELECT *.

  SELECT Name, DepartmentName
  FROM Employees
  WHERE DepartmentID = 1;

Filtering Early (Use WHERE Clauses): Filter data as early as possible in the query.

  SELECT Name, DepartmentName
  FROM Employees
  WHERE DepartmentID = 1;

Optimizing Subqueries (Use Joins or CTEs): Replace subqueries with joins or Common Table Expressions (CTEs) if possible.

  -- Using a join instead of a subquery
  SELECT e.Name, d.DepartmentName
  FROM Employees e
  INNER JOIN Departments d ON e.DepartmentID = d.DepartmentID
  WHERE d.DepartmentName = 'Sales';

Limiting Results (Use LIMIT): If you need only a subset of the results, use the LIMIT clause.

  SELECT Name, DepartmentName
  FROM Employees
  WHERE DepartmentID = 1
  LIMIT 10;

Analyzing and Updating Statistics (Update Statistics): Ensure that the database statistics are up-to-date for the optimizer to make informed decisions.

  ANALYZE TABLE Employees;

Avoiding Functions in WHERE Clause (Use Indexed Columns): Avoid using functions on columns in the WHERE clause as it can prevent index usage.

-- Instead of this
SELECT * FROM Employees WHERE YEAR(JoiningDate) = 2021;

-- Use this
SELECT * FROM Employees 
WHERE JoiningDate BETWEEN '2021-01-01' AND '2021-12-31';

Partitioning Large Tables (Partition Tables): Break down large tables into smaller, more manageable pieces for faster query performance.

Difference between ISNULL() and IFNULL()?

ISNULL(expression): Checks if an expression is NULL and returns 1 if true, 0 otherwise.
IFNULL(expression, replacement): Replaces NULL values with a specified replacement value.

SELECT 
    EmployeeID, 
    FirstName,
    ISNULL(ManagerID) AS IsManagerIDNull
FROM 
    Employees;

SELECT 
    EmployeeID, 
    FirstName,
    IFNULL(ManagerID, 'No Manager') AS ManagerID
FROM 
    Employees;

SQL Joins and Relationships

1. Explain all types of joins in SQL.

1. Inner Join: Returns records that have matching values in both tables. Syntax:

  SELECT a.*, b.*
  FROM TableA a
  INNER JOIN TableB b ON a.CommonColumn = b.CommonColumn;

Example:

SELECT e.firstname, d.departmentname
FROM employees e
INNER JOIN departments d 
ON e.departmentid = d.departmentid;

2. Left (Outer) Join: Returns all records from the left table and the matched records from the right table. Unmatched records from the left table are also included. Syntax:

  SELECT a.*, b.*
  FROM TableA a
  LEFT JOIN TableB b ON a.CommonColumn = b.CommonColumn;

Example:

SELECT e.firstname, d.departmentname
FROM employees e
LEFT JOIN departments d 
ON e.departmentid = d.departmentid;

3. Right (Outer) Join: Returns all records from the right table and the matched records from the left table. Unmatched records from the right table are also included. Syntax:

  SELECT a.*, b.*
  FROM TableA a
  RIGHT JOIN TableB b ON a.CommonColumn = b.CommonColumn;

Example:

SELECT e.firstname, d.departmentname
FROM employees e
RIGHT JOIN departments d 
ON e.departmentid = d.departmentid;

4. Full (Outer) Join: Returns all records when there is a match in either left or right table. Unmatched records from both tables are included. Syntax:

  SELECT a.*, b.*
  FROM TableA a
  FULL OUTER JOIN TableB b ON a.CommonColumn = b.CommonColumn;

Example:

  SELECT Employees.Name, Departments.DepartmentName
  FROM Employees
  FULL OUTER JOIN Departments ON Employees.DepartmentID = Departments.DepartmentID;

MySQL does not support the FULL OUTER JOIN clause. Most modern relational database management systems (RDBMS) support FULL OUTER JOIN, including: PostgreSQL, SQL Server, Oracle, SQLite (with a workaround using LEFT JOIN and UNION). However, as mentioned earlier, MySQL does not support FULL OUTER JOIN directly. You can achieve the same result using a combination of LEFT JOIN and UNION.

-- Simulating FULL OUTER JOIN in MySQL

SELECT a.column1, b.column2
FROM table1 a
LEFT JOIN table2 b ON a.common_column = b.common_column

UNION

SELECT a.column1, b.column2
FROM table1 a
RIGHT JOIN table2 b ON a.common_column = b.common_column;

5. Cross Join: Returns the Cartesian product of both tables, meaning it returns all possible combinations of rows. Syntax:

  SELECT a.*, b.*
  FROM TableA a
  CROSS JOIN TableB b;

Example:

SELECT e.firstname, d.departmentname
FROM employees e
CROSS JOIN departments d;

6. Self Join: A self join is a regular join but the table is joined with itself. Syntax:

  SELECT a.*, b.*
  FROM TableA a
  INNER JOIN TableA b ON a.CommonColumn = b.CommonColumn;

Example: SQL Query will show the employees’ and managers’ names.

SELECT e1.FirstName AS EmployeeName, e2.FirstName AS ManagerName 
FROM Employees e1 
INNER JOIN Employees e2 
ON e1.ManagerID = e2.EmployeeID;

Summary:

Inner Join: Matches records in both tables.
Left Join: All records from the left table and matched records from the right.
Right Join: All records from the right table and matched records from the left.
Full Join: All records when there is a match in either left or right table.
Cross Join: Cartesian product of both tables.
Self Join: Table joined with itself.

2. Difference between INNER JOIN and OUTER JOIN?

INNER JOIN returns only the rows where there is a match in both tables. OUTER JOIN returns all rows from one or both tables, along with the matched rows. If there is no match, NULL values are included for non-matching rows. Types of outer join: LEFT JOIN, RIGHT JOIN, FULL JOIN.

INNER JOIN: Only returns rows with matching data in both tables.
LEFT JOIN: Returns all rows from the left table, with matched rows from the right table (NULL if no match).
RIGHT JOIN: Returns all rows from the right table, with matched rows from the left table (NULL if no match).
FULL JOIN: Returns all rows from both tables, with matched rows and NULLs where there is no match.

3. Difference between UNION and UNION ALL.

1. UNION:

Purpose: Combines the results of two or more SELECT statements into a single result set, removing duplicates.
Duplicates: Removes duplicate rows from the result set.
Use Case: When you need to combine results and ensure there are no duplicate rows.
Syntax:

  SELECT column1, column2 FROM table1
  UNION
  SELECT column1, column2 FROM table2;

Example:

  SELECT FirstName, LastName FROM Employees
  UNION
  SELECT FirstName, LastName FROM Managers;

SELECT departmentId 
FROM employees 
UNION
SELECT departmentId 
FROM departments;

2. UNION ALL:

Purpose: Combines the results of two or more SELECT statements into a single result set, including duplicates.
Duplicates: Does not remove duplicate rows; all rows from each SELECT statement are included.
Use Case: When you need to combine results and are okay with including duplicates for performance reasons or specific data requirements.
Syntax:

  SELECT column1, column2 FROM table1
  UNION ALL
  SELECT column1, column2 FROM table2;

Example:

  SELECT FirstName, LastName FROM Employees
  UNION ALL
  SELECT FirstName, LastName FROM Managers;

SELECT departmentId 
FROM employees 
UNION ALL
SELECT departmentId 
FROM departments;

Summary:

UNION: Combines results and removes duplicates, ensuring unique rows in the result set. May have a performance overhead due to duplicate removal.
UNION ALL: Combines results and includes all duplicates, which can be faster since it doesn’t check for duplicates.

4. What are the various types of relationships in SQL?

In SQL, relationships between tables are defined to maintain data integrity and facilitate querying related data. There are several types of relationships:

1. One-to-One (1:1) Relationship: Each row in Table A is linked to one and only one row in Table B, and vice versa. Example: A table for Employees and a table for EmployeeDetails where each employee has one set of detailed information.

  CREATE TABLE Employees (
      EmployeeID INT PRIMARY KEY,
      Name VARCHAR(50)
  );

  CREATE TABLE EmployeeDetails (
      DetailID INT PRIMARY KEY,
      EmployeeID INT,
      Address VARCHAR(100),
      FOREIGN KEY (EmployeeID) REFERENCES Employees(EmployeeID)
  );

2. One-to-Many (1:N) Relationship: Each row in Table A can be linked to multiple rows in Table B, but each row in Table B is linked to only one row in Table A. Example: A table for Departments and a table for Employees where each department can have multiple employees.

  CREATE TABLE Departments (
      DepartmentID INT PRIMARY KEY,
      DepartmentName VARCHAR(50)
  );

  CREATE TABLE Employees (
      EmployeeID INT PRIMARY KEY,
      Name VARCHAR(50),
      DepartmentID INT,
      FOREIGN KEY (DepartmentID) REFERENCES Departments(DepartmentID)
  );

3. Many-to-One (N:1) Relationship: Each row in Table B can be linked to multiple rows in Table A, but each row in Table A is linked to only one row in Table B (essentially the inverse of One-to-Many). Example: Similar to the One-to-Many example, viewed from the perspective of the Employees table.

4. Many-to-Many (N:N) Relationship: Each row in Table A can be linked to multiple rows in Table B, and each row in Table B can be linked to multiple rows in Table A. Example: A table for Students and a table for Courses where each student can enroll in multiple courses and each course can have multiple students. This relationship is usually implemented using a junction table.

  CREATE TABLE Students (
      StudentID INT PRIMARY KEY,
      StudentName VARCHAR(50)
  );

  CREATE TABLE Courses (
      CourseID INT PRIMARY KEY,
      CourseName VARCHAR(50)
  );

  CREATE TABLE StudentCourses (
      StudentID INT,
      CourseID INT,
      PRIMARY KEY (StudentID, CourseID),
      FOREIGN KEY (StudentID) REFERENCES Students(StudentID),
      FOREIGN KEY (CourseID) REFERENCES Courses(CourseID)
  );

5. Self-Referencing Relationship: A table has a foreign key that references its own primary key, useful for hierarchical data. Example: An Employees table where each employee may have a manager, who is also an employee.

  CREATE TABLE Employees (
      EmployeeID INT PRIMARY KEY,
      Name VARCHAR(50),
      ManagerID INT,
      FOREIGN KEY (ManagerID) REFERENCES Employees(EmployeeID)
  );

Summary:

One-to-One: Each row in one table is linked to one row in another table.
One-to-Many: One row in a table is linked to multiple rows in another table.
Many-to-One: Multiple rows in one table are linked to one row in another table.
Many-to-Many: Multiple rows in one table are linked to multiple rows in another table via a junction table.
Self-Referencing: A table references itself, useful for hierarchical relationships.

5. Difference between Primary Key and Secondary Key?

Primary Key:

Purpose: Uniquely identifies each record in a table.
Uniqueness: Enforces uniqueness; no duplicate values are allowed.
Null Values: Cannot contain NULL values.
Definition: Typically defined at the table creation.

  CREATE TABLE Employees (
      EmployeeID INT PRIMARY KEY,
      Name VARCHAR(50)
  );

In this example, EmployeeID is the primary key, ensuring each employee has a unique identifier.

Secondary Key:

Purpose: Provides additional means to identify records, often used for searching or indexing.
Uniqueness: Not necessarily unique; can contain duplicate values.
Null Values: Can contain NULL values.
Definition: Usually created as an index to improve query performance.

  CREATE INDEX idx_LastName ON Employees(LastName);

In this example, LastName is used as a secondary key to speed up searches based on the last name.

Summary:

Primary Key:
- Ensures each record in a table is unique.
- Cannot be NULL.
- Defined when the table is created.
Secondary Key:
- Used to improve query performance and searches.
- Can have duplicate values and contain NULLs.
- Typically created as an index after the table is created.

6. What is the difference between SUBQUERY and JOIN?

1. SUBQUERY:

Definition: A query nested within another query. It can be used to return data that will be used in the main query.
Usage: Commonly used in SELECT, INSERT, UPDATE, and DELETE statements to provide a result set for the outer query.
Syntax:

  SELECT column1
  FROM table1
  WHERE column2 = (SELECT column2
                   FROM table2
                   WHERE condition);

Example:

SELECT FirstName
FROM Employees
WHERE DepartmentID = ( SELECT DepartmentID
		       FROM Departments
                       WHERE DepartmentName = 'HR'
);

Performance: Can be less efficient than joins for large datasets since it may execute the subquery for each row of the outer query.

2. JOIN:

Definition: Combines rows from two or more tables based on a related column between them.
Usage: Used to retrieve related data from multiple tables in a single query.
Syntax:

  SELECT a.column1, b.column2
  FROM table1 a
  JOIN table2 b ON a.common_column = b.common_column;

Example:

  SELECT Employees.Name, Departments.DepartmentName
  FROM Employees
  JOIN Departments ON Employees.DepartmentID = Departments.DepartmentID;

Types of Joins: Includes INNER JOIN, LEFT JOIN, RIGHT JOIN, FULL JOIN, etc.
Performance: Generally more efficient than subqueries for large datasets as it allows the database to combine tables and filter rows in a single pass.

Summary:

Subquery:
- A query within another query.
- Can be used in SELECT, INSERT, UPDATE, and DELETE.
- May be less efficient for large datasets.
Join:
- Combines rows from multiple tables based on related columns.
- Typically more efficient for retrieving related data.
- Various types (INNER, LEFT, RIGHT, FULL).

In essence, subqueries are useful for situations where you need to perform a query on the result of another query, while joins are more efficient for combining and retrieving data from multiple tables. Each has its own use cases and can be chosen based on the specific requirements and performance considerations of the query.

Query

1. Write a SQL query to find the second highest salary of an employee.

SELECT *
FROM Employees
ORDER BY Salary DESC
LIMIT 1 OFFSET 1;

Find nth highest salary of the Employee table

SELECT DISTINCT Salary
FROM Employees
ORDER BY Salary DESC
LIMIT 1 OFFSET n-1;

With Employee information:-

WITH RankedSalaries AS (
    SELECT 
        EmployeeID,
        DENSE_RANK() OVER (ORDER BY Salary DESC) AS sal_rank
    FROM Employees
)
SELECT *
FROM Employees e
WHERE e.EmployeeID IN (
    SELECT EmployeeID
    FROM RankedSalaries
    WHERE sal_rank = 7
);

2. Write a SQL query to find the top 3 departments with the highest average salary.

SELECT DepartmentId, AVG(Salary) AS AvgSal
FROM Employees
GROUP BY DepartmentId
ORDER BY AvgSal DESC
LIMIT 3;

3. Write a SQL query to find the top 2 products with the highest sales.

SELECT p.ProductId, p.ProductName, SUM(s.Quantity) AS TotalSales
FROM Products p
JOIN Sales s ON p.ProductId = s.ProductId
GROUP BY p.ProductId, p.ProductName
ORDER BY TotalSales DESC
LIMIT 2;

4. Write a SQL query to find the top 3 products with the highest average price.

SELECT ProductId, ProductName, AVG(Price) AS AvgPrice
FROM Products
GROUP BY ProductId, ProductName
ORDER BY AvgPrice DESC
LIMIT 3;

5. How do you handle duplicate rows in a SQL query?

SELECT DISTINCT column1, column2
FROM table_name;

6. Write a SQL query to find the customers who have placed an order but have not made a payment.

SELECT DISTINCT o.CustomerId
FROM Orders o
LEFT JOIN Payments p ON o.OrderId = p.OrderId
WHERE p.PaymentId IS NULL;

7. Write a SQL query to find the employees who work in the same department as their manager.

SELECT 
    e.EmployeeId, 
    e.Firstname, 
    e.departmentId
FROM 
    Employees e
JOIN 
    Employees e1 ON e.departmentid = e1.departmentid 
               AND e.managerid = e1.employeeid;

8. Write a SQL query to find the employees who have worked for more than 5 years.

SELECT 
    EmployeeID, 
    joiningDate 
FROM 
    Employees 
WHERE 
    joiningDate <= DATE_SUB(CURDATE(), INTERVAL 5 YEAR);

9. Write a SQL query to find the employees who have the same name and work in the same department.

SELECT 
    e.EmployeeId, 
    e.FirstName
FROM 
    Employees e
JOIN 
    Employees e1 ON e.departmentId = e1.departmentId 
               AND e.FirstName = e1.FirstName 
               AND e.EmployeeId != e1.EmployeeId;

10. Write a SQL query to find the departments with no employees.

SELECT 
    d.*
FROM 
    Departments d 
LEFT JOIN 
    Employees e ON d.departmentid = e.departmentid
WHERE 
    e.employeeId IS NULL;

11. Write a SQL query to find the employees who have not taken any leave in the last 6 months.

SELECT 
    e.*
FROM 
    Employees e
LEFT JOIN 
    Leaves l ON e.EmployeeID = l.EmployeeID 
           AND l.LeaveDate >= DATE_SUB(CURDATE(), 
                     INTERVAL 6 MONTH)
WHERE 
    l.LeaveID IS NULL;

LEFT JOIN Employees and Leaves: Links to the Employees table with the Leaves table on the EmployeeID and filter leave dates within the last 6 months.
WHERE l.LeaveID IS NULL: Ensures that only employees who have not taken any leave in the last 6 months are selected.

12. Find the highest-earning employee in each department. Show their id, name, and department.

SELECT 
    e.EmployeeID,
    e.Firstname, 
    d.DepartmentName
FROM 
    Employees e
JOIN 
    Departments d ON e.departmentId = d.DepartmentId
WHERE 
    e.EmployeeID IN (
        SELECT 
            EmployeeID 
        FROM 
            Employees 
        GROUP BY 
            DepartmentId
        HAVING 
            MAX(Salary)
    );

13. Write a SQL query to find the employee name, department name, and manager name.

SELECT 
    e1.Firstname, 
    d.DepartmentName, 
    e2.Firstname AS ManagerName
FROM 
    Employees e1
INNER JOIN 
    Employees e2 ON e1.managerId = e2.EmployeeID
JOIN 
    Departments d ON e1.departmentId = d.DepartmentId;

14. How to delete DUPLICATE records from a table using a SQL Query?

Deleting duplicate records from a table can be handled in several ways. Below is a common approach using a DELETE statement with a subquery. This example assumes we have a table named Employees and we want to delete duplicate rows based on the FirstName and LastName columns, while keeping the row with the lowest EmployeeID.

Step 1: Identify Duplicates. First, identify the duplicates and decide which row to keep. Typically, you might keep the row with the lowest EmployeeID.

SELECT 
    EmployeeID,
    FirstName,
    LastName,
    COUNT(*)
FROM 
    Employees
GROUP BY 
    FirstName, 
    LastName
HAVING 
    COUNT(*) > 1;

Step 2: Delete Duplicates. Next, delete the duplicate rows. Here’s how you can do it:

DELETE e1
FROM Employees e1
INNER JOIN Employees e2 
    ON e1.FirstName = e2.FirstName 
    AND e1.LastName = e2.LastName 
    AND e1.EmployeeID > e2.EmployeeID;

Explanation:

INNER JOIN Employees e2 ON e1.FirstName = e2.FirstName AND e1.LastName = e2.LastName: Joins the table to itself based on the columns that define duplicates (FirstName and LastName in this case).
AND e1.EmployeeID > e2.EmployeeID: Ensures that we keep the row with the smallest EmployeeID by comparing each pair of duplicates.
DELETE e1: Deletes the duplicate rows.

15. How to read the TOP 5 records from a table using an SQL query?

The LIMIT clause allows you to specify the number of records to return from the result set. To read the top 5 records from a table, you can use the following query:-

SELECT *
FROM Employees
ORDER BY Salary DESC
LIMIT 5;

Advanced SQL Concepts

1. Explain all types of window functions. (mainly rank, row_num, dense_rank, lead & lag)

Window functions in SQL are a powerful feature that allow you to perform calculations across a set of table rows that are related to the current row. Here are the main types of window functions:

1. Aggregate Functions: These functions perform a calculation on a set of values and return a single value. When used as window functions, they can provide cumulative or moving calculations.

SUM(): Calculates the total sum of a numeric column.

SELECT DepartmentId, SUM(Salary) As DeptTotal
FROM Employees
GROUP BY DepartmentId;

SELECT FirstName, DepartmentID, 
SUM(Salary) OVER (PARTITION BY DepartmentID) AS DeptTotal
FROM Employees;

AVG(): Calculates the average value of a numeric column.

SELECT FirstName, DepartmentID, 
AVG(Salary) OVER (PARTITION BY DepartmentID) AS DeptAVG
FROM Employees;

MIN(): Finds the minimum value in a set.

SELECT FirstName, DepartmentID, 
MIN(Salary) OVER (PARTITION BY DepartmentID) AS DeptMin
FROM Employees;

MAX(): Finds the maximum value in a set.

SELECT FirstName, DepartmentID, 
MAX(Salary) OVER (PARTITION BY DepartmentID) AS DeptMax
FROM Employees;

COUNT(): Counts the number of rows.

SELECT FirstName, DepartmentID, 
Count(*) OVER (PARTITION BY DepartmentID) AS DeptCount
FROM Employees;

2. Ranking Functions: These functions assign a ranking value to each row within a partition.

ROW_NUMBER(): Assigns a unique sequential integer to rows within a partition.

SELECT FirstName, DepartmentID, Salary,
ROW_NUMBER() OVER (PARTITION BY DepartmentID ORDER BY Salary Desc) AS RowNum
FROM Employees;

RANK(): Assigns a rank to each row within a partition, with gaps for ties.

SELECT FirstName, DepartmentID, Salary,
RANK() OVER (PARTITION BY DepartmentID ORDER BY Salary Desc) AS SalRank
FROM Employees;

DENSE_RANK(): Assigns a rank to each row within a partition, without gaps for ties.

SELECT FirstName, DepartmentID, Salary,
DENSE_RANK() OVER (PARTITION BY DepartmentID ORDER BY Salary Desc) AS DenseSalRank
FROM Employees;

NTILE(): Divides rows into a specified number of approximately equal groups.

SELECT FirstName, DepartmentID, Salary,
NTILE(4) OVER (PARTITION BY DepartmentID ORDER BY Salary Desc) AS NtileGroup
FROM Employees;

3. Value Functions: These functions allow you to access data from other rows within the same result set.

LAG(): Provides access to a value from a previous row in the result set.

SELECT FirstName, Salary,
LAG(Salary, 1) OVER (ORDER BY Salary) AS PrevSalary
FROM Employees;

LEAD(): Provides access to a value from a subsequent row in the result set.

SELECT FirstName, Salary,
LEAD(Salary, 1) OVER (ORDER BY Salary) AS NextSalary
FROM Employees;

FIRST_VALUE(): Returns the first value in an ordered set of values.

SELECT FirstName, Salary,
FIRST_VALUE(Salary) OVER (ORDER BY Salary) AS FirstSalary
FROM Employees;

LAST_VALUE(): Returns the last value in an ordered set of values.

SELECT FirstName, Salary,
LAST_VALUE(Salary) OVER (ORDER BY Salary) AS LastSalary
FROM Employees;

4. Cumulative Distribution Functions: These functions calculate the distribution of values in a result set.

CUME_DIST(): Calculates the cumulative distribution of a value within a set of values.

SELECT FirstName, Salary,
CUME_DIST() OVER (ORDER BY Salary) AS CumeDist
FROM Employees;

PERCENT_RANK(): Calculates the relative rank of a row within a partition.

SELECT FirstName, Salary,
PERCENT_RANK() OVER (ORDER BY Salary) AS CumeDist
FROM Employees;

Summary:

Aggregate Functions: SUM(), AVG(), MIN(), MAX(), COUNT()
Ranking Functions: ROW_NUMBER(), RANK(), DENSE_RANK(), NTILE()
Value Functions: LAG(), LEAD(), FIRST_VALUE(), LAST_VALUE()
Cumulative Distribution Functions: CUME_DIST(), PERCENT_RANK()

1.a) RANK() vs DENSE_RANK()?

Both of these functions are used to assign ranks to rows in a result set based on the values of a specified column, but they handle ties differently.

Example Table: Scores

Student	Score
A	90
B	90
C	85
D	80
E	80
F	75

The RANK() function assigns a unique rank to each row within a partition of the result set. If there are ties, the same rank is assigned to the tied values, but the next rank is incremented by the number of ties.

SELECT Student, Score, RANK() OVER (ORDER BY Score DESC) AS Rank
FROM Scores;

Student	Score	Rank
A	90	1
B	90	1
C	85	3
D	80	4
E	80	4
F	75	6

The DENSE_RANK() function is similar to RANK(), but it does not leave gaps between the ranks when there are ties.

SELECT Student, Score, DENSE_RANK() OVER (ORDER BY Score DESC) AS DenseRank
FROM Scores;

Student	Score	DenseRank
A	90	1
B	90	1
C	85	2
D	80	3
E	80	3
F	75	4

Key Differences:

RANK():
- Leaves gaps in the ranking when there are ties.
- E.g., Scores of 90 are both ranked 1, but the next score (85) is ranked 3.
DENSE_RANK():
- No gaps in the ranking.
- E.g., Scores of 90 are both ranked 1, and the next score (85) is ranked 2.

2. How do you use window functions to solve complex queries?

Window functions are a powerful tool in SQL that help you solve complex queries involving running totals, moving averages, cumulative sums, rankings, and more, all without the need for subqueries or self-joins. Here’s how you can use window functions to address different complex query scenarios:

1. Running Total: Calculate a running total of sales for each employee.

SELECT EmployeeID, SaleDate, SalesAmount,
       SUM(SalesAmount) OVER (PARTITION BY EmployeeID ORDER BY SaleDate) AS RunningTotal
FROM Sales;

This query partitions the sales by employee and orders them by the sale date to calculate the cumulative sum.

2. Moving Average: Calculate a 3-day moving average of sales for each product.

SELECT ProductID, SaleDate, SalesAmount,
       AVG(SalesAmount) OVER (PARTITION BY ProductID ORDER BY SaleDate ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS MovingAvg
FROM Sales;

This calculates the average sales amount over the current row and the preceding two rows for each product.

3. Ranking: Rank employees within their departments based on their sales.

SELECT EmployeeID, DepartmentID, SalesAmount,
       RANK() OVER (PARTITION BY DepartmentID ORDER BY SalesAmount DESC) AS SalesRank
FROM EmployeeSales;

This assigns a rank to each employee within their department based on their sales amount.

4. First and Last Values: Get the first and last sales dates for each employee.

SELECT EmployeeID, SaleDate, SalesAmount,
       FIRST_VALUE(SaleDate) OVER (PARTITION BY EmployeeID ORDER BY SaleDate) AS FirstSaleDate,
       LAST_VALUE(SaleDate) OVER (PARTITION BY EmployeeID ORDER BY SaleDate
                                  ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS LastSaleDate
FROM Sales;

This retrieves the first and last sale dates for each employee, considering all sales dates.

5. Cumulative Distribution: Calculate the cumulative distribution of salaries within a department.

SELECT EmployeeID, DepartmentID, Salary,
       CUME_DIST() OVER (PARTITION BY DepartmentID ORDER BY Salary) AS CumulativeDistribution
FROM Employees;

This calculates the cumulative distribution of salaries within each department, showing the relative position of each salary.

6. Percent Rank: Determine the percentile rank of employees’ salaries within their department.

SELECT EmployeeID, DepartmentID, Salary,
       PERCENT_RANK() OVER (PARTITION BY DepartmentID ORDER BY Salary) AS PercentRank
FROM Employees;

This calculates the percentile rank of each employee’s salary within their department.

Summary:

Running Total: Use SUM() for cumulative sums.
Moving Average: Use AVG() with a range frame.
Ranking: Use RANK(), ROW_NUMBER(), or DENSE_RANK().
First and Last Values: Use FIRST_VALUE() and LAST_VALUE().
Cumulative Distribution: Use CUME_DIST().
Percent Rank: Use PERCENT_RANK().

3. Which is faster between CTE and Subquery?

The performance difference between Common Table Expressions (CTEs) and subqueries can vary depending on the specific context and database engine. Here’s a comparison to help understand when each might be faster or more appropriate:

Common Table Expressions (CTEs):

Readability: CTEs often make complex queries easier to read and maintain.
Performance: In some cases, CTEs can be optimized similarly to subqueries. However, they may not always be as efficient as using temporary tables or indexes for complex operations.
Reusability: CTEs can be defined once and referenced multiple times within the same query.
Execution: Some database engines optimize CTEs by materializing them (storing the result temporarily), while others treat them like inline views.

Subqueries:

Performance: Subqueries, especially correlated subqueries, can sometimes be less efficient than joins or CTEs because they may execute multiple times. However, this depends on the specific query and database engine.
Readability: Inline subqueries can make queries harder to read and maintain compared to CTEs.
Execution: Subqueries can be optimized differently by the database engine, and in some cases, they might be more efficient than CTEs for simple, non-correlated scenarios.

Using a CTE:

WITH SalesCTE AS (
    SELECT EmployeeID, SUM(SalesAmount) AS TotalSales
    FROM Sales
    GROUP BY EmployeeID
)
SELECT Employees.Name, SalesCTE.TotalSales
FROM Employees
JOIN SalesCTE ON Employees.EmployeeID = SalesCTE.EmployeeID;

Using a Subquery:

SELECT Employees.Name, (
    SELECT SUM(SalesAmount)
    FROM Sales
    WHERE Sales.EmployeeID = Employees.EmployeeID
) AS TotalSales
FROM Employees;

Summary:

CTEs: Generally better for readability and maintainability, potentially reusable within the same query.
Subqueries: May be more efficient in simple, non-correlated scenarios, but can be less readable and maintainable.

Ultimately, the performance depends on the specific query, the data involved, and the database engine’s optimization capabilities. It’s often a good practice to test both approaches and analyze the execution plans to determine which performs better for your particular use case.

4. How do you use stored procedures to improve SQL query performance?

Stored procedures can enhance SQL query performance in several key ways. Let’s dive into some strategies and benefits specific to MySQL:

1. Precompiled Execution: Stored procedures are compiled once and stored in the database, meaning they can be executed more quickly than dynamic SQL because the database engine doesn’t have to parse and compile the query each time it’s run. Example:

  DELIMITER //

  CREATE PROCEDURE GetEmployeeDetails()
  BEGIN
      SELECT * FROM Employees;
  END //

  DELIMITER ;

2. Reduced Network Traffic: Instead of sending multiple SQL statements across the network, a single call to a stored procedure can be made. This reduces network overhead and speeds up the overall performance. Example:

  CALL GetEmployeeDetails();

3. Code Reusability and Maintenance: Stored procedures allow for code reuse and centralize logic in the database, making it easier to maintain and update queries. Changes made to the stored procedure are automatically reflected wherever the procedure is called. Example:

  DELIMITER //

  CREATE PROCEDURE UpdateEmployeeSalary(IN EmployeeID INT, IN NewSalary DECIMAL(10,2))
  BEGIN
      UPDATE Employees
      SET Salary = NewSalary
      WHERE EmployeeID = EmployeeID;
  END //

  DELIMITER ;

4. Enhanced Security: Stored procedures can provide an additional layer of security by controlling access to data. Users can be granted permission to execute the stored procedure without granting them direct access to the underlying tables. Example:

  GRANT EXECUTE ON PROCEDURE UpdateEmployeeSalary TO UserName;

5. Improved Performance through Execution Plans: Stored procedures use cached execution plans, which means the database engine can reuse previously generated plans for subsequent executions, leading to faster performance. Example:

  DELIMITER //

  CREATE PROCEDURE GetEmployeeByDepartment(IN DepartmentID INT)
  BEGIN
      SELECT * FROM Employees
      WHERE DepartmentID = DepartmentID;
  END //

  DELIMITER ;

6. Better Error Handling: Stored procedures can include comprehensive error handling mechanisms to ensure that exceptions are managed efficiently, improving the robustness of the application. Example:

  DELIMITER //

  CREATE PROCEDURE SafeUpdateEmployeeSalary(IN EmployeeID INT, IN NewSalary DECIMAL(10,2))
  BEGIN
      DECLARE EXIT HANDLER FOR SQLEXCEPTION
      BEGIN
          -- Error handling code
          ROLLBACK;
      END;

      START TRANSACTION;
      UPDATE Employees
      SET Salary = NewSalary
      WHERE EmployeeID = EmployeeID;
      COMMIT;
  END //

  DELIMITER ;

7. Modular Programming: Benefit: Stored procedures support modular programming by allowing complex operations to be broken down into smaller, reusable procedures. This enhances code readability and maintainability. Example:-

  DELIMITER //

  CREATE PROCEDURE CalculateBonus(IN EmployeeID INT, OUT Bonus DECIMAL(10,2))
  BEGIN
      DECLARE Salary DECIMAL(10,2);
      SELECT Salary INTO Salary FROM Employees WHERE EmployeeID = EmployeeID;
      SET Bonus = Salary * 0.10;
  END //

  DELIMITER ;

Summary:

Precompiled Execution: Faster execution due to precompilation.
Reduced Network Traffic: Less network overhead.
Code Reusability: Centralized and reusable logic.
Enhanced Security: Controlled data access.
Cached Execution Plans: Faster performance through plan reuse.
Error Handling: Robust error management.
Modular Programming: Improved code organization.

5. Explain View concepts.

A view in SQL is a virtual table that is based on the result set of a SQL query. It does not store data physically but provides a way to simplify complex queries and enhance security by restricting access to specific data.

Characteristics:

Virtual Table: Acts like a table but does not store data itself.
Derived from Query: Created using a SELECT statement.
Read-Only or Updatable: Views can be read-only or allow updates depending on the SQL database and the view’s definition.

Creating a View: To create a view, use the CREATE VIEW statement followed by the view name and the SELECT query.

CREATE VIEW ViewName AS
SELECT column1, column2, ...
FROM table
WHERE condition;

Example:

CREATE VIEW EmployeeView AS
SELECT EmployeeID, Name, DepartmentName
FROM Employees
JOIN Departments ON Employees.DepartmentID = Departments.DepartmentID;

Using a View: You can query a view just like a regular table.

SELECT * FROM EmployeeView;

Benefits:

Simplifies Complex Queries: Abstracts complex joins and filters into a single view.
Enhances Security: Restricts access to specific columns or rows.
Data Consistency: Ensures consistent query results without duplicating data.
Eases Maintenance: Centralizes query logic, making it easier to update.

Updating Data through a View: Views can sometimes be updatable, allowing you to modify data in the underlying tables.

UPDATE EmployeeView
SET DepartmentName = 'Marketing'
WHERE EmployeeID = 101;

Restrictions:

Not all views are updatable. Restrictions depend on the SQL database and the complexity of the view.
Views with complex joins, aggregate functions, or DISTINCT clauses are typically read-only.

Dropping a View: To remove a view, use the DROP VIEW statement.

DROP VIEW ViewName;

Summary:

Views: Virtual tables created from SQL queries.
Creation: Defined using CREATE VIEW.
Usage: Queried like regular tables for simplified access and security.
Benefits: Simplifies complex queries, enhances security, and ensures data consistency.
Updating: Some views are updatable; others are read-only.
Maintenance: Centralized query logic eases updates and management.

6. What are triggers in SQL?

A trigger in SQL is a set of instructions that automatically execute (or “fire”) in response to specific events on a particular table or view. Triggers can be used to enforce business rules, maintain data integrity, and audit changes.

Types of Triggers:

Before Triggers:

Fires before an insert, update, or delete operation on a table.
Usage: Often used to validate or modify data before it is written to the table.

     CREATE TRIGGER BeforeInsertEmployee
     BEFORE INSERT ON Employees
     FOR EACH ROW
     BEGIN
         IF NEW.Salary < 0 THEN
             SET NEW.Salary = 0;
         END IF;
     END;

After Triggers:

Fires after an insert, update, or delete operation on a table.
Usage: Often used to update related tables, enforce referential integrity, or log changes.

     CREATE TRIGGER AfterInsertEmployee
     AFTER INSERT ON Employees
     FOR EACH ROW
     BEGIN
         INSERT INTO EmployeeLog(EmployeeID, Action)
         VALUES (NEW.EmployeeID, 'INSERT');
     END;

Trigger Timing:

BEFORE: Executes before the triggering event.
AFTER: Executes after the triggering event.

Trigger Events:

INSERT: Fires when a new row is inserted.
UPDATE: Fires when a row is updated.
DELETE: Fires when a row is deleted.

Trigger Levels:

Row-Level Triggers: Execute once for each row affected by the triggering event.
Statement-Level Triggers: Execute once for the entire statement, regardless of the number of rows affected.

Creating a Trigger: To create a trigger, use the CREATE TRIGGER statement.

CREATE TRIGGER trigger_name
BEFORE|AFTER INSERT|UPDATE|DELETE
ON table_name
FOR EACH ROW
BEGIN
    -- Trigger logic here
END;

Example Triggers:

Before Insert Trigger:

   DELIMITER //

   CREATE TRIGGER BeforeInsertEmployee
   BEFORE INSERT ON Employees
   FOR EACH ROW
   BEGIN
       IF NEW.Salary < 0 THEN
           SET NEW.Salary = 0;
       END IF;
   END //

   DELIMITER ;

After Insert Trigger:

   DELIMITER //

   CREATE TRIGGER AfterInsertEmployee
   AFTER INSERT ON Employees
   FOR EACH ROW
   BEGIN
       INSERT INTO EmployeeLog(EmployeeID, Action)
       VALUES (NEW.EmployeeID, 'INSERT');
   END //

   DELIMITER ;

Advantages of Triggers:

Automation: Automatically enforce business rules and data integrity.
Consistency: Ensure that related changes are handled uniformly.
Auditing: Track changes and maintain historical records.

Disadvantages of Triggers:

Complexity: Can make the system harder to debug and maintain.
Performance Overhead: May introduce performance overhead if not used carefully.

Summary:

Triggers: Automated actions that fire in response to specific table events.
Types: Before and after triggers, each can fire on insert, update, or delete events.
Levels: Row-level (per row) and statement-level (per statement).
Advantages: Enforce rules, maintain consistency, and track changes.
Disadvantages: Can increase complexity and impact performance.

7. What is the stored procedure in SQL?

A stored procedure in SQL is a precompiled set of one or more SQL statements that are stored under a name and processed as a unit. Stored procedures allow for efficient, reusable, and secure execution of SQL code within a database.

Key Features of Stored Procedures:

Precompiled Execution: Stored procedures are compiled once and stored in the database. This precompilation means they can execute faster than ad-hoc SQL queries because the database does not need to parse and compile the query each time.
Reusability: Stored procedures can be reused across multiple programs and applications, promoting code reuse and reducing redundancy.
Security: They can encapsulate complex logic and restrict direct access to the underlying tables. Permission can be granted to execute the procedure without allowing access to the actual data.
Maintainability: Changes to business logic are centralized within the stored procedure, making it easier to update and maintain without altering the application code.
Network Traffic Reduction: Reduces network traffic by executing multiple SQL statements in a single call to the database.

Creating a Stored Procedure: To create a stored procedure, you use the CREATE PROCEDURE statement followed by the procedure name and the SQL statements it encompasses.

DELIMITER //

CREATE PROCEDURE GetEmployeeDetails()
BEGIN
    SELECT * FROM Employees;
END //

DELIMITER ;

In this example, GetEmployeeDetails is a stored procedure that retrieves all records from the Employees table.

Executing a Stored Procedure: To execute a stored procedure, you use the CALL statement.

CALL GetEmployeeDetails();

This call runs the GetEmployeeDetails procedure, which retrieves and displays the employee data.

Parameters in Stored Procedures: Stored procedures can accept parameters to make them more dynamic.

DELIMITER //

CREATE PROCEDURE GetEmployeeByID(IN EmployeeID INT)
BEGIN
    SELECT * FROM Employees
    WHERE EmployeeID = EmployeeID;
END //

DELIMITER ;

In this example, the GetEmployeeByID procedure takes an input parameter EmployeeID and returns the details of that specific employee.

Advantages of Stored Procedures:

Performance: Precompiled and cached execution plans improve performance.
Modularity: Break down complex operations into simpler, reusable procedures.
Security: Protects database integrity by controlling access to data.
Consistency: Ensures consistent implementation of business logic.

8. Difference between the Function and Store procedure?

The function performs calculations and returns a single value or a table. It can be used in SELECT, WHERE, and other clauses. Syntax in MySQL:-

DELIMITER //

CREATE FUNCTION FunctionName(param datatype)
RETURNS datatype
DETERMINISTIC
BEGIN
    -- Function logic
    RETURN value;
END //

DELIMITER ;

Example:-

DELIMITER //
CREATE FUNCTION CalculateSquare (Number INT)
RETURNS INT
DETERMINISTIC
BEGIN
    RETURN Number * Number;
END //
DELIMITER ;

Calling the function:- You can call this function in a SELECT statement or wherever expressions are allowed.

SELECT CalculateSquare(5) AS SquareResult;

Stored Procedures perform a series of operations like insert, update, delete, and more complex logic. It was executed using the EXEC command in SQL server/Oracle, and CALL in MySQL. Syntax in MySQL:-

DELIMITER //

CREATE PROCEDURE ProcedureName()
BEGIN
    -- Procedure logic
END //

DELIMITER ;

Example:-

DELIMITER //

CREATE PROCEDURE UpdateEmployeeSalary(IN EmployeeID INT, IN NewSalary DECIMAL(10, 2))
BEGIN
    UPDATE Employees
    SET Salary = NewSalary
    WHERE EmployeeID = EmployeeID;
END //

DELIMITER ;

Executing the stored procedure in MySQL:-

CALL UpdateEmployeeSalary(1, 70000.00);

9. What is a Cursor?

A cursor in SQL is a database object used to retrieve, manipulate, and navigate through a result set row by row. It provides a way to process individual rows returned by a query, which is useful for performing operations that require row-by-row processing rather than set-based processing.

Key Concepts

Declare a Cursor: Define the cursor and the SQL query that it will execute.
Open the Cursor: Execute the query and populate the cursor with the result set.
Fetch Rows: Retrieve individual rows from the cursor.
Close the Cursor: Release the cursor and associated resources.
Deallocate the Cursor: Remove the cursor definition from memory.

Here’s an example of how to use a cursor in SQL to iterate through employees and update their salaries based on certain conditions. MySQL Example

-- Declare variables
DECLARE done INT DEFAULT 0;
DECLARE emp_id INT;
DECLARE emp_salary DECIMAL(10, 2);

-- Declare the cursor
DECLARE emp_cursor CURSOR FOR
SELECT EmployeeID, Salary FROM Employees;

-- Declare a handler for the end of the cursor
DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = 1;

-- Open the cursor
OPEN emp_cursor;

-- Fetch the first row
FETCH emp_cursor INTO emp_id, emp_salary;

-- Loop through all rows
WHILE done = 0 DO
    -- Perform operations (example: increase salary by 10%)
    UPDATE Employees SET Salary = emp_salary * 1.10 WHERE EmployeeID = emp_id;

    -- Fetch the next row
    FETCH emp_cursor INTO emp_id, emp_salary;
END WHILE;

-- Close the cursor
CLOSE emp_cursor;

Explanation:

Declare Variables: Used to store the current row’s data.
Declare Cursor: Define the cursor with a SELECT query.
Declare Handler: Handle the condition when the cursor reaches the end.
Open Cursor: Execute the query and prepare to fetch rows.
Fetch Rows: Retrieve rows one by one within a loop.
Perform Operations: Perform any row-by-row operations.
Close Cursor: Release resources associated with the cursor.

Use Cases

Iterating Over Rows: When you need to process each row individually.
Complex Logic: Applying complex business logic that can’t be done in a single SQL statement.
Row-by-Row Operations: Operations that depend on previous rows’ data.

While cursors are powerful, they can be less efficient than set-based operations. It’s often recommended to use them sparingly and opt for set-based SQL operations when possible.

10. How exceptions can be handled in MySQL?

In MySQL, exception handling can be managed using DECLARE ... HANDLER statements within stored procedures or functions. These handlers allow you to specify actions to be taken when specific conditions or errors occur. Example Usage:- Here’s an example of how to use exception handling in a stored procedure:

Creating a Stored Procedure with Exception Handling

DELIMITER $$

CREATE PROCEDURE UpdateSalary(
    IN emp_id INT,
    IN new_salary DECIMAL(10, 2)
)
BEGIN
    -- Declare a variable for errors
    DECLARE EXIT HANDLER FOR SQLEXCEPTION
    BEGIN
        -- Handle the exception
        ROLLBACK;
        SELECT 'An error occurred while updating the salary';
    END;

    -- Start transaction
    START TRANSACTION;

    -- Update the employee's salary
    UPDATE Employees
    SET Salary = new_salary
    WHERE EmployeeID = emp_id;

    -- Commit the transaction
    COMMIT;

    -- Success message
    SELECT 'Salary updated successfully';
END $$

DELIMITER ;

Explanation

DELIMITER $$: Changes the statement delimiter to $$ to allow for the creation of multi-statement procedures.
DECLARE EXIT HANDLER FOR SQLEXCEPTION: Declares a handler for SQL exceptions. If an error occurs, the specified block of code will be executed.
BEGIN … END: Defines the block of code to execute when an error occurs. Here, it performs a ROLLBACK and returns an error message.
START TRANSACTION: Begins a new transaction.
UPDATE: The operation that could potentially cause an error.
COMMIT: Commits the transaction if no errors occur.
SELECT ‘Salary updated successfully’: Returns a success message if the transaction is committed successfully.

Types of Handlers

CONTINUE HANDLER: Continues execution after handling the exception.
EXIT HANDLER: Exits the procedure after handling the exception.

Example of CONTINUE HANDLER

DELIMITER $$

CREATE PROCEDURE TestHandler()
BEGIN
    DECLARE CONTINUE HANDLER FOR SQLEXCEPTION
    BEGIN
        SELECT 'Handled an exception';
    END;

    -- Intentionally cause an error
    DROP TABLE NonExistentTable;

    -- Execution continues here after handling the error
    SELECT 'This will still run';
END $$

DELIMITER ;

In this example, even after the error caused by trying to drop a non-existent table, the procedure continues execution and returns the second message.

Use Cases

Transaction Management: Ensuring that transactions are correctly handled, rolled back, or committed.
Custom Error Messages: Providing user-friendly error messages.
Debugging: Capturing and logging errors for debugging purposes.

If you enjoyed this post, share it with your friends. Do you want to share more information about the topic discussed above or do you find anything incorrect? Let us know in the comments. Thank you!

SQL Fundamentals

SQL Constraints and Data Types

SQL Joins and Relationships

Query

Advanced SQL Concepts

Leave a Comment Cancel Reply