Data Structure is a way of collecting and organizing data in such a way that we can perform operations on these data in an effective way. It is about rendering data elements in terms of some relationship, for better organization and storage.
Data structures are broadly classified into two major categories:
- Built-in Data Structures (Primitive)
- User Defined Data Structures (Complex/Abstract)
Anything that can store data can be called a data structure. These are known as Primitive Data Structures.
| Type | Description |
|---|---|
| Integer | Used to store whole numbers. |
| Float | Used to store decimal or fractional values. |
| Character | Used to store single characters. |
| Pointer | Used to store memory addresses. |
Example of Primitive Data Organization:
- We have some data which has a player's name "Virat" and age 26.
- Here "Virat" is of String data type and 26 is of integer data type.
- We can organize this data as a record like Player record.
These Abstract Data Structures are used to store large and connected data.
- Arrays: Stores elements sequentially.
- Files: Used for storing data permanently.
Lists are categorized into:
- Stacks: Follows LIFO principle.
- Queues: Follows FIFO principle.
- Trees: Hierarchical structure with root node.
- Graphs: Nodes connected via edges.
(Diagram derived from the classification chart of Data Structures)
A linked list is a sequence of data structures connected via links.
- Link: Stores data.
- Next: Points to next node.
- Head: First node.
- Navigation is forward only.
- Contains data and next pointer.
- Forward & backward navigation.
- Contains next and previous pointers.
- Last node connects to first node.
- Two-Way Navigation
- Dynamic Applications
- Efficient Memory Management
- Restricted Navigation
- Complex Deletion
- Sequential Access Only
Expression notation is a way of writing mathematical expressions using operators and operands. These notations are widely used in compiler design, expression evaluation, and stack applications.
There are three main types:
- Infix Notation
- Prefix Notation (Polish Notation)
- Postfix Notation (Reverse Polish Notation)
Operator is written between operands
Most commonly used form in mathematics
Operand Operator Operand
A + B
3 * 5
(A + B) * C
- Requires parentheses to define precedence
- Easy for humans to understand
- Difficult for machines to evaluate directly
Operator is written before operands
No need for parentheses
Operator Operand Operand
+ A B
* 3 5
* + A B C
Read expression right to left
Operator is written after operands
No parentheses required
Operand Operand Operator
A B +
3 5 *
A B + C *
Read expression left to right
Uses stack for evaluation
3 * 3 / (4 - 1) + 6 * 2
| Step | Expression | Stack | Output |
|---|---|---|---|
| 1 | 3 | — | 3 |
| 2 | * | * | 3 |
| 3 | 3 | * | 33 |
| 4 | / | / | 33* |
| 5 | ( | /( | 33* |
| 6 | 4 | /( | 33*4 |
| 7 | - | /(- | 33*4 |
| 8 | 1 | /(- | 33*41 |
| 9 | ) | / | 33*41- |
| 10 | + | + | 33*41-/ |
| 11 | 6 | + | 33*41-/6 |
| 12 | * | +* | 33*41-/62 |
| 13 | 2 | +* | 33*41-/62 |
| Final | — | — | 3341-/62+ |
| Feature | Infix | Prefix | Postfix |
|---|---|---|---|
| Operator Position | Middle | Beginning | End |
| Parentheses Needed | Yes | No | No |
| Evaluation | Complex | Easy | Easy |
| Used In | Humans | Compilers | Stack Evaluation |
| Example | A+B | +AB | AB+ |
- Expression evaluation using stack
- Compiler design
- Syntax parsing
- Arithmetic expression conversion
A circular queue is a linear data structure in which the last position is connected back to the first position, forming a circular structure.
It overcomes the limitation of simple queue (wasted space).
- Elements are arranged in a circle
- Uses modulo operation (%) for wrapping
- Step 1: If (REAR + 1) % SIZE == FRONT → Overflow
- Step 2: If queue empty → FRONT = REAR = 0
- Step 3: Else → REAR = (REAR + 1) % SIZE
- Step 4: Insert element at REAR
- Step 1: If FRONT == -1 → Underflow
- Step 2: Remove element at FRONT
- Step 3: If FRONT == REAR → FRONT = REAR = -1
- Step 4: Else → FRONT = (FRONT + 1) % SIZE
🔹 Initial State:
Queue: [ ] [ ] [ ] [ ]
FRONT = -1, REAR = -1
🔹 Insert 10, 20, 30
[10] [20] [30] [ ]
FRONT = 0, REAR = 2
🔹 Delete one element
[ ] [20] [30] [ ]
FRONT = 1, REAR = 2
🔹 Insert 40 (Circular Move)
[40] [20] [30] [ ]
FRONT = 1, REAR = 3 → then wraps to 0
| # | Advantage |
|---|---|
| 1 | Efficient memory utilization |
| 2 | No wastage of space |
| 3 | Faster operations |
| 4 | Useful in real-time systems |
| # | Disadvantage |
|---|---|
| 1 | Complex implementation |
| 2 | Hard to detect full/empty condition |
| 3 | Requires modulo operation |
- CPU scheduling
- Memory management
- Traffic signal systems
- Buffers (keyboard, printer)
| Feature | Linear Queue | Circular Queue |
|---|---|---|
| Structure | Straight | Circular |
| Space Usage | Wasted | Fully utilized |
| Efficiency | Low | High |
| Overflow Condition | Rear = Max | (Rear+1)%Size = Front |
A tree is a non-linear hierarchical data structure consisting of nodes connected by edges. It represents relationships in a parent–child form.
- Basic element of a tree
- Contains data + links to child nodes
- Topmost node of the tree
- Has no parent
Connection between two nodes
- Parent: Node having child nodes
- Child: Node derived from parent
Node with no children
Node having at least one child
Number of children of a node
Maximum degree of any node in the tree
Number of edges from root to node
Level 0 → A
Level 1 → B, C
Maximum level of any node
Distance from root to a specific node
A tree formed by a node and its descendants
Tree can be represented in memory using different methods.
- Tree stored in array form
- Mainly used for binary trees
Rules:
- If node index = i
- Left child = 2i
- Right child = 2i + 1
Example:
- Each node contains:
- Data
- Pointer to left child
- Pointer to right child
Stores parent of each node in array
| Representation | Advantage | Disadvantage |
|---|---|---|
| Array | Simple & fast | Wastes memory |
| Linked List | Dynamic | Extra pointer memory |
| Parent Array | Easy parent access | Not efficient for traversal |
| Child List | Flexible | Complex structure |
- File systems
- Database indexing
- Expression trees
- Artificial Intelligence
A heap is a complete binary tree that satisfies the heap property.
There are two types:
- Min Heap
- Max Heap
Definition:
Parent node is smaller than or equal to its children
Parent ≤ Child
Root contains minimum element
ARRAY REPRESENTATION:
[10, 20, 30, 40, 50]
PROPERTY:
Smallest element always at root
Definition:
Parent node is greater than or equal to its children
Parent ≥ Child
Root contains maximum element
ARRAY REPRESENTATION:
[50, 30, 40, 10, 20]
Steps:
- Insert element at last position
- Compare with parent
- Swap if heap property violated
- Repeat until correct position
Steps:
- Remove root element
- Replace with last element
- Heapify (adjust tree)
- Maintain heap property
| Feature | Min Heap | Max Heap |
|---|---|---|
| Root Value | Minimum | Maximum |
| Property | Parent ≤ Child | Parent ≥ Child |
| Use Case | Priority queue (min) | Priority queue (max) |
| Example | 10,20,30 | 50,30,40 |
- Priority Queue
- Heap Sort
- Scheduling algorithms
- Graph algorithms (Dijkstra, Prim)
Min Heap → Extract Minimum
Max Heap → Extract Maximum
Sorting is the process of arranging data elements in a specific order (ascending or descending). It improves searching efficiency and data organization.
Bubble Sort is a simple sorting technique where adjacent elements are compared and swapped if they are in the wrong order.
- Repeatedly compare adjacent elements
- Swap if left > right
- Largest element moves to the end in each pass
| Case | Complexity |
|---|---|
| Best Case | O(n) |
| Average Case | O(n²) |
| Worst Case | O(n²) |
Radix Sort is a non-comparison sorting algorithm that sorts numbers digit by digit (from least significant digit to most significant).
- Sort numbers based on individual digits
- Use bucket system (0–9)
- Repeat for each digit place
| Case | Complexity |
|---|---|
| Best | O(nk) |
| Average | O(nk) |
| Worst | O(nk) |
(k = number of digits)
| Feature | Bubble Sort | Radix Sort |
|---|---|---|
| Type | Comparison | Non-comparison |
| Complexity | O(n²) | O(nk) |
| Efficiency | Slow | Fast for large data |
| Stability | Stable | Stable |
| Space | In-place | Extra space needed |
| Use Case | Small data | Large numeric data |
Bubble Sort → Simple but inefficient
Radix Sort → Efficient for large numbers
Searching is the process of finding a specific element in a data structure.
Two main techniques:
- Linear Search
- Binary Search
Linear Search checks each element sequentially until the target element is found.
| Case | Complexity |
|---|---|
| Best | O(1) |
| Average | O(n) |
| Worst | O(n) |
Binary Search works on sorted arrays and repeatedly divides the search space into half.
| Case | Complexity |
|---|---|
| Best | O(1) |
| Average | O(log n) |
| Worst | O(log n) |
| Feature | Linear Search | Binary Search |
|---|---|---|
| Requirement | Unsorted OK | Must be sorted |
| Complexity | O(n) | O(log n) |
| Speed | Slow | Fast |
| Method | Sequential | Divide & Conquer |
| Use Case | Small data | Large sorted data |
Linear Search → Check one by one
Binary Search → Divide and search
Graph traversal is the process of visiting all the vertices (nodes) of a graph in a systematic way.
Two important traversal techniques are:
- Breadth First Search (BFS)
- Depth First Search (DFS)
BFS visits nodes level by level, starting from the source node.
- Uses Queue (FIFO)
- Explores neighbors first
O(V + E)
- Shortest path in unweighted graphs
- Social networks (friend suggestions)
- Web crawling
- Network broadcasting
DFS explores as deep as possible before backtracking.
- Uses Stack / Recursion
- Follows depth-wise traversal
O(V + E)
- Cycle detection
- Topological sorting
- Path finding
- Solving puzzles (maze, backtracking)
| Feature | BFS | DFS |
|---|---|---|
| Data Structure | Queue | Stack/Recursion |
| Traversal | Level-wise | Depth-wise |
| Shortest Path | Yes | No |
| Memory Usage | More | Less |
| Speed | Slower | Faster (sometimes) |
BFS → Level by level traversal
DFS → Go deep first then backtrack
A Minimum Spanning Tree is a subset of edges that:
- Connects all vertices
- Has no cycles
- Has minimum total weight
Prim’s Algorithm is a greedy algorithm used to find the Minimum Spanning Tree of a weighted graph.
- Start from any vertex
- Select the minimum weight edge
- Add new vertex to MST
- Repeat until all vertices included
- Network design (roads, cables)
- Electrical wiring
- Telecommunication networks
| Feature | Shortest Path | Minimum Spanning Tree |
|---|---|---|
| Goal | Shortest distance between nodes | Minimum total weight |
| Algorithms | Dijkstra, Bellman-Ford | Prim, Kruskal |
| Path | Between two nodes | Covers all nodes |
| Cycle | Allowed | Not allowed |
| Example | Navigation systems | Network design |
Shortest Path → Min distance between two nodes
MST → Min total cost connecting all nodes
Asymptotic notations are used to analyze the performance of algorithms in terms of time and space complexity. They describe how an algorithm behaves as input size n increases.
- Big-O Notation (O)
- Omega Notation (Ω)
- Theta Notation (θ)
Big-O represents the upper bound (worst case) of an algorithm.
O(f(n)) = maximum time required
Linear Search worst case:
Element at last position → O(n)
Omega represents the lower bound (best case) of an algorithm.
Ω(f(n)) = minimum time required
Linear Search best case:
Element at first position → Ω(1)
Theta represents the tight bound (average case).
θ(f(n)) = both upper and lower bound
Linear Search average case:
θ(n)
| Notation | Meaning | Case |
|---|---|---|
| Big-O | Upper bound | Worst |
| Omega | Lower bound | Best |
| Theta | Exact bound | Average |
Checks elements one by one sequentially
| Case | Complexity |
|---|---|
| Best Case | Ω(1) |
| Average Case | θ(n) |
| Worst Case | O(n) |
Array: [10, 20, 30, 40, 50]
Search 10 → 1 step → Ω(1)
Search 50 → 5 steps → O(n)
Divides array into halves repeatedly
Requires sorted array
| Case | Complexity |
|---|---|
| Best Case | Ω(1) |
| Average Case | θ(log n) |
| Worst Case | O(log n) |
Array: [10, 20, 30, 40, 50]
Search 40:
Step 1 → mid = 30
Step 2 → search right → 40 found
| Feature | Linear Search | Binary Search |
|---|---|---|
| Method | Sequential | Divide & Conquer |
| Best | Ω(1) | Ω(1) |
| Worst | O(n) | O(log n) |
| Average | θ(n) | θ(log n) |
| Requirement | Unsorted OK | Sorted required |
Linear Search → Slower (O(n))
Binary Search → Faster (O(log n))
A linked list is a dynamic data structure consisting of nodes where each node contains:
- Data
- Pointer to next node
Each node points to next node only
Each node has two pointers
Last node points back to first node
➤ At Beginning
- Step 1: Create new node
- Step 2: new->next = head
- Step 3: head = new node
➤ At End
- Step 1: Traverse to last node
- Step 2: last->next = new node
- Step 3: new->next = NULL
➤ At Middle
- Step 1: Find position
- Step 2: new->next = temp->next
- Step 3: temp->next = new node
➤ From Beginning
- Step 1: temp = head
- Step 2: head = head->next
- Step 3: delete temp
➤ From End
- Step 1: Traverse to second last node
- Step 2: second_last->next = NULL
➤ From Middle
- Step 1: Find node
- Step 2: prev->next = current->next
- Step 3: delete current
- Step 1: Start from head
- Step 2: Print each node
- Step 3: Move to next until NULL
- Step 1: Start from head
- Step 2: Compare each node
- Step 3: If found → return position
| # | Advantage |
|---|---|
| 1 | Dynamic size |
| 2 | Efficient insertion/deletion |
| 3 | No memory wastage |
| # | Disadvantage |
|---|---|
| 1 | No random access |
| 2 | Extra memory for pointers |
| 3 | Traversal is slow |
Linked List → Dynamic and flexible
Array → Fixed and faster access
A stack is a linear data structure that follows the LIFO (Last In First Out) principle. All operations are performed at one end called TOP.
- Stack is implemented using an array
- A variable top is used to track the top element
Initially:
top = -1
Inserts element into stack
Removes element from stack
Returns top element without removing
value = stack[top]
isEmpty → top == -1
isFull → top == MAX-1
Stack is widely used for expression conversion and evaluation.
| Type | Example |
|---|---|
| Infix | A + B |
| Prefix | +AB |
| Postfix | AB+ |
- Expression evaluation
- Expression conversion
- Recursion
- Backtracking
- Undo operations
Stack → LIFO
Used in expression evaluation & recursion
A queue is a linear data structure that follows FIFO (First In First Out) principle.
Insertion → REAR
Deletion → FRONT
Elements processed based on priority
High → Medium → Low
Insertion & deletion from both ends
- CPU scheduling
- Printer queue
- Buffering
- Traffic systems
| Feature | Stack | Queue |
|---|---|---|
| Order | LIFO | FIFO |
| Ends Used | One | Two |
| Example | Plates | Line |
A Binary Search Tree is a binary tree where:
- Left subtree → values less than root
- Right subtree → values greater than root
Simply remove node
Replace node with its child
Find inorder successor
Replace node value
Delete successor
| Operation | Complexity |
|---|---|
| Insertion | O(log n) |
| Searching | O(log n) |
| Deletion | O(log n) |
- Searching & sorting
- Database indexing
- Auto-complete systems
BST → Faster searching using ordered structure
An AVL Tree is a self-balancing Binary Search Tree (BST) where the difference between heights of left and right subtrees is at most 1.
BF = Height(Left Subtree) – Height(Right Subtree)
BF must be -1, 0, or +1
If BF is outside this range → Tree becomes unbalanced → Rotations are applied.
When imbalance occurs, rotations are performed to restore balance.
Occurs when node is inserted in right subtree of right child
Occurs when node is inserted in left subtree of left child
Occurs when insertion is in right subtree of left child
Occurs when insertion is in left subtree of right child
| Case | Rotation |
|---|---|
| LL | Right Rotation |
| RR | Left Rotation |
| LR | Left + Right |
| RL | Right + Left |
| Feature | AVL Tree | BST |
|---|---|---|
| Balance | Self-balanced | Not balanced |
| Height | O(log n) | O(n) worst |
| Search | Fast | Slow (skewed tree) |
| Complexity | O(log n) | O(n) worst |
| Rotations | Required | Not required |
AVL Tree → Balanced → Faster operations
BST → Can become skewed → Slower
Sorting arranges elements in ascending or descending order. Quick Sort and Merge Sort are efficient divide-and-conquer algorithms.
Quick Sort divides the array into smaller subarrays using a pivot element.
- Select pivot
- Partition array
- Elements < pivot → left
- Elements > pivot → right
- Recursively sort subarrays
| Case | Complexity |
|---|---|
| Best | O(n log n) |
| Average | O(n log n) |
| Worst | O(n²) |
Merge Sort divides the array into halves and then merges sorted halves.
- Divide array into two halves
- Recursively sort each half
- Merge sorted halves
| Case | Complexity |
|---|---|
| Best | O(n log n) |
| Average | O(n log n) |
| Worst | O(n log n) |
| Feature | Quick Sort | Merge Sort |
|---|---|---|
| Method | Partition | Divide & Merge |
| Space | In-place | Extra memory |
| Speed | Faster (average) | Stable performance |
| Worst Case | O(n²) | O(n log n) |
| Stability | Not stable | Stable |
Quick Sort → Faster in practice
Merge Sort → Guaranteed O(n log n)
Hashing is a technique used to store and retrieve data efficiently using a hash function.
- Data is stored in a hash table
- Key is converted into an index (address)
Index = Hash Function(Key)
A hash function maps a key to an index.
H(key) = key % table_size
H(23) = 23 % 10 = 3
H(key) = key mod m
- Most commonly used
- m = table size
- Square the key
- Extract middle digits
Key = 123 → 123² = 15129 → Middle = 512
- Split key into parts
- Add parts
1234 → 12 + 34 = 46
H(key) = floor(m * (key * A % 1))
(A = constant)
Collision occurs when two keys map to same index.
H(23) = 3
H(33) = 3 → Collision
Use linked list at each index
✔ Advantage:
- Simple
- Handles multiple collisions
❌ Disadvantage:
- Extra memory
Stores elements in the table itself.
Index = (H(key) + i) % size
Example:
3 occupied → try 4 → 5 → ...
Index = (H(key) + i²) % size
Index = (H1(key) + i * H2(key)) % size
| Method | Idea | Advantage | Disadvantage |
|---|---|---|---|
| Chaining | Linked list | Easy | Extra space |
| Linear Probing | Next slot | Simple | Clustering |
| Quadratic | i² jump | Less clustering | Complex |
| Double Hashing | Two hash | Best distribution | Costly |
- Database indexing
- Symbol tables
- Password storage
- Caching systems
Hashing → Fast data access (O(1))
Collision → Must be handled properly
A graph is defined as:
G = (V, E)
V → Set of vertices
E → Set of edges
2D array representation
✔ Advantages:
- Simple
- Easy to implement
❌ Disadvantages:
- Uses more memory
Uses linked list
✔ Advantages:
- Memory efficient
- Better for sparse graphs
❌ Disadvantages:
- Slightly complex
| Feature | Matrix | List |
|---|---|---|
| Memory | High | Low |
| Speed | Fast lookup | Slower |
| Usage | Dense graph | Sparse graph |
A → B → D → C → E
| Feature | BFS | DFS |
|---|---|---|
| Structure | Queue | Stack |
| Traversal | Level-wise | Depth-wise |
| Shortest Path | Yes | No |
| Memory | High | Low |
- BFS → Shortest path
- DFS → Cycle detection
- Network routing
- AI search algorithms
BFS → Breadth (level)
DFS → Depth (deep traversal)
Dijkstra’s Algorithm is used to find the shortest path from a source vertex to all other vertices in a weighted graph with non-negative weights.
- Start from source node
- Select node with minimum distance
- Update distances of adjacent nodes
- Repeat until all nodes are visited
Shortest path from A:
A → C = 1
A → B = 4
A → D = 6
| Case | Complexity |
|---|---|
| Using array | O(V²) |
| Using priority queue | O((V+E) log V) |
- Network routing
- GPS navigation
- Shortest path problems
Kruskal’s Algorithm is a greedy algorithm used to find the Minimum Spanning Tree (MST) of a graph.
- Sort all edges by weight
- Pick smallest edge
- Add edge if it does not form cycle
- Repeat until MST formed
Minimum Spanning Tree:
AB(1), BC(2), BD(4)
Total Cost = 7
| Case | Complexity |
|---|---|
| Using sorting | O(E log E) |
- Network design
- Road construction
- Cable layout
| Feature | Dijkstra | Kruskal |
|---|---|---|
| Goal | Shortest Path | MST |
| Type | Path algorithm | Tree algorithm |
| Works on | Nodes | Edges |
| Cycle allowed | Yes | No |
Finds minimum distance between two nodes
Example: Dijkstra, Bellman-Ford
Connects all nodes with minimum total weight
Example: Prim, Kruskal
| Feature | Shortest Path | Minimum Spanning Tree |
|---|---|---|
| Objective | Minimum distance between nodes | Minimum total weight |
| Coverage | Between two nodes | Covers all nodes |
| Algorithms | Dijkstra, Bellman-Ford | Prim, Kruskal |
| Cycle | Allowed | Not allowed |
| Result | Path | Tree |
| Use Case | Navigation | Network design |
Shortest Path:
A → B → D (minimum distance)
MST:
A-B, B-C, B-D (minimum total cost)
Shortest Path → Focus on one destination
MST → Connect all nodes with minimum cost
0 Comments