Skip to content

Commit

Permalink
Updated contents of Hash table (#678)
Browse files Browse the repository at this point in the history
* Resolved the issue of adding about HTTP - HTTPS

* Resolved the issue of adding object based model

* Hash table modified and added detailed content and modification

* Hash table modified and added detailed content and modification and refactored

* Refactoring

* Updated Hash table content

* Updated Hash table content refactoring

* Updated Hash table content refactoring

* Hash table error resolve

* Added Advanced concepts in OOPS , like Friend class , Operator overloading and Function overloading

* reverted changes

* Updated hash table
  • Loading branch information
rohankaushal123 authored Oct 20, 2023
1 parent a950293 commit b9afd92
Showing 1 changed file with 194 additions and 53 deletions.
247 changes: 194 additions & 53 deletions Data Structures/HashTable.md
Original file line number Diff line number Diff line change
@@ -1,83 +1,224 @@
# Hash Tables
# Hash Tables: A Comprehensive Guide

* Hash Tables are data structures which store data in an associative manner.
* In a hash table, data is stored in an array format, where each data value has its own unique index value.
* Access of data becomes very fast if we know the index of the desired data.
* In a Hash table it stores information in two main components, i.e., key and value.
* The efficiency of mapping depends upon the efficiency of the hash function used for mapping.
Hash tables, also known as hash maps, are essential data structures that play a crucial role in computer science and software development. They are widely used for storing and retrieving data efficiently, making them a fundamental concept for anyone working with data. In this comprehensive guide, we will explore hash tables in great detail, covering their definition, structure, operations, advanced techniques, and practical implementations in various programming languages. By the end of this guide, you will have a deep understanding of hash tables and how to use them effectively in your projects.

Some important points about hash tables:
## Table of Contents

1. Values are not stored in a sorted order.
2. You mush account for potential collisions. This is usually done with a technique called chaining. Chaining means to create a linked list of values, the keys of which map to a certain index.
1. [Introduction to Hash Tables](#1-introduction-to-hash-tables)
- 1.1 What are Hash Tables?
- 1.2 Why Use Hash Tables?

## Hashing
2. [Key Concepts](#2-key-concepts)
- 2.1 Data Storage
- 2.2 Key-Value Pairs
- 2.3 Hash Function
- 2.4 No Sorting
- 2.5 Handling Collisions

* Hashing is one of the searching techniques that uses a constant time. The time complexity in hashing is O(1).
* Till now, we've read the two techniques for searching, i.e., linear search and binary search. The worst time complexity in linear search is O(n), and O(logn) in binary search.
* In both the searching techniques, the searching depends upon the number of elements but we want the technique that takes a constant time. So, hashing technique came that provides a constant time.
3. [Anatomy of a Hash Table](#3-anatomy-of-a-hash-table)

* In Hashing technique, the hash table and hash function are used. Using the hash function, we can calculate the address at which the value can be stored.
4. [Hashing](#4-hashing)
- 4.1 What is Hashing?
- 4.2 Hash Functions
- 4.3 Properties of a Good Hash Function
- 4.4 Example of Hashing

* The main idea behind hashing is to create (key/value) pairs. If the key is given, then the algorithm computes the index at which the value would be stored. It can be written as:
5. [Hash Collision Resolution](#5-hash-collision-resolution)
- 5.1 Collision Problem
- 5.2 Chaining
- 5.3 Open Addressing

```py
Index = hash(key)
```
6. [Performance and Time Complexity](#6-performance-and-time-complexity)
- 6.1 Time Complexity
- 6.2 Efficiency Factors

7. [Practical Implementation](#7-practical-implementation)
- 7.1 Hash Tables in Python
- 7.2 Hash Tables in Java
- 7.3 Hash Tables in C++

8. [Advanced Hash Table Techniques](#8-advanced-hash-table-techniques)
- 8.1 Open Addressing
- 8.2 Perfect Hashing
- 8.3 Dynamic Hash Tables
- 8.4 Hash Functions

9. [Use Cases and Applications](#9-use-cases-and-applications)

10. [Conclusion](#10-conclusion)

## 1. Introduction to Hash Tables

### 1.1 What are Hash Tables?

A hash table, also known as a hash map, is a data structure that enables efficient data storage and retrieval. It is based on the idea of associative arrays, where data is stored in a way that allows for quick access using a unique key. This key-value pair system makes hash tables versatile for various applications, from database systems to programming languages.

### 1.2 Why Use Hash Tables?

Hash tables are used extensively in computer science and software development for several reasons:

- **Fast Data Retrieval**: Hash tables provide constant-time average complexity for insertion, retrieval, and deletion operations, making them ideal for scenarios where speed is critical.

- **Unique Keys**: Each key in a hash table is unique, ensuring that data is stored without redundancy.

- **Versatility**: Hash tables are suitable for various data structures, including sets, dictionaries, and caches.

- **Efficient Search**: Hash tables use a hash function to index data, leading to quick lookups.

## 2. Key Concepts

### 2.1 Data Storage

Hash tables store data in an array-like format, where each data value is associated with a unique index. This index is calculated using a hash function, ensuring that data can be efficiently retrieved.

### 2.2 Key-Value Pairs

Hash tables operate on a key-value pair system, where data is accessed using a key. The key is mapped to a specific value, allowing for structured data storage.

### 2.3 Hash Function

The heart of a hash table is the hash function. This function takes a key and returns an index, indicating where the associated value is stored. A good hash function minimizes collisions and ensures a uniform distribution of data.

### 2.4 No Sorting

Unlike some data structures that store data in a sorted order, hash tables store data based on the calculated index, which is not sorted.

### 2.5 Handling Collisions

Hash tables need to account for potential collisions when multiple keys map to the same index. This is typically managed using techniques such as chaining or open addressing.

## 3. Anatomy of a Hash Table

A typical hash table consists of the following components:

- **Array**: The underlying data structure used to store the key-value pairs. It is often an array with a fixed or dynamically adjusted size.

- **Hash Function**: This function calculates the index for a given key. The quality of the hash function greatly affects the performance of the hash table.

- **Key-Value Pairs**: The actual data stored in the hash table. Each pair includes a key and the associated value.

- **Buckets**: The slots or containers in the array where data is stored. These are used to handle collisions.

## 4. Hashing

### 4.1 What is Hashing?

Hashing is a fundamental concept in computer science and plays a crucial role in the operation of hash tables. It involves taking an input (or 'key') and producing a fixed-size string of characters, which is typically a numerical value. This output, often called a hash code or hash value, is used as the index to access the associated data.

### 4.2 Hash Functions

## Features of Hashtables
A hash function is the core of hashing and is used to determine the index for a given key. The quality of the hash function is critical in ensuring minimal collisions and efficient data retrieval.

1. It is similar to HashMap, but is synchronized.
2. Hashtable stores key/value pair in hash table.
3. In Hashtable we specify an object that is used as a key, and the value we want to associate to that key. The key is then hashed, and the resulting hash code is used as the index at which the value is stored within the table.
4. The initial default capacity of Hashtable class is 11 whereas loadFactor is 0.75.
5. HashMaps doesn’t provide any Enumeration, while Hashtable provides not fail-fast Enumeration.
### 4.3 Properties of a Good Hash Function

### Example
A good hash function exhibits the following properties:

```md
Let hash function H(x) = [11,12,13,14,15]
// it will be stored at positions {1,2,3,4,5}
// in the array or Hash table respectively.
- **Deterministic**: For a given input, it should always produce the same hash code.

- **Efficient**: The function should compute the hash code quickly.

- **Uniform Distribution**: The hash codes should be uniformly distributed across the available indexes to minimize collisions.

### 4.4 Example of Hashing

Consider a simple example of a hash function for strings. The hash function could sum the ASCII values of the characters in the string and then take the modulus of the array size to determine the index. For instance, if the hash function calculates that a key should map to index 5, the associated value would be stored in the 5th bucket of the array.

## 5. Hash Collision Resolution

### 5.1 Collision Problem

Collisions occur when two different keys map to the same index in the hash table. For instance, if two different keys result in the same hash code, they would attempt to occupy the same bucket. Collision resolution techniques are employed to manage such situations.

### 5.2 Chaining

Chaining is a common technique to handle collisions. In chaining, each bucket contains a linked list of key-value pairs. When a collision occurs, the new pair is added to the existing linked list at that index. Chaining ensures that multiple key-value pairs can coexist at the same index.

### 5.3 Open Addressing

Open addressing is an alternative collision resolution technique. When a collision occurs, the algorithm probes for the next available slot (index) in the hash table until an empty slot is found. This method avoids using linked lists and directly stores values in the table.

## 6. Performance and Time Complexity

### 6.1 Time Complexity

Hash tables offer O(1) average time complexity for insertion, retrieval, and deletion operations. However, it is essential to consider worst-case scenarios. The quality of the hash function, as well as the collision resolution technique used, can affect the time complexity. In some cases, the time complexity may become O(n) when numerous collisions occur.

### 6.2 Efficiency Factors

Efficient hashing and collision resolution methods are crucial for maintaining the constant-time performance of hash tables. Quality hash functions and well-implemented collision resolution techniques contribute to optimal results.

## 7. Practical Implementation

In this section, we will explore the practical implementation of hash tables in several programming languages. Hash tables are versatile and widely supported in various languages, making them a valuable tool for software development.

### 7.1 Hash Tables in Python

Python provides a built-in `dict` data structure, which is essentially a hash table. Here's how to declare and use a dictionary:

Python
```
student_hash = {}
student_hash['Alice'] = {'age': 20, 'grade': 'A'}
```

## Declaration
### 7.2 Hash Tables in Java

1. In c++
In Java, you can use the java.util.Hashtable class to create a hash table. Here's an example:

```C++
unordered_map<int,int>mp;
# map named as mp, an unordered map takes lesser time complexity than an ordered map.
for(int i=0;i<arr.size();i++){
mp[arr[i]]++
}
Java
```
import java.util.Hashtable;
Hashtable<String, Integer> studentHash = new Hashtable<>();
studentHash.put("Alice", 20);
```

In this way , map named mp stores the frequency of all elements in array.
### 7.3 Hash Tables in C++

2. In Java
In C++, you can use the Standard Template Library (STL) to work with hash tables. Here's an example using std::unordered_map:

```java
Hashtable<Integer, String> hashtableName = new Hashtable<Integer, String>();
hashtableName.put(integer, string);
C++
```
#include <unordered_map>
std::unordered_map<std::string, int> studentHash;
studentHash["Alice"] = 20;
```

## 8. Advanced Hash Table Techniques

In this section, we will explore advanced techniques and concepts related to hash tables.

### 8.1 Open Addressing

Open addressing is an advanced collision resolution technique that involves probing for the next available slot when a collision occurs. It avoids the use of linked lists and directly stores values in the hash table.

### 8.2 Perfect Hashing

Perfect hashing is a technique that aims to eliminate collisions entirely by designing hash functions that guarantee unique indexes for every key.

### 8.3 Dynamic Hash Tables

Dynamic hash tables allow for resizing of the underlying array as data grows. This dynamic resizing ensures optimal performance and memory usage.

### 8.4 Hash Functions

The quality of the hash function is vital for the performance of hash tables. Advanced techniques involve designing hash functions that minimize collisions and distribute data uniformly.

## 9. Use Cases and Applications

3. In Python, the Dictionary data types represent the implementation of hash tables. The Keys in the dictionary satisfy the following requirements.
Hash tables have a wide range of applications in computer science and software development:

* The keys of the dictionary are hashable i.e. the are generated by hashing function which generates unique result for each unique value supplied to the hash function.
- **Databases**: Hash tables are used in database management systems to optimize data retrieval.

* The order of data elements in a dictionary is not fixed.
- **Caches**: Caching systems use hash tables to store frequently accessed data for quick retrieval.

Accessing Values in Dictionary
To access dictionary elements, you can use the familiar square brackets along with the key to obtain its value.
- **Language Features**: Hash tables are implemented in many programming languages as dictionaries or associative arrays.

## Example
- **Compiler Symbol Tables**: Compilers use hash tables to manage symbol tables efficiently.

### Declaring a dictionary
- **DNS Lookups**: Domain Name System (DNS) servers use hash tables to map domain names to IP addresses.

dict = {'Name': 'Zara', 'Age': 7, 'Class': 'First'}
## 10. Conclusion

### Accessing the dictionary with its key
Hash tables are a cornerstone of computer science and software development. They offer efficient data storage and retrieval, making them invaluable for a wide range of applications. Understanding how hash tables work, including their structure, hashing techniques, and collision resolution methods, is essential for building efficient software systems. Whether you are working with databases, implementing language features, or optimizing data access, hash tables are a powerful tool that can greatly improve the performance of your applications.

print("dict['Name']: ", dict['Name'])
print("dict['Age']: ", dict['Age'])
In this comprehensive guide, we have explored the fundamental concepts of hash tables, delved into their advanced techniques, and provided practical examples in various programming languages. Armed with this knowledge, you are well-equipped to leverage hash tables effectively in your software projects.

0 comments on commit b9afd92

Please sign in to comment.