Hashing A Guide For Beginners
Hashing: A Guide For Beginners
July 30, 2020
Hashing A Guide For Beginners
Hashing: A Guide For Beginners
July 30, 2020

Hashing refers to generate a value or values from a string of text by using a mathematical function, and it is a one way to enable security during the process of message transmission when the message is intended for a particular recipient only, as a formula generates the hash, which helps to protect the security of the transmission against tampering.

Hashing is used to index and retrieve items in a database because it is faster to find the item using the shorter hashed key than to find it using the original value. It is also used in many encryption algorithms.

In other words, it is an algorithm that calculates a fixed-size bit string value from a file, which basically contains blocks of data. It transforms the data into a far shorter fixed-length value or key which represents the original string. The hash value can be considered the distilled summary of everything within that file.

Usually, a hash is a hexadecimal string of several characters. Hashing is also a unidirectional process so you can never work backward to get back the original data.

Suppose, we want to design a system for storing employee records keyed using phone numbers. And we want the following queries to be performed efficiently:

  • Insert a phone number and corresponding information.
  • Search a phone number and fetch the information.
  • Delete a phone number and related information.

We can think of using the following data structures to maintain information about different phone numbers.

  • The array of phone numbers and records.
  • Linked list of phone numbers and records.
  • Balanced binary search tree with phone numbers as keys.
  • Direct Access Table.

For arrays and linked lists, we need to search in a linear fashion, which can be costly in practice. If we use arrays and keep the data sorted, then a phone number can be searched in O(Logn) time using Binary Search, but insert and delete operations become costly as we have to maintain sorted order.

With a balanced binary search tree, we get moderate search, insert and delete times. All of these operations can be guaranteed to be in O(Logn) time.

Hash Function

Hashing A Guide For Beginners
Image: Wikipedia

A function that converts a given big phone number to a small practical integer value. The mapped integer value is used as an index in the hash table. In simple terms, a hash function maps a big number or string to a small integer that can be used as an index in a hash table.
Here are some relatively simple hash functions that have been used:

  • Division-remainder method: The size of the number of items in the table is estimated. That number is then used as a divisor into each original value or key to extracting a quotient and a remainder. The remainder is the hashed value. (Since this method is liable to produce a number of collisions, any search mechanism would have to be able to recognize a collision and offer an alternate search mechanism).
  • Folding method: This method divides the original value (digits in this case) into several parts, add the parts together, and then uses the last four digits (or some other arbitrary number of digits that will work) as the hashed value or key.
  • Radix transformation method: Where the value or key is digital, the number base (or radix) can be changed resulting in a different sequence of digits. (For example, a decimal numbered key could be transformed into hexadecimal numbered key). High-order digits could be discarded to fit a hash value of uniform length.
  • Digit rearrangement method: This is simply taking part in the original value or key such as digits in positions 3 through 6, reversing their order, and then using that sequence of digits as the hash value or key.
ALSO READ :  Kakao Klaytn Ahead Of Facebook Libra, Says Kakao CEO

Hash Rate

Hash rate basically means how fast these hashing operations are taking place while mining. A high hash rate means more people and software machines are taking part in the mining process and as a result, the system is running smoothly. If the hash rate is too fast the difficulty level is increased. If the hash rate becomes too slow then the difficulty level is decreased.

Hash Table

Image: Dev.to

An array that stores pointer to records corresponding to a given phone number. An entry in the hash table is NIL if no existing phone number has hash function value equal to the index for the entry.

Collision Handling

Since a hash function gets us a small number for a big key, there is a possibility that two keys result in the same value. The situation where a newly inserted key maps to an already occupied slot in the hash table is called collision and must be handled using some collision handling technique.

Following are the ways to handle collisions:

  • Chaining: The idea is to make each cell of hash table point to a linked list of records that have the same hash function value. Chaining is simple but requires additional memory outside the table.
  • Open Addressing: In open addressing, all elements are stored in the hash table itself. Each table entry contains either a record or NIL. When searching for an element, we one by one examine table slots until the desired element is found or it is clear that the element is not in the table.
ALSO READ :  Russian Musician Oleg Kenzov Utilized Blockchain For Digital Rights Transfer Of His Song

HashMap

Hashing A Guide For Beginners
Image: Programmer Think

HashMap is a collection class that is designed to store elements as key-value pairs. Maps provide a way of looking up one thing based on the value of another.

The following are some of the HashMap methods:

  • map.get(key) — returns the value associated with that key. If the map does not associate any value with that key then it returns null. Referring to “map.get(key)” is similar to referring to “A[key]” for an array A.
  • map.put(key, value) — adds the key-value pair to the map. This is similar to “A[key] = value” for an array A.
  • map.containsKey(key) — returns true if the map has that key.
  • map.containsValue(value) — returns true if the map has that value.
  • map.keySet() — returns a set of all keys
  • map.values() — returns a collection of all value

Benefits of Hashing

One main use of hashing is to compare two files for equality. Without opening two document files to compare them word-for-word, the calculated hash values of these files will allow the owner to know immediately if they are different. Hashing is also used to verify the integrity of a file after it has been transferred from one place to another, typically in a file backup program like SyncBack. To ensure the transferred file is not corrupted, a user can compare the hash value of both files. If they are the same, then the transferred file is an identical copy. In some situations, an encrypted file may be designed to never change the file size nor the last modification date and time (for example, virtual drive container files). In such cases, it would be impossible to tell at a glance if two similar files are different or not, but the hash values would easily tell these files apart if they are different.

Types of Hashing

There are many different types of hash algorithms such as RipeMD, Tiger, xxhash, and more, but the most common type of hashing used for file integrity checks are MD5, SHA-2, and CRC32.

  • MD5 – An MD5 hash function encodes a string of information and encodes it into a 128-bit fingerprint. MD5 is often used as a checksum to verify data integrity. However, due to its age, MD5 is also known to suffer from extensive hash collision vulnerabilities, but it’s still one of the most widely used algorithms in the world.
  • SHA-2 – SHA-2, developed by the National Security Agency (NSA), is a cryptographic hash function. SHA-2 includes significant changes from its predecessor, SHA-1. The SHA-2 family consists of six hash functions with digests (hash values) that are 224, 256, 384 or 512 bits: SHA-224, SHA-256, SHA-384, SHA-512, SHA-512/224, SHA-512/256.
  • CRC32 – A cyclic redundancy check (CRC) is an error-detecting code often used for the detection of accidental changes to data. Encoding the same data string using CRC32 will always result in the same hash output, thus CRC32 is sometimes used as a hash algorithm for file integrity checks. These days, CRC32 is rarely used outside of Zip files.
ALSO READ :  An Unknown Developer Deploys Curve Tokens And Contracts With An Unexpected Launch

Properties of Hashing

Image: REDCOM
  • Deterministic: This means that no matter how many times you parse through a particular input through a hash function, you will always get the same result. This is critical because if you get different hashes every single time, it will be impossible to keep track of the input.
  • Quick computation: The hash function should be capable of returning the hash of input quickly. If the process is not fast enough, then the system simply will not be efficient.
  • Pre-image resistance: What pre-image resistance states are that given H(A) it is infeasible to determine A, where A is the input, and H(A) is the output hash. Notice the use of the word “infeasible” instead of “impossible”. We already know that it is not impossible to determine the original input from its hash value.
  • Small changes in the input change the hash: Even if you make a small change in your input, the changes that will be reflected in the hash will be huge.
  • Collision resistant: Given two different inputs A and B where H(A) and H(B) are their respective hashes, it is infeasible for H(A) to be equal to H(B). What that means is that for the most part, each input will have its own unique hash. 

Conclusion

Hashing is a useful tool to verify files and copied correctly between two resources. It can also be used to check if files are identical without opening and comparing them. Thus, hashing has truly been fundamental in the creation of blockchain technology.

Cover Image: Studytonight


Disclaimer

Crypto News Point a news platform of Digital Notice Media Labs is primarily a regular publication of information, commentary and articles focused extensively on fintech, blockchain technology, cryptocurrency, blockchain-based tokens, cryptocurrency market trends, and trading strategies. We do not provide individually tailored investment advice and does not take a subscriber’s or anyone’s circumstances into consideration when discussing investments, nor is Crypto News Point registered as an investment adviser or broker-dealer in any jurisdiction. Information contained herein is not an offer or solicitation to buy, hold, or sell any digital assets.

Affiliate Disclosure: To help support the work we do here at CNP, we often link to products and deals from around the web. Should you buy some of these, we may get a portion of the sale.

We in generally gather content from the major websites. In every article there is always a clear link and attribution to the source publication. If you have any issue with any of our published content taken from your site, kindly let us know so that we can take appropriate action. In any case, the content of the pages of this website is for your general information and use only. It is subject to change without notice.

You May Also like

A Brief Overview On Wrapped Token

A Brief Overview On Wrapped Token

A wrapped token is an ERC-20 token with a value identical to another asset that it represents, either through a smart contract or by being backed one-to-one with the underlying asset. It is an asset hosted on the...

Blockchain in the Tourism Industry

Blockchain in the Tourism Industry

Tourism can be defined as traveling to a place that is different from your home city or country for various leisure or business purposes, and staying there for some considerable period of time at a length. It becomes...

Ishita Bora

Ishita Bora is a Senior Content Creator at Digital Notice Media Labs with an experience of 1 year. She has completed her Master's Degree in Language and Linguistics in 2019 from Gauhati University, India. Her interest lies in blockchain technology and cryptocurrency space, as she loves writing about blockchain and other blockchain-related articles. Currently, she is working on blockchain-based news, reviews, featured articles, and guides.
Share This

Share This

Share this post with your friends!