🌴208. Implement Trie (Prefix Tree)

Medium | Grind 75 | Sunday July 10

A trie (pronounced as "try") or prefix tree is a tree data structure used to efficiently store and retrieve keys in a dataset of strings. There are various applications of this data structure, such as autocomplete and spellchecker.

Implement the Trie class:

  • Trie() Initializes the trie object.

  • void insert(String word) Inserts the string word into the trie.

  • boolean search(String word) Returns true if the string word is in the trie (i.e., was inserted before), and false otherwise.

  • boolean startsWith(String prefix) Returns true if there is a previously inserted string word that has the prefix prefix, and false otherwise.

Input
["Trie", "insert", "search", "search", "startsWith", "insert", "search"]
[[], ["apple"], ["apple"], ["app"], ["app"], ["app"], ["app"]]
Output
[null, null, true, false, true, null, true]

Explanation
Trie trie = new Trie();
trie.insert("apple");
trie.search("apple");   // return True
trie.search("app");     // return False
trie.startsWith("app"); // return True
trie.insert("app");
trie.search("app");     // return True

If we were to visualize trie in our case of english alphabet, we will have the following.

Drawing
Trie Visualization

Our root will point to 26 english alphabet letters. Each alphabet will inturn point to another 26 alphabets. This will go on until you hit the word with the longest length.

However, in our case, our root will point to a dictionary. This dictionary contains key value pairs for the unique letters we have seen that are the first letter in the word. The key is the letter and the value will be another dictionary that contains the second letter (key) and the value as another dictionary. This will go on until we looked at every word.

For example: let's look at the word 1.car 2.far 3.care 4.Pure

Drawing
Trie structure store the above 4 words

WORD: Car when we first see the word car, we insert a node with value 'C' pointed from root since this is the first time seeing the letter c at first position of the word. The from 'C', we point to the Node 'A' since that is the next letter we encouter. From A we point to node with value 'R'. Now since we reached the end of the word car, the r will point to none since there is no other letter. When we reach letter 'R', we out a marker indication it is end of the word. This is need later when we check if a word exist.

WORD: Far and Pure The word Far and Pure follows the same step as the word car but with its corresponding letter node.

WORD: Care This is where the interesting thing happens. Since we already inserted a Node with value C, we move to the next letter in the word which is 'A' and next to the next pointer in our trie structure. We have seen A before as well so we move to the next letter and the next node. When we reach letter 'R', we we see that R is already there but now, when you move to teh next letter E, we see that there is no letter 'E' following R. So we add a node with value 'E' and there is a pointer coming from 'R'. We also mark this letter 'E' with the end marker to signify it is a word.

This is the inution behind all the operations. You traverse trhough each letters and accordingly carry out the require oreation.

SOLUTION

Given that we need to implement four methods, we will go one at a time.

  1. INITIALIZE We will initialize our class object with next_ which is a dictionary and a boolean endmark. a. dictionary next_: this basically a dictionary that holds all the seen letters following the current letter. This is equivalent to point to the next letter (the arrow in the diagram). When you first initialize it, it is empty because we haven't seen any letters yet so the root is pointing to None/empty. b. endmark: this is a boolean to indicate whether a letter in the tree is an end of the word. When we start reading letters, we will always assume it is not the end of the word until we hit the last letter in out word. Then at that point that letter will have endmark as True.

  2. INSERT When we need to insert a word, we will iterate over every character in the word. When we first start, as in read the first character, we will check if the next dictionary for the root contains the letter. If it does, then we will go along that dictionary. If not, we will create a new Node with value in the next_ dictionary. We repeat this until we reached the end of the word whereby we mark that letter with endmark true.

  3. SEARCH In order to search if again loop through each character in the word and see if the character exist in the trie. NOTE: given that we found every letter of the target word in the trie, we still have to check if the last letter's endmark is true or false. If it is true we know the word exist. However, if the endmark is False, then we only found a subset of another longer word and not the target word.

  4. STARTS_WITH We follow the same process as search. However, this time, we do not care if the endmark is true or false. If the prefix word exist, then we return true regardless of the endmark value. This is because a word is a prefix of itself.

class Node: 
    def __init__(self): 
        self.next_ = {}
        self.endmark = False
        
class Trie:
    def __init__(self):
        self.root = Node() 
        
    def insert(self, word: str) -> None:
        current = self.root
        for i in word: 
            if i not in current.next_: 
                current.next_[i]= Node()
            current = current.next_[i]
        current.endmark = True 
            
            
    def search(self, word: str) -> bool:
        current = self.root
        for i in word: 
            if i in current.next_: 
                current = current.next_[i]
            else: 
                return False
        if current.endmark: 
            return True
        else: 
            return False

        
    def startsWith(self, prefix: str) -> bool:
        current = self.root
        for i in prefix: 
            if i not in current.next_: 
                return False
            current = current.next_[i]
        return True
            
            

Last updated