π©721. Accounts Merge
Medium | Grind 75 | January 30th Monday, 2023
Given a list of accounts
where each element accounts[i]
is a list of strings, where the first element accounts[i][0]
is a name, and the rest of the elements are emails representing emails of the account.
Now, we would like to merge these accounts. Two accounts definitely belong to the same person if there is some common email to both accounts. Note that even if two accounts have the same name, they may belong to different people as people could have the same name. A person can have any number of accounts initially, but all of their accounts definitely have the same name.
After merging the accounts, return the accounts in the following format: the first element of each account is the name, and the rest of the elements are emails in sorted order. The accounts themselves can be returned in any order.
Constraints:
1 <= accounts.length <= 1000
2 <= accounts[i].length <= 10
1 <= accounts[i][j].length <= 30
accounts[i][0]
consists of English letters.accounts[i][j] (for j > 0)
is a valid email.
Input: accounts = [["John","johnsmith@mail.com","john_newyork@mail.com"],["John","johnsmith@mail.com","john00@mail.com"],["Mary","mary@mail.com"],["John","johnnybravo@mail.com"]]
Output: [["John","john00@mail.com","john_newyork@mail.com","johnsmith@mail.com"],["Mary","mary@mail.com"],["John","johnnybravo@mail.com"]]
Explanation:
The first and second John's are the same person as they have the common email "johnsmith@mail.com".
The third John and Mary are different people as none of their email addresses are used by other accounts.
We could return these lists in any order, for example the answer [['Mary', 'mary@mail.com'], ['John', 'johnnybravo@mail.com'],
['John', 'john00@mail.com', 'john_newyork@mail.com', 'johnsmith@mail.com']] would still be accepted.
SOLUTION
Disjoint set union basically takes two sets with a commonality and combines them into a set with unique/disjointed values. This is exactly the data structure/function we need here. We have different sets each containing various emails and if the sets share a common email, we need to merge the two sets.
Part 1: DSU
INITIALIZE(): We initialize every element(a list containing the emails) in the accounts list as its own parent. We will use the index of the element to indicate the parent value. The idea is to merge these parents when an email is common between two elements.
FIND(): The find method gives us the root parent of an element. If two elements have the same root parent, then it means the elements have been merged or in our context, the two accounts belong to the same person.
UNION(): The union method is what 'merges' the two sets. In order to achieve the merge action, we set the root parent of element B as element A. In traditional DSU implementation, the root parent selection happens based on each parent's rank for optimization, but this is not needed in our case. This is because, even if one element's parent has a higher rank, the element with a lower rank could have several more emails compared to the high-ranking element.
PART 2: merge the accounts
We first iterate over the elements in the accounts list. Each element contains a name and emails. we iterate over the emails and see if the email has been seen before.
We initialize a dictionary called emails that will store key-value pairs made up of email and their parent value.
if the email has not been seen before, then we add the current email as a key and the index as a value to our emails dictionary.
if the email has been seen before, then we need to merge the parents of the two elements containing the seen email. In order to do that, we need to find the root parent of the previously seen email and the current index which will be the parent for the current email. Then we set the parent of the current element as the previously seen parent.
After iterating over every account and the emails inside it, the parents of the elements sharing a common email will have been merged.
We now iterate over the emails dictionary.
We initialize a dictionary of sets called merged that will now store all the emails belonging to the same root parent as a value for one key value.
Every time we iterate over the key-value pairs in the emails dictionary, we get the root parent by using the find method under DSU class. We then use this root parent as the key and the email as a value and add it to our new dictionary.
The new dictionary now contains all the merged emails with their root parent.
We use the key of the merged dictionary as an index in the original list(accounts) to fetch the name of the person.
Additionally, we also sort the emails for each key in the merged dictionary.
We now take the name and sorted emails and add them to a list and return it.
class DSU:
def __init__(self, size:int):
# initialize every index as its own parent
self.parent = [i for i in range(size)]
#NOTE: it doesn't make sense to have rank in this problem given how we use the union.
def find(self, x:int) -> int:
''' return the root parent of the element '''
root = x
while root != self.parent[root]:
# path compression
self.parent[root] = self.parent[self.parent[root]]
root = self.parent[root]
return root
def union(self, a:int, b:int) -> None:
''' union the parent of two elements '''
# get the root parent of the elements
a_parent, b_parent = self.find(a), self.find(b)
# two elements already have the same parent
if a_parent == b_parent: return
self.parent[b_parent] = a_parent
class Solution:
def accountsMerge(self, accounts: List[List[str]]) -> List[List[str]]:
dsu = DSU(len(accounts))
emails = {}
for index, account in enumerate(accounts):
for mail in account[1:]:
# previously seen so merge current index and previous seen index
if mail in emails:
parent = dsu.find(emails[mail])
dsu.union(index,parent)
else:
emails[mail] = index
# merge emails with same root parent into one list
merged = defaultdict(set)
for mail, parent in emails.items():
root = dsu.find(parent)
merged[root].add(mail)
# convert dict of set to list and append the person name
res = []
for index, value in merged.items():
name = [accounts[index][0]]
sorted_email = sorted(value)
res.append(name + sorted_email)
return res
TIME AND SPACE COMPLEXITY
TIME: Not quite sure if the runtime is correct: O(n * log n * log n)
1. find: O(log n)
where n is the number of emails.
2. Union: O(log n)
3. accountsMerge: O(n*log n * log n)
- You iterate over every email once and while iterating, you perform the find and union every single time in the worst case.
**would really appreciate it if someone knows the runtime well and can explain it to me. I am not super confident on this one.
SPACE: O(n)
where n is the total number of unique emails.
1. parent [list]: O(n) where n is the total number of unique emails.
2. emails [dictionary]: O(n)
3. merged [dictionary]: O(n)
4. res [list]: O(n)
So in total, we use O(4n)
space.
Last updated