Interview Questions

Find all duplicates in an array in linear time (v1)
This is a common interview question where you need to write a program to find all duplicates in an array where the numbers in the array are in the range of 0 to n-1 where n is the size of the array. For example: [1, 2, 3, 3] is okay but [1, 2, 6, 3] is not. In this version of the challenge there can be multiple duplicate numbers as well. The algorithm below is commented to explain what each piece of code does, but the general algorithm is:
(1) Loop through the array (2) For each element, find array[absolute(array[i])] in the array and set its value to negative if positive (3) If in step 2 you encounter a negative number, then it means the element at index i in the array is a duplicate

An example

arr = [1, 2, 2, 3, 1] At i = 0 arr[absolute(arr[0])] = 2 which is positive so make it negative arr = [1, -2, 2, 3, 1] At i = 1 arr[absolute(arr[1])] = 2 which is positive so make it negative arr = [1, -2, -2, 3, 1] At i = 2 arr[absolute(arr[2])] = -2 which is negative which means the element originally at index 2 is a duplicate duplicates = [2] At i = 3 arr[absolute(arr[3])] = 3 which is positive so make it negative arr = [1, -2, -2, -3, 1] At i = 4 arr[absolute(arr[4])] = -2 which is negative which means the element originally at index 4 is a duplicate duplicates = [2, 1]

Solution

function duplicates(arr) {

// where we will store our duplicate values
var dups = [];

for (var i = 0; i < arr.length; i++) {

// get element in array
var el = arr[Math.abs(arr[i])];

// element in array is positive so make it negative
if (el > 0) { arr[Math.abs(arr[i])] = -arr[Math.abs(arr[i])]; }

// special case if element is zero
// we set the value at this index to -arr.size as not to
// mix this number up with the others because we know the
// numbers are all in the range of 0 to n-1
else if (el === 0) { arr[Math.abs(arr[i])] = -arr.length; }

// element is negative so it is a duplicate
else {
if (Math.abs(arr[i]) === arr.length) { dups.push(0); }
else { dups.push(Math.abs(arr[i])); }
}

}

return dups;

}

duplicates([0,2,0,1,3,3]);
def duplicates(arr):

# where we will store our duplicate values
dups = []

for i in range(0, len(arr)):

# get element in array and check if array is in bounds
if abs(arr[i]) == len(arr):
el = -1
else:
el = arr[abs(arr[i])]

# element in array is positive so make it negative
if el > 0:
arr[abs(arr[i])] = -arr[abs(arr[i])]

# special case if element is zero
# we set the value at this index to -arr.size as not to
# mix this number up with the others because we know the
# numbers are all in the range of 0 to n-1
elif el == 0:
arr[abs(arr[i])] = -len(arr)

# element is negative so it is a duplicate
else:
if abs(arr[i]) == len(arr):
dups.append(0)
else:
dups.append(abs(arr[i]))

return dups

print duplicates([0, 2, 0, 1, 3, 3])

Running time

This algorithm runs in O(n) time because it only needs to loop through the array once. A problem though, is that this algorithm modifies the original array which may not be what you want. You can always copy the entire array before running the algorithm, which would add to the running time. Copying the entire array would run in linear time though, so the final running time would be O(2n) which is still O(n).

Sources

http://www.careercup.com/question?id=10255343 https://www.glassdoor.ca/Interview/Given-an-input-array-remove-all-any-duplicate-occurrences-and-return-the-array-QTN_1258495.htm http://www.glassdoor.com/Interview/Determine-if-an-array-from-1-n-has-a-duplicate-in-constant-time-and-space-QTN_504470.htm
mrdaniel published this on 11/11/15 |
• +
• 4
• -
• I did this using a hash table in Python - seems to still run in O(n) time (and with less array manipulation), and works for any numbers: def get_dups(array): ht = {} dups = [] for x in array: if x in ht: dups.append(x) else: ht[x] = x return list(set(dups))
• +
• 2
• -
• Since JavaScript has the concept of -0, the code can be significanlty simplified. The test for -0 is 1/x > 0. Evaluates to false if x is -0 and true if normal +0. var abs = (x) => Math.abs(arr[x]); for (var i = 0; i < arr.length; i++) { // get element in array var el = arr[abs(i)]; // element in array is positive so make it negative; handle -0 if (el >= 0 && 1/el > 0) arr[abs(i)] = -arr[abs(i)]; // element is negative so it is a duplicate else dups.push(Math.abs(arr[i])); }
• +
• 1
• -
• @numbergames Don't see how to edit, but need to update the test in my code -- don't need two tests, so if (el >= 0 && 1/el > 0) should be: if (1/el > 0)
• +
• 1
• -
• @karn09 This specific solution solves the challenges of finding duplicates in an array with numbers *only* in the range 0 to n-1 where n is the size of the array. So for [1,2,1,3,6, 5] it wouldn't work because the size of the array is 6 so the range must be 0 to 5, and there is a 6 in the array. Same goes for the second example. The v2 algorithm gives a solution for the more general challenge of finding duplicates in an array of any range of numbers.
• +
• 0
• -
• Java: List<Integer> duplicates = new ArrayList<Integer>(); for (int i = 0; i < ar.length; i++){ for (int a = 0; a < ar.length; a++){ if (a!=i && ar[a] == ar[i]){ duplicates.add(ar[i]); } } } System.out.println(duplicates);
• +
• 0
• -
• Sorry I just realized this was building up to using a hash table.
• +
• 0
• -
• Very good point. I will need to be more careful when reading over the cases. Thank you.
• +
• 0
• -
• There seems to be a problem with this solution. In certain cases it returns a duplicate multiple times and in others will return non-duplicated numbers. duplicates([1,2,1,3,6, 5]); => [1,0] duplicates([0,2,1,3,6, 6]); => [0,0] I think part of the issue is with setting the special case 0 to be equal to the length where arrays contain duplicates that are equal to the length.