I had originally written an ArrayList
and stored unique values (usernames, i.e. Strings
) in it. I later needed to use the ArrayList
to search if a user existed in it. That's O(n)
for the search.
My tech lead wanted me to change that to a HashMap
and store the usernames as keys in the array and values as empty Strings
.
So, in Java -
hashmap.put("johndoe","");
I can see if this user exists later by running -
hashmap.containsKey("johndoe");
This is O(1)
right?
My lead said this was a more efficient way to do this and it made sense to me, but it just seemed a bit off to put null/empty as values in the hashmap and store elements in it as keys.
My question is, is this a good approach? The efficiency beats ArrayList#contains
or an array search in general. It works.
My worry is, I haven't seen anyone else do this after a search. I may be missing an obvious issue somewhere but I can't see it.
Since you have a set of unique values, a Set
is the appropriate data structure. You can put your values inside HashSet
, an implementation of the Set
interface.
My lead said this was a more efficient way to do this and it made sense to me, but it just seemed a bit off to put null/empty as values in the hashmap and store elements in it as keys.
The advice of the lead is flawed. Map
is not the right abstraction for this, Set
is. A Map
is appropriate for key-value pairs. But you don't have values, only keys.
Example usage:
Set<String> users = new HashSet<>(Arrays.asList("Alice", "Bob"));
System.out.println(users.contains("Alice"));
// -> prints true
System.out.println(users.contains("Jack"));
// -> prints false
Using a Map
would be awkward, because what should be the type of the values? That question makes no sense in your use case,
as you have just keys, not key-value pairs.
With a Set
, you don't need to ask that, the usage is perfectly natural.
This is O(1) right?
Yes, searching in a HashMap
or a HashSet
is O(1) amortized worst case, while searching in a List
or an array is O(n) worst case.
Some comments point out that a HashSet
is implemented in terms of HashMap
.
That's fine, at that level of abstraction.
At the level of abstraction of the task at hand ---
to store a collection of unique usernames,
using a set is a natural choice, more natural than a map.