Is it a good idea to store data as keys in HashMap with empty/null values?

dozer :

I had originally written an ArrayList and stored unique values (usernames, i.e. Strings) in it. I later needed to use the ArrayList to search if a user existed in it. That's O(n) for the search.

My tech lead wanted me to change that to a HashMap and store the usernames as keys in the array and values as empty Strings.

So, in Java -

hashmap.put("johndoe","");

I can see if this user exists later by running -

hashmap.containsKey("johndoe"); 

This is O(1) right?

My lead said this was a more efficient way to do this and it made sense to me, but it just seemed a bit off to put null/empty as values in the hashmap and store elements in it as keys.

My question is, is this a good approach? The efficiency beats ArrayList#contains or an array search in general. It works. My worry is, I haven't seen anyone else do this after a search. I may be missing an obvious issue somewhere but I can't see it.

janos :

Since you have a set of unique values, a Set is the appropriate data structure. You can put your values inside HashSet, an implementation of the Set interface.

My lead said this was a more efficient way to do this and it made sense to me, but it just seemed a bit off to put null/empty as values in the hashmap and store elements in it as keys.

The advice of the lead is flawed. Map is not the right abstraction for this, Set is. A Map is appropriate for key-value pairs. But you don't have values, only keys.

Example usage:

Set<String> users = new HashSet<>(Arrays.asList("Alice", "Bob"));

System.out.println(users.contains("Alice"));
// -> prints true

System.out.println(users.contains("Jack"));
// -> prints false

Using a Map would be awkward, because what should be the type of the values? That question makes no sense in your use case, as you have just keys, not key-value pairs. With a Set, you don't need to ask that, the usage is perfectly natural.

This is O(1) right?

Yes, searching in a HashMap or a HashSet is O(1) amortized worst case, while searching in a List or an array is O(n) worst case.


Some comments point out that a HashSet is implemented in terms of HashMap. That's fine, at that level of abstraction. At the level of abstraction of the task at hand --- to store a collection of unique usernames, using a set is a natural choice, more natural than a map.

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=423546&siteId=1