HashSet handles duplicates through hashCode() and equals() methods. When an object is added to a HashSet, its hashCode() determines the storage location. If a hash conflict occurs, equals() will be used to further determine whether it is equal; if the object already exists, it will not be added repeatedly. To make the custom object recognize duplicates correctly, you must ① rewrite hashCode() to ensure that the same content returns the same hash value; ② rewrite equals() to define the logical equality of the object; ③ maintain consistency between the two and use the same fields. Common errors include forgetting to rewrite two methods at the same time, modifying the object causes a hash value to change, or logical inconsistency between the two.
HashSet in Java handles duplicates by using the hashCode()
and equals()
methods to determine whether an object is already present in the set.
How HashSet Works Internationally
Java's HashSet
is backed by a HashMap
. When you add an element to a HashSet
, it's stored as a key in the internal HashMap
, with a dummy value (typically a static PRESENT
object). Since keys in a HashMap
must be unique, this ensures that no duplicate elements can exist in a HashSet
.
When adding an element:
- The
hashCode()
method of the object is called to compute a hash value. - This hash value determines the bucket location in the underlying array of the
HashMap
. - If there are other elements in the same bucket (due to hash collision), the
equals()
method is used to check if the current object is equal to any existing one. - If a matching object is found, the new object is not added — ensuring uniqueness.
What Makes Two Objects Duplicates?
For two objects to be considered duplicates in a HashSet
, they must:
- Return the same value from
hashCode()
- Return
true
when compared via theequals()
method
This means that if you're storing custom objects in a HashSet
, you need to override both hashCode()
and equals()
in your class to ensure proper behavior. Otherwise, the default implementations from Object
will be used, which consider two different instances as distinct even if their contents are the same.
Example: Custom Object in HashSet
Suppose we have a simple Person
class:
class Person { String name; Person(String name) { this.name = name; } }
Now let's try using it in a HashSet
:
Set<Person> people = new HashSet<>(); people.add(new Person("Alice")); people.add(new Person("Alice")); System.out.println(people.size()); // prints 2
Even though both Person("Alice")
objects seem the same, since we didn't override hashCode()
and equals()
, they are treated as separate entries.
To fix that, we'd need to update the Person
class:
@Override public int hashCode() { return name.hashCode(); } @Override public boolean equals(Object obj) { if (this == obj) return true; if (!(obj instanceof Person)) return false; Person other = (Person)obj; return this.name.equals(other.name); }
After this change, adding two Person("Alice")
instances would result in only one being stored.
Common Pitfalls
Here are a few things to watch out for when working with HashSet
and duplicates:
- ? Forgetting to override both
hashCode()
andequals()
— doing just one won't work correctly. - ? Mutating an object after adding it to the
HashSet
— especially if the mutation affects the fields used inhashCode()
andequals()
. This may cause the object to become "lost" in the set. - ?? Inconsistent logic between
hashCode()
andequals()
— eg, using different fields in each method.
Summary
The HashSet
prevents duplicates by leveraging the unique guarantee of keys in a HashMap
. It uses the hashCode()
method to find where an object should go, and then equals()
to confirm whether it's already there.
If you're using custom classes, always remember to override both hashCode()
and equals()
properly.
Basically that's it.
The above is the detailed content of How does HashSet handle duplicates?. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undress AI Tool
Undress images for free

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

The difference between HashMap and Hashtable is mainly reflected in thread safety, null value support and performance. 1. In terms of thread safety, Hashtable is thread-safe, and its methods are mostly synchronous methods, while HashMap does not perform synchronization processing, which is not thread-safe; 2. In terms of null value support, HashMap allows one null key and multiple null values, while Hashtable does not allow null keys or values, otherwise a NullPointerException will be thrown; 3. In terms of performance, HashMap is more efficient because there is no synchronization mechanism, and Hashtable has a low locking performance for each operation. It is recommended to use ConcurrentHashMap instead.

Java uses wrapper classes because basic data types cannot directly participate in object-oriented operations, and object forms are often required in actual needs; 1. Collection classes can only store objects, such as Lists use automatic boxing to store numerical values; 2. Generics do not support basic types, and packaging classes must be used as type parameters; 3. Packaging classes can represent null values ??to distinguish unset or missing data; 4. Packaging classes provide practical methods such as string conversion to facilitate data parsing and processing, so in scenarios where these characteristics are needed, packaging classes are indispensable.

StaticmethodsininterfaceswereintroducedinJava8toallowutilityfunctionswithintheinterfaceitself.BeforeJava8,suchfunctionsrequiredseparatehelperclasses,leadingtodisorganizedcode.Now,staticmethodsprovidethreekeybenefits:1)theyenableutilitymethodsdirectly

The JIT compiler optimizes code through four methods: method inline, hot spot detection and compilation, type speculation and devirtualization, and redundant operation elimination. 1. Method inline reduces call overhead and inserts frequently called small methods directly into the call; 2. Hot spot detection and high-frequency code execution and centrally optimize it to save resources; 3. Type speculation collects runtime type information to achieve devirtualization calls, improving efficiency; 4. Redundant operations eliminate useless calculations and inspections based on operational data deletion, enhancing performance.

Instance initialization blocks are used in Java to run initialization logic when creating objects, which are executed before the constructor. It is suitable for scenarios where multiple constructors share initialization code, complex field initialization, or anonymous class initialization scenarios. Unlike static initialization blocks, it is executed every time it is instantiated, while static initialization blocks only run once when the class is loaded.

InJava,thefinalkeywordpreventsavariable’svaluefrombeingchangedafterassignment,butitsbehaviordiffersforprimitivesandobjectreferences.Forprimitivevariables,finalmakesthevalueconstant,asinfinalintMAX_SPEED=100;wherereassignmentcausesanerror.Forobjectref

Factory mode is used to encapsulate object creation logic, making the code more flexible, easy to maintain, and loosely coupled. The core answer is: by centrally managing object creation logic, hiding implementation details, and supporting the creation of multiple related objects. The specific description is as follows: the factory mode handes object creation to a special factory class or method for processing, avoiding the use of newClass() directly; it is suitable for scenarios where multiple types of related objects are created, creation logic may change, and implementation details need to be hidden; for example, in the payment processor, Stripe, PayPal and other instances are created through factories; its implementation includes the object returned by the factory class based on input parameters, and all objects realize a common interface; common variants include simple factories, factory methods and abstract factories, which are suitable for different complexities.

There are two types of conversion: implicit and explicit. 1. Implicit conversion occurs automatically, such as converting int to double; 2. Explicit conversion requires manual operation, such as using (int)myDouble. A case where type conversion is required includes processing user input, mathematical operations, or passing different types of values ??between functions. Issues that need to be noted are: turning floating-point numbers into integers will truncate the fractional part, turning large types into small types may lead to data loss, and some languages ??do not allow direct conversion of specific types. A proper understanding of language conversion rules helps avoid errors.
