11.3 Set: HashSet, LinkedHashSet, TreeSet

Set은 중복을 허용하지 않는 집합(Set) 이라는 수학 개념을 그대로 구현한 컬렉션입니다. 같은 값을 두 번 저장하려고 하면 자동으로 무시합니다. 주로 "이 항목이 존재하는지 여부" 확인이나, 중복 없이 고유한 데이터만 보관 할 때 씁니다.

1. Set 인터페이스 특징

특징	설명
중복 불허	동일한 값은 한 번만 저장됩니다
순서 없음	HashSet 기준. 입력 순서와 출력 순서가 다를 수 있습니다
null 허용	HashSet은 null 하나만 저장 가능
빠른 검색	HashSet은 O(1), TreeSet은 O(log n)

import java.util.HashSet;
import java.util.Set;

Set<String> set = new HashSet<>();
set.add("사과");
set.add("바나나");
set.add("사과"); // 중복 → 무시됨
System.out.println(set.size()); // 2 (중복 제거)
System.out.println(set);        // [바나나, 사과] (순서 불규칙)

2. HashSet

가장 많이 쓰이는 Set 구현체입니다. 저장 순서를 보장하지 않습니다(넣은 순서와 꺼내는 순서가 다를 수 있습니다). 내부적으로 HashMap을 사용하여 구현됩니다.

주요 메서드

import java.util.HashSet;
import java.util.Iterator;

public class HashSetMethods {
    public static void main(String[] args) {
        HashSet<String> tags = new HashSet<>();

        // ========== 추가 ==========
        tags.add("Java");
        tags.add("Spring");
        tags.add("MySQL");
        tags.add("Java"); // 중복! → false 반환, 무시됨
        boolean added = tags.add("Python");
        System.out.println("Python 추가 성공: " + added);   // true
        boolean dup = tags.add("Java");
        System.out.println("Java 재추가 성공: " + dup);     // false

        System.out.println("tags: " + tags);
        System.out.println("크기: " + tags.size());         // 4

        // ========== 조회 ==========
        System.out.println("Java 있음?: " + tags.contains("Java"));    // true
        System.out.println("Go 있음?: " + tags.contains("Go"));        // false
        System.out.println("비어있음?: " + tags.isEmpty());             // false

        // ========== 삭제 ==========
        tags.remove("MySQL");
        System.out.println("삭제 후: " + tags);

        // ========== 순회 ==========
        System.out.println("=== for-each 순회 ===");
        for (String tag : tags) {
            System.out.println("- " + tag);
        }

        System.out.println("=== Iterator 순회 ===");
        Iterator<String> it = tags.iterator();
        while (it.hasNext()) {
            System.out.println("- " + it.next());
        }

        // ========== 초기화 ==========
        tags.clear();
        System.out.println("clear 후 isEmpty: " + tags.isEmpty()); // true
    }
}

equals와 hashCode의 중요성

HashSet에서 중복을 판단하는 기준은 equals() 메서드 와 hashCode() 메서드 입니다. 두 메서드 모두 오버라이드해야 올바르게 동작합니다.

import java.util.HashSet;
import java.util.Objects;

public class HashSetEqualsExample {

    // equals/hashCode를 오버라이드하지 않은 클래스
    static class PointBad {
        int x, y;
        PointBad(int x, int y) { this.x = x; this.y = y; }
    }

    // equals/hashCode를 올바르게 오버라이드한 클래스
    static class PointGood {
        int x, y;
        PointGood(int x, int y) { this.x = x; this.y = y; }

        @Override
        public boolean equals(Object o) {
            if (this == o) return true;
            if (!(o instanceof PointGood)) return false;
            PointGood p = (PointGood) o;
            return x == p.x && y == p.y;
        }

        @Override
        public int hashCode() {
            return Objects.hash(x, y); // x, y 기반 해시코드
        }

        @Override
        public String toString() { return "(" + x + "," + y + ")"; }
    }

    public static void main(String[] args) {
        // 문제: equals/hashCode 없이는 내용이 같아도 다른 객체로 판단
        HashSet<PointBad> badSet = new HashSet<>();
        badSet.add(new PointBad(1, 1));
        badSet.add(new PointBad(1, 1)); // 내용 같지만 다른 객체!
        System.out.println("PointBad 크기: " + badSet.size()); // 2 (중복 제거 실패!)

        // 해결: equals/hashCode 오버라이드
        HashSet<PointGood> goodSet = new HashSet<>();
        goodSet.add(new PointGood(1, 1));
        goodSet.add(new PointGood(1, 1)); // 내용 같음 → 중복으로 처리
        System.out.println("PointGood 크기: " + goodSet.size()); // 1 (올바르게 중복 제거!)
        System.out.println("PointGood 내용: " + goodSet); // [(1,1)]
    }
}

equals와 hashCode 규칙

HashSet(또는 HashMap의 키)에 사용자 정의 객체를 넣을 때는 반드시 equals()와 hashCode()를 함께 오버라이드해야 합니다.

equals()가 true이면 hashCode()도 같아야 합니다.
hashCode()가 같아도 equals()가 false일 수 있습니다 (해시 충돌).

3. LinkedHashSet

LinkedHashSet은 HashSet과 동일하지만 삽입 순서를 유지 합니다. 내부적으로 이중 연결 리스트를 추가로 관리합니다.

import java.util.LinkedHashSet;

public class LinkedHashSetExample {
    public static void main(String[] args) {
        LinkedHashSet<String> langs = new LinkedHashSet<>();

        langs.add("Python");
        langs.add("Java");
        langs.add("Go");
        langs.add("Java");    // 중복 → 무시
        langs.add("Kotlin");

        // 삽입 순서 그대로 출력!
        System.out.println(langs); // [Python, Java, Go, Kotlin]
        System.out.println("크기: " + langs.size()); // 4

        // 중복 제거 + 순서 유지가 모두 필요할 때 사용
        System.out.println("Java 있음? " + langs.contains("Java")); // true
    }
}

LinkedHashSet vs HashSet

순서가 중요없고 속도만 중요: HashSet 사용
중복 제거 + 입력 순서 유지: LinkedHashSet 사용
LinkedHashSet은 HashSet보다 약간 더 메모리를 사용합니다.

4. TreeSet

TreeSet은 저장되는 요소를 자동으로 정렬 해서 보관합니다. 내부적으로 Red-Black 트리(자가 균형 이진 탐색 트리)를 사용합니다. 기본적으로 오름차순(자연 정렬)이며, Comparator로 정렬 기준을 커스터마이즈할 수 있습니다.

기본 사용

import java.util.TreeSet;

public class TreeSetBasic {
    public static void main(String[] args) {
        TreeSet<Integer> scores = new TreeSet<>();
        scores.add(85);
        scores.add(92);
        scores.add(70);
        scores.add(100);
        scores.add(85); // 중복 제거됨

        System.out.println("정렬된 점수: " + scores); // [70, 85, 92, 100]

        // NavigableSet 기능
        System.out.println("최솟값: " + scores.first());    // 70
        System.out.println("최댓값: " + scores.last());     // 100
        System.out.println("80보다 크거나 같은 첫 값: " + scores.ceiling(80));  // 85
        System.out.println("90보다 작거나 같은 첫 값: " + scores.floor(90));    // 85
        System.out.println("85보다 큰 첫 값: " + scores.higher(85));            // 92
        System.out.println("85보다 작은 첫 값: " + scores.lower(85));           // 70

        // 범위 조회: 80~95 사이 (포함) → headSet, tailSet, subSet
        System.out.println("80~95 범위: " + scores.subSet(80, true, 95, true)); // [85, 92]
        System.out.println("90 초과: " + scores.tailSet(90, false));             // [92, 100]
        System.out.println("90 이하: " + scores.headSet(90, true));             // [70, 85]
    }
}

커스텀 정렬 Comparator

import java.util.Comparator;
import java.util.TreeSet;

public class TreeSetComparator {
    public static void main(String[] args) {
        // 문자열을 길이 내림차순, 같으면 알파벳 순으로 정렬
        TreeSet<String> words = new TreeSet<>(
            Comparator.comparingInt(String::length).reversed()
                      .thenComparing(Comparator.naturalOrder())
        );

        words.add("banana");
        words.add("apple");
        words.add("fig");
        words.add("cherry");
        words.add("kiwi");

        System.out.println(words); // [banana, cherry, apple, kiwi, fig]
        // 길이: banana(6), cherry(6), apple(5), kiwi(4), fig(3)
        // 길이 같으면 알파벳 순: banana < cherry
    }
}

5. Set 집합 연산 (합집합, 교집합, 차집합)

Set의 강력한 기능 중 하나는 수학적 집합 연산 을 쉽게 수행할 수 있다는 점입니다.

import java.util.HashSet;
import java.util.Set;

public class SetOperations {
    public static void main(String[] args) {
        Set<String> groupA = new HashSet<>(Set.of("Java", "Python", "Go", "Rust"));
        Set<String> groupB = new HashSet<>(Set.of("Python", "Kotlin", "Go", "Swift"));

        System.out.println("그룹A: " + groupA);
        System.out.println("그룹B: " + groupB);

        // 합집합 (Union): A ∪ B
        Set<String> union = new HashSet<>(groupA);
        union.addAll(groupB);
        System.out.println("합집합 (A ∪ B): " + union);
        // [Java, Python, Go, Rust, Kotlin, Swift]

        // 교집합 (Intersection): A ∩ B
        Set<String> intersection = new HashSet<>(groupA);
        intersection.retainAll(groupB);
        System.out.println("교집합 (A ∩ B): " + intersection);
        // [Python, Go]

        // 차집합 (Difference): A - B
        Set<String> difference = new HashSet<>(groupA);
        difference.removeAll(groupB);
        System.out.println("차집합 (A - B): " + difference);
        // [Java, Rust]

        // 대칭 차집합 (Symmetric Difference): (A ∪ B) - (A ∩ B)
        Set<String> symDiff = new HashSet<>(union);
        symDiff.removeAll(intersection);
        System.out.println("대칭 차집합: " + symDiff);
        // [Java, Rust, Kotlin, Swift]

        // 부분집합 판단: groupA가 union의 부분집합인가?
        System.out.println("A ⊆ union? " + union.containsAll(groupA)); // true
    }
}

집합 연산 메서드 정리

연산	메서드	설명
합집합	`addAll(other)`	other의 모든 요소를 추가
교집합	`retainAll(other)`	other에 없는 요소 모두 제거
차집합	`removeAll(other)`	other에 있는 요소 모두 제거
부분집합	`containsAll(other)`	other의 모든 요소가 포함되어 있는지 확인

6. Set 구현체 비교표

비교 항목	HashSet	LinkedHashSet	TreeSet
내부 구조	해시 테이블	해시 테이블 + 연결 리스트	Red-Black 트리
순서	없음	삽입 순서 유지	정렬 순서 유지
중복	불허	불허	불허
`add` 성능	O(1)	O(1)	O(log n)
`contains` 성능	O(1)	O(1)	O(log n)
null 허용	1개 허용	1개 허용	불허
범위 검색	불가	불가	가능 (NavigableSet)

7. 실전 예제 1: 중복 이메일 제거

import java.util.ArrayList;
import java.util.LinkedHashSet;
import java.util.List;

public class DuplicateEmailRemover {
    public static void main(String[] args) {
        // 가입 신청 이메일 목록 (중복 포함)
        List<String> signupEmails = new ArrayList<>();
        signupEmails.add("alice@example.com");
        signupEmails.add("bob@example.com");
        signupEmails.add("alice@example.com");   // 중복
        signupEmails.add("charlie@example.com");
        signupEmails.add("bob@example.com");     // 중복
        signupEmails.add("diana@example.com");

        System.out.println("신청 목록 (" + signupEmails.size() + "개): " + signupEmails);

        // LinkedHashSet: 중복 제거 + 신청 순서 유지
        LinkedHashSet<String> uniqueEmails = new LinkedHashSet<>(signupEmails);
        System.out.println("중복 제거 후 (" + uniqueEmails.size() + "개): " + uniqueEmails);

        // 다시 리스트로 변환해서 사용
        List<String> finalList = new ArrayList<>(uniqueEmails);
        System.out.println("최종 가입 처리: " + finalList);

        // 이미 가입된 이메일 목록
        LinkedHashSet<String> registeredEmails = new LinkedHashSet<>();
        registeredEmails.add("alice@example.com");
        registeredEmails.add("eve@example.com");

        // 새로 가입할 이메일 = 신청 이메일 - 이미 등록된 이메일
        LinkedHashSet<String> newEmails = new LinkedHashSet<>(uniqueEmails);
        newEmails.removeAll(registeredEmails);
        System.out.println("신규 가입 처리 대상: " + newEmails);
        // [bob@example.com, charlie@example.com, diana@example.com]
    }
}

8. 실전 예제 2: 두 그룹의 공통 관심사 찾기

import java.util.Arrays;
import java.util.HashSet;
import java.util.Set;
import java.util.TreeSet;

public class CommonInterests {
    public static void main(String[] args) {
        // 사용자 A의 관심사
        Set<String> userA = new HashSet<>(Arrays.asList(
            "독서", "영화", "코딩", "요리", "여행", "사진"
        ));

        // 사용자 B의 관심사
        Set<String> userB = new HashSet<>(Arrays.asList(
            "음악", "코딩", "여행", "운동", "요리", "게임"
        ));

        System.out.println("사용자 A 관심사: " + userA);
        System.out.println("사용자 B 관심사: " + userB);

        // 공통 관심사 (교집합)
        Set<String> common = new HashSet<>(userA);
        common.retainAll(userB);
        System.out.println("공통 관심사: " + new TreeSet<>(common));
        // [요리, 여행, 코딩] (TreeSet으로 정렬하여 출력)

        // A만의 관심사
        Set<String> onlyA = new HashSet<>(userA);
        onlyA.removeAll(userB);
        System.out.println("A만의 관심사: " + new TreeSet<>(onlyA));

        // B만의 관심사
        Set<String> onlyB = new HashSet<>(userB);
        onlyB.removeAll(userA);
        System.out.println("B만의 관심사: " + new TreeSet<>(onlyB));

        // 매칭 점수 계산 (공통 관심사 / 전체 관심사 * 100)
        Set<String> allInterests = new HashSet<>(userA);
        allInterests.addAll(userB);
        double matchScore = (double) common.size() / allInterests.size() * 100;
        System.out.printf("매칭 점수: %.1f%%%n", matchScore);
    }
}

9. Set.of() 불변 Set (Java 9+)

import java.util.Set;

public class ImmutableSet {
    public static void main(String[] args) {
        // 불변 Set 생성 (중복, null 불허)
        Set<String> weekdays = Set.of("월", "화", "수", "목", "금");
        System.out.println(weekdays);
        System.out.println("월요일 있음? " + weekdays.contains("월")); // true
        // weekdays.add("토"); // UnsupportedOperationException!
        // weekdays.remove("월"); // UnsupportedOperationException!

        // Set.copyOf(): 기존 Set을 불변으로 복사
        java.util.HashSet<String> mutable = new java.util.HashSet<>();
        mutable.add("Java");
        mutable.add("Python");
        Set<String> immutableCopy = Set.copyOf(mutable);
        System.out.println(immutableCopy); // [Java, Python]

        // mutable을 변경해도 immutableCopy는 영향 없음
        mutable.add("Go");
        System.out.println("mutable: " + mutable);          // [Java, Python, Go]
        System.out.println("immutableCopy: " + immutableCopy); // [Java, Python]
    }
}

고수 팁: Set 성능 고려사항

import java.util.EnumSet;

public class SetAdvancedTips {
    // Enum과 함께 사용하는 EnumSet: 비트벡터 기반으로 가장 빠른 Set
    enum Permission { READ, WRITE, EXECUTE, DELETE }

    public static void main(String[] args) {
        // EnumSet: Enum 타입 전용 Set, 매우 빠름 (비트 연산)
        EnumSet<Permission> adminPerms = EnumSet.allOf(Permission.class);
        EnumSet<Permission> readOnlyPerms = EnumSet.of(Permission.READ);

        System.out.println("관리자 권한: " + adminPerms);   // [READ, WRITE, EXECUTE, DELETE]
        System.out.println("읽기 전용: " + readOnlyPerms);  // [READ]

        // 교집합: 읽기 권한이 있는가?
        EnumSet<Permission> commonPerms = EnumSet.copyOf(adminPerms);
        commonPerms.retainAll(readOnlyPerms);
        System.out.println("공통 권한: " + commonPerms);    // [READ]
    }
}

Set 선택 가이드 요약

단순 중복 제거, 빠른 검색 → HashSet(가장 일반적)
중복 제거 + 입력 순서 유지 → LinkedHashSet
중복 제거 + 자동 정렬 + 범위 검색 → TreeSet
Enum 타입 집합 → EnumSet(가장 빠름)

1. Set 인터페이스 특징​

2. HashSet​

주요 메서드​

equals와 hashCode의 중요성​

3. LinkedHashSet​

4. TreeSet​

기본 사용​

커스텀 정렬 Comparator​

5. Set 집합 연산 (합집합, 교집합, 차집합)​

6. Set 구현체 비교표​

7. 실전 예제 1: 중복 이메일 제거​

8. 실전 예제 2: 두 그룹의 공통 관심사 찾기​

9. Set.of() 불변 Set (Java 9+)​

고수 팁: Set 성능 고려사항​