자바 자료구조 스터디 5주차

이 md 파일은 Typora 에서 작성되었습니다. 읽으실 때, Typora 에서 읽으시는 것이 가장 보기에 좋습니다.

우선순위 큐(Priority Queue)와 힙(Heap)

우선순위 큐의 이해

'큐의 구현'과 '우선순위 큐의 구현'에는 많은 차이가 있습니다.

우선순위 큐와 우선순위의 이해

큐와 우선순위 큐의 공통점은 데이터를 삽입하고, 꺼내는 연산이 있다는 것입니다.

차이점은 그 연산의 결과가 서로 다르다는 점입니다.

이 차이는 '우선순위'에서 나옵니다. 우선순위가 있느냐 없느냐가 자료구조의 성질을 정의합니다.

이 우선순위는 전적으로 프로그래머가 결정하게 됩니다. 정렬되는 기준을 프로그래머가 설정합니다.

우선순위 큐의 구현 방법

배열을 기반으로 구현하는 방법
연결 리스트를 기반으로 구현하는 방법
힙(heap)을 이용하는 방법

구현방법은 배열이 가장 성능상 문제를 많이 가지고 있습니다. 배열은 삽입 연산이 O(n)이고, 삭제 연산도 O(n)이기 때문에, 우선순위 별로 저장하는데에는 적합하지 않습니다.

배열과 리스트의 공통적인 문제로는 삽입시에 우선순위를 확인하는데 O(n)이 걸린다는 것입니다.

그래서 힙(Heap)을 사용합니다.

힙(Heap)의 소개

힙은 '이진 트리'이되 '완전 이진 트리'입니다. 또한 모든 노드에 저장된 값이 자식 노드에 저장된 값보다 크거나 같을 수 있고, 작거나 같을 수 있습니다. 이는 각각 최대 힙(max heap), 최소 힙(min heap)으로 구분됩니다.

힙의 뜻은 '무엇인가를 차곡차곡 쌓아 올린 더미'라고 합니다.

힙의 구현과 우선순위 큐의 완성

힙에서의 데이터 저장과정

힙의 구현을 위해서는 데이터의 저장과 삭제의 방법을 알아야 합니다.

최소 힙을 기준으로 했을 때, 마지막 위치에 일단 저장을 하고, 부모 노드와 값을 비교해서 맞는 위치를 찾는 것이 저장의 과정입니다.

힙에서의 데이터 삭제과정

힙에서 삭제를 할 때, 우선순위 큐임을 고려해서 루트 노드를 제거합니다. 이 때, 루트노드를 어떻게 다시 채울 것인지가 삭제의 가장 큰 로직입니다.

여기서는 마지막 노드를 루트노드로 올리고, 자식노드 중 우선순위가 높은 노드와 위치를 변경합니다.

우선순위가 낮은 노드와 변경하면 힙의 기본 조건이 무너지기 때문에, 주의해야 합니다.

삽입과 삭제의 과정에서 보인 성능의 평가

힙의 시간 복잡도는 저장과 삭제 모두 $O(log_2n)$ 입니다.

따라서, 우선순위 큐를 구현할 때, 배열과 연결 리스트에 비해서 평균적으로 더 좋은 성능을 보임을 알 수 있습니다.

힙의 구현에 어울리는 것은 연결 리스트? 아니면 배열?

여기서는 구현말고 구현 방법에 대해서 다룹니다. 구현은 트리구조지만, 트리는 배열 또는 연결 리스트로 구현할 수 있습니다.

특이하게도 힙은 배열을 기반으로 구현해야합니다. 힙의 구현은 배열로 구현하는 것이 원칙으로 받아들여지고 있습니다.

왜냐하면 연결 리스트를 기반으로 했을 때, 새 노드를 마지막 위치에 추가하는 것이 쉽지 않기 때문이라고 합니다.

배열을 기반으로 힙을 구현하는데 필요한 지식들

노드에 고유한 번호를 부여하고, 그곳이 곧 인덱스다.
0번 노드는 구현의 편의를 위해 비워둔다.
왼쪽 자식 노드, 오른쪽 자식 노드의 인덱스 값을 얻는 방법을 알아야 한다.
또, 자식 노드에서 부모 노드의 인덱스 값을 얻는 방법을 알아야 한다.
방법
- 왼쪽 자식 노드의 인덱스 값: 부모 노드의 인덱스 값 X 2
- 오른쪽 자식 노드의 인덱스 값: 부모 노드의 인덱스 값 X 2 + 1
- 부모 노드의 인덱스 값: 자식 노드의 인덱스 값 / 2

원리 이해 중심의 힙 구현: ADT 정의

힙은 완전 이진 트리입니다.
힙의 구현은 배열을 기반으로 하며 인덱스가 0인 요소는 비워둡니다.
힙에 저장된 노드의 개수와 마지막 노드의 고유번호는 일치합니다.
노드의 고유번호가 노드가 저장되는 배열의 인덱스 값이 됩니다.
우선순위를 나타내는 정수 값이 작을수록 높은 우선순위를 나타낸다고 가정합니다.

SimpleHeap.java

/**
 * 우선순위를 기준으로 하는 Heap 자료구조
 *
 * @param <E> 타입 파라미터
 */
public interface SimpleHeap<E> {

  /**
   * 해당 Heap이 비어있는지 여부를 반환합니다.
   *
   * @return 해당 Heap이 비어있으면 true, 아니라면 false
   */
  boolean isEmpty();

  /**
   * 해당 Heap에 데이터를 저장합니다.
   *
   * @param data     저장할 데이터
   * @param priority 저장할 데이터의 우선순위(작을 수록 높은 우선순위)
   */
  void insert(E data, int priority);

  /**
   * 해당 Heap의 최우선순위 데이터를 제거합니다.
   *
   * @return 힙에 저장되어 있던 최우선순위 데이터
   */
  E delete();
}

원리 이해 중심의 힙 구현: 테스트 코드 작성

SimpleHeapTest.java

class SimpleHeapTest {

  @Test
  @DisplayName("힙의_초기화_테스트")
  void 힙의_초기화_테스트() {
    SimpleHeap<Character> charHeap = new ArraySimpleHeap<>();

    assertThat(charHeap).isNotNull();
    assertThat(charHeap.isEmpty()).isTrue();
  }

  @ParameterizedTest
  @CsvSource({"'A','B','C'"})
  @DisplayName("힙_저장_테스트")
  void 힙_저장_테스트(char firstElem, char secondElem, char thirdElem) {
    SimpleHeap<Character> charHeap = new ArraySimpleHeap<>();

    charHeap.insert(firstElem, 1);
    charHeap.insert(secondElem, 2);
    charHeap.insert(thirdElem, 3);

    assertThat(charHeap).isNotNull();
    assertThat(charHeap.isEmpty()).isFalse();
  }

  @ParameterizedTest
  @CsvSource({"'A','B','C'"})
  @DisplayName("힙_저장_후_제거_테스트")
  void 힙_저장_후_제거_테스트(char firstElem, char secondElem, char thirdElem) {
    SimpleHeap<Character> charHeap = new ArraySimpleHeap<>();

    charHeap.insert(firstElem, 1);
    charHeap.insert(secondElem, 2);
    charHeap.insert(thirdElem, 3);

    assertThat(charHeap).isNotNull();
    assertThat(charHeap.isEmpty()).isFalse();
    assertThat(charHeap.delete()).isEqualTo(firstElem);
    assertThat(charHeap.delete()).isEqualTo(secondElem);
    assertThat(charHeap.delete()).isEqualTo(thirdElem);
    assertThat(charHeap.isEmpty()).isTrue();
  }

  @ParameterizedTest
  @CsvSource({"'A','B','C'"})
  @DisplayName("힙_사용_테스트")
  void 힙_사용_테스트(char firstElem, char secondElem, char thirdElem) {
    SimpleHeap<Character> charHeap = new ArraySimpleHeap<>();

    charHeap.insert(firstElem, 1);
    charHeap.insert(secondElem, 2);
    charHeap.insert(thirdElem, 3);

    assertThat(charHeap.delete()).isEqualTo('A');

    charHeap.insert(firstElem, 1);
    charHeap.insert(secondElem, 2);
    charHeap.insert(thirdElem, 3);

    assertThat(charHeap).isNotNull();
    assertThat(charHeap.isEmpty()).isFalse();
    assertThat(charHeap.delete()).isEqualTo(firstElem);
    assertThat(charHeap.delete()).isEqualTo(secondElem);
    assertThat(charHeap.delete()).isEqualTo(secondElem);
    assertThat(charHeap.delete()).isEqualTo(thirdElem);
    assertThat(charHeap.delete()).isEqualTo(thirdElem);
    assertThat(charHeap.isEmpty()).isTrue();
  }
}

원리 이해 중심의 힙 구현: 클래스 작성

ArraySimpleHeap.java

public class ArraySimpleHeap<E> implements SimpleHeap<E> {

  private static final int INITIAL_CAPACITY = 100 + 1;

  private int numOfData;
  private HeapElement<E>[] heapArr;

  public ArraySimpleHeap() {
    this.heapArr = new HeapElement[INITIAL_CAPACITY];
  }

  @Override
  public boolean isEmpty() {
    return numOfData == 0;
  }

  @Override
  public void insert(E data, int priority) {

  }

  @Override
  public E delete() {
    return null;
  }
  
  private static class HeapElement<T> {
    
    private final T data;
    private final int priority;

    public HeapElement(T data, int priority) {
      this.data = data;
      this.priority = priority;
    }
  }
}

클래스를 초기화했습니다.

INITIAL_CAPACITY가 100 + 1인 이유는 첫 인덱스는 사용하지 않기 때문입니다.

원리 이해 중심의 힙 구현: private 메소드 작성

/**
 * 부모 노드의 인덱스 반환 요청
 *
 * @param currentIndex 현재 자식 노드의 인덱스
 * @return 부모노드의 인덱스
 */
private int getParentIndex(int currentIndex) {
  return currentIndex / 2;
}

/**
  * 왼쪽 자식 노드의 인덱스 반환 요청
  *
  * @param parentIndex 부모 노드의 인덱스
  * @return 왼쪽 자식 노드의 인덱스
  */
private int getLeftChildIndex(int parentIndex) {
  return parentIndex * 2;
}

/**
  * 오른쪽 자식 노드의 인덱스 반환 요청
  *
  * @param parentIndex 부모 노드의 인덱스
  * @return 오른쪽 자식 노드의 인덱스
  */
private int getRightChildIndex(int parentIndex) {
  return parentIndex * 2 + 1;
}

/**
  * 두 개의 자식 노드 중 우선순위가 더 높은 자식의 인덱스 반환 요청
  *
  * @param currentIndex 판단의 기준이 되는 노드
  * @return 우선순위가 더 높은 자식 노드의 인덱스
  */
private int getHighPriorityChildIndex(int currentIndex) {
  int leftChildIndex = this.getLeftChildIndex(currentIndex);
  if (leftChildIndex > this.numOfData) { // 존재하지 않는 노드라면
    return 0;
  }
  if (leftChildIndex == this.numOfData) { // 왼쪽노드가 마지막 노드라면
    return leftChildIndex;
  }

  int rightChildIndex = this.getRightChildIndex(currentIndex);
  int leftChildNodePriority = this.heapArr[leftChildIndex].priority;
  int rightChildNodePriority = this.heapArr[rightChildIndex].priority;

  // 우선순위는 낮은것이 더 우위이므로 왼쪽노드가 더 우선순위가 낮다면
  if (leftChildNodePriority > rightChildNodePriority) {
    return rightChildIndex;
  }
  return leftChildIndex;
}

내부 구현에서 쓰일 private 메소드를 선언해주었습니다.

원리 이해 중심의 힙 구현: 삽입 로직 작성

@Override
public void insert(E data, int priority) {
  int index = numOfData + 1;
  HeapElement<E> nextElement = new HeapElement<>(data, priority);

  while (index != 1) {
    int parentIndex = getParentIndex(index);

    // 부모 노드와 비교했을 때, 부모노드보다 우선순위가 높다면 서로의 위치를 바꾼다.
    if (priority < this.heapArr[parentIndex].priority) {
      this.heapArr[index] = this.heapArr[parentIndex];
      index = parentIndex;
    } else {
      break;
    }
  }

  this.heapArr[index] = nextElement;
  this.numOfData += 1;
}

삽입하는 과정은 맨 마지막에 저장하고 부모 노드와 비교하면서 서로의 위치를 바꾸는 구조입니다.

원리 이해 중심의 힙 구현: 삭제 로직 작성

@Override
public E delete() {
  int rootIndex = 1;
  E retData = this.heapArr[rootIndex].data; // 루트노드에 존재하던 데이터
  HeapElement<E> lastElement = this.heapArr[this.numOfData]; // 마지막 노드

  int parentIndex = rootIndex; // 루트로 옮기는 것을 의미
  int childIndex = getHighPriorityChildIndex(parentIndex); // 더 우선순위인 자식노드

  while (childIndex > 0) { // 부모노드가 단말노드가 아니라면
    if (lastElement.priority <= this.heapArr[childIndex].priority) { // 마지막 노드가 우선순위가 높다면
      break;
    }
    this.heapArr[parentIndex] = this.heapArr[childIndex]; // 자식 노드와 부모노드의 위치를 변경
    parentIndex = childIndex;
    childIndex = getHighPriorityChildIndex(parentIndex);
  }

  this.heapArr[parentIndex] = lastElement; // 마지막에 위치했던 노드를 한 번에 옮긴다.
  this.numOfData -= 1;
  return retData;
}

루트에 있던 데이터를 제거하고, 마지막 노드를 루트 노드로 옮긴 상태를 가정합니다.

그리고 나서 자식노드 중 더 우선순위인 노드와 우선순위를 비교해서, 힙의 조건을 만족하도록 노드의 위치를 조정합니다.

제법 쓸만한 수준의 힙 구현: 힙의 변경

프로그래머가 우선순위의 판단 기준을 힙에 설정할 수 있어야 합니다.

일단 우선순위를 넣는 것이 없어야 하고, 대신 Comparator 를 넣을 수 있으면 요구사항을 만족하겠네요!

제법 쓸만한 수준의 힙 구현: Interface 정의

UsefulHeap.java

/**
 * Comparator를 이용한 쓸만한 Heap 자료구조
 *
 * @param <E> 타입 파라미터
 */
public interface UsefulHeap<E> {

  /**
   * 해당 Heap이 비어있는지 여부를 반환합니다.
   *
   * @return 해당 Heap이 비어있으면 true, 아니라면 false
   */
  boolean isEmpty();

  /**
   * 해당 Heap에 데이터를 저장합니다. 우선순위는 Comparator로 결정합니다.
   *
   * @param data 저장할 데이터
   */
  void insert(E data);

  /**
   * 해당 Heap의 최우선순위 데이터를 제거합니다.
   *
   * @return 힙에 저장되어 있던 최우선순위 데이터
   */
  E delete();
}

제법 쓸만한 수준의 힙 구현: Test 코드 작성

UsefulHeapTest.java

class UsefulHeapTest {

  @Test
  @DisplayName("쓸만한_힙의_초기화_테스트")
  void 쓸만한_힙의_초기화_테스트() {
    UsefulHeap<Character> charHeap = new ArrayUsefulHeap<>((o1, o2) -> o2 - o1);

    assertThat(charHeap).isNotNull();
    assertThat(charHeap.isEmpty()).isTrue();
  }

  @ParameterizedTest
  @CsvSource({"'A','B','C'"})
  @DisplayName("쓸만한_힙_저장_테스트")
  void 쓸만한_힙_저장_테스트(char firstElem, char secondElem, char thirdElem) {
    UsefulHeap<Character> charHeap = new ArrayUsefulHeap<>((o1, o2) -> o2 - o1);

    charHeap.insert(firstElem);
    charHeap.insert(secondElem);
    charHeap.insert(thirdElem);

    assertThat(charHeap).isNotNull();
    assertThat(charHeap.isEmpty()).isFalse();
  }

  @ParameterizedTest
  @CsvSource({"'A','B','C'"})
  @DisplayName("쓸만한_힙_저장_후_제거_테스트")
  void 쓸만한_힙_저장_후_제거_테스트(char firstElem, char secondElem, char thirdElem) {
    UsefulHeap<Character> charHeap = new ArrayUsefulHeap<>((o1, o2) -> o2 - o1);

    charHeap.insert(firstElem);
    charHeap.insert(secondElem);
    charHeap.insert(thirdElem);

    assertThat(charHeap).isNotNull();
    assertThat(charHeap.isEmpty()).isFalse();
    assertThat(charHeap.delete()).isEqualTo(firstElem);
    assertThat(charHeap.delete()).isEqualTo(secondElem);
    assertThat(charHeap.delete()).isEqualTo(thirdElem);
    assertThat(charHeap.isEmpty()).isTrue();
  }

  @ParameterizedTest
  @CsvSource({"'A','B','C'"})
  @DisplayName("쓸만한_힙_사용_테스트")
  void 쓸만한_힙_사용_테스트(char firstElem, char secondElem, char thirdElem) {
    UsefulHeap<Character> charHeap = new ArrayUsefulHeap<>((o1, o2) -> o2 - o1);

    charHeap.insert(firstElem);
    charHeap.insert(secondElem);
    charHeap.insert(thirdElem);

    assertThat(charHeap.delete()).isEqualTo('A');

    charHeap.insert(firstElem);
    charHeap.insert(secondElem);
    charHeap.insert(thirdElem);

    assertThat(charHeap).isNotNull();
    assertThat(charHeap.isEmpty()).isFalse();
    assertThat(charHeap.delete()).isEqualTo(firstElem);
    assertThat(charHeap.delete()).isEqualTo(secondElem);
    assertThat(charHeap.delete()).isEqualTo(secondElem);
    assertThat(charHeap.delete()).isEqualTo(thirdElem);
    assertThat(charHeap.delete()).isEqualTo(thirdElem);
    assertThat(charHeap.isEmpty()).isTrue();
  }
}

제법 쓸만한 수준의 힙 구현: 클래스 작성

ArrayUsefulHeap.java

public class ArrayUsefulHeap<E> implements UsefulHeap<E> {

  private static final int INITIAL_CAPACITY = 100 + 1;

  private int numOfData;
  private Object[] heapArr;
  // 첫번째 인자가 크면 양수, 작으면 음수, 같으면 0을 반환한다.
  private Comparator<E> comp;

  public ArrayUsefulHeap(Comparator<E> comparator) {
    this.heapArr = new Object[INITIAL_CAPACITY];
    this.comp = comparator;
  }

  @Override
  public boolean isEmpty() {
    return this.numOfData == 0;
  }

  @Override
  public void insert(E data) {

  }

  @Override
  public E delete() {
    return null;
  }
}

Comparator를 받는 부분과 내장 클래스가 제거된 부분을 제외하면 동일합니다.

제법 쓸만한 수준의 힙 구현: 클래스 구현

@Override
public void insert(E data) {
  int index = numOfData + 1;

  while (index != 1) {
    int parentIndex = getParentIndex(index);

    // 부모 노드와 비교했을 때, 부모노드보다 우선순위가 높다면 서로의 위치를 바꾼다.
    if (comp.compare(data, (E) this.heapArr[parentIndex]) > 0) {
      this.heapArr[index] = this.heapArr[parentIndex];
      index = parentIndex;
    } else {
      break;
    }
  }

  this.heapArr[index] = data;
  this.numOfData += 1;
}

@Override
public E delete() {
  int rootIndex = 1;
  E retData = (E) this.heapArr[rootIndex]; // 루트노드에 존재하던 데이터
  E lastElement = (E) this.heapArr[this.numOfData]; // 마지막 노드

  int parentIndex = rootIndex; // 루트로 옮기는 것을 의미
  int childIndex = getHighPriorityChildIndex(parentIndex); // 더 우선순위인 자식노드

  while (childIndex > 0) { // 부모노드가 단말노드가 아니라면
    if (comp.compare(lastElement, (E) this.heapArr[childIndex]) >= 0) { // 마지막 노드가 우선순위가 높다면
      break;
    }
    this.heapArr[parentIndex] = this.heapArr[childIndex]; // 자식 노드와 부모노드의 위치를 변경
    parentIndex = childIndex;
    childIndex = getHighPriorityChildIndex(parentIndex);
  }

  this.heapArr[parentIndex] = lastElement; // 마지막에 위치했던 노드를 한 번에 옮긴다.
  this.numOfData -= 1;
  return retData;
}

/**
   * 부모 노드의 인덱스 반환 요청
   *
   * @param currentIndex 현재 자식 노드의 인덱스
   * @return 부모노드의 인덱스
   */
private int getParentIndex(int currentIndex) {
  return currentIndex / 2;
}

/**
   * 왼쪽 자식 노드의 인덱스 반환 요청
   *
   * @param parentIndex 부모 노드의 인덱스
   * @return 왼쪽 자식 노드의 인덱스
   */
private int getLeftChildIndex(int parentIndex) {
  return parentIndex * 2;
}

/**
   * 오른쪽 자식 노드의 인덱스 반환 요청
   *
   * @param parentIndex 부모 노드의 인덱스
   * @return 오른쪽 자식 노드의 인덱스
   */
private int getRightChildIndex(int parentIndex) {
  return parentIndex * 2 + 1;
}

/**
   * 두 개의 자식 노드 중 우선순위가 더 높은 자식의 인덱스 반환 요청
   *
   * @param currentIndex 판단의 기준이 되는 노드
   * @return 우선순위가 더 높은 자식 노드의 인덱스
   */
private int getHighPriorityChildIndex(int currentIndex) {
  int leftChildIndex = this.getLeftChildIndex(currentIndex);
  if (leftChildIndex > this.numOfData) { // 존재하지 않는 노드라면
    return 0;
  }
  if (leftChildIndex == this.numOfData) { // 왼쪽노드가 마지막 노드라면
    return leftChildIndex;
  }

  int rightChildIndex = this.getRightChildIndex(currentIndex);
  E leftChildNode = (E) this.heapArr[leftChildIndex];
  E rightChildNode = (E) this.heapArr[rightChildIndex];

  // 우선순위는 낮은것이 더 우위이므로 왼쪽노드가 더 우선순위가 낮다면
  if (comp.compare(leftChildNode, rightChildNode) < 0) {
    return rightChildIndex;
  }
  return leftChildIndex;
}

달라진 부분은 책을 참고해주세요!

제법 쓸만한 수준의 힙을 이용한 우선순위 큐의 구현: 인터페이스 선언

ADT를 보고 인터페이스를 먼저 정의해봅시다.

PriorityQueue.java

/**
 * 우선순위 큐 인터페이스
 *
 * @param <E> 타입 파라미터
 */
public interface PriorityQueue<E> {

  /**
   * 해당 우선순위 큐가 비어있는지 여부를 반환합니다.
   *
   * @return 해당 우선순위 큐가 비어있는지 여부
   */
  boolean isEmpty();

  /**
   * 우선순위 큐에 데이터를 저장합니다.
   *
   * @param data 저장할 데이터
   */
  void enqueue(E data);

  /**
   * 우선순위 큐에서 우선순위가 가장 높은 데이터를 제거합니다.<br>이 메소드는 데이터가 최소 1개 이상 있을 때, 동작합니다.
   *
   * @return 우선순위가 가장 높은 데이터
   */
  E dequeue();
}

제법 쓸만한 수준의 힙을 이용한 우선순위 큐의 구현: 테스트 코드 작성

PriorityQueueTest.java

class PriorityQueueTest {

  @Test
  @DisplayName("우선순위_큐_초기화_테스트")
  void 우선순위_큐_초기화_테스트() {
    PriorityQueue<Character> priorityQueue = new HeapPriorityQueue<>((o1, o2) -> o2 - o1);

    assertThat(priorityQueue).isNotNull();
  }

  @ParameterizedTest
  @CsvSource({"'A','B','C'"})
  @DisplayName("우선순위_큐_저장_테스트")
  void 우선순위_큐_저장_테스트(char firstElem, char secondElem, char thirdElem) {
    PriorityQueue<Character> charHeap = new HeapPriorityQueue<>((o1, o2) -> o2 - o1);

    charHeap.enqueue(firstElem);
    charHeap.enqueue(secondElem);
    charHeap.enqueue(thirdElem);

    assertThat(charHeap).isNotNull();
    assertThat(charHeap.isEmpty()).isFalse();
  }

  @ParameterizedTest
  @CsvSource({"'A','B','C'"})
  @DisplayName("우선순위_큐_저장_후_제거_테스트")
  void 우선순위_큐_저장_후_제거_테스트(char firstElem, char secondElem, char thirdElem) {
    PriorityQueue<Character> charHeap = new HeapPriorityQueue<>((o1, o2) -> o2 - o1);

    charHeap.enqueue(firstElem);
    charHeap.enqueue(secondElem);
    charHeap.enqueue(thirdElem);

    assertThat(charHeap).isNotNull();
    assertThat(charHeap.isEmpty()).isFalse();
    assertThat(charHeap.dequeue()).isEqualTo(firstElem);
    assertThat(charHeap.dequeue()).isEqualTo(secondElem);
    assertThat(charHeap.dequeue()).isEqualTo(thirdElem);
    assertThat(charHeap.isEmpty()).isTrue();
  }

  @ParameterizedTest
  @CsvSource({"'A','B','C'"})
  @DisplayName("우선순위_큐_사용_테스트")
  void 우선순위_큐_사용_테스트(char firstElem, char secondElem, char thirdElem) {
    PriorityQueue<Character> charHeap = new HeapPriorityQueue<>((o1, o2) -> o2 - o1);

    charHeap.enqueue(firstElem);
    charHeap.enqueue(secondElem);
    charHeap.enqueue(thirdElem);

    assertThat(charHeap.dequeue()).isEqualTo('A');

    charHeap.enqueue(firstElem);
    charHeap.enqueue(secondElem);
    charHeap.enqueue(thirdElem);

    assertThat(charHeap).isNotNull();
    assertThat(charHeap.isEmpty()).isFalse();
    assertThat(charHeap.dequeue()).isEqualTo(firstElem);
    assertThat(charHeap.dequeue()).isEqualTo(secondElem);
    assertThat(charHeap.dequeue()).isEqualTo(secondElem);
    assertThat(charHeap.dequeue()).isEqualTo(thirdElem);
    assertThat(charHeap.dequeue()).isEqualTo(thirdElem);
    assertThat(charHeap.isEmpty()).isTrue();
  }

  @Test
  @DisplayName("우선순위_큐_제거시_에러_테스트")
  void 우선순위_큐_제거시_에러_테스트() {
    PriorityQueue<Character> charHeap = new HeapPriorityQueue<>((o1, o2) -> o2 - o1);

    assertThat(charHeap).isNotNull();
    assertThatThrownBy(charHeap::dequeue).isInstanceOf(EmptyQueueException.class);
    assertThat(charHeap.isEmpty()).isTrue();
  }
}

제법 쓸만한 수준의 힙을 이용한 우선순위 큐의 구현: 클래스 구현

HeapPriorityQueue.java

public class HeapPriorityQueue<E> implements PriorityQueue<E> {

  private final UsefulHeap<E> heap;

  public HeapPriorityQueue(Comparator<E> comparator) {
    this.heap = new ArrayUsefulHeap<>(comparator);
  }

  @Override
  public boolean isEmpty() {
    return this.heap.isEmpty();
  }

  @Override
  public void enqueue(E data) {
    this.heap.insert(data);
  }

  @Override
  public E dequeue() {
    if (isEmpty()) {
      throw new EmptyQueueException();
    }
    return this.heap.delete();
  }
}

힙을 이용해서 간단하게 구현할 수 있었습니다!

정렬(Sorting)

Sort.java

public interface Sort {

  void sort(int[] array);
}

정렬은 테스트 코드를 어떻게 짤지 고민하다가, 다음과 같이 테스트를 작성했습니다.

class SortTest {

  long beforeTime;
  long afterTime;

  @BeforeEach
  void setUp() {
    beforeTime = System.currentTimeMillis();
  }

  @AfterEach
  void tearDown() {
    afterTime = System.currentTimeMillis();
    System.out.println("===========>실행시간(ms): " + (afterTime - beforeTime));
  }

  @Test
  @DisplayName("버블_정렬_테스트")
  void 버블_정렬_테스트() {
    Sort sort = new BubbleSort();

    // Repeated Test 사용시 각각 실행시간이 찍혀서 어쩔 수 없이 반복 (1억번)
    for (int i = 0; i < 100_000_000; i++) {
      int[] actual = {1, 4, 5, 2, 3, 7, 6};
      int[] expected = {1, 2, 3, 4, 5, 6, 7};
      sort.sort(actual);

      assertThat(actual).isEqualTo(expected);
    }
  }

  @Test
  @DisplayName("선택_정렬_테스트")
  void 선택_정렬_테스트() {
    Sort sort = new SelectionSort();

    // Repeated Test 사용시 각각 실행시간이 찍혀서 어쩔 수 없이 반복 (1억번)
    for (int i = 0; i < 100_000_000; i++) {
      int[] actual = {1, 4, 5, 2, 3, 7, 6};
      int[] expected = {1, 2, 3, 4, 5, 6, 7};
      sort.sort(actual);

      assertThat(actual).isEqualTo(expected);
    }
  }

  @Test
  @DisplayName("삽입_정렬_테스트")
  void 삽입_정렬_테스트() {
    Sort sort = new InsertionSort();

    // Repeated Test 사용시 각각 실행시간이 찍혀서 어쩔 수 없이 반복 (1억번)
    for (int i = 0; i < 100_000_000; i++) {
      int[] actual = {1, 4, 5, 2, 3, 7, 6};
      int[] expected = {1, 2, 3, 4, 5, 6, 7};
      sort.sort(actual);

      assertThat(actual).isEqualTo(expected);
    }
  }

  @Test
  @DisplayName("힙_정렬_테스트")
  void 힙_정렬_테스트() {
    Sort sort = new HeapSort();

    // Repeated Test 사용시 각각 실행시간이 찍혀서 어쩔 수 없이 반복 (1억번)
    for (int i = 0; i < 100_000_000; i++) {
      int[] actual = {1, 4, 5, 2, 3, 7, 6};
      int[] expected = {1, 2, 3, 4, 5, 6, 7};
      sort.sort(actual);

      assertThat(actual).isEqualTo(expected);
    }
  }

  @Test
  @DisplayName("병합_정렬_테스트")
  void 병합_정렬_테스트() {
    Sort sort = new MergeSort();

    // Repeated Test 사용시 각각 실행시간이 찍혀서 어쩔 수 없이 반복 (1억번)
    for (int i = 0; i < 100_000_000; i++) {
      int[] actual = {1, 4, 5, 2, 3, 7, 6};
      int[] expected = {1, 2, 3, 4, 5, 6, 7};
      sort.sort(actual);

      assertThat(actual).isEqualTo(expected);
    }
  }

  @Test
  @DisplayName("퀵_정렬_테스트")
  void 퀵_정렬_테스트() {
    Sort sort = new QuickSort();

    // Repeated Test 사용시 각각 실행시간이 찍혀서 어쩔 수 없이 반복 (1억번)
    for (int i = 0; i < 100_000_000; i++) {
      int[] actual = {1, 4, 5, 2, 3, 7, 6};
      int[] expected = {1, 2, 3, 4, 5, 6, 7};
      sort.sort(actual);

      assertThat(actual).isEqualTo(expected);
    }
  }
}

컴퓨터 성능에 따라 적절히 조정하시면 될 것 같습니다.

단순한 정렬 알고리즘

버블 정렬(Bubble Sort): 이해와 구현

버블 정렬은 인접한 두 개의 데이터를 비교해가면서 정렬을 진행하는 방식입니다.

BubbleSort.java

public class BubbleSort implements Sort {

  @Override
  public void sort(int[] array) {
    int length = array.length;
    int temp;
    for (int i = 0; i < length - 1; i++) {
      for (int j = 0; j < length - i - 1; j++) {
        if (array[j] > array[j + 1]) {
          temp = array[j];
          array[j] = array[j + 1];
          array[j + 1] = temp;
        }
      }
    }
  }
}

제 컴퓨터에서 테스트를 실행시켜보니 약 4.5초가 걸렸습니다.

버블 정렬(Bubble Sort): 성능 평가

구리다. $O(n^2)$

사실 학습용 정렬입니다.

선택 정렬(Selection Sort): 이해와 구현

제일 작은 것을 찾아서 앞에서부터 채우는 방식입니다.

SelectionSort.java

public class SelectionSort implements Sort {

  @Override
  public void sort(int[] array) {
    int length = array.length;
    int maxIndex;
    int temp;
    for (int i = 0; i < length - 1; i++) {
      maxIndex = i;
      for (int j = i + 1; j < length; j++) { // 최솟값 탐색
        if (array[j] < array[maxIndex]) {
          maxIndex = j;
        }
      }

      // swap
      temp = array[i];
      array[i] = array[maxIndex];
      array[maxIndex] = temp;
    }
  }
}

선택 정렬(Selection Sort): 성능 평가

$O(n^2)$

역시 비슷한 속도를 보임을 알 수 있습니다.

삽입 정렬(Insertion Sort): 이해와 구현

선택 된 원소를 삽입하는 식의 정렬입니다.

InsertionSort.java

public class InsertionSort implements Sort {

  @Override
  public void sort(int[] array) {
    int length = array.length;
    int insertData;

    for (int i = 1; i < length; i++) {
      insertData = array[i];
      int j;

      for (j = i - 1; j >= 0; j--) { // 정렬된 배열의 뒷부분 부터 비교해서 삽입위치 탐색
        if (array[j] > insertData) { // 삽입할 데이터가 작으면
          array[j + 1] = array[j]; // 위치 변경
        } else {
          break; // 해당 위치에 삽입되어야 하므로 탈출
        }
      }

      array[j + 1] = insertData;
    }
  }
}

삽입 정렬(Insertion Sort): 성능 평가

삽입 정렬은 이미 정렬되어 있는 원소가 많을 경우 매우 빠릅니다.

하지만 최악의 경우 $O(n^2)$ 임은 변함이 없습니다.

복잡하지만 효율적인 정렬 알고리즘

이제부터는 좀 쓸만한 정렬 알고리즘이 나옵니다.

대신 어렵습니다.

힙 정렬(Heap Sort): 이해와 구현

힙 자료구조를 정렬에 사용하는 형태입니다.

힙 자료구조가 사용되기 때문에 공간복잡도는 조금 높아집니다.

HeapSort.java

public class HeapSort implements Sort {

  @Override
  public void sort(int[] array) {
    this.sort(array, ((o1, o2) -> o2 - o1));
  }

  private void sort(int[] array, Comparator<Integer> comp) {
    UsefulHeap<Integer> heap = new ArrayUsefulHeap<>(comp);

    // 정렬 대상을 가지고 힙을 구성한다.
    for (int i : array) {
      heap.insert(i);
    }

    // 힙에서 하나씩 꺼내서 정렬된 구조로 반환한다.
    for (int i = 0; i < array.length; i++) {
      array[i] = heap.delete();
    }
  }
}

힙 정렬(Heap Sort): 성능 평가

직접 돌려보면, 그렇게 좋아보이지 않습니다.

하지만, 힙 자료구조의 특성으로 인하여, $O(nlogn)$ 이 됩니다.

오래 걸리는 이유는 정렬이 필요한 데이터가 적기 때문입니다.

병합 정렬(Merge Sort): 이해와 구현

이 알고리즘은 분할 정복(divide and conquer)기법을 알아야 합니다.

뒤에 나올 퀵 소트도 마찬가지 입니다.

설명은 책을 읽어보세요. 복잡한 문제를 단순한 문제로 쪼개서 문제를 풀어나가는 방법입니다.

보통 재귀를 많이 사용합니다.

병합 정렬의 탈출 조건은 정렬할 필요가 없을 때, 탈출하게 됩니다.

MergeSort.java

public class MergeSort implements Sort {

  @Override
  public void sort(int[] array) {
    mergeSort(array, 0, array.length - 1);
  }

  private void mergeSort(int[] array, int left, int right) {
    if (left < right) { // 더 나눌 수 있다면
      int mid = left + ((right - left) / 2); // https://endorphin0710.tistory.com/112 참조

      mergeSort(array, left, mid);
      mergeSort(array, mid + 1, right);

      mergeTwoArea(array, left, mid , right);
    }
  }

  private void mergeTwoArea(int[] array, int left, int mid, int right) {
    int[] temp = new int[right + 1];
    int fIndex = left;
    int rIndex = mid + 1;
    int sIndex = left;

    while (fIndex <= mid && rIndex <= right) { // 병합할 영역을 비교해 temp에 담는다.
      if (array[fIndex] <= array[rIndex]) {
        temp[sIndex] = array[fIndex++];
      } else {
        temp[sIndex] = array[rIndex++];
      }

      sIndex++;
    }

    if (fIndex > mid) { // 남은 데이터를 옮깁니다.
      for (int i = rIndex; i <= right; i++, sIndex++) {
        temp[sIndex] = array[i];
      }
    } else {
      for (int i = fIndex; i <= mid; i++, sIndex++) {
        temp[sIndex] = array[i];
      }
    }

    for (int i = left; i <= right; i++) {
      array[i] = temp[i];
    }
  }
}

병합 정렬(Merge Sort): 성능 평가

$O(nlogn)$ 입니다.

추가적인 공간복잡도를 소모하기 때문에 이를 유의해야 합니다.

항상 $nlogn$ 이므로 최악의 경우에도 좋은 성능을 기대할 수 있습니다.

퀵 정렬(Quick Sort): 이해와 구현

평균적으로 가장 빠른 정렬 방법입니다.

피벗을 지정해야 한다는 점이 특이할 수 있겠네요.

책에 있는 코드로 따라 쳤는데 ArrayIndexOutOfBoundException이 발생해서 제가 예전에 작성했던 코드로 대체합니다.

public class QuickSort implements Sort {

  @Override
  public void sort(int[] array) {
    quickSort(array, 0, array.length - 1);
  }

  private void quickSort(int[] array, int left, int right) {
    if (left < right) { // 교차되기 전까지
      int pivot = partition(array, left, right);
      quickSort(array, left, pivot - 1);
      quickSort(array, pivot + 1, right);
    }
  }

  private int partition(int[] array, int left, int right) {
    int pivot = array[left];
    int k = right;

    for (int i = right; i > left; i--) {
      if (array[i] > pivot) {
        swap(array, i, k);
        k--;
      }
    }

    swap(array, left, k);
    return k;
  }

  private void swap(int[] array, int index1, int index2) {
    int temp = array[index1];
    array[index1] = array[index2];
    array[index2] = temp;
  }
}

퀵 정렬(Quick Sort): 성능 평가

보통 $O(nlogn)$ 이고, 최악의 경우 $O(n^2)$ 입니다.

평균적으로 가장 빠르기 때문에 자주 사용되는 정렬입니다.

요즘엔 퀵 정렬 보다 진보된 정렬 알고리즘을 언어 자체적으로 내장하고 있는 경우가 많습니다.

기수 정렬(Radix Sort): 이해1

기수 정렬은 특정한 상황에서만 사용 가능한 정렬 방법입니다.

특정한 상황에서 사용할 수 있는 만큼 성능은 $O(nlogn)$ 보다 빠를 수 있습니다.

기수 정렬은 데이터를 구성하는 기본요소인 기수를 이용해서 정렬을 진행합니다.

기수 정렬(Radix Sort): 이해2

'LSD 기수 정렬': Least Significant Digit, '덜 중요한 자릿수'에서부터 정렬을 진행해 나간다.

버킷은 데이터가 표현 가능한 범위로 10진수의 경우 10개, 책에서 나오는 5진수의 경우엔 5개입니다.

가장 덜 중요한, 첫번째 자리의 수부터 비교합니다.

그리고, 데이터는 들어간 순서대로 꺼내면 됩니다.

이런 과정을 맨 앞자리 까지 거치면 정렬되게 됩니다.

단점은, 가장 영향력이 큰 자릿수를 마지막에 비교해서 마지막까지 결과를 알 수 없다는 것이 단점입니다.

기수 정렬(Radix Sort): LSD vs. MSD

MSD는 LSD를 반대로 진행합니다.

장점은 중간에 정렬이 완료될 수도 있지만, 단점으로는 구현 난이도가 상대적으로 높다는 데에 있습니다.

기수 정렬(Radix Sort): LSD 기준 구현

LSD로 구현하는 장점은 우리가 직접하는 것이 아닌 컴퓨터가 한다는데에 있습니다.

항상 일관된 방식으로 구현이 가능하다는 것이 가장 큰 장점입니다.

기수 정렬(Radix Sort): 테스트 코드 작성

기수정렬은 다른 정렬방식과는 정렬 방식 자체가 다르기 때문에, 인터페이스 없이 테스트 코드를 작성하려고 합니다.

RadixSortTest.java

class RadixSortTest {

  @Test
  @DisplayName("기수_정렬_테스트")
  void 기수_정렬_테스트() {
    int[] array = {13, 212, 14, 7141, 10987, 6, 15};
    int maxLength = 5;

    RadixSort radixSort = new RadixSort();
    radixSort.sort(array, maxLength);
    assertThat(array).containsExactly(6, 13, 14, 15, 212, 7141, 10987);
  }
}

제일 큰 원소의 길이를 알려주어야 합니다.

기수 정렬(Radix Sort): 구현

RadixSort.java

public class RadixSort {

  private static final int BUCKET_NUM = 10;

  public void sort(int[] array, int maxLength) {
    int divFac = 1;
    Queue<Integer>[] buckets = new Queue[BUCKET_NUM];

    // 버킷 배열 초기화
    for (int i = 0; i < BUCKET_NUM; i++) {
      buckets[i] = new LinkedListQueue<>();
    }

    // 가장 긴 데이터의 길이만큼 반복
    for (int i = 0; i < maxLength; i++) {
      // 정렬 대상 수의 개수만큼 반복
      for (int num : array) {
        // N번째 자리의 숫자 추출
        int radix = (num / divFac) % 10;

        // 추출한 숫자에 위치하도록 데이터 저장
        buckets[radix].enqueue(num);
      }

      // 버킷 수 만큼 반복
      for (int j = 0; j < BUCKET_NUM; j++) {
        // 버킷에 저장되어 있는 것을 순서대로 꺼내서 다시 array에 저장
        int arrayIndex = 0;
        while (!buckets[j].isEmpty()) {
          array[arrayIndex++] = buckets[j].dequeue();
        }
      }

      // N번째 자리 숫자를 추출할 수 있도록 변경
      divFac *= 10;
    }
  }
}

책의 내용을 거의 따라쳐서 비교하면서 보시면 될 것 같습니다.

기수 정렬(Radix Sort): 성능 평가

기수 정렬은 다른 정렬과는 다르게, 비교연산보다 버킷으로의 데이터 삽입과 추출을 핵심 연산으로 합니다.

$O(n)$ 의 성능을 보이는, 특정한 상황에서 최고의 성능을 보이는 정렬 알고리즘을 작성했네요.

ksundong/자료구조-스터디-5주차.md