검색 데이터 구조

컴퓨터 과학에서, 검색 데이터^{[citation needed]} 구조는 데이터베이스에서 특정 기록과 같은 일련의 항목에서 특정 항목을 효율적으로 검색할 수 있는 데이터 구조다.

가장 단순하고, 가장 일반적이며, 가장 효율적이지 않은 검색 구조는 모든 항목의 순서 없는 순차적 목록일 뿐이다.이러한 목록에서 원하는 항목을 선형 검색 방법에 의해 찾으려면, 불가피하게 항목 수 n개에 비례하는 많은 작업이 필요하며, 최악의 경우뿐만 아니라 평균적인 경우도 그러하다.유용한 검색 데이터 구조는 더 빠른 검색을 가능하게 하지만, 그것들은 특정한 종류의 쿼리로 제한된다.더욱이 그러한 구조를 구축하는 비용은 최소한 n에 비례하기 때문에, 동일한 데이터베이스(또는 쿼리 간에 거의 변경되지 않는 데이터베이스)에서 여러 개의 질의를 수행해야 하는 경우에만 그 대가를 치르게 된다.

정적 검색 구조는 고정된 데이터베이스에 대한 많은 질의에 응답하도록 설계된다. 동적 구조는 또한 연속적인 질의 사이에 항목을 삽입, 삭제 또는 수정할 수 있다.동적 사례에서는 데이터베이스 변경사항을 감안하여 검색 구조를 수정하는 비용도 고려해야 한다.

분류

가장 간단한 종류의 쿼리는 특정 필드(키)가 지정된 값 v와 동일한 레코드를 찾는 것이다.다른 일반적인 유형의 쿼리는 "키 값이 가장 작은 항목 찾기" "v를 초과하지 않는 가장 큰 키 값을 가진 항목 찾기" "지정된 범위 v와_min v 사이의_max 키 값을 가진 모든 항목 찾기"이다.

특정 데이터베이스에서 핵심 값은 일부 다차원 공간의 점일 수 있다.예를 들어, 키는 지구상의 지리적 위치(위도 및 경도)일 수 있다.이 경우 일반적인 종류의 쿼리는 "주어진 지점 v"에 가장 가까운 키가 있는 레코드를 찾거나 "v"로부터 주어진 거리에 키가 있는 모든 항목을 찾거나 "공간 R의 지정된 영역 내에 있는 모든 항목 찾기"이다.

후자의 일반적인 특별한 경우는 "5만에서 10만 사이의 급여를 받고 1995년에서 2007년 사이에 고용된 모든 직원 기록을 찾아라"와 같은 두 개 이상의 간단한 키에 대한 동시 범위 질의다.

단일 순서 키

키 값이 적당히 컴팩트한 간격에 걸쳐 있는 경우 배열.
우선순위 정렬 목록, 선형 검색 참조
키 정렬 배열. 이진 검색 참조
자체 균형 이진 검색 트리
해시 테이블

가장 작은 요소 찾기

힙

점근법 최악의 경우 분석

이 표에서 점근성 표기법 O(f(n)는 "최악의 경우 f(n)의 일부 고정 배수를 초과하지 않는다"는 뜻이다."

데이터 구조	삽입하다	삭제	잔액	인덱스에 가져오기	검색	최소값 찾기	최대값 찾기	공간 사용량
정렬되지 않은 배열	O(1) ^{(참고 참조)}	O(1) ^{(참고 참조)}	해당 없음	O(1)	O(n)	O(n)	O(n)	O(n)
정렬된 배열	O(n)	O(n)	해당 없음	O(1)	O(log n)	O(1)	O(1)	O(n)
쌓다	O(1)	O(1)			O(n)			O(n)
큐	O(1)	O(1)			O(n)			O(n)
정렬되지 않은 연결 목록	O(1)	O(1)^[1]	해당 없음	O(n)	O(n)	O(n)	O(n)	O(n)
정렬된 연결 목록	O(n)	O(1)^[1]	해당 없음	O(n)	O(n)	O(1)	O(1)	O(n)
목록 건너뛰기
자체 균형 이진 검색 트리	O(log n)	O(log n)	O(log n)	해당 없음	O(log n)	O(log n)	O(log n)	O(n)
힙	O(log n)	O(log n)	O(log n)	해당 없음	O(n)	O(1) Min-heap용 최대 높이^[2] O(n)	최대 heap의 경우 O(n) for min-heap^[2]	O(n)
해시 테이블	O(1)	O(1)	O(n)	해당 없음	O(1)	O(n)	O(n)	O(n)
트라이(k = 평균 키 길이)	O(k)	O(k)	해당 없음	O(k)	O(k)	O(k)	O(k)	O(k n)
데카르트 나무
B-트리	O(log n)	O(log n)	O(log n)	해당 없음	O(log n)	O(log n)	O(log n)	O(n)
적흑나무
스플레이 트리
AVL 트리					O(log n)
k-d 트리

참고: 정렬되지 않은 배열의 삽입은 삽입할 요소를 배열의 특정 위치에 삽입해야 한다는 가정 때문에 O(n)로 인용되기도 하는데, 이는 모든 후속 요소를 한 위치씩 이동시켜야 하기 때문이다.그러나 고전적인 배열에서는 임의의 변형되지 않은 원소를 저장하기 위해 배열을 사용하므로, 주어진 원소의 정확한 위치는 중요하지 않으며, 배열 크기를 1씩 증가시키고 배열 끝단에 원소를 저장함으로써 삽입을 수행하는데, 이는 O(1) 연산이다.^[3]^[4]마찬가지로 삭제 작업은 후속 요소를 이동시켜야 한다는 가정으로 인해 O(n)로 인용되기도 하지만, 고전적인 배열에서는 순서가 중요하지 않기 때문에(요소는 삽입 시간에 의해 암묵적으로 순서를 정함) 배열의 마지막 요소와 함께 삭제할 요소를 스와핑하여 삭제가 가능하다.그런 다음 배열 크기를 O(1) 연산인 1만큼 줄인다.^[5]

이 표는 단지 대략적인 요약일 뿐이다. 각 데이터 구조에 대해 다른 비용으로 이어질 수 있는 특수한 상황과 변형이 있다.또한 두 개 이상의 데이터 구조를 결합하여 비용을 절감할 수 있다.

각주

^ ^a ^b Cormen, Leiserson, Rivest. Introduction to Algorithms. The College of Information Sciences and Technology at Penn State. ISBN 9780262530910. LIST-DELETE runs in O(1) time, but if to delete an element with a given key, Θ(n) time is required in the worst case because we must first call LIST-SEARCH.{{cite book}}: CS1 maint: 작성자 매개변수 사용(링크)
^ ^a ^b Cormen, Leiserson, Rivest. Introduction to Algorithms. The College of Information Sciences and Technology at Penn State. ISBN 9780262530910. There are two kinds of binary heaps: max-heaps and min-heaps. In both kinds, the values in the nodes satisfy a heap property... the largest element in a max-heap is stored at the root... The smallest element in a min-heap is at the root... The operation HEAP-MAXIMUM returns the maximum heap element in Θ(1) time by simply returning the value A[1] in the heap.{{cite book}}: CS1 maint: 작성자 매개변수 사용(링크)
^ Allen Sherrod (2007). Data Structures and Algorithms for Game Developers. Cengage Learning. ISBN 9781584506638. The insertion of an item into an unordered array does not depend on anything other than placing the new item at the end of the list. This gives the insertion into an unordered array of O(1).
^ Cormen, Leiserson, Rivest. Introduction to Algorithms. The College of Information Sciences and Technology at Penn State. ISBN 9780262530910.{{cite book}}: CS1 maint: 작성자 매개변수 사용(링크)
^ "Algorithm - the time complexity of deletion in a unsorted array". Finding the element with a given value is linear. Since the array isn't sorted anyway, you can do the deletion itself in constant time. First swap the element you want to delete to the end of the array, then reduce the array size by one element.

참고 항목

[listdelete-1] Cormen, Leiserson, Rivest. Introduction to Algorithms. The College of Information Sciences and Technology at Penn State. ISBN 9780262530910. LIST-DELETE runs in O(1) time, but if to delete an element with a given key, Θ(n) time is required in the worst case because we must first call LIST-SEARCH.{{cite book}}: CS1 maint: 작성자 매개변수 사용(링크)

[heap-2] Cormen, Leiserson, Rivest. Introduction to Algorithms. The College of Information Sciences and Technology at Penn State. ISBN 9780262530910. There are two kinds of binary heaps: max-heaps and min-heaps. In both kinds, the values in the nodes satisfy a heap property... the largest element in a max-heap is stored at the root... The smallest element in a min-heap is at the root... The operation HEAP-MAXIMUM returns the maximum heap element in Θ(1) time by simply returning the value A[1] in the heap.{{cite book}}: CS1 maint: 작성자 매개변수 사용(링크)

[games-3] Allen Sherrod (2007). Data Structures and Algorithms for Game Developers. Cengage Learning. ISBN 9781584506638. The insertion of an item into an unordered array does not depend on anything other than placing the new item at the end of the list. This gives the insertion into an unordered array of O(1).

[4] Cormen, Leiserson, Rivest. Introduction to Algorithms. The College of Information Sciences and Technology at Penn State. ISBN 9780262530910.{{cite book}}: CS1 maint: 작성자 매개변수 사용(링크)

[5] "Algorithm - the time complexity of deletion in a unsorted array". Finding the element with a given value is linear. Since the array isn't sorted anyway, you can do the deletion itself in constant time. First swap the element you want to delete to the end of the array, then reduce the array size by one element.

[1]

[2]

[3]

[4]

[5]

Search

검색 데이터 구조

네임스페이스

더

목차

분류

단일 순서 키

가장 작은 요소 찾기

점근법 최악의 경우 분석

각주

참고 항목