Home Archives for 1월 2017

2017년 1월 14일 토요일

Which Index to use when

ArangoDB는 각 컬렉션의 _key 속성을 자동으로 인덱싱합니다. 이 속성을 별도로 색인 할 필요는 없습니다. 문서의 _id 속성은 _key 속성에서 파생되므로 암시적으로 인덱싱됩니다.

또한 ArangoDB는 모든 에지 콜렉션에서 _from 및 _to에 대한 색인을 자동으로 작성하므로 들어오는 edge와 나가는 edge를 효율적으로 결정할 수 있습니다.

Index types

사용자는 하나 이상의 문서 속성에 추가 색인을 정의 할 수 있습니다. ArangoDB는 여러 가지 인덱스 유형을 제공합니다. 이러한 인덱스의 사용 시나리오는 다음과 같습니다.

hash index : 검색어에 모든 색인된 속성이 제공되는 경우에만 개별 문서에 빠르게 액세스 할 수 있습니다. 색인은 동등한 비교에만 사용됩니다. 범위 쿼리를 지원하지 않으므로 정렬에 사용할 수 없습니다.

인덱스 된 속성의 모든 또는 대부분의 쿼리가 동등한 비교인 경우 해시 인덱스가 좋은 후보가됩니다. unique 해시 인덱스는 삽입, 업데이트, 제거 및 검색 작업을 위해 O (1)의 amortized 시간복잡성을 제공합니다. 고유하지 않은 해시 인덱스는 O (1) 삽입, 업데이트 및 제거 기능을 제공하며 amortized O (n) 복잡도의 인덱스 값으로 문서를 조회 할 수 있습니다. n은 해당 인덱스 값이 있는 문서 수입니다.

색인 속성이 설정되지 않은 문서를 색인하지 않도록 선택적 문서 속성의 고유하지 않은 해시 색인을 sparse로 선언해야합니다.
skiplist index : skiplists는 인덱스 된 값을 순서대로 유지하므로 동등성 검색, 범위 쿼리 및 정렬에 사용할 수 있습니다. 자주 선택되는 속성의 경우 skiplist 인덱스는 해시 인덱스보다 오버 헤드가 높습니다. 자주 선택되지 않는 속성의 경우 skiplist 인덱스는 고유하지 않은 해시 인덱스보다 더 효율적입니다.

또한 skiplist 인덱스는 해시 인덱스보다 더 많은 사용 사례 (예 : 범위 쿼리, 정렬)를 허용합니다. 또한 색인 속성의 가장 왼쪽 접두어를 기반으로 한 조회에 사용할 수 있습니다.
영구 색인 : 영구 색인은 정렬 된 skiplist 색인과 매우 유사하게 작동합니다. 단, 모든 색인 값은 디스크에 유지되므로 서버를 다시 시작하거나 색인 생성 된 모음을 다시 로드 할 때 메모리에서 다시 작성할 필요가 없습니다. 영속 인덱스의 연산은 로그 복잡성을 갖지만 인덱스를 사용하여 문서를 검색하기 위한 primary 인덱스에 추가 왕복이 필요할 수 있으므로 작업에 skiplist 인덱스의 연산보다 더 높은 상수 계수의 시간이 걸릴 수 있습니다.

동등한 검색, 범위 쿼리 및 정렬을 위해 영구 인덱스를 사용할 수 있습니다. 자주 사용되는 속성의 경우, 영구색인은 skiplist 또는 해시 인덱스보다 높은 오버 헤드를 갖습니다.

영구 인덱스는 해시 인덱스보다 더 많은 유스 케이스 (예 : 범위 쿼리, 정렬)를 허용합니다. 또한 색인 속성의 가장 왼쪽 접두어를 기반으로 한 조회에 사용할 수 있습니다. 메모리 내 skiplist 인덱스와 달리 영구 인덱스는 메모리 내에서 다시 빌드 할 필요가 없으므로 다른 메모리 내 인덱스처럼 컬렉션의로드 시간에 영향을 미치지 않습니다.

2017년 1월 13일 금요일

Reaching and harnessing consensus with ArangoDB

January 11, 2017Architecture, cluster, GeneralTags: cluster

nihil novi nisi commune consensu

nothing new unless by the common consensus

– law of the polish-lithuanian common-wealth, 1505

주의 : 긴 글이다. 하지만 매달려서 읽어보면 언젠가 많은 시간을 절약할 수 있을 것이다.

Introduction

Consensus has its etymological roots in the latin verb consentire, which comes as no surprise to mean to consent, to agree. As old as the verb equally old is the concept in the brief history of computer science. It designates a crucial necessity of distributed appliances. More fundamentally, consensus wants to provide a fault-tolerant distributed animal brain to higher level appliances such as deployed cluster file systems, currency exchange systems, or specifically in our case distributed databases, etc.
합의는 동의하는 것을 의미하는 놀라움이 아니라 동의하는 라틴어 동사 동의로 그 어원 뿌리를 가진다. 오래된 동사만큼 오래된 컴퓨터 과학의 역사의 개념입니다. 분산 장비의 중요한 필요성을 보여줍니다. 더 기본적으로는 합의가 배치 된 클러스터 파일 시스템, 환전 시스템 또는 구체적으로는 우리의 분산 데이터베이스와 같은 높은 수준의 장비에 결함 허용 분산 동물의 뇌를 제공하는 것 를 바라고 있습니다.

While conceptually, one could easily envision a bunch of computers on a network, who share and manipulate some central truth-base, the actual implementation of such a service will pose paradoxical demands to fail-safety and synchronicity. In other words, making guarantees about the truth of state x at a given time t or more demanding at any given time {t1,…,tn} turns out to be entirely and radically non-trivial.

This is the reason, why most implementations of consensus protocols as deployed in real production systems have relied upon one of two major publications1,2 referred to by their synonyms, Paxos, including its derivatives, and RAFT, respectively.

개념적으로는 진정한 기반을 공유하고 작업하는 네트워크에 다수의 컴퓨터를 쉽게 상상할 수 있지만, 그러한 서비스를 실제로 구현하면 안전성과 동기 성이 일치하지 않습니다. 즉, 주어진 시간 t에서 상태 x 진리를 보증하는 것은 어떤 주어진 시간 {t 1, ..., tn}에서 요구하는 것은 완전하고 근본적으로 중요하지 않음 알 수있다.

이것은 실제 생산 시스템에 도입 된 합의 프로토콜의 대부분의 구현이 동의어 인 Paxos (파생물 포함) RAFT 두 가지 주요 간행물 1,2 중 하나에 의존 있는 이유입니다.

Although it would be a beastly joy to discuss here the differences and the pro and contra argumentation of each, I suggest to have a look at the extent of both papers in excess of 15 pages to get a rough idea of the scope of such a discussion. As a matter of fact we were as audacious to try to define and implement a simpler protocol than the above. And we failed miserably – at least in arriving at a simpler solution.

Suffice it to say that we decided to utilise RAFT in our implementation. The choice for RAFT fell mainly because in my humble view it is the overall simpler method to understand.

여기에서 차이점과 각각의 의견과 반대의 논의를 논의하는 것은 매우 기쁜 일이지만, 이러한 논의의 범위를 대충 알기 위하여는 15 페이지 이상의 두 논문의 범위를 보면 것이 좋습니다. 사실 우리는 위보다 간단한 프로토콜을 정의하고 구현하려고하면 대담이었습니다. 그리고 우리는 비참하게 실패했습니다 - 적어도 간단한 해결책에 도달하려면 우리의 구현 RAFT를 이용하기로 결정했다고 말하면 충분합니다. RAFT의 선택은 내 겸손 생각으로는 이해하기가 전체적으로 쉬운 방법이므로 주로 떨어졌습니다.

In short, RAFT ensures at all times that one and only one instance of the deployed services assumes leadership. The leader projects permanently its reign over other instances, replicates all write requests to its followers and serves read requests from its local replica. As a matter of fact, it is crucial to note that the entire deployment needs to maintain the order of write requests, as arbitrary operations on the replicated log will not be commutable.

Furthermore and not least importantly, RAFT guarantees that any majority of instances is functional i.e. is capable of electing a leader and knows the replicated truth as promised to the outside world.

For more on RAFT and consensus in general, we would like to refer to the authors’ website.

즉, RAFT는 항상 배치 된 서비스의 하나의 인스턴스가 리더십을 가지고가는 것을 보증합니다. 리더는 다른 인스턴스보다 영구적으로 통치하고 모든 쓰기 요청을 그 추종자 복제하고 로컬 복제본에서 읽기 요청을 처리합니다. 실제로 복제 된 로그의 모든 작업은 커밋 할 수 없으므로 전체 배포가 쓰기 요청의 순서를 유지해야한다는 점에 유의하는 것이 중요합니다.

RAFT는 대다수의 인스턴스가 기능적인 것, 즉 리더를 선출 할 수 외계에 약속 된 복제 된 진실을 알고 있다는 것을 보장합니다.

RAFT와 합의에 대한 일반적인 정보는 저자의 웹 사이트

What is already out there and why not just take one of those

Say that you have realised by now that you need such a consensus among some set of deployed appliances. Should you do the implementation yourself? Most probably not. Unless obviously, you have good reasons. As there are multiple arguably very good implementations of both of the above algorithms out there. They come as code snippets, libraries or full-fledged services; some of which have enjoyed praise and criticism alike.

Common wisdom suggests to invest your development energy in more urgent matters that will put you ahead of your competition, or at least to spare the tedious work of proving the correctness of the implementation all the way to deliberate abuse case studies.

We, initially, started off building ArangoDB clusters before version 3.0 relying on etcd. etcd is a great and easy to use service, which is actively maintained and developed to date. We did hit limits though as we did have need for replication of transactions rather than single write requests, which etcd did not provide back then. We also dismissed other very good services such as zookeeper, as zookeeper, for example, would have added the requirement to deploy a Java virtual machine along. And so on.

But the need for our own consensus implementation imposed itself upon us for other reasons.

I mentioned earlier the consensus to be the animal brain of a larger service. When we think of our brainstem the first thing that comes to mind is not its great capabilities in storing things but its core function to control respiration and heart beat. So how about being able to make guarantees not only about replicating a log or key-value-store but also running a particular process as a combined effort of all agents. In other words, could one think of a resilient program thread which follows the RAFT leadership? And if so, what would the benefits look like?

Letting the cat entirely out of the bag, we built into the RAFT environment a supervision process, which handles failure as well as maintenance cases at cluster runtime. None of the implementations we were aware of could provide us with that. In my view the most intriguing argument.

This is how we use the agency

After all the nice story telling, let us look how we ended up using the agency.

ArangoDB cluster deployments consist of 3 types or roles of services, namely database servers, coordinators and agents. For details on the function of the first two roles, please refer to the cluster documentation. For them however to function and interact properly they rely on our distributed initialisation and configuration.

Every imaginable meta information is stored there. Which instance is storing a replica of a particular shard? Which one is the primary database server currently responsible for shard X or synchronous replication of shard Y as a follower and vice versa. When was the last heartbeat received from which service. etc.

ArangoDB-speak for the central cluster maintenance and configuration consensus is agency. The agency consists of an odd number of ArangoDB instances, which hold the replicated and persisted configuration of the cluster while maintaining integrity using the RAFT algorithm in their midst.

Database servers and coordinators interact with the agents through an HTTP API. Agents respond as a unit. Read and write API calls are redirected seamlessly by followers to the current leader. Changes in cluster configuration are stored in the “Plan” section of the key value store while actually committed changes are reported by the database servers and coordinators in the “Current” section. A single API call to the agency may consist of a bundle of transactions on the key value store, which are executed atomically with guarantees to transaction safety.

Last but not least, the agency holds a supervision thread whose job it is to perform automated failover of db servers with all the consequences for individual shards of individual collections. But it is also the place where deliberate maintenance jobs are executed such as orderly shutdown of a node for say hardware upgrade and the like.

Having all the power of consensus in such a program, we can guarantee that no 2 machines start giving contradictory orders to the rest of the cluster nodes. Additionally, a job like handling a failed server replacement is continued in the very same way it was meant to, if the agency’s leader catastrophically fails in midst of its execution.

How could you use ArangoDB as a consensus service

The online documentation describes how such an ArangoDB RAFT service is deployed on an odd (and arguably low) number of interconnected hosts.

During initial startup of such an agency the instances find each other, exchange identities and establish a clear key-value and program store. Once the initialisation phase is complete within typically a couple of seconds, the RAFT algorithm is established and the HTTP API is accessible and the replicated log is recording; the documentation is found here. Restarts of individual nodes of the agency must not influence the runtime behaviour and on a good network should go unnoticed.

The key-value store comes with some additional features such as assignment of time to live for entries and entire branches and the possibility of registering callbacks for arbitrary subsections. It also features transactions and allows one to precondition these transactions with high granularity.

Any distributed service could now use arangodb agencies for initialisation and configuration management in the very same way as we do with ArangoDB clusters through integration the HTTP-API.

But in addition, agents run local Foxx threads, which can be used to deploy a resilient monitoring and maintenance thread with access to the RAFT process with its replicated log and the current local RAFT personality.

Some initial performance and compliance test

Making claims about fault tolerance in a distributed deployment, needs some evidence to back that up. Failure scenarios clearly range all the way from malfunctioning network switches and fabric to crashed hosts or faulty hard disks. While the local sources of error as failing host hardware, can be dealt with in unit tests, it turns out that tests of correct function of such a distributed service is anything but trivial.

The main claim and minimum requirement: At all times any response from any member of the RAFT is true. Such a response could be the value of a requested key but it could also be that a preconditioned write to a particular key succeeded or failed. The nice to have and secondary goal: Performance.

Just a brief word on scalability of consensus deployments here. Consensus does not scale. The core nature of consensus could be most readily compared to that of a bottleneck. The amount of operation that go through a consensus deployment will be affected mostly by the number of members and the network’s quality. Not a big surprise when you think about how consensus is reached over back and forth of packets.

Jepsen

Thankfully, Kyle Kingsbury who has been extensively blogging about the subject of distributed correctness, has published a framework for running tests to that effect of github.

Kyle’s framework is a clojure library, which distills the findings of a couple of fundamental papers, which Kyle discusses on his blog into one flexible package. Kyle’s blog has taken the distributed world in a storm because of the way the tests are done and evaluated.

The tests are run and results are recorded as local time series on virtual machines, which are subject to random network partitioning and stability issues. The analysis then tries to find a valid ordering of the test results such that linearisability3 can be established.

Results

We have tested the ArangoDB agency inside Jepsen to validate that the agency does behave linear as above described. We have also stress tested the agency with hundreds of millions of requests over longer periods to both check memory leakage and compaction related issues. The tests show that ArangoDB is ready to be used as a fault tolerant distributed configuration management platform for small to large network appliances.

What is to come

Lookout for more in depth upcoming technical blogs, which will demonstrate best case scenarios for deploying ArangoDB agencies for your own network appliances. Also for 2017 we are planning more comprehensive and large-scale testing of linearisability over the entire ArangoDB cluster.

Examples

We have put together a small collection of go programs, which demonstrate some of the above concepts in hopes that you might find them helpful to get started on the ArangoDB agency at https://github.com/neunhoef/AgencyUsage

tldr;

Consensus is a key asset in distributed appliances. Groups of ArangoDB instances may be well integrated into appliance farms for initialisation and configuration management. Our own exhaustive tests have proven excellent dependability and resilience in vast scenarios of network and hardware failure.

_____________________________

References

1 Pease M, Shostak R and Lamport L. Reaching Agreement in the Presence of Faults. Journal of the Association for Computing Machinery, 27(2). 1980. pp 228-234
2 Ongaro D and Ousterhout J. In search of an understandable consensus algorithm. 2014 USENIX Annual Technical Conference (USENIX ATC 14). 2014. pp. 305-319
3 Herlihy MP and Wing JM. Linearizability: A Correctness Condition for Concurrent Objects. ACM Transactions on Programming Languages and Systems, 12(3). 1990. Pp.463-492
4 Max Neunhöfer. Creating Fault Tolerant Services on Mesos. Recording of the talk, MesosCon Asia 2016

January 11, 2017Author

Kaveh works on ArangoDB’s core. He is responsible for developing the consensus protocol in ArangoDB clusters.

2017년 1월 7일 토요일

Index of ArangoDB

Publisher ㅎH킹 Date 오전 6:00

Index basics

인덱스는 document로 빠른 접근을 허용하는데, 주어진 인덱스 속성들이 쿼리에서 사용된다. ArangoDB는 자동으로 시스템 속성들을 인덱스하며, 사용자들은 비시스템 속성을 자유롭게 추가 인덱스로 활용할 수 있다.

사용자 정의 인덱스들은 collection 레벨에서 생성될 수 있다. 대부분의 사용자 정의 인덱스들은 특정한 인덱스 속성의 이름을 가지고 생성될 수 있다. 일부 인덱스 타입은 단 하나의 속성만 인덱스 허용하는 반면(e.g fulltext index), 여러 속성을 동시에 인덱싱하는 것을 허용하는 것도 있다.

_id, _key, _from, _to와 같은 시스템 속성은 사용자가 인덱스로 정의하지 않더라도 ArangoDB가 자동으로 인덱싱한다. _id와 _key는 collection의 primary key로 다뤄지며, _from과 _to는 edge collection의 edge index로 다뤄진다.

시스템 속성인 _id를 사용자 정의 인덱스로 사용하는 것은 불가능하지만, _key, _rev, _from, _to는 가능하다.

ArangoDB는 다음과 같은 index들을 제공한다.

Primary Index

collection에서 모든 document의 key를 위한 hash index인 primary index가 있다. primary index는 사용시 매우 빠른 검색을 가능하게 하며, _key나 _id를 사용할 때 이것을 이용한다. AQL query에서 _key나 _id로 접근할 때 역시 사용된다.

주어진 _key 혹은 _id로 document를 검색하는 함수에서 primary index가 항상 사용된다.

db.collection.document("<document-key>");
db._document("<document-id>");

primary index는 정렬되어 있지 않은 hash index이므로, 범위 쿼리나 정렬 쿼리를 위해 사용될 수 없다.

primary index는 삭제나 변경할 수 없으며, 사용자 정의 방식의 primary index를 할 수 없다.

Edge Index

모든 edge collection은 자동으로 생성된 edge index를 가지고 있다. edge index는 _from과 _to 속성에 빠른 접근을 제공하여 vertex 사이의 연결점을 빠르게 찾는데 사용된다.이 인덱스는 vertex의 연결된 edge를 찾는 쿼리를 실행할 때 참조된다.

AQL에서 _from 또는 _to를 찾을 때 Edge index가 사용된다. 아래와 같이 주어진 _from, _to를 통해 edge를 찾는 함수는 항상 edge index를 사용한다.

db.collection.edges("<from-value>");
db.collection.edges("<to-value>");
db.collection.outEdges("<from-value>");
db.collection.outEdges("<to-value>");
db.collection.inEdges("<from-value>");
db.collection.inEdges("<to-value>");

내부적으로 edge index는 hash index로 구현이 되어 있는데 모든 _from과 _to를 저장한다. 이 index는 들여다보는데 사용할 수는 있지만, 정렬이나 범위 쿼리로 사용될 수는 없다. 또한 자동으로 생성되며, 사용자 정의에 따라 생성할 수는 없다. 그러나 사용자 정의 인덱스들에서 _from과 _to의 자유로운 사용은 가능하다.

edge index는 삭제나 변경할 수 없다.

Hash Index

hash index는 특정 속성값을 가진 document를 빠르게 찾는데 사용될 수 있다. hash index는 정렬되지 않기때문에, 검색에만 사용되고, 범위 쿼리나 정렬은 할 수 없다.

hash index는 하나 혹은 여러 document의 속성으로 생성될 수 있다. 만약 검색조건에 모든 인덱스 속성이 주어지고, 모든 속성이 동등 연산자(==)을 사용해서 비교하게 된다면 hash index가 사용될 것이다. AQL에서 byExample, firstExample과 같은 쿼리 함수에서 사용된다.

Hash index들은 선택적으로 unique로 선언될 수 있다. 그럴 경우 인덱스된 속성의 같은 값이 허용되지 않는다. 또 hash index는 sparse 선언이 가능하다.(sparse : 대부분 null이고 값을 가지는 경우가 많지 않은 경우 유리한 index)

hash index의 타입은 아래와 같다.

unique hash index:모든 documents들은 index된 속성 값이 유일해야 한다. unique index가 걸려 있는 속성 값이 같은 document를 insert할 경우 unique 제약 위반이 발생된다.
이 인덱스는 sparse가 아니다. index된 속성을 가지고 있지 않거나, null 값을 가진 document는 index도 index될 것이다.(null 역시 하나만 허용한다는 의미) 따라서 optional 속성에는 이 index를 사용할 수 없다.
unique 옵션은 _from, _to가 결합된 인덱스에 추가되어 중복된 edge 생성을 방지하는데 사용될 수 있다.
unique, sparse hash index: unique hash index와 동일하며 추가적으로, null 값이나 속성 값이 없을 경우 index에 포함되지 않는다.
non-unique hash index: collection의 모든 document들이 index 된다. sparse가 아니기 때문에 속성 값이 없거나 null인 것도 index되며, 중복된 값도 허용된다.
non-unique, sparse hash index: null이 아닌 값이 있는 document 만 index된다.

unique hash index의 lookup, insert, update, remove 연산의 할부시간복잡도는 O(1)이 된다.

Non-unique hash index도 마찬가지로 O(1)이다.

모든 혹은, 대부분의 document에서 값이 없는 속성에 hash index가 생성되면 아래와 같이 동작하게 된다.

index가 sparse이면, 속성값이 존재하지 않는 document는 index되지 않을 것이고, 메모리 index를 사용하지 않을 것이다. 따라서, update나 remove 연산의 성능에 영향을 미치지 않을 것이다.
sparse가 아니라면, 속성값이 없는 document들은 null key 값으로 index에 포함될 것이다.

Hash index들은 index 속성이름이 [*]를 포함한다면 배열값을 index하는 것도 지원한다.

Skiplist Index

skiplist는 정렬된 index 구조다. 범위에 대한 쿼리와 정렬된 순서로 document를 리턴하기 위한, 특정 속성 값을 빠르게 찾을 수 있다. byExample, firstExample 등에서 사용될 수 있다.

Skiplist 인덱스들은 쿼리에 인덱스 되는 속성이 주어지거나, index 속성의 prefix값이 지정될 경우 검색, 범위쿼리, 정렬에 사용될 수 있다.

예를 들어, value1과 value2로 skiplist인덱스가 생성된다면 아래의 필터 조건은 인덱스를 사용할 것이다.(<=, >=도 포함인데 밑에서는 생략되었다.)

FILTER doc.value1 == ...
FILTER doc.value1 < ...
FILTER doc.value1 > ...
FILTER doc.value1 > ... && doc.value1 < ...

FILTER doc.value1 == ... && doc.value2 == ...
FILTER doc.value1 == ... && doc.value2 > ...
FILTER doc.value1 == ... && doc.value2 > ... && doc.value2 < ...

정렬을 위해 skiplist 인덱스를 사용하기 위해서는, 인덱스 정의 순서와 동일한 순서로 SORT 구문에 속성이 명시되어야 한다. Skiplist 인덱스들은 항상 오름차순으로 생성되지만, 오름차순, 내림차순 모두에 사용될 수 있다. 그러나 복합 인덱스에서는, 복합 인덱스를 구성하는 모든 요소가 모드 오름차순이거나 모두 내림차순으로 SORT 구문에 명시되어야 한다.(기본적으로 오름차순으로 인식된다).

예를 들어, value1과 value2로 skiplist 인덱스가 생성될 때, 아래의 sort 구문은 인덱스를 사용할 수 있다.

SORT value1 ASC, value2 ASC (and its equivalent SORT value1, value2)
SORT value1 DESC, value2 DESC
SORT value1 ASC (and its equivalent SORT value1)
SORT value1 DESC

아래의 sort 구문은 인덱스가 만들어진 순서를 사용하지 않아서 추가적인 정렬 과정이 필요하게 된다.

SORT value1 ASC, value2 DESC
SORT value1 DESC, value2 ASC
SORT value2 (and its equivalent SORT value2 ASC)
SORT value2 DESC (because first indexed attribute value1 is not used in sort clause)

마지막 두 sort 구문은 인덱스 속성의 가장 좌측 prefix 속성을 사용하지 않으므로 인덱스를 사용하지 않는다.

Skiplist 역시 선택적으로 unique를 선언해서, 인덱스 되는 속성의 값을 불허할 수 잇다. sparse와 non-sparse 도 선언될 수 있다.

여러 종류의 skiplist index가 있으며 아래와 같은 특성이 있다.

unique skiplist index: collection의 모든 document는 index되는 속성의 값이 유일하다. 이미 존재하는 속성의 document를 넣으면 제약 조건 위반을 초래한다.이 인덱스는 sparse가 아니므로 값이 없거나 null인 document들도 인덱스 된다. null value는 그래서 딱 하나의 document에서만 허용된다. 따라서 optional 속성에는 이 인덱스를 사용할 수 없다.
unique, sparse skiplist index: unique skiplist index와 동일하지만, null값이나 속성값이 없는 document는 index하지 않는다.
non-unique skiplist index: null이나 속성값이 없는 document도 인덱스하며, 중복이 허용된다.
non-unique, sparse skiplist index: null이나 속성값이 없는 document는 인덱스 하지 않으며, 중복이 허용된다.

skiplist의 할부시간복잡도는 인덱스 되는 document의 개수를 N이라고 할 때 logN이 된다.

Skiplist index 역시 배열에 인덱스를 걸 수 있다.

Persistent Index

persistent index는 정렬된 인덱스로, document가 저장될 때와 업데이트 될 때 index가 disk에 쓰여져 영구 보존된다. 이는 서버가 재시작되거나, index된 collection이 초기에 로드될 때 data로부터 재생성할 필요가 없음을 의미한다. 그래서 영구 index를 사용하는 것은 collection의 로드 시간을 줄여줄 수 있다.

영구 인덱스 타입은 두 번째 인덱스로 바로 사용될 수 있다. 이것은 collection에 in-memory primary 인덱스가 항상 존재하고, 잠재적으로 edge index와 같이 더 많은 index가 있기 때문에, 영구 인덱스가 현재 collection의 유일한 인덱스로는 될 수 없음을 의미한다.

이 index의 구현은 RocksDB 엔진을 사용하며, insert, update, remove 연산을 log시간이 걸린다. 영구 index는 in-memory 인덱스가 아니기 때문에, primary index로 가는 pointer를 저장하지 않고, 대신에 document의 primary key를 저장한다. 영구 index를 통해 document를 찾을 경우, primary index를 찾는 추가적인 O(1) 시간이 필요하다.

영구 index는 정렬되어 있어서, 검색, 범위 검색, 정렬 동작에 사용될 수 있지만, query에 모든 속성이 제공되거나, 인덱스를 선언한 순서대로 속성이 나열되어야 한다.

Geo Index

사용자들은 collection에서 하나 혹은 여러 속성에 geo index를 생성할 수 있다. geo index는 장소나 지표면을 빠르게 검색하는데 사용할 수 있다.

geo index는 2차원 좌표를 저장한다. 이는 위도와 경도를 모두 포함하는 배열 속성이나 두 개의 document 속성에다가 생성될 수 있다. 위도와 경도 값은 숫자여야 한다.

geo index는 주어진 좌표와 가장 가까운 좌표를 갖는 document를 검색하는 동작을 제공하며, 주어진 좌표 주위 특정 반지름내에 존재하는 좌표를 갖는 document를 찾는 것을 제공한다.

geo index는 AQL 함수나, simple 쿼리 함수를 통해 사용되며, 다른 방법으로는 사용될 수 없다.

Fulltext Index

전체 텍스트 색인을 사용하여 문서에서 단어 또는 단어의 접두사를 찾을 수 있습니다. 전체 텍스트 색인은 하나의 속성에만 만들 수 있으며, 그 속성의 텍스트 값을 가지는 문서에 포함 된 모든 단어를 인덱싱합니다. 지정할 수 있는 최소 길이의 단어만 색인됩니다. 워드 토큰화는 libicu에서 제공하는 단어 경계 분석을 사용하여 수행되며, 서버 시작시 해당 서버의 언어를 따른다. 단어는 소문자 형태로 색인화됩니다. 이 인덱스는 완전 일치 조회 및 접두사 조회를 지원하며, 부분적인 결과를 결합하는 and, or, not 연산을 지원한다.

전체 텍스트 색인은 sparse입니다. 즉, 인덱스 속성이 설정된 문자열 값을 포함하는 문서만 인덱싱합니다. 또한 구성 가능한 최소 길이의 단어만 인덱스에 포함됩니다.
전체 텍스트 색인은 AQL의 전용 함수나 간단한 쿼리에서 사용되지만 다른 유형의 조회 및 조건에서는 사용하지 않습니다.

Indexing attributes and sub-attributes

최상위 속성뿐만 아니라 내부 속성도 인덱스 될 수 있다. 최상위 속성을 인덱싱하기 위해서는 속성 이름이 필요하다. 하나의 필드에 인덱스를 하기 위해서 ensureIndex() 메소드의 fields 파라미터에 하나의 element를 가진 배열을 기재한다. element는 인덱스하려는 속성의 이름이다. 여러 field에 걸쳐 복합 인덱스를 생성하려면 fields 배열에 나열하면 된다.

// { name: "Smith", age: 35 }
db.posts.ensureIndex({ type: "hash", fields: [ "name" ] })
db.posts.ensureIndex({ type: "hash", fields: [ "name", "age" ] })

sub속성에 인덱스를 걸기 위해서는 .(dot)를 사용하여 속성의 경로를 기재한다.

// { name: {last: "Smith", first: "John" } }
db.posts.ensureIndex({ type: "hash", fields: [ "name.last" ] })
db.posts.ensureIndex({ type: "hash", fields: [ "name.last", "name.first" ] })

Indexing array values

만약 array를 포함한 속성을 인덱스하려고 하면, ArangoDB는 기본적으로 인덱스 값으로 배열 전체를 저장한다. 배열의 각 멤버를 인덱스를 통해 접근하는 것은 이러한 방법으로는 불가능하다.

전체 배열 값을 저장하지 않고, 배열 각 멤버의 값을 인덱스하기 위해서는 특별한 배열 인덱스가 필요하다. 배열 인덱스는 collection.ensureIndex 함수를 통해 hash나 skiplist 인덱스로 설정될 수 있다. hash나 skiplist 배열 인덱스를 만들기 위해 인덱스를 생성할때 인덱스 속성명은 [*]가 추가되어야 한다.

아래는 배열 hash index를 tags 속성에 거는 예제이다.

db.posts.ensureIndex({ type: "hash", fields: [ "tags[*]" ] });
db.posts.insert({ tags: [ "foobar", "baz", "quux" ] });

이 배열 인덱스는 AQL 쿼리에서 IN 연산자를 통해 각 tags 값을 찾는데 사용될 수 있다.

FOR doc IN posts
  FILTER 'foobar' IN doc.tags
  RETURN doc

확장연산자 [*]를 추가해도 되지만 필수는 아니다. 확장 연산자를 기재하는 것은 명시적으로 array index를 사용하라는 것이지만, 명시적일뿐이다.

FOR doc IN posts
  FILTER 'foobar' IN doc.tags[*]
  RETURN doc

아래 필터 조건은 배열 인덱스를 사용하지 않음에 주의하라

FILTER doc.tags ANY == 'foobar'
FILTER doc.tags ANY IN 'foobar'
FILTER doc.tags IN 'foobar'
FILTER doc.tags == 'foobar'
FILTER 'foobar' == doc.tags

배열 값의 sub 속성에도 인덱스를 생성할 수 있다. 아래와 같이 object 배열이라면 가능한 상황이다.

db.posts.ensureIndex({ type: "hash", fields: [ "tags[*].name" ] });
db.posts.insert({ tags: [ { name: "foobar" }, { name: "baz" }, { name: "quux" } ] });

아래 쿼리는 배열 인덱스를 사용한다. 이 경우는 명시적으로 확장 연산자([*])를 기재해야 한다.

FOR doc IN posts
  FILTER 'foobar' IN doc.tags[*].name
  RETURN doc

만약 한 document가 하위 속성이 없는 element들의 집합인 배열을 가지고 있다면, 이 document는 null 값으로 인덱스 될 것이며, ArangoDB에서 이는 속성이 존재하지 않는 것과 과 같다.

ArangoDB는 배열 인덱스에 하나의 속성당 하나의 [*] 연산자만 허용한다. 예를 들어 아래는 인덱스가 가능하지 않다.

db.posts.ensureIndex({ type: "hash", fields: [ "tags[*].name[*].value" ] });

배열 인덱스 값은 자동으로 중복되지 않는다. 예를 들어 collection에 아래와 같은 document를 넣는다고 가정할 때, bar 값이 중복되지만 인덱스는 한 번만 저장된다.

db.posts.insert({ tags: [ "foobar", "bar", "bar" ] });

배열의 인덱스가 unique로 선언 된 경우 인덱스에 값을 삽입하기 전에 배열의 중복 제거가 이루어 지므로 위 삽입 작업은 실패하지 않습니다. 인덱스에 이미 bar 값의 인스턴스가 포함되어있는 경우는 실패하지만 값 bar가 인덱스에 존재하지 않는 경우는 성공합니다.

배열 인덱스가 선언되어 있고, 지정된 속성에 배열이 없는 문서를 저장하면 이 문서는 인덱스에 삽입되지 않습니다. 따라서 다음 개체는 인덱스에 등록되지 않습니다.

db.posts.ensureIndex({ type: "hash", fields: [ "tags[*]" ] });
db.posts.insert({ something: "else" });
db.posts.insert({ tags: null });
db.posts.insert({ tags: "this is no array" });
db.posts.insert({ tags: { content: [1, 2, 3] } });

배열 인덱스는 명시적인 null 값을 인덱싱 할 수 있습니다. 조회하면 배열에 명시 적으로 null이 저장되어있는 문서만 리턴할 것이며, 배열이 전혀 없는 문서는 반환되지 않습니다.

db.posts.ensureIndex({ type: "hash", fields: [ "tags[*]" ] });
db.posts.insert({tags: null}) // Will not be indexed
db.posts.insert({tags: []})  // Will not be indexed
db.posts.insert({tags: [null]}); // Will be indexed for null
db.posts.insert({tags: [null, 1, 2]}); // Will be indexed for null, 1 and 2

배열 인덱스 sparse로 선언하는 것은 인덱스의 배열 부분에는 영향을 주지 않습니다. 이것은 특히 명시적인 null값도 sparse 버전에서도 인덱스화되는 것을 의미합니다. 인덱스가 배열과 일반 속성으로 결합되어 있는 경우, 그 속성에 sparsity가 적용됩니다.

db.posts.ensureIndex({ type: "hash", fields: [ "tags[*]", "name" ], sparse: true });
db.posts.insert({tags: null, name: "alice"}) // Will not be indexed
db.posts.insert({tags: [], name: "alice"}) // Will not be indexed
db.posts.insert({tags: [1, 2, 3]}) // Will not be indexed
db.posts.insert({tags: [1, 2, 3], name: null}) // Will not be indexed
db.posts.insert({tags: [1, 2, 3], name: "alice"})
// Will be indexed for [1, "alice"], [2, "alice"], [3, "alice"]
db.posts.insert({tags: [null], name: "bob"})
// Will be indexed for [null, "bob"]

배열 인덱스를 사용한 필터링은 AQL 쿼리에서만 작동하며, 쿼리 IN 연산자를 사용하여 인덱스 속성을 필터링하는 경우에만 작동한다는 점에 유의하십시오. 현재 다른 비교 연산자 (==,! =,>,> =, <, <, ANY, ALL NONE)는 배열 인덱스를 사용할 수 없습니다.

Vertex centric indexes

위에서 언급한 바와 같이 그래프의 가장 중요한 인덱스는 에지 컬렉션 _from 속성과 _to 속성을 인덱스하는 에지 인덱스입니다. 그들은 주어진 정점에서 시작하거나 거기에 도착하는 모든 에지에 매우 빠른 액세스를 제공하여 정점의 모든 이웃을 빠르게 찾을 수 있도록 한다.

많은 경우 더 구체적인 쿼리를 실행하고 싶을 수 있습니다. 예를 들어, 타임 스탬프가 가장 최근인 20의 정점에서 시작하는 에지들을 찾을 수 있습니다. 이는 "정점 중심의 인덱스"로 얻을 수 있습니다. 어떤 의미에서는 이러한 것들은 edge 컬렉션의 지역화 된 인덱스이며, 이것은 모든 단일 정점에 위치합니다.

기술적으로는 이들은 ArangoDB에서 인덱스로 구현되어 있습니다. 이 인덱스는 먼저 _from에서 다른 속성으로 전체 edge 컬렉션을 정렬합니다. 예를 들어, 에지 컬렉션의 _from 및 타임 스탬프에 skiplist 인덱스가 있는 경우, 인덱스의 단일 범위 참조로 매우 빠르게 위의 문제를 해결할 수 있다.

ArangoDB 3.0부터는 특수 에지 속성인 _from 또는 _to와 기타 속성을 인덱스하는 정렬 된 인덱스 (유형 "skiplist"및 "persistent")을 만들 수 있습니다. ArangoDB 3.1부터는 optimizer에 의해 발견된 적절한 FILTER 구문에서 이것들이 그래프 탐색에 사용된다.

예를 들어, 위 형태의 vertex centric index를 생성하기 위해서는 아래와 같이 할 수 있다.

db.edges.ensureIndex({"type":"skiplist", "fields": ["_from", "timestamp"]});

그러면 다음과 같은 쿼리는

FOR v, e, p IN 1..1 OUTBOUND "V/1" edges
  FILTER e.timestamp >= "2016-11-09"
  RETURN p

정점 "V / 1"에서 시작하는 에지는 많지만, 최근의 타임 스탬프는 조금 밖에 존재하지 않는 경우에는 상당히 빠르게된다.

ㅎH킹의 저장소