[자료구조][Linux] Bitmap(비트맵), Bit Array(비트 배열)

티스토리 뷰

Algorithm/이론

[자료구조][Linux] Bitmap(비트맵), Bit Array(비트 배열)

SweetDev 2021. 11. 9. 22:49

비트를 지도처럼 쭉 늘여놓아서 비트맵이다.

데이터를 비트 단위로 처리하기 위해서 long 배열로 할당한 자료형

C의 모든 자료형은 최소 단위가 byte라서, 실제로는 8의 배수의 bit로 나오게 된다.

기본 연산

OR can be used to set a bit to one: 11101010 OR 00000100 = 11101110
AND can be used to set a bit to zero: 11101010 AND 11111101 = 11101000
AND together with zero-testing can be used to determine if a bit is set:11101010 AND 00000001 = 00000000 = 011101010 AND 00000010 = 00000010 ≠ 0
XOR can be used to invert or toggle a bit:11101010 XOR 00000100 = 1110111011101110 XOR 00000100 = 11101010
NOT can be used to invert all bits.NOT 10110010 = 01001101

Compression[edit]

A bit array is the most dense storage for "random" bits, that is, where each bit is equally likely to be 0 or 1, and each one is independent. But most data are not random, so it may be possible to store it more compactly. For example, the data of a typical fax image is not random and can be compressed.

Run-length encoding is commonly used to compress these long streams. However, most compressed data formats are not so easy to access randomly; also by compressing bit arrays too aggressively we run the risk of losing the benefits due to bit-level parallelism (vectorization).

Thus, instead of compressing bit arrays as streams of bits, we might compress them as streams of bytes or words (see Bitmap index (compression)).

Advantages and disadvantages[edit]

Bit arrays, despite their simplicity, have a number of marked advantages over other data structures for the same problems:

They are extremely compact; no other data structures can store n independent pieces of data in n/w words.
They allow small arrays of bits to be stored and manipulated in the register set for long periods of time with no memory accesses.
Because of their ability to exploit bit-level parallelism, limit memory access, and maximally use the data cache, they often outperform many other data structures on practical data sets, even those that are more asymptotically efficient.

However, bit arrays aren't the solution to everything. In particular:

Without compression, they are wasteful set data structures for sparse sets (those with few elements compared to their range) in both time and space. For such applications, compressed bit arrays, Judy arrays, tries, or even Bloom filters should be considered instead.
Accessing individual elements can be expensive and difficult to express in some languages. If random access is more common than sequential and the array is relatively small, a byte array may be preferable on a machine with byte addressing. A word array, however, is probably not justified due to the huge space overhead and additional cache misses it causes, unless the machine only has word addressing.

[출처]

https://junsoolee.gitbook.io/linux-insides-ko/summary/datastructures/linux-datastructures-3

저작자표시 (새창열림)

'Algorithm > 이론' 카테고리의 다른 글

이책을 추천합니다 (0)	2022.01.03
[Algorithm] Unrolled Linked List (0)	2021.11.23
[자료구조][Linux][WIP] XArray (0)	2021.11.09
[자료구조][Linux] Red Black Tree (0)	2021.11.08
[자료구조] 래딕스 트리(Radix Tree) (기수 트리) (0)	2021.11.08

공지사항

최근에 올라온 글

최근에 달린 댓글

Total

Today

Yesterday

링크

TAG more

« 2025/06 »
일	월	화	수	목	금	토
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

글 보관함

sweetdev

티스토리 뷰