博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
druid相关的时间序列数据库——也用到了倒排相关的优化技术
阅读量:5766 次
发布时间:2019-06-18

本文共 2603 字,大约阅读时间需要 8 分钟。

Cattell [6] maintains a great summary about existing Scalable SQL and NoSQL data stores. Hu [18] contributed another great summary for streaming databases. Druid feature-wise sits some-
where between Google’s Dremel [28] and PowerDrill [17]. Druid has most of the features implemented in Dremel (Dremel handles arbitrary nested data structures
while Druid only allows for a single
level of array-based nesting) and many of the interesting compression algorithms mentioned in PowerDrill. Although Druid builds on many of the same principles as other distributed columnar data stores [15],
many of these data stores are 
designed to be more generic key-value stores [23] and do not sup
port computation directly in the storage layer. There are also other 
data stores designed for some of the same data warehousing issues 
that Druid is meant to solve. These systems include in-memory 
databases such as SAP’s HANA [14] and VoltDB [43]. These data 
stores lack Druid’slowlatency ingestion characteristics. Druidalso 
has native analytical features baked in, similar to ParAccel [34], 
however, Druid allows system wide rolling software updates with 
no downtime. 
Druid is similiar to C-Store [38] and LazyBase [8] in that it has 
twosubsystems,aread-optimizedsubsysteminthehistoricalnodes 
andawrite-optimizedsubsysteminreal-timenodes. Real-timenodes 
are designed to ingest a high volume of append heavy data, and do 
not support data updates. Unlike the two aforementioned systems, 
Druid is meant for OLAP transactions and not OLTP transactions. 
Druid’s low latency data ingestion features share some similar-
ities with Trident/Storm [27] and Spark Streaming [45], however,
both systems are focused on stream processing whereas Druid is 
focused on ingestion and aggregation.
Stream processors are great 
complements to Druid as a means of pre-processing the data before 
the data enters Druid. 
There are a class of systems that specialize in queries on top of
cluster computing frameworks. Shark [13] is such a system for 
queriesontopofSpark,andCloudera’sImpala[9]isanothersystem 
focused on optimizing query performance on top of HDFS. Druid
historical nodes download data locally and only work with native 
Druid indexes. We believe this setup allows for faster query laten
cies. 
Druid leverages a unique combination of algorithms in its archi-
tecture. Although we believe no other data store has the same set 
of functionality as Druid, some of Druid’s optimization techniques 
suchas using inverted indices to perform fast filter sarealsousedin
other data stores [26].
 
druid白皮书:http://static.druid.io/docs/druid.pdf

转载地址:http://iagkx.baihongyu.com/

你可能感兴趣的文章
利润表(年末)未分配利润公式备份
查看>>
Android开发中ViewStub的应用方法
查看>>
gen already exists but is not a source folder. Convert to a source folder or rename it 的解决办法...
查看>>
HDOJ-2069Coin Change(母函数加强)
查看>>
遍历Map的四种方法
查看>>
IOS atomic与nonatomic,assign,copy与retain的定义和区别
查看>>
JAVA学习:maven开发环境快速搭建
查看>>
Altium Designer 小记
查看>>
【Linux高级驱动】I2C驱动框架分析
查看>>
赵雅智:js知识点汇总
查看>>
二维有序数组查找数字
查看>>
20个Linux服务器性能调优技巧
查看>>
多重影分身:一套代码如何生成多个小程序?
查看>>
Oracle将NetBeans交给了Apache基金会
查看>>
填坑记:Uncaught RangeError: Maximum call stack size exceeded
查看>>
SpringCloud之消息总线(Spring Cloud Bus)(八)
查看>>
DLA实现跨地域、跨实例的多AnalyticDB读写访问
查看>>
实时编辑
查看>>
北漂之毕业裁员后的又一波奇遇
查看>>
Python数据分析:pandas常用函数
查看>>