Fili Architecture

Fili is a low latency query engine for Druid. Fili is designed to query petabytes of data at interactive speeds that BI/Analytics environments require.

Fili is useful for short, interactive ad-hoc queries on large-scale data sets. Fili is capable of issuing complicated Druid query in HTTPs format.

High-Level Architecture


Fili Endpoints

Fili Query Execution

When you submit a HTTPs query, Fili translates the query to a JSON format that Druid understands and sends the translated query to Druid broker. The Request Flow section below explains how Fili translates a HTTPs request and optimizes it to a native Druid query.

Request Flow

Request Modelling

When a request hits one of endpoints, the very first process is deserializing it to a API Request object.

Based on the nature of request, there are 6 types of API requests Fili supports:

  1. Data API request
  2. Dimension API request
  3. Tables API request
  4. Metrics API request
  5. Jobs API request
  6. Slices API request

Each type is an interface for a request type. Fili has default implementations for each type of requests. The OOP model of Fili API request is summarized below:

Request Model

Partial Data Request Handler

When user queries data from a certain time range, 2018-01-01/2018-03-01 for example, it is possible that data for March has not yet been ingested into Druid cluster yet. In this case, use will receive incomplete data. Fili notifies the user about that using PartialDataRequestHandler

The PartialDataRequestHandler compares interval that query requests with intervals that has Druid data, if the query interval contains date range outside of what’s available, Fili will put add a header entry called “missingIntervals” in response to indicate those intervals.

Response Processing

If you do not specify any format, it will be JSON

Fili Internal

Druid Query Model

Druid Query Model

Druid JSON query corresponds to type DruidQuery in Fili. An object of type DruidQuery contains information about query type(i.e. GroupBy query), Druid datasource that the query is sent to, query context, and inner query. The type allows developers to set datasource and query context.

DruidQuery gets borken down into two sub-types:

  1. DruidMetaDataQuery: Fili currently supports two native Druid meta data query - Time Boundary Queries & Segment Metadata Queries
  2. DruidFactQuery: Fili currency support the following native Druid data query

HTTPS request will be parsed to generate corresponding DruidQuery objects in Fili; next the query will be serialized into Druid JSON query and gets sent to Druid broker for processing.