Filtering and Full-Text Search

1) Why you need a search layer

Filtering and full-text search (FTS) provide quick access to data "by meaning," not just by primary keys. A properly designed search layer combines:

Strict filters (categories, dates, prices, access rights)
Full text (lexical match and ranking)
Facets (aggregates for navigation)
Hybrid Ranking (BM25/TF-IDF + Vector Embeddings)
Reliable protocols (cursor pagination, token TTL, cross-sharding)

2) Architectural picture

Components:

1. Ingest/ETL → normalization, deduplication, enrichment, building fields for the index.

2. Indexer → reverse index (tokens → documents), column structures, vector index (HNSW/IVF-PQ).

3. Query Layer → request parser, application of filters/access rights, shard scheduler, k-way merge.

4. Ranker → BM25 + LTR/Neural re-rank.

5. Serving → cache, cursors, facets, highlights, autocomplete.

6. Observability → latency, quality metrics, A/B experiments.

3) Data and index model

3. 1 Fields and analyzers

Types: keyword (even match), text (analyzed), numeric/date/geo, vector.
Analyzers: tokenization, normalization (lowercase, Unicode NFKC), filters (stopwords, stemming/lemmatization).
Multilingualism: per-field analyzers (ru, uk, en); ICU analysis; transliteration; consideration of diacritics.

3. 2 Reverse index (sparse)

Structure: term → posting list (docID, term freq, positions).
Ranking: BM25 (or classic TF-IDF) with field boosts.

3. 3 Vector index (dense)

Text embeddings (for example, 384-1024-dimensional).
ANN structures: HNSW, IVF-PQ, Flat (for small sets).
Cosine proximity/inner product; BM25 calibration (hybrid).

3. 4 Facets and aggregates

Precompute/column storage of values for fast counts.
Hierarchical facets (category/subcategory).
Ranges (price bins, dates).

4) Queries: filters + full-text + sort

4. 1 API Contracts (REST)

Request:


GET /v1/search? q = classic slots & limit = 20 & cursor =... & sort = score: desc, created _ at: desc
&filters=brand:("NetEnt","EGT"); price:[10 TO 50];published_at:[2024-01-01 TO ]
&facets=brand,year,price:range(0,10,20,50,100)

Response (fragment):

json
{
"items": [ { "id":"...", "title":"...", "score": 12. 3, "highlight": { "content": ["..."] } } ],
"facets": { "brand": [{"value":"NetEnt","count":123},...] },
"page": { "limit":20, "has_more":true, "next_cursor":"opaque-token" }
}

4. 2 GraphQL (simplified)

graphql type Query {
search(query: String!, filter: SearchFilter, first: Int, after: String, sort: [Sort!]): SearchConnection!
}

4. 3 gRPC

proto message SearchRequest {
string query = 1;
map<string,string> filters = 2;
int32 page_size = 3;
string page_token = 4; // курсор repeated string facets = 5;
}

5) Natural Language Processing (NLP)

Tokenization/normalization: Unicode-safe, hyphen/apostrophe accounting.
Stopwords: customization lists by language.
Stemming vs lemmatization: for ru/uk lemmatization is better (quality> speed).
Synonyms: bidirectional/directional dictionaries; dictionary versions with TTL.
Typos (fuzzy): Damerau-Levenshtein with distance restriction and exact match boosts.
N-grams/edge-ngrams: for autocomplete and hints.
Transliteration: "shch ↔" "u," "kyiv/kyiv" - correspondence rules.

6) Relevance and ranking

6. 1 Basic lexical scoring

BM25 with the 'k1', 'b' setting by collection.
Boosts by fields (title ^ 3, tags ^ 1. 5, body^1).
Freshness: 'score + = freshness_boost (decay (created_at))'.

6. 2 Behavioral cues

Click-through rate, dwell time, save to favorites (with anti-positional bayas).
Deduplication - Stitch together documents with ~ identical content (MinHash/SimHash).

6. 3 Learning-to-Rank (LTR)

Features: field BM25, length, freshness, popularity, match by phrase, positional speed.
Models: LambdaMART/XGBoost; offline metrics NDCG @ k, MAP, Precision @ k; online A/B.

6. 4 Neuro-rearrangement

Two-step: recall (BM25/ANN) → top-N (for example, 200) → cross-encoder rerank.
Cost accounting: time budget, fallback without neuro-stage under load.

6. 5 Hybrid search (sparse + dense)

Either fusion (normalization of speeds and sum), or multi-stage (dense as rerank).
Calibration is important: min-max/z-score/quantitative mapping.

7) Filtering, facets and access

7. 1 Filters

Operators: '=', 'IN', ranges, prefixes, geo-bounding box/geo-distance.
Combinations: 'AND' by filters, 'OR' within a set of values (brand IN...).
Type security: numeric fields are not parsed as text.

7. 2 Facets

Cheap counts for pre-calculated structures.
"Applied" facets show the remaining post-filter facets.

7. 3 Access/multi-tenancy

Security filters are integrated before ranking (pre-filter).
ABAC/RBAC fields in the document ('tenant _ id', 'visibility', 'acl').
The request token is signed; with multi-tenant - automatic'tenant _ id'filter.

8) Pagination, cursors and consistency

Pagination by seek-cursor by '(score, tie-breaker)' or by '(created_at, id)' when sorted by time.
Opaque 'page _ token' with HMAC and TTL.
Consistency: near-real-time (NRT) index: delay 0. 5-2 s between recording and visibility. Document it in the SLA.
Cross-shard: local search → k-way merge by global order, per-shard cursors in token.

9) AutoComplete and prompts

Suggesters: prefix-trie / edge-ngrams по полю `title`.
Popular queries: log of clicks → tips on popularity + personalization (segments).
Spell-as-you-type: fast fuzzy search with distance limit '<= 1'.

Example REST:


GET /v1/suggest? q=kaz&limit=8&locale=ru
→ ["casino," "casual games,..."]

10) Highlights and snippets

Positional index → retrieving phrases with matches.
HTML escape, length limit, union of neighboring fragments.
Ranking snippets by density of relevant terms.

11) Performance, cache and SLO

Indexes: hot segments in memory; compression postings; doc values for facets.
Cache: L1 (process), L2 (Redis), facets/aggregates cache; disabled by index version.
SLO: P95 <150-200 ms at 'k <= 20', P99 <500 ms; availability 99. 9%.
Backpressure: decrease 'k', disable the neuro-stage when overloaded.
Rate limiting to the API/user/tenant key.

12) Observability and quality metrics

Technical metrics:

`search_latency_ms` (P50/P95/P99), `qps`, `timeouts`, `error_rate`
`cache_hit_ratio`, `facet_cache_hit`, `rerank_share`
`shard_fanout`, `merge_time_ms`, `ann_recall@k`

Quality (offline):

NDCG @ k, MAP, MRR, Recall @ k, Precision @ k on marked samples.

Online:

CTR@k, sCTR (satisfied clicks), dwell time, отказ (pogostick rate).

A/B: fix "guardrail" metrics (latency, errors) + target (NDCG proxy).

13) Testing

Relevance unit tests: checking expected matches for key requests.
Property-based: resistance to typos/synonyms/languages.
Pagination: no duplicates at the page boundary (seek contracts).
Security: access filters are always applied (even on faset-count).
Dictionary regressions: versioning synonyms and fuzzy rules.

14) Security and privacy

Fields with PII are not indexed as text; store separately/encrypt.
Minimize stored sources (store = false, snippet fields only).
Query privacy: do not log raw requests with PII; anonymization/hashing.
Multi-tenant: strict index isolation or mandatory'tenant _ id'filter.

15) Migrations and interoperability

Versioning index scheme (v1→v2) with double write and gradual switch.
Analyzer compatibility: do not re-index old chains yet.
Rotation of synonym/stopword dictionaries: 'version', 'activated _ at', rollback.

16) Practical recipes

16. 1 Classic Lexical Search (BM25)

Fields: 'title ^ 3', 'tags ^ 2', 'body ^ 1'.
Analyzers: language-specific + lemmatization.
Fuzzy for short queries ('<= 3' tokens), 'fuzziness = 1'.

16. 2 Hybrid sparse + dense

1. ANN search by query embedding (k = 200)

2. Merge with top-200 BM25

3. Calibration Rank Fusion

4. Take top-N (N = 20), optionally - rank cross-encoder with a sufficient budget.

16. 3 Faceted catalog navigation

Hard pre-filter by rights/tenant

Post-filter facets (counts including active filters)

Sort by relevance or business field (price/novelty)

17) Sample requests (pseudo-DSL)

Filters and sorting:

json
{
"query": "live casino,"
"filters": {
"country": ["EE","LV","LT"],
"license": ["MGA","UKGC"],
"launched_at": {"gte": "2023-01-01"}
},
"sort": ["_score:desc","launched_at:desc"],
"facets": ["country","license"],
"page": {"limit": 20, "cursor": "opaque"}
}

Geopoisk:

json
{
"query": "casino",
"geo": {"lat": 59. 437, "lon": 24. 753, "radius_km": 50}
}

Autocomplete:

json
{ "prefix": "evo", "field": "brand_suggest", "limit": 8 }

18) UX patterns

Active filter chips + "reset all."

Blank results: show "try..." (synonyms, remove filter).
Zero Hints: popular queries/categories.
Cursor pagination (More button) and infinite scrolling; fixed indicator of applied filters.

Separate switches "take into account typos," "exact match of the phrase."

19) Frequent errors and anti-patterns

No tie-breaker when sorting → doubles/jumps.
Facets without taking into account active filters → "false" counts.
Apply post-ranking access filters.
Mixing different languages with one analyzer.
Deep pagination OFFSET/LIMIT instead of seek cursor.
Unlimited fuzzy → explosion by latency.

20) Implementation checklist

1. Define the fields and their types, assign per-locale analyzers.
2. Design the inverse index + (opts.) vector ANN.
3. Implement a query parser and secure pre-filters.
4. Set up BM25 and field boosts; attach facets.
5. Enter cursors (opaque, HMAC, TTL) and k-way merge by shards.
6. Add autocomplete, highlights, safe shielding.
7. Metrics: latency, NDCG @ k, CTR; L1/L2 cache.
8. A/B framework for tuning relevance.
9. Document SLA: NRT delay, 'limit' limits, consistency guarantee.
10. Migration plan: versions of index, dictionaries and analyzers.

A well-designed filtering and full-text search layer is not only a fast index, but also a clear protocol contract with cursors, security, predictable UX, and measurable relevance. This approach scales from thousands to billions of documents and supports both classical lexical search and modern hybrid scenarios with neural network ranking.

Filtering and Full-Text Search

Get in Touch

Quick Contact

The video will be updated soon

We are currently very busy with projects