Sovereign semantic search — when a house's memory stays with the house

Embeddings, vector databases, similarity, retrieval: a measured reading of semantic search and the sovereign / hosted trade-off.

(Lev Marchuk: Profiling Engineer / Data Scientist)

2 June 2026 · 7 min

// with contributions from

Margaux LefèvreChief Technology Officer

Chloé GarnierHead of Architecture

The observation. Within a few years, a company accumulates thousands of documents : product sheets, technical notes, supplier exchanges, internal procedures. Finding them by exact keyword quickly becomes hopeless — the author wrote “wall slate”, the colleague searches for “mounted chalkboard”. Semantic search addresses that gap : it no longer compares character strings but meaning. Since 2018-2019, embeddings and vector databases have moved this idea from the lab into everyday tooling. The question today is no longer “is it possible” but “where should this knowledge live” — on an external service, or under one's own control.

From keyword to meaning

Classical search rests on lexical matching — the TF-IDF then BM25 family, which remain solid baselines in information retrieval to this day. It is fast and explainable but blind to synonyms, paraphrases and different languages. Semantic search works differently : each fragment of text is turned into a vector — a list of numbers, often several hundred, that places the fragment in a space where geometric proximity reflects proximity of meaning. Two sentences close in meaning are neighbours in that space, even with no word in common.

These vectors come from embedding models, heirs to a well-documented lineage : word2vec (Mikolov et al., 2013), GloVe (Pennington et al., 2014), then the contextual encoders of the Transformer architecture (Vaswani et al., 2017) and BERT (Devlin et al., 2018). Recent Sentence-BERT-style models (Reimers & Gurevych, 2019) produce directly comparable sentence vectors, which made semantic search practical at scale.

How a vector database works

Once documents are turned into vectors, you need to retrieve the ones closest to a query. That is what a vector database does. The most common proximity measure is cosine similarity, which compares the angle between two vectors rather than their length. Searching exhaustively across millions of vectors would be expensive, so approximate nearest-neighbour (ANN) indexes are used — the best-established families being HNSW (Malkov & Yashunin, 2016) and IVF/PQ popularised by the FAISS library (Johnson et al., 2017). These indexes trade a fraction of accuracy for considerable speed gains.

In practice this mechanism often feeds a retrieval-augmented generation (RAG, Lewis et al., 2020) pattern : relevant fragments are retrieved by similarity, then supplied as context to a drafting system. The quality of the answer then depends first on the quality of retrieval — hence the attention paid to document chunking, embedding-model choice and relevance evaluation (recall and precision).

Why “sovereign”

Many providers offer embedding and vector storage as a hosted service : quick to set up and maintenance-free. But for a corporate knowledge base, indexing a document means sending its content to a third party. Keeping the whole chain — embedding model, index, queries — on infrastructure you control is what we call a sovereign approach. Three motives justify it.

Data control. Supplier notes, commercial terms and internal procedures never leave the company perimeter — not as plain text, nor as vectors, which the literature shows can sometimes be partially inverted.
Compliance. The GDPR (Regulation (EU) 2016/679) requires a legal basis and control over transfers for any personal data ; self-hosting simplifies processing mapping and data localisation.
Stability. An embedding model chosen and frozen in-house does not shift under the company's feet with a vendor's updates — which would otherwise force re-indexing the whole corpus without notice.

The trade-offs

The sovereign approach is not free. An open embedding model of reasonable size (for example the E5 family, Wang et al., 2022, or MiniLM models) runs on a modest server, often without a GPU, but rarely matches the absolute quality of the largest proprietary models. It must be hosted, monitored, backed up, and its index version-managed. The right balance is measurable : if a lightweight model answers real queries with sufficient recall, control and confidentiality often outweigh the few points of relevance a hosted service might add.

The question is therefore not sovereign or hosted in the abstract. It is : how sensitive is the corpus, what relevance level is genuinely needed, and what operating cost the team can sustain over time. It is an engineering trade-off as much as a governance one.

Where we stand

Montandor Andorra runs a sovereign semantic knowledge base for its internal tooling : the corpus is chunked, encoded by an open embedding model hosted on our own infrastructure, then indexed in a vector database we operate ourselves. Nothing confidential leaves the perimeter. It is not an ideological choice ; it is the application of a simple principle — a house's data stays with the house — to a search tool our teams use every day.

“A knowledge base is a house's memory. You can rent a great many things ; your own memory is better kept at home. Semantic search makes that knowledge alive and findable — provided it stays under our own roof.”
— Wouter Meijboom, CEO, Montandor Andorra.

Sources

Published 2 June 2026 — research led by Lev Marchuk (Profiling Engineer / Data Scientist), in collaboration with Margaux Lefèvre (Chief Technology Officer) and Chloé Garnier (Head of Architecture).