Cognitive Search to Azure AI Search: Manufacturing Migration
Microsoft renamed Azure Cognitive Search to Azure AI Search in late 2023 and, more importantly, extended the service with vector search, semantic ranking improvements, agentic retrieval, and integrated embedding generation. Most manufacturing teams still running Cognitive Search workloads in production have not fully migrated. This is a practical guide to what changed, what did not, and how to upgrade without breaking running indexes.
What Actually Changed
The rename is the surface story. The underneath story is a set of substantive capability additions:
- Vector search became first-class. Previously, vector-capable features were bolted on via custom skills. The service now natively supports
Collection(Edm.Single)vector fields, HNSW and exhaustive KNN algorithms, and vector profiles. - Integrated embedding generation. Skillsets can now call Azure OpenAI embeddings directly without custom Azure Functions. One configuration step, no wrapper code.
- Semantic ranker matured. L2 re-ranking now handles longer documents better, supports more languages including Swedish and German, and exposes richer answers and captions.
- Agentic retrieval. New APIs that let an LLM issue multiple progressive queries against the index, reason over intermediate results, and compose a grounded answer. Replaces much of the custom orchestration that RAG pipelines used to require.
- Hybrid search defaults improved. Reciprocal rank fusion (RRF) now combines keyword and vector results with better default weighting, reducing the tuning burden.
- Higher tier capacities. Storage Optimized L1 and L2 tiers enable hundreds of gigabytes per partition, matching manufacturing corpus sizes that previously required multiple services.
What Did Not Change
The core primitives — index, indexer, skillset, data source — remained compatible. Queries continue to use the same REST API surface with additive changes. Existing client code written against the 2023 API versions keeps working; newer capabilities require upgrading the API version, not rewriting the client.
Why Manufacturing Workloads Benefit Most
Three capabilities in the new stack line up specifically with manufacturing use cases:
- Vector search on technical language. Manufacturing vocabulary (part numbers, material grades, tolerances) is poorly served by keyword-only retrieval. Embeddings capture the semantic proximity of "M12 flange bolt torque spec" to documents that describe it in different phrasing.
- Agentic retrieval for multi-hop queries. "Which safety-critical assemblies use supplier X's resistors, and what are the failure rates recorded in the last two years?" is two lookups. Agentic retrieval handles the decomposition.
- Integrated embeddings reduce the glue code that most Cognitive Search deployments accumulated. One enrichment step replaces the custom Function App that previously did the embedding.
The Migration, in Order
Step 1: Move to the new API version
Update client SDKs and REST API version strings to 2024-07-01 or later. Existing indexes and queries continue to work without change. This unlocks access to vector fields and improved semantic configuration even before any index redesign.
Step 2: Add a vector field to the existing index
You do not have to rebuild the index from scratch. Azure AI Search allows adding new fields to an existing index without re-indexing the older data. Add a contentVector field and a vector profile, then backfill embeddings for existing documents using the indexer with an updated skillset.
// PUT /indexes/mfg-documents?api-version=2024-07-01
{
"name": "mfg-documents",
"fields": [
// ... existing fields ...
{
"name": "contentVector",
"type": "Collection(Edm.Single)",
"searchable": true,
"dimensions": 1536,
"vectorSearchProfile": "hnsw-cosine"
}
],
"vectorSearch": {
"profiles": [{ "name": "hnsw-cosine", "algorithm": "hnsw-config" }],
"algorithms": [{ "name": "hnsw-config", "kind": "hnsw",
"hnswParameters": { "m": 4, "efConstruction": 400, "efSearch": 500, "metric": "cosine" } }]
}
}
Step 3: Replace custom embedding generation with integrated skill
Most Cognitive Search workloads with vectors used a custom Web API skill pointing at an Azure Function that called Azure OpenAI embeddings. Replace it with the built-in #Microsoft.Skills.Text.AzureOpenAIEmbeddingSkill in the skillset. Deletes a component from the architecture, reduces moving parts.
Step 4: Update query patterns to hybrid + semantic
If the application still uses keyword-only queries, introduce vector queries alongside them. The service's RRF fusion combines the two result sets; no application-side tuning required for most workloads. Semantic ranker goes on top for re-ranking the combined top-50 into the final top-10.
Step 5: Introduce agentic retrieval where applicable
For LLM-powered applications with multi-step reasoning requirements, the agentic retrieval API replaces custom orchestration code. The LLM drives the query sequence; the service handles retrieval, ranking, and citation. Fits well for copilots answering engineering queries across heterogeneous document types.
Step 6: Consolidate services if you ran multiple
Previously, corpora above a few hundred gigabytes required splitting across multiple Cognitive Search services. Storage Optimized L1 or L2 tiers now hold up to 1 TB per partition. A single service with the right tier consolidates what used to be two or three, simplifying operations and cutting cost.
Gotchas to Plan For
- Embedding model choice is not free to change later. Once the index is populated with
text-embedding-3-largevectors, swapping to a different model requires re-embedding every document. Pick carefully at the start. - HNSW memory footprint. High-dimensional vectors with dense indexes increase memory pressure. Monitor replica memory and scale the tier if query latency degrades.
- Rate limits on integrated embedding. The embedding skill calls Azure OpenAI synchronously during indexing. Provisioned Throughput Units (PTUs) or higher pay-as-you-go quotas avoid throttling during bulk backfills.
- Semantic configuration is per-index. When you add a semantic configuration to an existing index, existing queries do not automatically start using it. The client must explicitly pass
queryType: "semantic".
Cost Impact of the Migration
Three components move. Embedding generation adds cost proportional to corpus size (roughly 0.00013 USD per 1K tokens for text-embedding-3-small, higher for large). Vector storage increases the index footprint by the vector dimension times 4 bytes per chunk. Semantic ranker is metered per query. On net, the total cost typically rises 20–40% while retrieval quality improves by orders of magnitude. For manufacturing search that was previously ranked poorly on technical queries, this is a good trade.
What to Measure Before Declaring the Migration Done
- Recall at 10 on a labelled eval set, before and after the migration.
- Mean Reciprocal Rank of the first correct document on technical queries that use domain-specific language.
- p95 query latency across the hybrid + semantic path.
- Indexer throughput during bulk embedding backfills. Target is at least 10,000 documents per hour.
- Per-tenant cost delta against the previous Cognitive Search deployment.
When Not to Migrate Yet
Three situations justify deferring. First, if the workload is a low-query-volume internal wiki where keyword-only retrieval is already adequate, the upgrade cost outweighs the quality gain. Second, if a major index schema overhaul is already scheduled within six months, bundle the migration into that change rather than doing it twice. Third, if the organisation is in the middle of a broader RAG or agent platform decision, wait until that architecture is settled before committing to agentic retrieval dependencies.
For most manufacturing teams running Cognitive Search workloads that serve engineering or operations queries, the migration pays for itself within one release cycle. The longer it waits, the more custom embedding code and orchestration layers accumulate, and the migration gets harder. Start with Step 1 this quarter.