Posts tagged with "On-Premise AI"

Found 2 posts

February 17, 2026

Small Language Models On-Prem: The Phi-4 and Llama 3.3 ROI Math

When running small language models on-prem actually beats hosted inference — Phi-4, Llama 3.3, GPU sizing, Ollama and vLLM deployment patterns, and the honest cost math for Swedish data-residency workloads.

AI & Cloud Infrastructure

November 28, 2025

Running Open-Source AI Models at Scale: Azure Container Apps, AKS, and On-Premise Deployments - Microsoft Ignite 2025

Microsoft Ignite BRK117: Deploy open-source AI models (Llama 3.3, Mistral) with Azure Container Apps serverless GPUs, AKS with Kaido workflows, and on-premise infrastructure. Cost reduction 60-85%, data sovereignty, and hybrid architectures with Azure Arc.

Microsoft Ignite 2025

Azure Container Apps

Azure Kubernetes Service