Anass Ez-zouaine — Senior Backend Engineer · Software Architect

Anass Ez-zouaine — Senior Backend Engineer · Software Architect · AI EngineerSenior Lead Backend Engineer, Software Architect, and AI Engineer. 12+ years building production Laravel SaaS, Shopify Plus apps, and AI features (Claude, MCP, RAG, agentic systems). Remote-first since 2014.https://ansezz.com/Caching for speed: Redis and semantic layers in RAGhttps://ansezz.com/blog/redis-semantic-caching-rag/https://ansezz.com/blog/redis-semantic-caching-rag/Stop paying for the same LLM call twice. Two-tier caching — exact-match Redis keys plus semantic vector lookups via RedisVL — that cuts RAG latency from seconds to milliseconds and slashes API spend by up to 80%. With tenant isolation, TTL tiers, and the precision metrics that keep it honest.Tue, 26 May 2026 00:00:00 GMTarchitecturerediscachingairagsemantic-cacheredisvlvector-searchperformanceinfrastructureScaling on demand: smart auto-scaling for modern AI appshttps://ansezz.com/blog/smart-auto-scaling-ai/https://ansezz.com/blog/smart-auto-scaling-ai/CPU autoscaling is a lie for GPU workloads. Why queue depth, KV-cache pressure, and TTFT beat CPU as scaling triggers — KEDA-driven patterns, ARIMA forecasting, and composite metrics that scale your AI SaaS before users hit the spinner.Mon, 25 May 2026 00:00:00 GMTarchitectureauto-scalingaillmkuberneteskedagpukv-cachequeue-depthttftinfrastructureGPU-aware load balancing: managing AI compute like a prohttps://ansezz.com/blog/gpu-aware-load-balancing/https://ansezz.com/blog/gpu-aware-load-balancing/Round-robin is a relic when LLM requests span 50 tokens to 50,000. Prefill vs decode disaggregation, KV-cache-aware routing, prefix matching, and the four metrics that matter — how to route AI traffic so your P99 stops bleeding.Sun, 24 May 2026 00:00:00 GMTarchitecturegpuload-balancingaillminferencevllmkv-cacheprefilldecodeinfrastructureCircuit breakers: preventing cascading failures in your vector DBhttps://ansezz.com/blog/circuit-breakers-vector-db/https://ansezz.com/blog/circuit-breakers-vector-db/A slow vector DB kills SaaS faster than a dead one. The circuit-breaker pattern for AI infrastructure — closed/open/half-open states, fallback tiers, semantic caches, LLM-only mode, and Laravel-friendly wiring to keep production from melting under one bad dependency.Sat, 23 May 2026 00:00:00 GMTarchitecturecircuit-breakersairagresiliencevector-dbfallbacklaravelobservabilityinfrastructureMessage queues: handling the heavy lifting of document processinghttps://ansezz.com/blog/message-queues-document-processing/https://ansezz.com/blog/message-queues-document-processing/Stop running embeddings inside the request-response cycle. A production-grade document ingestion pipeline — staged workers, exponential backoff, dead-letter quarantines, batched embeddings, and queue-depth autoscaling that keeps your AI app from melting under a 500-page PDF.Fri, 22 May 2026 00:00:00 GMTarchitecturemessage-queuesairagdocument-processingredisbullmqlaravel-queuesdlqbatchinginfrastructureRate limiting: protecting your AI wallethttps://ansezz.com/blog/rate-limiting-ai-wallet/https://ansezz.com/blog/rate-limiting-ai-wallet/One runaway agent loop = $5,000 OpenAI bill. Why request-per-second limits lie for LLM apps, how to architect hierarchical token-bucket limits across global / tenant / user layers, and adaptive throttling patterns that protect margins without breaking UX.Thu, 21 May 2026 00:00:00 GMTarchitecturerate-limitingaillmragmulti-tenancyredislaraveldenial-of-walletapi-gatewayAPI Gateway: the front door of your AI stackhttps://ansezz.com/blog/api-gateway-ai-stack/https://ansezz.com/blog/api-gateway-ai-stack/Stop exposing LLM providers directly to the frontend. The gateway pattern for AI apps — JWT-scoped tenant isolation, model aliases, denial-of-wallet rate limiting, streaming-safe timeouts, and the wallet-saving guardrails every senior engineer needs.Wed, 20 May 2026 00:00:00 GMTarchitectureapi-gatewayairagsecurityrate-limitingmulti-tenancystreaminginfrastructureShopify Storefront Web Components: headless commerce for the rest of ushttps://ansezz.com/blog/shopify-storefront-web-components/https://ansezz.com/blog/shopify-storefront-web-components/Headless used to mean six engineers and a Hydrogen rebuild. Shopify Storefront Web Components let you drop products, collections, and cart into any HTML page with a script tag — no React, no build step, no DevOps tax.Tue, 19 May 2026 00:00:00 GMTshopifyshopifyweb-componentsheadlessllms-txthydrogenagenticstorefront-apiWhy agentic commerce will change the way you build Shopify storeshttps://ansezz.com/blog/agentic-commerce-shopify/https://ansezz.com/blog/agentic-commerce-shopify/AI agents don't browse — they query. The shift from human-centric Shopify themes to agent-ready infrastructure: Shopify Catalog, UCP, Agentic Storefronts, MCP servers, and why structured data is the new CSS.Mon, 18 May 2026 00:00:00 GMTshopifyshopifyagentic-commerceaimcphydrogengraphqlucpmetaobjects7 mistakes you're making with your production RAG stack (and how to fix them)https://ansezz.com/blog/7-rag-mistakes-production/https://ansezz.com/blog/7-rag-mistakes-production/Naive chunking, no reranker, embedding drift, latency blowups, vibe-checking — the seven structural mistakes that turn a slick RAG demo into a production nightmare, and the fixes that actually ship.Sun, 17 May 2026 00:00:00 GMTairagaillmpgvectorrerankerhybrid-searchevalsproductionScaling with RabbitMQ: why message brokers matterhttps://ansezz.com/blog/scaling-with-rabbitmq/https://ansezz.com/blog/scaling-with-rabbitmq/Synchronous controllers are how monoliths die. RabbitMQ basics, exchanges and queues, the strangler pattern for going async, idempotent workers, and the Laravel queue setup I use to absorb 100k-row spikes without breaking the login page.Sat, 16 May 2026 00:00:00 GMTarchitecturerabbitmqmessage-brokerqueuesdecouplingscalinglaravelasyncarchitectureMastering event-driven architecture with Google Pub/Subhttps://ansezz.com/blog/event-driven-pubsub/https://ansezz.com/blog/event-driven-pubsub/Decouple your services or drown in latency. Topics, fan-out, push vs pull, dead-letter queues, idempotent consumers, and the Laravel integration I run on Google Cloud — a practical EDA blueprint from a senior engineer.Sat, 02 May 2026 00:00:00 GMTarchitecturepub-subevent-drivengcpgoogle-cloudlaravelmessagingscalabilityarchitectureEffortless SaaS hosting: the Coolify and Docker deployment guidehttps://ansezz.com/blog/coolify-docker-saas-hosting/https://ansezz.com/blog/coolify-docker-saas-hosting/Heroku DX, your own server, none of the cloud tax. How Coolify + Docker on a $5 VPS replaces vendor-lock managed platforms with a control plane you actually own — one-click databases, automatic SSL, and zero-downtime deploys.Sun, 19 Apr 2026 00:00:00 GMTdevopscoolifydockerself-hostingdevopssaasdeploymentvpstraefikMCP and the future of tool-use: building context-aware agentshttps://ansezz.com/blog/mcp-context-aware-agents/https://ansezz.com/blog/mcp-context-aware-agents/The Model Context Protocol kills the era of brittle one-off integrations. Tools, resources, prompts, and the three primitives that let one server talk to any MCP-aware client — with a working TypeScript example you can ship today.Sun, 05 Apr 2026 00:00:00 GMTaimcpanthropicclaudeagentic-aitypescripttool-usecontextVibe coding and the architectural shift to agentic workflowshttps://ansezz.com/blog/agentic-workflows-vibe-coding/https://ansezz.com/blog/agentic-workflows-vibe-coding/MCP, agentic loops, and intent-based engineering. How vibe coding becomes a real architecture pattern when AI stops being a chat sidebar and starts owning stateful loops against your tools. The practical Laravel + MCP stack I run today.Sun, 22 Mar 2026 00:00:00 GMTarchitecturevibe-codingagentic-aimcpanthropicclaudelaravelarchitectureWhy your RAG implementation is failing in production (and how to fix it)https://ansezz.com/blog/why-your-rag-is-failing/https://ansezz.com/blog/why-your-rag-is-failing/Vector-only retrieval is the silent killer of production RAG. Hybrid search with BM25, reciprocal rank fusion, smarter chunking, re-rankers, and an evaluation harness — the production checklist that turns a flaky demo into a reliable system.Sun, 08 Mar 2026 00:00:00 GMTairagaivector-searchbm25hybrid-searchre-rankerproductionFrom monolith to micro-services: a senior dev's guide to pragmatic scalinghttps://ansezz.com/blog/monolith-to-microservices/https://ansezz.com/blog/monolith-to-microservices/Skip the big-bang rewrite. The strangler fig pattern, anti-corruption layers, Docker-first migration, and GKE/Coolify operations — how I peel services off a Laravel monolith one endpoint at a time without breaking revenue.Sun, 22 Feb 2026 00:00:00 GMTarchitecturemonolithmicro-servicesscalingstrangler-figdockerkubernetesdevopslaravelLaravel multi-tenancy: how I built a scalable SaaS architecturehttps://ansezz.com/blog/laravel-multi-tenancy/https://ansezz.com/blog/laravel-multi-tenancy/Single DB vs multi-DB, global scopes that stop data leaks, stancl/tenancy in production, isolated storage, automated migrations, and the Docker + Google Cloud setup I run for high-trust SaaS clients.Sun, 08 Feb 2026 00:00:00 GMTlaravellaravelmulti-tenancysaasarchitecturestancl-tenancydockerpostgresAI integration vs traditional development: which is better for your business in 2026?https://ansezz.com/blog/ai-vs-traditional-development/https://ansezz.com/blog/ai-vs-traditional-development/Speed, control, or a hybrid path? When AI-assisted development pays off, when traditional engineering is non-negotiable, and the hybrid workflow I recommend most often to founders and tech leads.Sun, 25 Jan 2026 00:00:00 GMTarchitectureaistrategyhybridbusinessdecision-makingproductivityScaling with confidence: advanced Coolify deployment strategieshttps://ansezz.com/blog/scaling-with-coolify/https://ansezz.com/blog/scaling-with-coolify/Move past the single-server trap. Multi-node Coolify setups, zero-downtime rolling deploys with health checks, dedicated build servers, managed databases, and GitHub Actions wiring — production-grade self-hosting without a DevOps team.Sun, 11 Jan 2026 00:00:00 GMTdevopscoolifydeploymentdevopsdockerself-hostingci-cdscalingShopify Liquid vs. headless: choosing the right stack for scalehttps://ansezz.com/blog/shopify-liquid-vs-headless/https://ansezz.com/blog/shopify-liquid-vs-headless/Hydrogen looks great on paper. Liquid still ships more revenue per week. A practical decision framework for picking between Liquid, headless Hydrogen, and the messy middle — based on what your team can actually operate long-term.Sun, 28 Dec 2025 00:00:00 GMTshopifyshopifyliquidheadlesshydrogenperformancearchitectureecommercePicking the right RAG stack: vector databases for AI engineeringhttps://ansezz.com/blog/picking-the-right-rag-stack/https://ansezz.com/blog/picking-the-right-rag-stack/pgvector, Pinecone, Weaviate, Qdrant — a 2026 field guide. Which vector store to pick for your AI app, why hybrid search matters, and how to ship without painting yourself into a corner.Sun, 14 Dec 2025 00:00:00 GMTairagvector-databasespgvectorpineconeweaviateqdrantlaravelVibe coding: why your next project needs more than just logichttps://ansezz.com/blog/vibe-coding/https://ansezz.com/blog/vibe-coding/Logic is the skeleton. Vibe is the soul. Why taste, intent, and feel are the new senior-engineer superpowers in the Cursor + Claude era — and how to keep the codebase from turning into a ball of mud while you chase it.Sun, 30 Nov 2025 00:00:00 GMTaivibe-codingaicursorclaudetastedxlaravelHello, world. Yes, another developer blog.https://ansezz.com/blog/hello-world/https://ansezz.com/blog/hello-world/Why this site exists, what I'll write about, and why neobrutalism is the right call for an engineer's portfolio in 2026.Sun, 16 Nov 2025 00:00:00 GMTcareermetaintro