<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"><channel><title>Anass Ez-zouaine — Senior Backend Engineer · Software Architect · AI Engineer</title><description>Senior Lead Backend Engineer, Software Architect, and AI Engineer. 12+ years building production Laravel SaaS, Shopify Plus apps, and AI features (Claude, MCP, RAG, agentic systems). Remote-first since 2014.</description><link>https://ansezz.com/</link><item><title>Caching for speed: Redis and semantic layers in RAG</title><link>https://ansezz.com/blog/redis-semantic-caching-rag/</link><guid isPermaLink="true">https://ansezz.com/blog/redis-semantic-caching-rag/</guid><description>Stop paying for the same LLM call twice. Two-tier caching — exact-match Redis keys plus semantic vector lookups via RedisVL — that cuts RAG latency from seconds to milliseconds and slashes API spend by up to 80%. With tenant isolation, TTL tiers, and the precision metrics that keep it honest.</description><pubDate>Tue, 26 May 2026 00:00:00 GMT</pubDate><category>architecture</category><category>redis</category><category>caching</category><category>ai</category><category>rag</category><category>semantic-cache</category><category>redisvl</category><category>vector-search</category><category>performance</category><category>infrastructure</category></item><item><title>Scaling on demand: smart auto-scaling for modern AI apps</title><link>https://ansezz.com/blog/smart-auto-scaling-ai/</link><guid isPermaLink="true">https://ansezz.com/blog/smart-auto-scaling-ai/</guid><description>CPU autoscaling is a lie for GPU workloads. Why queue depth, KV-cache pressure, and TTFT beat CPU as scaling triggers — KEDA-driven patterns, ARIMA forecasting, and composite metrics that scale your AI SaaS before users hit the spinner.</description><pubDate>Mon, 25 May 2026 00:00:00 GMT</pubDate><category>architecture</category><category>auto-scaling</category><category>ai</category><category>llm</category><category>kubernetes</category><category>keda</category><category>gpu</category><category>kv-cache</category><category>queue-depth</category><category>ttft</category><category>infrastructure</category></item><item><title>GPU-aware load balancing: managing AI compute like a pro</title><link>https://ansezz.com/blog/gpu-aware-load-balancing/</link><guid isPermaLink="true">https://ansezz.com/blog/gpu-aware-load-balancing/</guid><description>Round-robin is a relic when LLM requests span 50 tokens to 50,000. Prefill vs decode disaggregation, KV-cache-aware routing, prefix matching, and the four metrics that matter — how to route AI traffic so your P99 stops bleeding.</description><pubDate>Sun, 24 May 2026 00:00:00 GMT</pubDate><category>architecture</category><category>gpu</category><category>load-balancing</category><category>ai</category><category>llm</category><category>inference</category><category>vllm</category><category>kv-cache</category><category>prefill</category><category>decode</category><category>infrastructure</category></item><item><title>Circuit breakers: preventing cascading failures in your vector DB</title><link>https://ansezz.com/blog/circuit-breakers-vector-db/</link><guid isPermaLink="true">https://ansezz.com/blog/circuit-breakers-vector-db/</guid><description>A slow vector DB kills SaaS faster than a dead one. The circuit-breaker pattern for AI infrastructure — closed/open/half-open states, fallback tiers, semantic caches, LLM-only mode, and Laravel-friendly wiring to keep production from melting under one bad dependency.</description><pubDate>Sat, 23 May 2026 00:00:00 GMT</pubDate><category>architecture</category><category>circuit-breakers</category><category>ai</category><category>rag</category><category>resilience</category><category>vector-db</category><category>fallback</category><category>laravel</category><category>observability</category><category>infrastructure</category></item><item><title>Message queues: handling the heavy lifting of document processing</title><link>https://ansezz.com/blog/message-queues-document-processing/</link><guid isPermaLink="true">https://ansezz.com/blog/message-queues-document-processing/</guid><description>Stop running embeddings inside the request-response cycle. A production-grade document ingestion pipeline — staged workers, exponential backoff, dead-letter quarantines, batched embeddings, and queue-depth autoscaling that keeps your AI app from melting under a 500-page PDF.</description><pubDate>Fri, 22 May 2026 00:00:00 GMT</pubDate><category>architecture</category><category>message-queues</category><category>ai</category><category>rag</category><category>document-processing</category><category>redis</category><category>bullmq</category><category>laravel-queues</category><category>dlq</category><category>batching</category><category>infrastructure</category></item><item><title>Rate limiting: protecting your AI wallet</title><link>https://ansezz.com/blog/rate-limiting-ai-wallet/</link><guid isPermaLink="true">https://ansezz.com/blog/rate-limiting-ai-wallet/</guid><description>One runaway agent loop = $5,000 OpenAI bill. Why request-per-second limits lie for LLM apps, how to architect hierarchical token-bucket limits across global / tenant / user layers, and adaptive throttling patterns that protect margins without breaking UX.</description><pubDate>Thu, 21 May 2026 00:00:00 GMT</pubDate><category>architecture</category><category>rate-limiting</category><category>ai</category><category>llm</category><category>rag</category><category>multi-tenancy</category><category>redis</category><category>laravel</category><category>denial-of-wallet</category><category>api-gateway</category></item><item><title>API Gateway: the front door of your AI stack</title><link>https://ansezz.com/blog/api-gateway-ai-stack/</link><guid isPermaLink="true">https://ansezz.com/blog/api-gateway-ai-stack/</guid><description>Stop exposing LLM providers directly to the frontend. The gateway pattern for AI apps — JWT-scoped tenant isolation, model aliases, denial-of-wallet rate limiting, streaming-safe timeouts, and the wallet-saving guardrails every senior engineer needs.</description><pubDate>Wed, 20 May 2026 00:00:00 GMT</pubDate><category>architecture</category><category>api-gateway</category><category>ai</category><category>rag</category><category>security</category><category>rate-limiting</category><category>multi-tenancy</category><category>streaming</category><category>infrastructure</category></item><item><title>Shopify Storefront Web Components: headless commerce for the rest of us</title><link>https://ansezz.com/blog/shopify-storefront-web-components/</link><guid isPermaLink="true">https://ansezz.com/blog/shopify-storefront-web-components/</guid><description>Headless used to mean six engineers and a Hydrogen rebuild. Shopify Storefront Web Components let you drop products, collections, and cart into any HTML page with a script tag — no React, no build step, no DevOps tax.</description><pubDate>Tue, 19 May 2026 00:00:00 GMT</pubDate><category>shopify</category><category>shopify</category><category>web-components</category><category>headless</category><category>llms-txt</category><category>hydrogen</category><category>agentic</category><category>storefront-api</category></item><item><title>Why agentic commerce will change the way you build Shopify stores</title><link>https://ansezz.com/blog/agentic-commerce-shopify/</link><guid isPermaLink="true">https://ansezz.com/blog/agentic-commerce-shopify/</guid><description>AI agents don&apos;t browse — they query. The shift from human-centric Shopify themes to agent-ready infrastructure: Shopify Catalog, UCP, Agentic Storefronts, MCP servers, and why structured data is the new CSS.</description><pubDate>Mon, 18 May 2026 00:00:00 GMT</pubDate><category>shopify</category><category>shopify</category><category>agentic-commerce</category><category>ai</category><category>mcp</category><category>hydrogen</category><category>graphql</category><category>ucp</category><category>metaobjects</category></item><item><title>7 mistakes you&apos;re making with your production RAG stack (and how to fix them)</title><link>https://ansezz.com/blog/7-rag-mistakes-production/</link><guid isPermaLink="true">https://ansezz.com/blog/7-rag-mistakes-production/</guid><description>Naive chunking, no reranker, embedding drift, latency blowups, vibe-checking — the seven structural mistakes that turn a slick RAG demo into a production nightmare, and the fixes that actually ship.</description><pubDate>Sun, 17 May 2026 00:00:00 GMT</pubDate><category>ai</category><category>rag</category><category>ai</category><category>llm</category><category>pgvector</category><category>reranker</category><category>hybrid-search</category><category>evals</category><category>production</category></item><item><title>Scaling with RabbitMQ: why message brokers matter</title><link>https://ansezz.com/blog/scaling-with-rabbitmq/</link><guid isPermaLink="true">https://ansezz.com/blog/scaling-with-rabbitmq/</guid><description>Synchronous controllers are how monoliths die. RabbitMQ basics, exchanges and queues, the strangler pattern for going async, idempotent workers, and the Laravel queue setup I use to absorb 100k-row spikes without breaking the login page.</description><pubDate>Sat, 16 May 2026 00:00:00 GMT</pubDate><category>architecture</category><category>rabbitmq</category><category>message-broker</category><category>queues</category><category>decoupling</category><category>scaling</category><category>laravel</category><category>async</category><category>architecture</category></item><item><title>Mastering event-driven architecture with Google Pub/Sub</title><link>https://ansezz.com/blog/event-driven-pubsub/</link><guid isPermaLink="true">https://ansezz.com/blog/event-driven-pubsub/</guid><description>Decouple your services or drown in latency. Topics, fan-out, push vs pull, dead-letter queues, idempotent consumers, and the Laravel integration I run on Google Cloud — a practical EDA blueprint from a senior engineer.</description><pubDate>Sat, 02 May 2026 00:00:00 GMT</pubDate><category>architecture</category><category>pub-sub</category><category>event-driven</category><category>gcp</category><category>google-cloud</category><category>laravel</category><category>messaging</category><category>scalability</category><category>architecture</category></item><item><title>Effortless SaaS hosting: the Coolify and Docker deployment guide</title><link>https://ansezz.com/blog/coolify-docker-saas-hosting/</link><guid isPermaLink="true">https://ansezz.com/blog/coolify-docker-saas-hosting/</guid><description>Heroku DX, your own server, none of the cloud tax. How Coolify + Docker on a $5 VPS replaces vendor-lock managed platforms with a control plane you actually own — one-click databases, automatic SSL, and zero-downtime deploys.</description><pubDate>Sun, 19 Apr 2026 00:00:00 GMT</pubDate><category>devops</category><category>coolify</category><category>docker</category><category>self-hosting</category><category>devops</category><category>saas</category><category>deployment</category><category>vps</category><category>traefik</category></item><item><title>MCP and the future of tool-use: building context-aware agents</title><link>https://ansezz.com/blog/mcp-context-aware-agents/</link><guid isPermaLink="true">https://ansezz.com/blog/mcp-context-aware-agents/</guid><description>The Model Context Protocol kills the era of brittle one-off integrations. Tools, resources, prompts, and the three primitives that let one server talk to any MCP-aware client — with a working TypeScript example you can ship today.</description><pubDate>Sun, 05 Apr 2026 00:00:00 GMT</pubDate><category>ai</category><category>mcp</category><category>anthropic</category><category>claude</category><category>agentic-ai</category><category>typescript</category><category>tool-use</category><category>context</category></item><item><title>Vibe coding and the architectural shift to agentic workflows</title><link>https://ansezz.com/blog/agentic-workflows-vibe-coding/</link><guid isPermaLink="true">https://ansezz.com/blog/agentic-workflows-vibe-coding/</guid><description>MCP, agentic loops, and intent-based engineering. How vibe coding becomes a real architecture pattern when AI stops being a chat sidebar and starts owning stateful loops against your tools. The practical Laravel + MCP stack I run today.</description><pubDate>Sun, 22 Mar 2026 00:00:00 GMT</pubDate><category>architecture</category><category>vibe-coding</category><category>agentic-ai</category><category>mcp</category><category>anthropic</category><category>claude</category><category>laravel</category><category>architecture</category></item><item><title>Why your RAG implementation is failing in production (and how to fix it)</title><link>https://ansezz.com/blog/why-your-rag-is-failing/</link><guid isPermaLink="true">https://ansezz.com/blog/why-your-rag-is-failing/</guid><description>Vector-only retrieval is the silent killer of production RAG. Hybrid search with BM25, reciprocal rank fusion, smarter chunking, re-rankers, and an evaluation harness — the production checklist that turns a flaky demo into a reliable system.</description><pubDate>Sun, 08 Mar 2026 00:00:00 GMT</pubDate><category>ai</category><category>rag</category><category>ai</category><category>vector-search</category><category>bm25</category><category>hybrid-search</category><category>re-ranker</category><category>production</category></item><item><title>From monolith to micro-services: a senior dev&apos;s guide to pragmatic scaling</title><link>https://ansezz.com/blog/monolith-to-microservices/</link><guid isPermaLink="true">https://ansezz.com/blog/monolith-to-microservices/</guid><description>Skip the big-bang rewrite. The strangler fig pattern, anti-corruption layers, Docker-first migration, and GKE/Coolify operations — how I peel services off a Laravel monolith one endpoint at a time without breaking revenue.</description><pubDate>Sun, 22 Feb 2026 00:00:00 GMT</pubDate><category>architecture</category><category>monolith</category><category>micro-services</category><category>scaling</category><category>strangler-fig</category><category>docker</category><category>kubernetes</category><category>devops</category><category>laravel</category></item><item><title>Laravel multi-tenancy: how I built a scalable SaaS architecture</title><link>https://ansezz.com/blog/laravel-multi-tenancy/</link><guid isPermaLink="true">https://ansezz.com/blog/laravel-multi-tenancy/</guid><description>Single DB vs multi-DB, global scopes that stop data leaks, stancl/tenancy in production, isolated storage, automated migrations, and the Docker + Google Cloud setup I run for high-trust SaaS clients.</description><pubDate>Sun, 08 Feb 2026 00:00:00 GMT</pubDate><category>laravel</category><category>laravel</category><category>multi-tenancy</category><category>saas</category><category>architecture</category><category>stancl-tenancy</category><category>docker</category><category>postgres</category></item><item><title>AI integration vs traditional development: which is better for your business in 2026?</title><link>https://ansezz.com/blog/ai-vs-traditional-development/</link><guid isPermaLink="true">https://ansezz.com/blog/ai-vs-traditional-development/</guid><description>Speed, control, or a hybrid path? When AI-assisted development pays off, when traditional engineering is non-negotiable, and the hybrid workflow I recommend most often to founders and tech leads.</description><pubDate>Sun, 25 Jan 2026 00:00:00 GMT</pubDate><category>architecture</category><category>ai</category><category>strategy</category><category>hybrid</category><category>business</category><category>decision-making</category><category>productivity</category></item><item><title>Scaling with confidence: advanced Coolify deployment strategies</title><link>https://ansezz.com/blog/scaling-with-coolify/</link><guid isPermaLink="true">https://ansezz.com/blog/scaling-with-coolify/</guid><description>Move past the single-server trap. Multi-node Coolify setups, zero-downtime rolling deploys with health checks, dedicated build servers, managed databases, and GitHub Actions wiring — production-grade self-hosting without a DevOps team.</description><pubDate>Sun, 11 Jan 2026 00:00:00 GMT</pubDate><category>devops</category><category>coolify</category><category>deployment</category><category>devops</category><category>docker</category><category>self-hosting</category><category>ci-cd</category><category>scaling</category></item><item><title>Shopify Liquid vs. headless: choosing the right stack for scale</title><link>https://ansezz.com/blog/shopify-liquid-vs-headless/</link><guid isPermaLink="true">https://ansezz.com/blog/shopify-liquid-vs-headless/</guid><description>Hydrogen looks great on paper. Liquid still ships more revenue per week. A practical decision framework for picking between Liquid, headless Hydrogen, and the messy middle — based on what your team can actually operate long-term.</description><pubDate>Sun, 28 Dec 2025 00:00:00 GMT</pubDate><category>shopify</category><category>shopify</category><category>liquid</category><category>headless</category><category>hydrogen</category><category>performance</category><category>architecture</category><category>ecommerce</category></item><item><title>Picking the right RAG stack: vector databases for AI engineering</title><link>https://ansezz.com/blog/picking-the-right-rag-stack/</link><guid isPermaLink="true">https://ansezz.com/blog/picking-the-right-rag-stack/</guid><description>pgvector, Pinecone, Weaviate, Qdrant — a 2026 field guide. Which vector store to pick for your AI app, why hybrid search matters, and how to ship without painting yourself into a corner.</description><pubDate>Sun, 14 Dec 2025 00:00:00 GMT</pubDate><category>ai</category><category>rag</category><category>vector-databases</category><category>pgvector</category><category>pinecone</category><category>weaviate</category><category>qdrant</category><category>laravel</category></item><item><title>Vibe coding: why your next project needs more than just logic</title><link>https://ansezz.com/blog/vibe-coding/</link><guid isPermaLink="true">https://ansezz.com/blog/vibe-coding/</guid><description>Logic is the skeleton. Vibe is the soul. Why taste, intent, and feel are the new senior-engineer superpowers in the Cursor + Claude era — and how to keep the codebase from turning into a ball of mud while you chase it.</description><pubDate>Sun, 30 Nov 2025 00:00:00 GMT</pubDate><category>ai</category><category>vibe-coding</category><category>ai</category><category>cursor</category><category>claude</category><category>taste</category><category>dx</category><category>laravel</category></item><item><title>Hello, world. Yes, another developer blog.</title><link>https://ansezz.com/blog/hello-world/</link><guid isPermaLink="true">https://ansezz.com/blog/hello-world/</guid><description>Why this site exists, what I&apos;ll write about, and why neobrutalism is the right call for an engineer&apos;s portfolio in 2026.</description><pubDate>Sun, 16 Nov 2025 00:00:00 GMT</pubDate><category>career</category><category>meta</category><category>intro</category></item></channel></rss>