# Bollywood Box Robots.txt # https://www.bollywoodboxcalgary.com # Main directives for all bots User-agent: * Allow: / Disallow: /config/ Disallow: /components/ Disallow: /api/ Disallow: /*?*author=* Disallow: /*?*tag=* Disallow: /*?*month=* Disallow: /*?*view=* Disallow: /*?*format=* # Block legal pages from all crawlers Disallow: /privacy-policy.html Disallow: /terms-of-use.html # AI Bot specific directives - Allow AI crawlers to access llms.txt User-agent: GPTBot User-agent: ChatGPT-User User-agent: CCBot User-agent: anthropic-ai User-agent: Claude-Web User-agent: cohere-ai User-agent: PerplexityBot User-agent: YouBot Allow: / Allow: /llms.txt Disallow: /privacy-policy.html Disallow: /terms-of-use.html # Google Extended (AI training) - Allow access User-agent: Google-Extended Allow: / Allow: /llms.txt Disallow: /privacy-policy.html Disallow: /terms-of-use.html # Facebook Bot User-agent: FacebookBot Allow: / Disallow: /privacy-policy.html Disallow: /terms-of-use.html # Google Ads Bot User-agent: AdsBot-Google User-agent: AdsBot-Google-Mobile User-agent: AdsBot-Google-Mobile-Apps Allow: / # Search engine crawlers - Block llms.txt from indexing User-agent: Googlebot User-agent: Bingbot User-agent: Slurp User-agent: DuckDuckBot User-agent: Yandex Allow: / Disallow: /privacy-policy.html Disallow: /terms-of-use.html # Note: llms.txt noindex is handled via X-Robots-Tag header # Crawl delay for heavy bots User-agent: Baiduspider Crawl-delay: 10 User-agent: SemrushBot Crawl-delay: 5 User-agent: AhrefsBot Crawl-delay: 5 # Sitemap location Sitemap: https://www.bollywoodboxcalgary.com/sitemap.xml