Customising robots.txt.liquid on Shopify is one of those tasks that looks small in code but can carry an outsized SEO consequence.
What we have seen in StoreBuilt Shopify work is this: teams rarely break robots.txt because they are careless. They break it because the change feels too simple to deserve a proper release process. A developer adds one rule for filters, an SEO adds a rule for search pages, another team inherits the theme, and soon nobody knows whether the file is protecting crawl budget or hiding commercial pages.
Before editing anything, run the free Shopify robots.txt validator. If the store already has risky custom rules and you want StoreBuilt to review the safest next step, Contact StoreBuilt.
Table of contents
- When Shopify robots.txt.liquid should be customised
- The default-first principle
- Pre-edit checklist for ecommerce teams
- Rules that deserve extra caution
- Testing, release, and rollback process
- StoreBuilt example from a custom robots review
- Robots.txt.liquid decision table
- Final StoreBuilt point of view
When Shopify robots.txt.liquid should be customised
Shopify’s default robots.txt setup works for many stores. That is a feature, not a limitation. Customisation should happen when the store has a specific crawl-control need that the default file does not handle well enough.
Common reasons include:
- controlling crawl access to specific filter or parameter patterns
- managing Shopify Markets domain differences
- preventing crawl waste after a migration
- handling unusual app-generated URLs
- documenting intentional crawler access for store-specific architecture
Bad reasons include:
- “we heard robots.txt improves rankings”
- “we want Google to remove these pages from the index immediately”
- “a tool gave us a scary generic warning”
- “we want to block every URL that is not commercial”
The decision should start with evidence. Run the validator, review Search Console, inspect internal links, and understand whether Google is actually crawling or indexing the URLs you are worried about.
The default-first principle
Shopify’s developer documentation recommends using the provided Liquid objects where possible because default rules can be updated over time. In practice, that means a safe custom file should preserve Shopify’s default groups before layering store-specific rules.
The simplest mindset is:
- Preserve the Shopify baseline.
- Add only the rules you can justify.
- Document the reason for every custom rule.
- Retest after theme changes.
This prevents a common failure pattern: a custom file replaces useful default behaviour, then future Shopify updates or app changes are not reflected in the store’s crawl setup.
If your team cannot explain why a rule exists, it should be reviewed before it becomes permanent.
Pre-edit checklist for ecommerce teams
Before you edit robots.txt.liquid, answer these questions:
- Does the store already have a custom
robots.txt.liquidfile? - Is the current public
/robots.txtreachable? - Does the current file declare the correct sitemap?
- Are product, collection, blog, and page URLs crawlable?
- Are utility paths such as cart, checkout, account, and search controlled?
- Which URLs are actually appearing in Search Console?
- Which rules are temporary migration controls, and which are permanent?
- Who owns future robots.txt changes?
If those answers are unclear, the right move is usually a technical SEO cleanup rather than a quick code edit. StoreBuilt would normally connect this to Shopify SEO & AI Search Readiness because crawl control affects content, collection architecture, structured data, and migration safety.
Rules that deserve extra caution
Some rules are more dangerous than they look.
| Rule pattern | Why it is risky | Safer review question |
|---|---|---|
Disallow: / | can block the whole store from crawling | is this a staging-only rule accidentally live? |
Disallow: /products | can block core commercial pages | are product URLs meant to rank? |
Disallow: /collections | can block category visibility | are collection URLs the main SEO landing pages? |
| broad wildcard filters | can catch URLs beyond the intended pattern | have we tested representative examples? |
| blocking pages with noindex tags | crawler may not see the noindex | should this be allowed and noindexed instead? |
| blocking migrated legacy URLs | can interfere with redirect discovery | are redirects being crawled and validated? |
The safest rule is the narrowest rule that solves a proven problem.
That does not mean robots.txt should be timid. It means crawl control should be precise.
Testing, release, and rollback process
A robots.txt edit deserves a lightweight release process.
Before release
Run the StoreBuilt robots.txt validator, export the current file, and write down which rule is changing. Test examples of URLs that should remain crawlable and URLs that should be blocked.
During release
Make the change in the right theme, confirm the live /robots.txt, and check the sitemap reference. Avoid bundling the robots edit with unrelated theme changes if the store is already dealing with crawl or indexation problems.
After release
Retest the validator, use Search Console URL Inspection for representative pages, and monitor crawl/indexation changes. If the change creates unexpected blocking, revert quickly.
This is especially important for stores with international domains. Shopify Markets can create legitimate differences in crawl strategy, but each domain still needs a testable outcome.
StoreBuilt example from a custom robots review
One Shopify store came to StoreBuilt with a custom robots file that had been edited by multiple teams over several years. Nobody had made a single catastrophic mistake. The issue was accumulated uncertainty.
Some rules were related to an old migration, some appeared to target search pages, and some had no clear owner. The store’s commercial pages were mostly crawlable, but the team had lost confidence in whether the setup was intentional.
The useful fix was to rebuild the file around a default-first structure, remove historical rules that no longer had evidence, keep a few narrow crawl controls, and document the owner. The technical improvement was modest. The operational improvement was bigger: future SEO and theme changes no longer started from confusion.
Robots.txt.liquid decision table
| Scenario | Customise now? | Better first step |
|---|---|---|
| default Shopify file, no crawl issues | usually no | monitor Search Console and keep defaults |
| missing sitemap declaration | yes, after confirming cause | restore sitemap reference and retest |
| product pages blocked | urgent review | remove accidental block and inspect URLs |
| filter URLs flooding crawl reports | possibly | confirm patterns before narrow rules |
| old migration URLs still appearing | maybe | inspect redirects before blocking |
| app-generated URLs discovered | maybe | check app settings, internal links, and theme output |
| pages need removal from index | not by robots alone | use noindex where crawlable, then validate |
The key is matching the control to the problem. Robots.txt is powerful when used for crawl access. It becomes clumsy when used to solve indexing, content, or architecture problems that need other tools.
45-day customisation workflow
Days 1-10: audit and evidence
Validate the live file, compare it with Shopify’s expected default structure, and collect Search Console examples. Identify whether the issue is crawl access, indexation, duplicate discovery, or internal linking.
Days 11-25: design the rule set
Preserve default groups, draft only the required custom rules, and map each rule to example URLs. Add documentation beside the implementation plan so future teams understand the logic.
Days 26-45: release and monitor
Deploy the change, retest live output, inspect priority URLs, and monitor crawl/indexation reports. Keep a rollback copy and avoid judging the change from one isolated warning.
If this feels heavier than expected, that is the point. The code is small, but the business consequence can be large.
For hands-on help, use the free validator first, then Contact StoreBuilt with the store URL and current robots output.
Final StoreBuilt point of view
Robots.txt customisation should be boring, documented, and evidence-led. The best Shopify teams do not edit it because a generic audit tool shouted at them. They edit it when a specific crawl problem has been proven and a narrow rule can solve it safely.
StoreBuilt’s view is simple: preserve Shopify’s useful defaults, avoid broad blocking, test representative URLs, and treat robots.txt as part of the technical SEO release process. That is how a small file stays helpful instead of becoming a hidden risk.