Trouble began with Google and ended with a shuffle.
In the process of moving a site from Dreamweaver to WordPress, I moved many pages to new URLs. What was http://sitename/About/index.html became http://sitename/about. It made semantic sense, but caused a problem for anyone who linked to the old pages. This site had more than 400 inbound links, and my changes meant those links produced a “page not found” error.
The solutions, as I can hear developers shouting, are htaccess redirects.
Apache web servers look to files named .htaccess for instructions what to do with browser requests. Among other things, instructions in .htaccess determine which files they can or can’t reveal, how to interpret dynamic URLs and where to send browsers that request files that don’t exist.
That last task kept me occupied for half an hour as I listed where the old site’s content had moved to. For example:
redirect 301 /About_this_site http://sitename/about
The code above says “If someone asks for http://sitename/About_this_site,” send them to “http://sitename/about.” The 301 code means “by the way, this move is permanent. Update your records.”
After listing the pages, I searched the web with Google for the links to make sure it worked. The first two worked. The next two didn’t.
- http://sitename/about went to sitename/program, just as it should.
- http://sitename/about/index.html tried to find sitename/program/index.html, then gave a “page not found” error.
- http://sitename/About (uppercase A) gave a “page not found” error.
- http://sitename/About/index.html (uppercase A) gave a “page not found” error.
The problems looked easy enough to solve. I added redirects to handle index.html and About, and tried again.
redirect 301 /about http://sitename/program
redirect 301 /About http://sitename/program
redirect 301 /about/index.html http://sitename/program
redirect 301 /About/index.html http://sitename/program
All of these redirects should have sent me to http://sitename/program. But they didn’t. After a few minutes’ tinkering I had a hypothesis: Since neither instruction with “index.html” worked, but similar redirects had worked before, then the redirects _might_ be order-sensitive. I moved them around and tried again.
redirect 301 /about/index.html http://sitename/program
redirect 301 /About/index.html http://sitename/program
redirect 301 /about http://sitename/program
redirect 301 /About http://sitename/program
Sure enough, it worked.
The morals of the story: Keep specific URLs higher in htaccess redirect lists, Google isn’t always right, and test with real data.
What’s next?
