Converting a big site to new friendly url structure

Well if you have ever taken on an SEO project that involves converting a large (meaning many many thousands of pages) site to a new architecture you know how difficult this can be. The temptation is to just not do it and try to work with the existing pages. Also, the issue of mapping the old URLs with valid 301 redirects is challenging as well.

Here are some steps when taking on this task.

1) First make sure to get a good crawl of the existing site. This will tell you whats out there. I recently had to do this with a site that had cleverly programmed over 1.5 million pages. Of course google only had around 1,000 of them in their index but they were still ‘accessible’ and this had to be changed as these were doorways into the old site architecture. If these were to be left open this could create issues for duplicate content.

2) Once you have the URL’s you’ll need to determine the unique ID’s that are used to drive the existing architecture. You’ll need to know these so you can either use them for your re-writes or in my most recent case it was a conversion from ASP to PHP which meant I could not use ANY of the existing logic since it was all using recordsets etc etc. So, I had to create an export based on the crawl that kept the product and category ID’s intact and used those to import into the new database that would drive the site.

3) Once you have the tables you can then either desing the new database from this or , as I did, programmatically create rewrites in the .htaccess that will handle all of these pages. Be sure to check for long URL’s with multiple queries parameters and be sure to forward all of these to either their corresponding pages or the parent ID’s of these pages.

4) When testing on the new site you can use the so that the rewrites will work while testing.

5) Run a new crawl on the new site and verify all things work

5) Check your redirects in something like Redirect Checker so you can be sure you get 301’s instead of 404 or 302’s. The 301 is the accepted method of redirecting. It also allows the PR from those old pages to pass on to the new pages.

The overall process can be pretty involved but be sure to take it one step at a time and stay organized so you dont end up losing the old pages.

The final tip is that once you do the change dont freak out when the traffic drops for the site as the new pages will take some time to get all figured out. When anything ‘in mass’ takes place on a site the SE’s sometimes take a bit. Once they re-crawl all the new pages and figure out what is going on they will start giving clearer rankings.

Advertisements
  1. Leave a comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: