Posted on Nov 10, 2011

Ultimate 301 & 404 handler for Sitecore – Part 2

In Part 1 I went through the process of getting Sitecore pipelines to handle an XML file containing redirects. If no URL match could be made in the redirects file, and no actual Sitecore item was found, a 404 status is returned with a friendly 404 page.

In this part, I’m going to show how we manage these redirects within Sitecore, and publish them out to the XML file for handling at runtime.

The Sitecore Items

We use a Global folder within our Sitecore content tree.  This is outwith the site root, and contains various settings, config items, lookup list items and so on.  I’ve created a Redirects folder there containing a “301 Redirects” and an “Aliases” folder.  In reality there’s no difference between an alias and a redirect but it’s probably easier to manage if they’re in separate folders.

Sitecore/Content/SiteName/Global/Redirects/301 Redirects
Sitecore/Content/SiteName/Global/Redirects/Aliases

I’ve also created a “Redirect Item” template which is the default item for these folders.  It’s pretty straight forward, with just two values:

MatchURL – Single line Text
RedirectTo – Droptree – Source: /Sitecore/Content/SiteName/Website

So, quite easily we can create an item in either the 301 or the Aliases folder, containing a URL to match (from the forward slash on), and  a Sitecore item to redirect to.

Updating the Redirects XML

It took a bit of thought on how to get these items out and into the xml file.  At first I hooked into an Item:Saved event and checked against the Redirect Item template type.  That worked fine, but really wasn’t the right place to do it, as the redirect hadn’t been published yet.  Really I wanted to ensure the redirects respected publishing restrictions on an item so it would only be added if it was actually publishable.

I then looked at the Publish:End event and hooked into that.  Unfortunately, even with a Smart Publish, every redirect item in our tree was creating a match as it was being published.  It seems that this event will iterate over the whole tree regardless of whether the item will be published, so our XML was being written multiple times which was bad news.

Finally, I came across Alex Shyba’s post on how to get a list of published items.  Perfect!  Alex is well worth following by the way – lots of useful things coming from him. With a quick tweak to the web.config to add the history engine to the web database, I can now get a list of items that have actually been published in the last operation.  If I find a redirect item in that list will we build the XML.

Here is the processor we add to our DogRedirector.config

<publish>
	<processor type="Dog.Code.Dog.DogPublishProcessor, Dog" patch:after="processor[@type='Sitecore.Publishing.Pipelines.Publish.ProcessQueue, Sitecore.Kernel']" />
</publish>

And here is a minor tweak to Alex’s ProcessPublishedItems method to identify our redirects and trigger our XML builder.

bool bRedirectPublished = false;

foreach (var id in cacheQueue)
{

	// Check if this is a redirect item, and if so, rebuild the XML
	Item itm = db.GetItem(id);
	if (itm != null && itm.TemplateID.ToString() == Settings.IDRedirectsTpl)
	{
		Log.Info("*** Redirect item published: " + itm.Name, this);
		bRedirectPublished = true;
	}
}

// We have a redirect change - write the XML
if (bRedirectPublished)
{
	RedirectHandlerUtils redirUtils = new RedirectHandlerUtils();
	redirUtils.buildRedirectXML();
}

Now I know there are more efficient ways to do this than performing a db.GetItem on each ID to retrieve the template name, but as usual with Sitecore things like this seem to take no time at all so I’ve left it as is for now.  I may revisit this if it ends up not scaling.

Building the XML

It’s now a pretty straight-forward case of writing our XML file.  We query out all our redirect items from the web database, iterate over them and build our simple XML file:

public void buildRedirectXML()
{
	String strXMLFile = Sitecore.IO.FileUtil.MapPath(Settings.RedirectsXMLPath);
	String strRootRedirectItem = Settings.GlobalPath+"/redirects";

	try
	{
		// We're running from shell, so tell the linkmanager which site to use
		UrlOptions options = new UrlOptions();
		options.Site = Sitecore.Configuration.Factory.GetSite("website");
		options.LanguageEmbedding = LanguageEmbedding.Never;
options.AddAspxExtension = true;

		// Get redirect items
		Database db = Database.GetDatabase("web");
		Item[] redirItems = db.SelectItems(strRootRedirectItem + "//*[@@templateid='" + Settings.IDRedirectsTpl + "']");

		StringBuilder sitemap = new StringBuilder();
		//add initial bits for XML file
		String sN = Environment.NewLine;
		sitemap.Append("<?xml version=\"1.0\" encoding=\"UTF-8\"?>" + sN);
		sitemap.Append("<redirects xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\">"+sN);

		Log.Debug("Processing Redirect Loop");

		foreach (Item itm in redirItems)
		{
			// Is this redirect publishable?
			if (itm.Publishing.IsPublishable(DateTime.Now, false))
			{
				Log.Debug("Processing Redirect for: " + itm.Name);
				//LinkField lnk = itm.Fields["RedirectTo"];
				Item itmLink = db.GetItem(new ID(itm["RedirectTo"]));

				// Does the target item exist, and is it publishable
				if (itmLink != null && itmLink.Publishing.IsPublishable(DateTime.Now, false))
				{
					String strUrl = Utils.stripXMLIllegalCharacters(LinkManager.GetItemUrl(itmLink, options).Replace(" ", "-").ToLower());
					String strMatch = Utils.stripXMLIllegalCharacters(itm["MatchURL"]);
					sitemap.Append(String.Format("  <redirect>" + sN + "    <from>{0}</from>" + sN + "    <to>{1}</to>" + sN + "  </redirect>" + sN + "", strMatch, strUrl));
				}
				else
				{
					Log.Debug("No link found for " + itm.Name + " - " + itm["RedirectTo"]);
				}
			}
		}

		sitemap.Append("</redirects>");

		TextWriter tw = new StreamWriter(strXMLFile);
		tw.WriteLine(sitemap.ToString());
		tw.Close();
		tw.Dispose();

	}
	catch (Exception ex)
	{
		Log.Error("Problem writing redirect XML", ex, this);
	}
}

Notice we need to setup UrlOptions as we’re running this method from the shell website (i.e. Sitecore’s admin). As such the LinkManager will write out paths from the root Sitecore node. We tell it to use the “website” context (or whatever you’ve named your site in your web.config) and set whatever other options are associated with the LinkManager config you’re using.

I could probably have used an XmlWriter or some other .NET methods to build the XML but hey, I’m a scripter at heart and I have a tendancy to build strings the old-school way…

The Result

Now you should be able to manage your aliases and redirects on a per-site basis purely from Sitecore with a few tweaks to this code.  The pipelines detailed in Part 1 should provide a pretty efficient way of forwarding users without excessive calls on the xml.  If you were in a high traffic scenario then you could look at holding the xml file in cache.

You also might want to make some further enhancements, like making sure each redirect is unique on creation, and tidying up redirects if the target page is deleted.  At least Sitecore will warn that there are links to the item if you try that.

So here are the files again, as detailed in Part 1.  These have been extracted from our solution and modified for sharing so may need some tweaks to get them to build.

Download DogRedirector

I should make a final note that getting this together was prompted by Euan Blair our SEO guru, and I should credit input and send thanks to various devs at Dog Digital, especially Ian Etherington.  Cheers!