2018 Search Engine Optimization with AngularJS 1.x Single Page Application

Search engine optimization with single page applications

Creating a Single Page Application (SPA) comes with many modern advantages to a traditional web site that can improve the user experience. These advantages include faster user experiences typically due to not having to reload the entire application in order to display changes in the content. This leads to an experience that can be very similar, or identical, to a user experience in a traditional desktop applicaiton. One area of ¬†advantages that SPA do not tend to have is that web crawlers and search engines haven’t quite kept up with the ability to index and crawl Single Page Applications. Single Page Applications using AngularJS use a hash bang (ie #!) to take advantage of the browsers history. Thus, everything after the hash bang is not served to a web crawler or a web server (ie, Apache or Tomcat) and is only loaded by a browser or client.

Now, Google claims to be able to index SPA; however, it is my experience, that it has not been able to index three of my AngularJS web sites. Google Webmaster consistently has failed to crawl anything more than a few JavaScript files, which is not good for Search Engine Optimization (SEO). Even if you provide a sitemap.xml with all permutations of your dynamic pages listed, it will still fail to crawl your site in my experience. Also, for your sanity, you can check what your website looks like when Google crawls it by going to Google Webmaster/Search Console and under “Crawl” choose “Fetch as Google”, then “Fetch and Render” :

Then if it works, you can click the links in the table to see the render page from the Googlebot Web Crawler vs the User perspective. You will likely see that Google fails to correctly load your web site as the Googlebot.

If you notice that Google, or other search engines, are failing to crawl your site correctly, you do have options. When searching for how to do SEO on AngularJS and/or SPA, you will find many posts about having to pre-render all of your pages, as sort of a cache, to serve to search engines by doing some re-routing of the hash bang url path to a more traditional URL (ie, eliminate the signal to not load what comes after the hash bang). This can involve using services such as pretender.io or PhantomJS. These solutions have been used by many, but are too complicated in my opinion and they are not really acceptable for a technology like SPA, which is becoming the de facto standard for modern web applications. Luckily, we have a better option.

Instead of going through the chore of pre-rendering every single page in your web application, we can leverage HTML5. This is a two part solution. First, we will leverage some AngularJS code to automatically re-write hash bang URLs to a normal URL without the hash bang. For instance,

This will result in a much more crawlable web site and can be done with minimal code changes. Here is the highlighted code change needed in your index.html:

<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="utf-8">
  <meta http-equiv="X-UA-Compatible" content="IE=edge">
  <meta name="viewport" content="width=device-width, initial-scale=1">
  <title>Wing Specials Today</title>

  <!-- This is needed for HTML5 Mode -->
  <base href="/"><!-- Make sure the base is above your stylesheet -->

  <link rel="stylesheet" href="../node_modules/bootstrap/dist/css/bootstrap.css">
  <link rel="stylesheet" href="../node_modules/bootstrap/dist/css/bootstrap-theme.css">
  <link rel="stylesheet" href="css/wingnight.css">

  . . .

</head>

. . .

Note: This change must be above your stylesheet links!
Note: Locally, I test a few applications from the same root folder using npm start. So, I need to change my base to <base href="/wingnight/"> for this to work locally.

Next, you will need to change your main module’s config.js file. For instance:

(function( angular ) {
  angular.module( 'wingnight' )
    .config( wingnightConfig );

  function wingnightConfig( $stateProvider, $urlRouterProvider, $locationProvider ) {
    $urlRouterProvider.otherwise( '/restaurants/list' );
    $stateProvider
      .state( 'restaurants', {
        url      : '/restaurants',
        template: '<restaurant-main></restaurant-main>'
      } )
      .state( 'about-us', {
        url     : '/about-us',
        template: '<about-us-main></about-us-main>'
      } )
      .state( 'contact-us', {
        url     : '/contact-us',
        template: '<contact-us-main></contact-us-main>'
      } );

    $locationProvider.html5Mode(true);
    $locationProvider.hashPrefix('!');
  }

})( angular );

Secondly, we need a solution that helps serves the pages when directly linked or when refreshing a page. Why? Well, when we refresh the page the web server is trying to load that given page; however, it doesn’t exist, because this is a Single Page Application and the routing links are dynamically created. This will result in an error like the following:

Cannot GET /wingnight/restaurants/detail/Bigham-Tavern-Mt.-Washington

Therefore, we need to be able to rewrite all URLs that are not at the root of the application to first load the root AngularJS application (ie, index.html), which has the references to your module(s). This would allow your Single Page Application to correctly load again from a browser refresh or via direct linking. Even better, all already establish links using the hash bang format should continue to work.

There are several options for getting your web server to redirect non-base links back to index.html; however, the simplest one is an Apache .htaccess file added to the root of your single page application.

<IfModule mod_rewrite.c>
    RewriteEngine on
    RewriteCond %{REQUEST_FILENAME} -s [OR]
    RewriteCond %{REQUEST_FILENAME} -l [OR]
    RewriteCond %{REQUEST_FILENAME} -d
    RewriteRule ^.*$ - [NC,L]
    RewriteRule ^(.*) index.html [NC,L]
</IfModule>

With this change, you will be able to refresh your page and have it work correctly. Also, you will be able to directly link to any of the dynamically created links generated by your SPA (ie, think sitemap.xml). If you are experiencing issues still, please double check that you put your base href above your stylesheets!

Now, you should be able to verify using Google Search Console’s Fetch as Google with Render option, that Googlebot can indeed process your AngularJS 1.x Single Page Application. You can even request indexing to see what Googlebot crawls.

2 Comments:

  1. Pingback: Indexing of Sitemap for AngularJS Site Increasing – kevinmichaelcoy.com

  2. Traditionally you always map the parameters (at least I’ve always done so). I guess with Java 8 / Maven, it can infer the mapping on the fly. Cool, but less verbose as to what is happening in my opinion.

Leave a Reply

Your email address will not be published. Required fields are marked *