WCSEA 2015: Panel — WordPress at Scale: Enterprise, Media, and Education

I’m at WordCamp Seattle today and will be post­ing notes from ses­sions through­out the day. These are posted right after the session, and could be a little rough.

This was a panel discussion moderated by Grant Landram, with Evan Cordulack, Jeremy Felt, Josh Kadis, and Nathan Letsinger. Follow those links to their Twitter pages. Here is the WordCamp.org session description.

Introductions

ModeratorGrant Landram: Senior project manager at 10up.com, previously with FreshMuse.

Jeremy Felt: Platform Manager at Washington State University. Previously senior engineer at 10up.

Josh Kadis: Internal and external product development at Alley Interactive, and Seattle dev meetup organizer. Previously the senior technologist at Quartz.

Evan Cordulack: Engineering manager for the Seattle Times.

Nathan Letsinger: Product lead at Grist, designer and developer.

Topic 1: What is “Scale”?

What does scale mean or look like with you and your organization?

Josh: Scale means traffic, pageviews, load on server. But there’s also internal users, scale of code base, database.

Evan: Scale means the bigger the database gets, then the things we thought WP was good at out of the box left us with some hangups. We still fight with that every day.

Nathan: Scale means we needed to deal with just the shear number of users. 2 dozen or 600 users who need to login raises a lot of concerns. Could be security. Many people logging in at the admin level means you should evaluate security. We have 700 bylines for posts, and didn’t want that many fake users in WP.

What special considerations do you have?

Jeremy: When you’re no longer running a single WP site, a lot of caching considerations get really complicated. Also a lot depends on your budget. With more money, have a sysadmin work on this for you. If you don’t, there’s a lot you can get away with on a single Linode with Memcached. Even if you’re paying someone to do this for you, it makes sense to understand what they’re doing for you, so you can write code that works well in sync with the systems you have in place.

Josh: The use case most people have for WP is one where every non-logged-in user gets the same HTML markup. If you’re working on a BuddyPress site, then that’s a totally different story. But normally, traffic can quickly become a non-issue with full page caching.

Evan: We try to use caching at every level. We use Akamai, then Varnish, Memcached… and at the WP API level, we use transients everywhere to take advantage of the Memcached object cache. We managed to totally kill page loading speed early on in develompent. Once we started taking advantage of caching APIs, it shaved off seconds. There are things built-in that get you really far, really fast — whatever level you’re at, even if you don’t have object systems in place. And it’ll make you feel like a pro.

A lot of managed hosts disable caching for logged-in users. So if you have high logged-in user traffic, you need to be more strategic about your caching strategy.

Nathan: You want to worry about caching. For us that meant full page caching. A story about this: Before we were on WP, we had that problem. Our best days for editors had huge traffic spikes. Those were our worst days for our developers, when we were having to spin up new servers, etc. We learned about Batchache from the WordPress community, and ported it over to our old system. But in the process we learned enough about the WP ecosystem that it convinced us to embrace WP generally.

Nathan: You need to know the constraints your caching systems should have in place, so that you know what not to do to break things. For example, we can’t add database tables without also having to modify our caching systems to accommodate those. I actually found these constraints freeing.

What other tools are you using?

How do you make choose which systems to use?

Nathan: It also depends on your passion. We’re not sysadmins, but we were playing that role part time. Can we hire someone to do this for us? We decided to pay somoene to just solve this for us. Even if we felt we were paying too much, there’s a lot of value in peace of mind. So when your traffic is spiking on a Saturday, I can keep sipping my Manhatten and not worry about it.

Topic 2: Scaling Strategies

Dev Team Sizes for your organizations

Jeremy: 20 or so on the team doing content, design, dev, everything. We’re supporting 600 sites and around 1,000 users. Central univ support.

Josh: We have 40 people at Alley Interactive including non devs (PMs, design, etc.). On a given project we may have as few as a dev and a PM. But sites we support with caching strategies are as large as the NY Post, which is one of the largest WP sites anywhere. They’re hosted on WP.com VIP which handles scaling, traffic, caching in the same way for us as it does for anyone else. That enables us to be the dev team for them with a relatively small team because we don’t need to worry about it.

Evan: Dozen devs. While there are a lot of users in the system at a time, there are also content being injected from various places constantly.

Nathan: It’s 2 devs and myself at Grist.

Key parts of your tool set (tech, process, organizational) for scaling

Jeremy: Have a local environment matching production as closely as possible is important. Then after that I rely on Ngynx and MySQL (although Zach Brown would recommend alternatives to MySQL like MariaDB).

Josh: For sites with very large database, e.g. New York Post which has a million database rows, we use Elasticsearch for queries (not just user-facing search). Even if you already know the ID of a post, in a large database a post-specific query can be expensive.

Nathan: Lots of users writing means you need something like a calendar for scheduling, and some kind of chat program so everyone’s in touch w/ each other. This is pretty essential for the overall team.

The Reliability of Elasticsearch

[A question from me, hence the greater detail in this document:] Josh mentioned depending on Elasticsearch for Queries. Kyle Kingsbury posted some research into Elasticsearch last year that raised questions about its reliability. Have you encountered that or had to deal with it in production?

Jeremy: Automattic makes heavy use of Elasticsearch, with people doing the work full time. If they can trust it and they’re using it at insane scale, even though they have to rebuild their indexes occasionally, then it’s probably fine.

Evan: There are a lot of queries that are surprisingly taxing at scale. Times Elasticsearch has failed for us are usually times that we did something wrong. You can write integration stuff that’ll make your life easier. Inevitably you’ll have to reindex and it’ll be a drag.

Zach Brown, from the audience: Basically, don’t use it as canonical. It’s just a way to access info quickly.

Workflow for content to moves from outside of WP into WP

Jeremy: Even though we think TinyMCE and DFW are neat, nobody uses it, because people pass things around in Word and then copy/paste them into WP.

Josh: At Quartz it became the policy that stories had to be written in WP. I could still tell who wasn’t doing that. But we were able to help allay concerns from journalists just because people get a little put off by the interface because if it’s a little off-putting to them, it’ll become familiar eventually. “Word was new to you too, once, and this will get better eventually.”

Evan: We don’t have anything like an editorial calendar, and we don’t have a good way to do this.

[After a Follow up question from audience about content/workflow policy governance and enforcement:] We rely from WP for curation; what shows up where. That’s where WP does the most work for us. Revisions of the story will happen in a different editing platform, reason being we have to feed the printing press. Some would like to do digital first, but “that is not my department”.

Nathan: Autosave has been great for the concern about browsers crashing.

Do you run your sites through performance analyzer tools?

Jeremy: I obsess over websitetest.org. On launch days I’m refreshing every 10 minutes trying new things, trying to get others out.

Josh: We’re taking the opposite approach: Starting a project by structuring build tasks, so that when you get to the end, Grunt or Gulp or whatever you’re using has already been done. [In other words, this is a philosophy of beginning with high performance as your baseline, rather than revising to get it right at the end.]

Evan: Since these tools are easy to use once you show someone, then everyone cares about performance. This is good and bad because meetings get really weird really fast (talking about which DOM events matter with non-devs is strange), but good to have everyone considering it.

Do you use tools to find performance issues specifically in code?

Jeremy: John Blackbourn’s Query Monitor plugin. Using things like xdebug or PHPStorm with step debugging can identify loops that are running multiple times and slowing things down.

Josh: Peer code review is huge for performance because you can rely on the experience of your team, seeing what things are slow, what’s caused problems for them in the past, etc. Leverage your embodied experience.

Besides plugin updates and installs, what changes when you move to a multiserver load-balanced environment

Jeremy: I’m only running one VM, but… it becomes more to manage, more pipes to keep connected, so more things to worry about going down. But, scaling horizontally by having more redundant things can be a less painful thing where everything’s beefy and fast…

Evan: We have a ton of internal conversations about this, and everyone has an opinion. Whatever it is, take your time. If you’re using a cloud platform, you can experiment with scaling things up and down. (If you’re having to do this in production, that’s another story.)

For large editorial teams, how do you manage training as you make continual changes?

Jeremy: Every Friday morning we have an open lab on campus for people to come talk. We say “we’re working on this, here’s a preview” and we get good feedback.

Josh: It’s important to treat dev of internal WP functionality as a UX problem in the same way you would treat a design change on the front-end. Without completely redesigning the admin interface there is still a lot you can do to make a feature intuitive and easy for people to figure out (or totally baffling and unintuitive). There’s a degree of design thinking you can apply to internal plugin dev that I’ve found helpful — even briefly and semi-formally for an internal plugin can really help speed up adoption.

Evan: If new changes require one-time actions before you can use them or before the UX returns to normal, and people don’t know about it, that upsets people. It’s important to give people a heads-up about changes that are coming.

Architecture constraints migrating to WP from other platforms

Josh: The constraint is less about the total number of content objects than it is about the differences in information architecture. The great thing about migrations is that you’re typically not that concerned with performance (within reason). So when you start the migration script if you’re talking about 6 hours or 10 hours to finish, it doesn’t matter so much… But if you can’t get a clean map of information architecture, that can be difficult.

Evan: We had a lot of problems with DB performance. It would get to a certain size where we were timing out, particularly around post_meta. It was failing in mysterious ways. We were in a hosted environment where there were settings we didn’t have access to. Lots of little roadblocks to clear, remove one, hit another, so we had to get through a bunch of that.

Nathan: We use EditFlow, which includes the editorial calendar. There are many media companies and even mid-sized blogs using this who need a calendar. This is an area for growth. We’re testing CoSchedule, which is monthly paid service, and has great features, but is off of WP’s architecture. I think this is an area where media companies have a need.

Have you encountered bloat from leaving revision management active?

Josh: On one occasion I’ve run a wp-cli script to clear out revisions on very old posts.

Nathan: That’s WordPress.com VIP’s problem. :)

What do we need to know about what’s coming?:

Jeremy: The easy answer is the JSON REST API and having everything available to you via JSON.

Nathan: Improvements to taxonomy terms will open a lot of opportunities.

This post is part of the thread: 2015 WordCamp Seattle Live Notes – an ongoing story on this site. View the thread timeline for more context on this post.

WCSEA 2015: Taylor McCaslin — Multisite Network Do’s and Don’ts — Experience from an Enterprise Solution

I’m at WordCamp Seattle today and will be post­ing notes from ses­sions through­out the day. These are posted right after the session, and could be a little rough.

This is a talk from Taylor McCaslin, Product Manager for WP Engine. Find Taylor at @taylor4844 on Twitter and find WP Engine on the web at wpengine.com (affiliate link). The WordCamp.org session description is here.

I missed the first few minutes of the talk, unfortunately, so these are less comprehensive notes than I would like.

Clarifying What is Multisite

Multisite is Dr. Jekyll & Mr. Hyde. Your best friend and your worst enemy. People generally don’t actually know what it is, or what’s different about it.

A multisite network is a collection of sub-sites that all share the same single WP installation.

What is Multisite Not?

It is not a network that can be moved to separate hosts. It’s 1 host, 1 install, many sites.

It is not a group of sites that can be easily separated into their own separate installs later. You can do this, but you don’t want to have to. (Modifying serialized data is not fun.) When people think they want to go there, talk really hard about why.

Multisite is not a set of sites with different IP addresses. They all have the same IP addresses. This can ruin a multisite: One bad actor can blacklist all sites on the network. The shared IP address is a single point of failure for every site on the ntwork.

Configuring Multisite

Different Multiste Modes: Open, or closed?

Two options. This terminology is hard, because the words mean different things in different contexts. Here’s the current consensus terminology among the core cdeveloper community:

Public Network / Untrusted

Aanyone can signup and create a site (sometimes paid). e.g. WordPress.com, Happytables.com, university student blogs (which are great for giving your faculty and students the power of .edu to easily rank highly in search results.)

Concerns: file types / uploads, scripts / embeds, copyright. Your users can harm you and each other.

Private Network / Trusted

Limited sites and user creation. WordCamp.org is an example. Company intranets for different departments, etc.

Concerns:

  • Too many cooks. “Just make me a superadmin”. Uh, no! Just, no.
  • Do you have someone to manage all of this? One WP site on its own is a lot to manage. Updating a plugin that breaks on the network breaks everywhere.
  • Code changes affect all sites!

Sub folders vs Sub domains

[I missed some notes here, but I feel this is fairly well documented elsewhere.]

Domain Mapping

Plugin for this: wordpress-mu-domain-mapping

Use CNAMES! Not wildcards. Allows you to manage the whole thing with a single DNS when you need to migrate [I think that is what he said.]

wpmudev.org has premium plugins for selling domains to users. Taylor recommends these.

Benefits of Multisite

Super Admin Role

Many WP capabilities that regular administrator role users can’t do. Includes unfiltered_html. Basically, the super admin role is super dangerous for this reason. DO NOT give people access to it. They will ask for it, but just don’t.

Shared Users

  • All blogs have entral user management. This is one of the biggest reasons to use Multisite.
  • This can have weird consequences. e.g. to New York Times’ blogs which are hosted on wordpress.com’s VIP network, you see the WP.com toolbar. This is a bit weird for NYT’s branding. This also makes privacy a bit harder.
  • Doesn’t play well with alternative login plugins for 2 factor auth (Duo2, 2FA, Google Authenticator). You’re logged in everywhere (perhaps not the intention if you’re using 2FA), but you still need to 2FA into each site individually (probably not expected if you’re using MS).
  • User profiles are the same for all sites.

Shared Themes

  • Add a theme
    • Network enable (all sites)
    • Restrict themes available to use per site (Site -> Edit Site menu
  • Remember to add Child Themes.
    • If users have file editor access, their changes to themes are network wide. (Whoa.)
  • If you network enable the theme, activate it on a site, and then network disable the theme, it is not deactivated on the site that has activated it.

Shared Plugins

  • Install plugins on the network
    1. Activate per site
    2. Network Activate
    • note that this is distinct from how themes work. This is probably a better system than how it works with themes.
  • Must use plugins
    • Are enabled and active network-wide
    • Can’t be deactivated through the admin
  • Some plugins have their own network settings
    • Not all plugins do this the same way or the recommended way.

Structural Differences

File structure:

[My markdown rendering doesn’t handle indentation for file structure diagrams very well, so please bear with me using code blocks to preserve indentation for these.]

* /
    *   wp-config.php — has extra lines
    *  .htaccess — has extra lines
    *  wp-content — has extra subfolders, including site-specific folders
        *  Rely on the host to allow you to grant your site admins access to specific site directories — and only those for their own sites.

Uploads folders:

* wp-contents/
    * uploads/
        * 2015/ — this is for your primary (first) site on the network
        * 2014/ — this is for your primary (first) site on the network
        * /sites/
            * 2/ — site ID
                * 2015/

Multisite databases have at least 17 tables, instead of the standard 11. The 6 extras are:

  • wp_blogs
  • wp_blogs_versions
  • wp_sitemeta
  • wp_site
  • wp_signups
  • wp_registration_log

Here’s the kicker: there are 9 prefixed tables per every new site!!!

  • wp_6_posts — prefixed with site id
  • wp_posts — first site (not prefixed)

Formula for number of tables in the databse: 8 + ( 9 * n ), where n is the number of sites.

Recommendations

Use a managed host. Multisite is already hard enough as it is, and doesn’t perform on shared hosting.

  • Ask for automatic backups with 1 click restore (ability to download backup)
  • built in staging sites (that magically work with the networked sites)
  • granular deploy to production controls (deploy only specific sites)
  • look for extra security features (automatic ip blacklisting, whitelisting, etc.)
    • Jetpack includes some new shiny of leveraging wp.com’s blacklist data for bad actors.

Recommended wp-config.php additions:

define( 'DISALLOW_FILE_MODS`, true ); // disable the Admin File Editor
define ( 'DO_NOT_UPGRADE_GLOBAL_TABLES', true ); // prevent upgrade functions from doing expensive database queries
// [I missed one here.]

Do not ever do this anywhere in your code base: current_user_can( 'unfiltered)html' ); You will regret this.

Don’t loop through your network sites. Unless you know what you’re doing, you’ll cripple your site performance, if not crashing your site altogether.

Note Regarding Q & A

I didn’t capture Taylor’s use cases for Multisite during Q & A, with thumbs-ups and thumbs-downs on using multisite for various reasons. This is probably worth catching the wordpress.tv talk when it is available.

This post is part of the thread: 2015 WordCamp Seattle Live Notes – an ongoing story on this site. View the thread timeline for more context on this post.

WCSEA 2015: Matt Johnson — Content Migration: Beyond WXR

I’m at WordCamp Seattle today and am post­ing notes from ses­sions through­out the day. These are posted right after the session, and could be a little rough.

This is a talk from Matt Johnson, team lead at Alley Interactive. Find Matt at @xmatt on Twitter and at alleyinteractive.com on the web. Here’s the WordCamp.org session description.

What is migration

There’s an old site, and you’re making a new one. Your old site has content; you need to move it to the new site.

Client attitudes vary. Some are obsessed with migration. These are great because they give you a clear heads-up. Others don’t think about it at all unless you do.

Clients are often surprised when things get complicated, because they imagined it would be really simple. That got laughter in the audience, but Matt pointed out it makes sense: Content migration isn’t something you notice unless it goes wrong, so most people don’t know to think about it.

Content migration can be the fun part of a project

Content migration can involved some of the more interesting problems to solve, such as reverse-engineering weird legacy systems. You may get to write code just to extract it, and then clean up old content from bad code. Then there’s the satisfaction of processing hundreds of thousands of posts with a single CLI command.

Content migration can be the least fun part of a project

Bad things can happen, such as legacy content in Windows-1252 content encoding, when WP speaks UTF-8. Sometimes meta data is completely missing.

“Oh hey, we have this microsite we forgot to tell you about until right now, a week before the launch. Can we just merge it into the main site’s migration?”

With techniques from this presentation, the answer to that can be “Yes (here’s a change order) and yes!”

The basics

Content migration is moving all of your user-generated content from one place to another, accurately. Sometimes the old data maps to the new data really easily. Other times, migration is part of a project that also overhauls the site’s information architecture — as opposed to just a “face lift”. In these cases, newly migrated data needs its structure changed to from the old to the new information architecture structure.

The scale of the project does not necessarily correlate with the difficulty of migration. A project with major information architecture changes and relatively little content can have a much more difficult content migration than a project with a little of content and very little information architecture changes.

The types of migration approaches

The easiest: WXR out, and WXR in. WRX stands for WordPress Extended RSS feed (WRX). WXR files are generated by the Export menu item in the the WordPress Tools menu, and ingested by the Import tool.

You can almost never use this method. Common reasons include that the old site isn’t WordPress; the new site handles images differently on the new site; the new site has a new information architecture, such as:

  • When switching from users-as-authors to Co-Authors Plus (a plugin that allows you to have authors on the byline who are not users in WordPress, which news organizations frequently want)
  • When loading custom metadata into Fieldmanager [?]
  • Re-mapping taxonomies
  • Re-mapping content types (custom post types)

Approaches to migration

Plan A: Make your own WXR

This is unweildy. It requires writing custom code, and it’s code that must write XML, which is hard. You’re also still limited by the format of WXR.

As long as you’re writing custom migration code, why not take total control?

Plan B: Fix up your data after WXR

Run a WXR import, and see what went wrong or is missing, then troubleshoot using a WP-CLI script to finish this up.

A detour into using WP-CLI: WP-CLI is perfect for this, especially its extensability, which allows you to write custom WordPress code to run on demand from the command line. You could write custom code from a tool page, but there are runtime limits, and you need to work harder to create (even limited) UI that you just don’t need on the command line.

Doing this is easy:

if ( defined( 'WP_CLI' ) ) {
    require_once( MY_THEME_DIR . '/inc/class-migration-cli.php' );
}

/**
 * /inc/class-migration-cli.php'
 *
 * In this example, for the sake of brevity, we're omitting
 * output for debugging, which unless you're an evil genius,
 * you need.
 */
class Migration_CLI extends WP_CLI_Command {
    public function fix_my_data( '$args, $assoc_args ) {
        $per_page = 100;
        $page = 0;
        do {
            $posts = get_posts( array(
                // .../ Your WP_Query arguments here.
                'posts_per_page' => $per_page,
                'offset' => $per_page * $page++
            ) );
            foreach ( $posts as $post ) {
                // Do your stuff here.
                wp_update_post( $post );
            }
        } while ( $per_page == count ( $posts ) );
    }
}

To run this command, just do:

$ cd /var/www/my_wp_site.com
$ wp migration fix_my_data

Plan C: Goodbye WXR, Hello ETL

The advantage here is to get away completely from the WXR limitations.

ETL is a venerable computer science term that means extract, transform, load. It’s the most common pattern for custom migration scripts.

Another custom WP-CLI migration class example:

/**
 * /inc/class-migration-cli.php'
 */
class Migration_CLI extends WP_CLI_Command {
    public function migrate_data( '$args, $assoc_args ) {
        $this->connect_to_legacy_source();
        // <code>has_legacy_data()</code> returns true if there's more to process
        while ( $this-&gt;has_legacy_data() ) {
            // Extract:
            // <code>get_legacy_data()</code> gets next, and increments counter
            // This can also do a lot of your heavy lifting in extraction
            $row = $this-&gt;get_legacy_post();
            $post = array(
                'post_type' =&gt; 'post',
                'post_title' =&gt; $row['title'],
                'post_content' =&gt; $row['content'],
                'post_date' =&gt; date( 'Y-m-d H;i:s', strtotime( $row['date'] ) )
            );
            // Transform:
            if ( $row['is_slidewshow'] ) {
                $post['post_type'] = 'slideshow';
            }
            // Load:
            $post_id = wp_insert_post( $post );
            update_post_meta( $post_id, 'legacy_id', $row['id'] );
            if ( $row['is_slideshow'] ) {
                update_post_meta( $post_id, 'slides, $this-&gt;get_legacy_slides( … );
            }
        }
    }
}

Important: update_post_meta() is “item potent” which means you can run it mutiple times and have the same result as running it once, as opposed to creating duplicate posts each time you run it. It does this by checking if the WP post id existsyet for this post, and updating it if does.

if ( $post_id = $this->new_post_exists( $row['id'] ) ) {
    $post['ID'] = $post_id;
    wp_update_post( $post );
} else {
    $post_id = wp_insert_post( $post );
}

[This is brilliant!] This means you can iteratively improve your data if you need to add to it, instead of starting from scratch each time with a clean database.

  • has_legacy_date(): return true until no legacy items left.
  • get_legacy_post(): return an array with the next legacy item
  • get_legacy_slide():returns some special structured data (like slides in a slideshow).
  • new_post_exists(): returns the post_id of the WP post with this legacy id, or false if there isn’t one.

Typical Legacy data formats to work with

  • A MySQL database (any weird schema)
  • A pile of ZXML or JSON files
  • An RSS feed
  • A REST API (yay!)

This post is part of the thread: 2015 WordCamp Seattle Live Notes – an ongoing story on this site. View the thread timeline for more context on this post.