Unexpected Situation

During the weekend WishList Products experienced an unexpected situation that effected roughly 10% of our customers (which has since been resolved).

We’d like to inform you of what happened along with what we have done to correct it and prevent it from happening again.

When WishList Member is purchased, each customer is issued a “license key” to verify the purchase.

With that in mind, WishList Member checks the license key on an ongoing basis to verify that each site is using the correct license and to also issue product updates to these sites accordingly.

The problem that occured this weekend resulted because the “activation server” that verifies the license key, went down.

So for the sites where WishList Member wanted to verify the license, there wasn’t anything there to verify it was a valid license (because our server was down).

This caused roughly 10% of our customers to receive a “500 internal error”.

As soon as we heard about our server being down (we heard first from our customers vs. our hosting provider), we immediately sprung into action to resolve the issue.

This being the first time something like this has ever happened in our company’s history (3+ years), we weren’t sure what was all of a sudden causing this server to crash.

Nevertheless, we got the server back up, everything was fixed and after monitoring it for several hours, it looked to be running smoothly.

Then, the following day it happened again.

This was baffling to us as this was two times in as many days and prior to that, it had never occurred.

Why was this all of a sudden happening?

We immediately got the server back up and then went into “investigation mode”.

After hours of analyzing the situation, we found out that it had to do with specific limitations of the server we were using (something we didn’t initially foresee).

That’s when we decided to make some significant changes in the way our verification process operates.

We have now implemented a situation whereby the “verification check” must fail multiple times before the WishList Member license is deactivated.

Plus, even when it is going to be deactivated, the customer will have a number of days prior to the deactivation and will be made aware of that with several messages.

This will eliminate the situation we had this weekend because the sites would continue working fine until the next check. If the next check failed as well, the customer would still have plenty of time to prepare as well.

The time for this whole process to complete before the final deactivation would be roughly a couple of weeks. So, as you can see, this changes the situation completely from what happened this weekend when our server was down for a few hours.

Furthermore, we have made significant upgrades to our server environment introducing load balancers, mirroring servers (so if one goes down, the other kicks in to verify licenses) etc.

We are deeply sorry to those customers who were effected.

We take this very seriously as we pride ourselves on our products and the service we provide to our customers.

In addition, we too have sites using our own software (and we were effected as well – and we don’t want anything like that happening either).

Moving forward, we are very confident in the changes we have made and look forward to continuing to hear about your success.

Take care,

Stu McLaren
Co-Founder, WishList Products

Tags: , ,

Comments { 30 }

About Stu McLaren

Stu McLaren is the Co-Founder of WishList Products whose passion for building online communities served as the catalyst for the development of the membership platform WishList Member. Today he uses WishList Member for his own membership sites as well as to serve the charity he co-founded with his wife called World Teacher Aid (www.WorldTeacherAid.org).

, ,

30 Responses to Unexpected Situation

  1. Cortney Wanca January 25, 2012 at 2:20 pm #

    Just to be clear, is the new verification check process handled entirely on the activation server side? Or is a Wishlist Member plugin update also going to be necessary on our end?

    Thanks!

    • WPWL Team January 25, 2012 at 2:33 pm #

      Everything is up and running with no issues now, so no update is required :)

  2. Chris Zavadowski January 25, 2012 at 2:21 pm #

    Hey guys,

    As one of the folks affected, I appreciate how quickly you jumped on fixing things. And I also appreciate the transparency and honesty in your update above. Kudos on handling this well!

    Keep up the good work!

    Chris :)

    • Stu McLaren January 25, 2012 at 4:11 pm #

      Thanks Chris – that means a lot.

      Seriously.

  3. Nigel Merrick January 25, 2012 at 2:23 pm #

    Thanks, Stu, for clarifying this for us – I did notice this on my site over the weekend, but it didn’t cause me too much disruption.

    As always, you guys do a great job of keeping on top of things and, just as importantly, keeping us in the loop about what happened and how you solved it.

    Many thanks

    Nigel

    • Stu McLaren January 25, 2012 at 4:14 pm #

      We appreciate that Nigel.

      It was one of those fluke situations but it’s comforting to know that we have understanding customers such as yourself.

  4. Darren January 25, 2012 at 2:27 pm #

    Hey yes my site was affected however as this is the internet ***** does happen. No biggie problem was rectified fairly fast.

    Thanks for the post :)

    • Stu McLaren January 25, 2012 at 4:15 pm #

      Thanks Darren.

      I couldn’t agree more.

  5. Rob January 25, 2012 at 2:42 pm #

    I am just wondering that after you ping the site once for a license check, why sites would fail if you are just pinging for updates/alerts?

    The license had already been verified and authenticated. It doesn’t make sense to make a software stop working because it can’t fetch news alerts or updates. It’s been activated so should be able to continue running as normal, not shut a whole business down.

    Also what other information is being sent back? Number of members? Logins? Financial data? There are privacy concerns with any software that “phones home” on a periodic basis and I don’t see this listed anywhere on the site.

    Please clarify as this is confusing and I want to make sure I understand things before I build out my membership site.

    Looks like you’re on top of things and have solved the issue but maybe I am missing something. Wishlist is an impressive piece of software and I’m sure more good things and features are in store.

    • Stu McLaren January 25, 2012 at 5:05 pm #

      @Rob – There is a need to verify the license on an ongoing basis.

      When someone installs WLM and activates it, what then happens if they ask for a refund?

      How would we be able to verify that the people using WLM are those that have paid for it?

      Furthermore, each time WLM checks our system, it also looks for updates (which appear in your WLM dashboard). Not everyone is entitled to those updates (they are free for a year and after that they are $47/yr).

      As far as information being “sent back”, there isn’t any.

      We don’t have access to number of members, logins, financial data or anything like that.

      Hopefully that helps clarify a few things for you.

      Take care,

      Stu

      • Rob January 25, 2012 at 5:48 pm #

        Hi Stu, many thanks for the quick response. That makes sense, now I understand. It’s good to know that Wishlist is on top of things and addresses issues quickly.

        It’s nice to deal with a company where the owners are involved and have a personal interest in seeing us succeed!

  6. Pamela Rose Anders January 25, 2012 at 2:45 pm #

    Thanks for fixing the problem, and thanks for explaining things to us.

    Pam

    • Stu McLaren January 25, 2012 at 4:15 pm #

      Your welcome Pam.

      Hopefully it helped to understand what was happening “behind the scenes” :)

  7. K. Lendi January 25, 2012 at 2:54 pm #

    Thank you so much everybody at Wishlist for helping me this weekend. I was TOTALLY freaked out and I thought it was my fault. Your explanation and prompt service even on the weekend really helped me to feel much much better. THANK YOU THANK YOU THANK YOU!

    Kerry

    • Stu McLaren January 25, 2012 at 4:16 pm #

      Glad we were able to get everything figured out :)

      Now go get that membership site up and let us know about it! LOL

  8. Jan from faalangst January 25, 2012 at 3:22 pm #

    Hi Stu.

    Thanks for being clear on this issue. I’m in the building phase of my wishlist site and am amazed how well everything does work. And I was not amused when I found out about the problems this weekend. I’m glad everything does work fine now.
    Thanks

  9. Pierre Nouaille-Degorce January 25, 2012 at 3:32 pm #

    Thanks for those explanations. What you have done to correct the issues seems to be good.

  10. Stu McLaren January 25, 2012 at 4:19 pm #

    We appreciate it Pierre.

  11. Victoria Cook January 25, 2012 at 4:54 pm #

    Thankfully I wasn’t affected but appreciate the update and transparency! Keep up the great work!

  12. Jeff Pfau January 25, 2012 at 4:58 pm #

    Thanks, Stu.

    As a professional programmer, your solution to prevent this from happening again is exactly what I’d do too. Good job!

    In other words, your solution gives the benefit of the doubt to the site owner by letting the license test fail several times before deactivation of the license.

    Hopefully, you’ll space the licensing tests out (like once a day over 4 days in case things fail on a Friday night, no one’s site will go down over the weekend—even a holiday weekend—when help isn’t always available).

    Prompt (and automatic) notification of the issue is extremely helpful to your customers (the site owners) and a necessary part of the solution. This is true for affected customers and non-affected customers as well.

    I appreciate that you are very serious about this event never occuring again (as demonstrated by your physical server upgrades, and implementation of backup, rollover, and disaster recovery systems).

    Having worked in the Information Technology (IT) industry for nearly 30 years, I know that some disasters (such as this) simply cannot be foreseen because of the complexity between the different system components, hardware, data volumes, infrastructures, and business entities involved.

    I appreciate the speed at which the issue was addressed and your transparency in discussing the problem publicly.

    I have a few suggestions that I’m offering (only because they weren’t specifically addressed in your explanation above…and I feel confident that you probably have already implemented these suggestions or at least discussed them as part of your solution):

    1) Make sure your team members are notified automatically (by a separate system that monitors your critical system/server);

    2) have someone “on call” at all times in case of emergencies (we took turns carrying a beeper overnight as the “on-call” person who would be the first response to any problems, if needed, they would call in others to actually fix the issue); the on-call person was beeped automatically when the monitoring system discovered a problem;

    3) Make sure that your automatic notifications to your customers list an emergency phone number that your customers can call. You might tie this in to a special “outtage” or “emergency” bulletin board where customers can go to for hour-by-hour updates on the situation status (unless you LIKE getting continual calls by frantic people while you’re trying to fix the problem). ;) This emergency phone number and bulletin board URL should be posted on your main site also so that it’s easily findable.

    4) Also, your hosting company should know who to call when an emergency arises. If their hardware goes down, do they know how to contact your people if questions arise about how they should handle the emergency? You DON’T want them to make the judgement call—they’ll take whatever response is easiest and least costly to them (like “Just overwrite the existing database with last month’s (old) member data.”) You may need to sign a contract, a Service Level Agreement (SLA), which we often did, for this extra layer of service. Their role was specifically defined as to what THEY need to do during your emergencies when the issue is related to their hardware or services.

    Anyway, those are my recommendations. Some are immediately implementable, and the others are highly recommended to be put in place. And I’m sure you’ve got other, alternative solutions that you can implement.

    Again, I think your response to the whole situation was immediate and professional. Thank you for that. And I think that in the future, these types of events will be rare (or even completely avoided).

    ~Jeff

    • Stu McLaren January 25, 2012 at 5:10 pm #

      All good points Jeff.

      One of the changes that we made was moving to a new hosting company precisely for the reason you listed above – so someone could contact us when the server goes down.

      Our OLD hosting company wasn’t even aware that it was down.

      Secondly, after the first blimp, they informed us of an service ($7.50/month) that would notify us if anything went wrong.

      We immediately upgraded.

      It didn’t work the next day when the server went down again.

      That’s when we moved to a different company.

  13. Tom January 25, 2012 at 5:09 pm #

    I am very much with Rob. I find it actually quite concerning that the software continually verifies the license and possibly sends data back to Wishlist Products.

    I can’t recall reading about this anywhere in the license agreement (maybe I overlooked that part).

    In addition, how can you fix the “call home feature” without installing any update on my end? Or does Wishist download and install updates automatically?

    Can someone from Wishlist please clarify these issues. It’s all nice to write feel good LOL message in response to customers but please address the more pressing questions raised by Rob.

    • Stu McLaren January 26, 2012 at 10:42 am #

      @Tom – Please see my response to Rob.

      I think it will clarify a number of things for you.

      Take care,

      Stu

  14. Jennifer Hoffman January 25, 2012 at 6:19 pm #

    My website wasn’t affected but I did notice some glitches in performance over the weekend and a page that wouldn’t display properly. Thanks for the info, it helps to know you are aware of the problem and acted immediately. Having worked in corporate IT as a sysadmin for a mission critical system (payroll!) with over 15k users, I know how disastrous a system failure can be and it often takes a system failure to create new processes that can handle a larger user base. If this had not happened you would not have been aware of this issue and your expanding user base will be better served by the processes you put in place to handle this crisis. Good work, I have been so impressed to date by the level of service and information I have received from WishList.

    • Stu McLaren January 26, 2012 at 10:44 am #

      @Jennifer – I think anyone that has dealt with “tech” on a large scale can appreciate the “oh crap” factor when/if it ever happens ;)

      Thanks for your kind words… we really do like receiving them from our customers.

  15. Rodney February 27, 2012 at 2:20 pm #

    I’d just like to say that I really like the way you handled this. First, you admitted that there was a problem. Second, you took very specific actions to make sure it doesn’t happen again.

    There’s a few other companies out there especially certain hosting companies who will remain nameless that should really take a cue from you.

  16. Amin July 29, 2012 at 5:04 am #

    “We have now implemented a situation whereby the “verification check” must fail multiple times before the WishList Member license is deactivated.”

    That’s all great, but now, because your site was down a few times over the last few days (not your fault, I know), it made my sites fail to even load.

    I have no problem with you verifying that I’m a valid member. That’s just good business sense. But doing it in a way that can turn off all my sites through no fault of mine, *or yours*, is a serious weakness.

    There must be some model you can use for checking validity of a license that does not require 100% uptime for *your* site.

  17. David Hunt August 9, 2012 at 3:08 pm #

    Stu,

    This is well done, very admirable.

    Only I am the wish I was getting the same response to the major problem I have been having and reporting to WishList for the past month. If not more. This involves whenever anybody posts on our site: wishList uses up all the resources of a dedicated server and craters the site bringing down MySQL.

    Your support in general tell us that we should get a dedicated server which we don’t have.

    It does this when posting simple five or six simple text posts.

    We have been told by others that they could program this to fix it. Why can’t WishList?

Leave a Reply