Jump to content

Site Stability


RadioRob
This topic is 989 days old and is no longer open for new replies.  Replies are automatically disabled after two years of inactivity.  Please create a new topic instead of posting here.  

Recommended Posts

Over the past month, you've may have noticed randomly off and on times when the site would choke up and not work for a period of time...  typically anywhere from 5 minutes to an hour.  

In the background when this was happening there would be a large number of connections to the server that were just "stuck".  They were not closed, but not doing anything either.  Eventually this would reach a point where there would be thousands of these stuck connections that would just cause the server to run out of resources and fail.  

This was something that was challenging to get to the bottom of because there was no error logs that were generated and there was no indication of what was actually happening to make these connections would become "stuck".  It meant that I would literally have to just keep checking the site as often as possible to make sure it was actually online.  I would wake up in the middle of the night to go to the bathroom and check to see if everything was still running.  I would check the first thing as I woke up, or during lunch, or in between Zoom meetings to ensure the site responded.  If not, I would have to restart the web service which would reset everything and bring the site back online.  Restarting the server however was just a bandaid and did not address the root cause of the problem of why it was happening in the first place.  That's why we saw this happen over and over several times.  

I spent a lot of time reconstructing each failure looking at was happening at the time leading up to each failure.  At first, nothing out of the ordinary stood out when looking at the single event.  However if I started comparing each incident to other failures, I noticed a couple of things:

  • Just before failure, there were a large number of accesses by search engines such as Google and Bing.  They're trying to crawl the site so that results show up in various search results.  While this can be good, the search engines tend to make a LOT of requests in a short period of time from a lot of different locations.  These requests when added with our normal traffic would cause a bottleneck.
  • While the search engine was crawling the site, it would look for many things that were no longer there.  This could be old profile pictures, or content that was deleted, etc.  Each time one of these "not found" objects was triggered, instead of the web server serving a standard "404 NOT FOUND" response, it would instead route to the forum software itself since it had to figure out if there was a different address for the content that should be returned.  (THIS TAKES UP A LOT OF RESOURCES!)

As a result, it looks like when search engines would start crawling our site, it would trigger much more usage than normal and things it was doing generated even MORE resources than activities done by a real person.  

To fix this, I've made several changes:

  • For users that are not logged in (such as Google, Bing, etc), content is not updated in "real time" like it is when you're logged in.  Instead users not logged in will see a "cached" version of pages that are updated every 5-10 minutes.  
  • If a file is not found by the server, instead of letting IPB also look to see if there is a different address available, the web server will just return a message that it was not found.  
  • I've implemented more caching of system common files such as images and javascripts.  The good part of this is that there are fewer requests to the server.  The downside is that if an IPB file changes, your browser cache might the old one still saved instead of getting a new version for EVERY SINGLE page request.  (This is why you might have seen me tell people to clear their browser cache from time to time when a problem is reported.)
  • I've split the search system from the main site database.  By having these functions separated, several people searching can't hang up the site waiting for search results while others are waiting to load topics/posts, etc.  They can be done concurrently as separate tasks.  

Since making these changes and a few others, the sever load has dropped by more than 75% and reduced the memory usage by almost half. In addition, I have not seen any situations in which the server has locked up when being crawled by a search engine.  Finally it should be making the site faster and more responsive.  For each page you view, there are fewer things that have to be requested from our server since your browser will now reuse certain "static" pieces of content.   When I benchmark the site's performance compared to another test from a month ago, it's about 40% more responsive.  This is pretty impressive considering most of our users are in the US and our server itself is located on the other side of the world in Amsterdam due to political/legal issues.  

At the end of the day, what I'm saying with all of this techno mumbo jumbo is I think I have gotten to the bottom of what was randomly choking up the site.  And I think the fix should hopefully be making the site be a little faster and more responsive than it was before.  

Edited by RadioRob
Link to comment
Share on other sites

4 hours ago, RadioRob said:

Since making these changes and a few others, the sever load has dropped by more than 75% and reduced the memory usage by almost half. In addition, I have not seen any situations in which the server has locked up when being crawled by a search engine.  Finally it should be making the site faster and more responsive.  For each page you view, there are fewer things that have to be requested from our server since your browser will now reuse certain "static" pieces of content.   When I benchmark the site's performance compared to another test from a month ago, it's about 40% more responsive.  This is pretty impressive considering most of our users are in the US and our server itself is located on the other side of the world in Amsterdam due to political/legal issues.  

At the end of the day, what I'm saying with all of this techno mumbo jumbo is I think I have gotten to the bottom of what was randomly choking up the site.  And I think the fix should hopefully be making the site be a little faster and more responsive than it was before.  

You da man...!

Thank you for all your tireless efforts...  You deserve all the brownie points dear sir!

Cheers!

Link to comment
Share on other sites

On 7/1/2021 at 2:26 PM, RadioRob said:

I actually don’t want to block or limit the search engines. I’m trying to get them to crawl us MORE to catch people who don’t visit frequently and might not have caught the domain change. 

After reading your message @RadioRobI believe that there is notably less hesitation scrolling between screens and changing topics.  Your efforts have not gone unnoticed.  Very much appreciated!  Have a great 4th of July!

Link to comment
Share on other sites

Guest MikeThomas

07/05/21, 4:08pm CT.  Went to www.companyofmen.org and was told page doesn’t exist.  There was a log in option on the message that enabled me to log in.

Just passing on an observation… no need to respond.

Thanks for all of your hard work!

Link to comment
Share on other sites

2 hours ago, MikeThomas said:

07/05/21, 4:08pm CT.  Went to www.companyofmen.org and was told page doesn’t exist.  There was a log in option on the message that enabled me to log in.

Just passing on an observation… no need to respond.

Thanks for all of your hard work!

Certain pages require being logged in to access. If you were trying to access the new review section for example, you would receive a message that it’s not found. You would only be able to access it if you are logged in. 

EDIT:  For clarification, the long-term intent (at the moment) is not to have the review feature be limited to members only.  Its set that way since there are bugs and quirks that can still exist.  I don't want bots, or non-members playing around with my virtual sandbox just yet.  Once the review system is "factory reset", I'll update permissions appropriately.  

Link to comment
Share on other sites

  • 2 weeks later...
Guest MikeThomas
Just now, MikeThomas said:

Curious why at the main page it shows me as "online" even though I am not logged in.

Now that I am logged in, it shows me at the top of the online list.  I guess online means currently and in the and past.

Link to comment
Share on other sites

The online list on the homepage is not updated "real time" meaning every single time you load the page.  To reduce the amount of resources generated by the server, that widget updates every 1-2 minutes.

In addition, on the homepage, an "online user" includes anyone who has accessed the site in the last 20 minutes.  So if you were online, then logged out...  it would not reflect immediately on the home page.  

The online list (https://www.companyofmen.org/online/) updates in real time.  If you are on that page and log out, it would remove you from the online list and instead reflect you as a guest.  

Link to comment
Share on other sites

Guest MikeThomas
1 minute ago, RadioRob said:

The online list on the homepage is not updated "real time" meaning every single time you load the page.  To reduce the amount of resources generated by the server, that widget updates every 1-2 minutes.

The online list (https://www.companyofmen.org/online/) updates in real time.  

Thanks.  Got it.

Link to comment
Share on other sites

Thanks for this explanation of how the system works and about some of the glitches. Whenever something goes wrong I immediately assume it is me or my computer that is causing the problem, as I am not all that tech savvy. Nice to know sometimes the source of the problem lies elsewhere. and someone else's concern. Thanks for your good work ,as always.

Link to comment
Share on other sites

  • 3 weeks later...

I missed this entry of @RadioRob "genius-ness." 
 

Wow! The care and detail. I would ask you to run for higher office, but I won't say which for fear this would wind up in the political forum, and for the most part, I try to stay out of there! 😜🤣🤣

Bottomline, sir, you continue to be a most benevolent and steady friend to this community. Thank you! 

Link to comment
Share on other sites

This topic is 989 days old and is no longer open for new replies.  Replies are automatically disabled after two years of inactivity.  Please create a new topic instead of posting here.  

Guest
This topic is now closed to further replies.
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...