Date:  12/11/2012 11:31:11 AM Msg ID:  004557
From:  David Hempy Thread:  004544
Subject:  Re: HTTP 500 errors / channel issues
I've got a Good News update to share.  
 
We switched over to IIS last week, running on a fresh VMWare machine.  Two school days later, it does not appear that we've seen any more errors. Additionally, page loads are *much* zippier on the new server.  
 
We have not switched to ISAPI, and probably won't so long as we have smooth sailing. That is my next step, should it be needed. 
 
I can't say for sure what change made the difference, but my money is on moving from iPlanet 7 to IIS 6.  We had run iPlanet 6 in the past, and I'm guessing something in 7 disagreed.  In any case, it feels good to have FoxWeb running under the web server software you guys seem to encourage the most.
 
Thanks for the help, FoxWeb!
 
-dave
 
 
--
David Hempy
Director of Broadcast and Educational Technology
Kentucky Educational Television
(859)258-7164 - (800)333-9764
Sent by David Hempy on 10/29/2012 11:23:21 PM:
Thanks for the insight.
 
Yes, this machine has been rebooted lately.  I applied a backlog of Windows updates over the past few days, involving many reboots, and it does get rebooted occasionally...maybe once a month.  
 
Today I've been setting up IIS on our edit server.  I'm a bit slow, as I have no experience with it, but it seems to be pretty easy once you put on your Microsoft hat.  I think I've got it fully configured and will have our staff test it in the morning.  If all goes well, I hope to have the live server running on IIS late Tuesday.  
 
.........
 
I think the Windows updates have done some good.  We were averaging 500 of these errors/day last week, and did not have any show in the logs today.  I'm hesitant, though, as there are some new errors I've never seen before, so maybe the same old error just has a new label now.  I'll be canvassing my users tomorrow for their personal observations from today.  
 
Encouraging, but I'm still moving ahead with the IIS change.  I'll follow up.
 
Thank you VERY MUCH for your feedback.  
 
-dave
 
 
--David Hempy
Director of Broadcast and Educational Technology
Kentucky Educational Television
(859)258-7164 - (800)333-9764
Sent by FoxWeb Support on 10/29/2012 08:59:26 PM:
My guess is that you are facing two separate issues.
 
The 500 error is an indication that foxweb.exe was either not launched properly, or that the process crashed badly at an early stage. You are right in that the error is generated by the Web server, when it detects that the CGI process is no longer there (or was never there to begin with), without the server having received any response. If you were using IIS, I would recommend that you switch to the ISAPI module (at least for testing purposes), but I don't think this is possible with the Sun Web server.
 
The rest of the symptoms point to a problem with the communication between the FoxWeb channels (fwserver.exe -- hosted in VFP) and the FoxWeb Channel Controller (fwstart.exe). The two components utilize sockets for communication, but it seems that lately they have been having issues on your server. I have seen similar issues caused by lack of sufficient sockets, but in some other cases the problem went away inexplicably, when FoxWeb was configured to not run as service (or vice versa!). You may want to temporarily disable the "Run as Service" option in the FoxWeb configuration, or experiment with the system service user. Of course if you haven't rebooted the server in the last few days, you should try that first.
 
A couple of things that will explain some questions you have:
 
The 30 second timeout is unrelated to the script timeout.  It is enforced in the FoxWeb Channel Controller, which is responsible for starting and maintaining channel processes. The Controller refrains from checking on channels for the first 30 seconds after it launches them, so as to give them time to start and establish communications.
 
The fact that show_channels work even when no channels are present is because, this particular request is not passed through to a channel. The data is provided by the FoxWeb Channel Controller and there is no need for VFP to be involved.
FoxWeb Support Team
support@foxweb.com email
Sent by David Hempy on 10/26/2012 04:33:02 PM:
 
(Sorry in advance this is so long...I'm trying to braindump as much of my troubleshooting notes as possible in case any of it clicks with anyone...)
 
I have been troubleshooting numerous server errors being returned to the browser.  This has been happening occasionally for a while -- perhaps a year -- but has increased greatly in the past month or so, to the point of causing substantial problems for our K-12 students' achievement.  
 
Students occasionally get a 500 status code and this web page:
 

Server Error

This server has encountered an internal error which prevents it from fulfilling your request. The most likely cause is a misconfiguration. Please ask the administrator to look for messages in the server's error log.

 
I'm fairly certain this error is generated by the web server software (Sun Java System Web Server 7.0) It shows a line such as this in the error log every time the error occurs:
 
[26/Oct/2012:12:44:32] failure ( 1460): for host 168.10.184.100 trying to POST /cgi-stu/foxweb.exe/quiz@/db/stu/stu, cgi_scan_headers reports: HTTP4044: the CGI program e:/cgi/pub/foxweb.exe did not produce a valid header (program terminated without a valid CGI header. Check for core dump or other abnormal termination)

 
...and this in the access log:
 
168.10.184.100 - [26/Oct/2012:12:44:32 -0400] "POST /cgi-stu/foxweb.exe/quiz@/db/stu/stu HTTP/1.1" 500 305 "http://dl.ket.org/cgi-stu/foxweb.exe/quiz@/db/stu/stu" "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.4 (KHTML, like Gecko) Chrome/22.0.1229.94 Safari/537.4" POST /cgi-stu/foxweb.exe/quiz@/db/stu/stu - HTTP/1.1 www.dl.ket.org
 
These two logs are the only place I can reliably detect the error has occurred, and unfortunately they're not very helpful in exploring the cause of the error. 
 
The problem seems loosely related to traffic load, but even then, we are not heavily loaded.  If a class of 20 students are all told to open a particular page at once, there are fair odds the error will occur for at least some of them, and it seems to snowball when it does occur. The problem is sporadic -- a reload will often succeed.
 
I cannot recreate it on demand, but do occasionally see it myself. 
 
There are no entries in FWStart.log, the errors table, or Windows event log that correspond to the appearance of these errors.  I do some logging ASAP in fw_enter.prg, and no activity registers there for these hits.  I suspect foxweb.exe is not getting far enough to connect to vfp, or perhaps foxweb.exe is not even being called by the web server correctly in the first place.  
 
We lose FoxWeb channels over time.  We used to run a dozen channels, and would occasionally find only five present in fwstatus a week later.  Early this week, I increased this to 20 channels and 3,000MB just to ensure more channels will survive.
 
 
The first entry in our FWStart.log is about two years old: 11/30/2010 16:28:02 Starting FoxWeb . Two days ago, this error appeared for the first time in FWStart.log:
 

10/23/2012 12:49:08 Error: 1429 Message: "OLE IDispatch exception code 61704 from FoxWebCtrl: Internal application error..." Program: "PROCEDURE FWMAIN C:\PROGRAM FILES\FOXWEB\FWSERVER.EXE" Line Number: 176 Line: "..." Channel: 16

 
It has happened  274 times since then.
 
Later that day, I saw this error for the first time ever:
 
10/23/2012 13:42:35 Error: 2005 Message: "Error loading file - record number 3.  FRMSCK <or one of its members>.  Loading form or the data environment : There is not enough memory to complete this operation" Program: "PROCEDURE FWMAIN C:\PROGRAM FILES\FOXWEB\FWSERVER.EXE" Line Number: 165 Line: "..." Channel: 7
 
That error has occurred 118 times since then.
 
 I've also recently started seeing various channels get stuck in 30-second restart cycles:
 
10/25/2012 05:18:48 Restarting Channel 20
10/25/2012 05:18:52 Restarting Channel 15
10/25/2012 05:18:57 Restarting Channel 19
10/25/2012 05:19:14 Restarting Channel 6
10/25/2012 05:19:18 Restarting Channel 20
10/25/2012 05:19:22 Restarting Channel 15
10/25/2012 05:19:27 Restarting Channel 19
10/25/2012 05:19:44 Restarting Channel 6
10/25/2012 05:19:48 Restarting Channel 20
10/25/2012 05:19:52 Restarting Channel 15
10/25/2012 05:19:57 Restarting Channel 19
10/25/2012 05:20:14 Restarting Channel 6
10/25/2012 05:20:18 Restarting Channel 20
10/25/2012 05:20:22 Restarting Channel 15
10/25/2012 05:20:27 Restarting Channel 19
10/25/2012 05:20:44 Restarting Channel 6
10/25/2012 05:20:48 Restarting Channel 20
10/25/2012 05:20:52 Restarting Channel 15
10/25/2012 05:20:57 Restarting Channel 19
10/25/2012 05:21:14 Restarting Channel 6
10/25/2012 05:21:18 Restarting Channel 20
10/25/2012 05:21:22 Restarting Channel 15
10/25/2012 05:21:27 Restarting Channel 19
10/25/2012 05:21:44 Restarting Channel 6
10/25/2012 05:21:48 Restarting Channel 20
10/25/2012 05:21:52 Restarting Channel 15
10/25/2012 05:21:57 Restarting Channel 19
10/25/2012 05:22:14 Restarting Channel 6
 
Our FoxWeb script timeout is 30 seconds.  As a test, I changed it to 60 seconds and restarted FoxWeb.  Surprisingly, the restart interval is still 30 seconds:
 
10/26/2012 15:50:31 Restarting Channel 4
10/26/2012 15:50:59 Restarting Channel 20
10/26/2012 15:51:01 Restarting Channel 4
10/26/2012 15:51:29 Restarting Channel 20
10/26/2012 15:51:31 Restarting Channel 4
10/26/2012 15:51:59 Restarting Channel 20
10/26/2012 15:52:01 Restarting Channel 4
10/26/2012 15:52:29 Restarting Channel 20
10/26/2012 15:52:31 Restarting Channel 4
10/26/2012 15:52:50 Closing FoxWeb
10/26/2012 15:52:55 Starting FoxWeb
10/26/2012 15:53:34 Restarting Channel 18
10/26/2012 15:53:35 Restarting Channel 19
10/26/2012 15:53:35 Restarting Channel 20
10/26/2012 15:54:04 Restarting Channel 18
10/26/2012 15:54:05 Restarting Channel 19
10/26/2012 15:54:05 Restarting Channel 20
10/26/2012 15:54:35 Restarting Channel 19
10/26/2012 15:54:35 Restarting Channel 20
10/26/2012 15:55:05 Restarting Channel 19 
 
Looking in Windows Event Log has not revealed much.  I do see these:
 

From the Windows Event Log, Application Log:

Event Type:    Error
Event Source:    Application Error
Event Category:    (100)
Event ID:    1000
Date:        10/25/2012
Time:        7:41:35 AM
User:        N/A
Computer:    3DL
Description:
Faulting application vfp9.exe, version 9.0.0.2412, faulting module vfp9.exe, version 9.0.0.2412, fault address 0x0002f454.

I see lots of these, but not necessarily in synch with anything in FWStart.log, or the observed "Server Error" messages appearing in the browser.  I also rarely see a dialogue box on the server's desktop announcing VFP has died and asking if I want to report it to Microsoft.  I've seen that a handful of times this week -- normally I won't see those for months. 
This is probably not relevant, but if I have FWStatus running when I restart FoxWeb, the FWStatus window will sometimes (not always) start flickering, redrawing itself constantly and ignoring input.  It taps CPU utilization, and I have to kill it via Task Manager.  I've gotten into the habit of stopping FWStatus before restarting FoxWeb.
 As an odd twist, fwadmin.fwx works, except for show_channels.fwx.  In fact, you can see the Server Error I'm chasing by hitting that page: http://www.dl.ket.org/cgi-sta/foxweb.exe/show_channels.fwx  That is the *only* link on the site that consistently generates the Server Error...but I do not think it is failing for the same reason as the sporadic errors our users are seeing.
Here is our Server Info page (with some data redacted)
About this Server Logout

System Information
FoxWeb Version  4.3 
Channel 
Total Channels  20 
FoxWeb Path  c:\program files\foxweb\ 
Virtual Program Root  e:\ 
Main Program Root  e:\ 
Program Path  e:\ 
Operating System  Windows 5.02 
Web server  Sun-Java-System-Web-Server/7.0 
Date/Time  10/26/2012 04:18:12 PM 
VFP Version  Visual FoxPro 09.00.0000.2412 for Windows 
VFP Compile Version 
VFP Temporary Path  [snip]
Buffer Memory Size  150 MB 
License Flags  Servers: 1 

Virtual Roots
www.dl.ket.org/[snip]
[default] 

Licenses
040-002[snip] Company: KET Servers: 1 

© Copyright Aegis Group
 
 
 
Well...that's about it for my observations.I can share more complete logs upon request.
 Here is what I'm doing to try to solve this:
 
 
In the past few days, I have forced a recompile of every .prg and .fwx file on the site (some were been compiled ten years ago, under who-knows-what version of VFP) after reading the FoxWeb release notes.  No improvement. 
I've been applying some long-overdue Windows Updates.  Not finished yet, but the "critical" updates are all applied at least. 
I even ran a memory test suite (memtest86) with no errors.  (I'm willing to pursue *any* option right now!)
Last year, we moved to new/fresh/virtual/physical servers in various configurations without improvement.  The problem was less pronounced then, so we shrugged through it.  The errors are now to a level that is not acceptable to our students.
Things I have on my roadmap to try:
  • Upgrade FoxWeb 4.3 to the latest (6.6.3)
  • Continue all Windows Updates
  • Check for VFP updates
  • Update Sun Server to latest version
  • Replace Sun server with IIS (hope to avoid this...we have no expertise with IIS)

Whew...if you're still reading this, thanks for your persistence.  If anyone has any suggestions, I'd GREATLY appreciate any insight. 

 

-dave

 

--
David Hempy
Director of Broadcast and Educational Technology
Kentucky Educational Television
(859)258-7164 - (800)333-9764