WinCGI Package 2.0 WinCGI provides access to the submitted CGI data on a web server supporing the Windows CGI (WinCGI) specification. The available routines are: CGIRead filename Process the WinCGI ini file, setting up all submitted variables. When the TCL script is started, the first argument is the WinCGI ini file. CGIValue key Return the value of a submitted data item. Key is the submitted data name. CGIAccept key Return yes/no settings for what the client can support. Key is the MIME type. CGISystem key Return system information such as client browser identifier etc. Key is the argument name CGIDebug local | remote A mechanism to display all of the submitted information. local - dipslay to the standard out of TCL remote - display back to the client browser. No HTML tags are present in this display, so the programmer must ensure correct HTML wrappers are included. CGIWrite string Place the string into the data stream going back to the client browser. --------------- Author - Evan Rempel erempel@UVic.CA I will attempt to provide support for this package, and to extend its capabilities when requested and appropriate. Feedback is welcome. ---------------------------------------------------------------------------------- The following sections are defined in the Windows CGI 1.3 specification. The complete specification is provided for reference at the end of this document. [CGI] - all kinds of system and server settings [Accept] - lists the mime types that the client browser can receive [System] - special values specific to Windows CGI 1.3 [Extra Headers] - headers that the browser sent that are not part of http 1.1 spec [Form Literal] - variables and values submitted by client (small) [Form External] - variables and values submitted by client (large < 64Kbytes) [Form File] - file uploads (MIME) from client [Form Huge] - variables and values submitted by client ( > 64Kbytes) The WinCGI Package rearranges these a litle as described below, and leaves out the secitions: [Extra Headers] - if they aren't part of the spec, they can't really be counted on. [Form File] - this package is not for file uploads [Form Huge] - if >64K, probably a file upload, so not handled by this package. [System] - only needed during CGIWin parsinge (states where data files are etc). This section is processed, but only the GMT time zone info is maintained. ------------------- WinCGI Package specification This package parses a Windows CGI file (.ini) and creates three (3) sections. SYSTEM - all settings for the current server/client/script interaction. ACCEPT - all of the MIME types that the client will accept. VALUES - all of the variables and values that were submitted by the client. This includes the "Form Literal", "Form External" and "Query String" from the Windows CGI Specification. Section SYSTEM -------------- Request Protocol The name and revision of the information protocol this request came in with. Format: protocol/revision. Example: "HTTP/1.0". Request Method The method with which the request was made. For HTTP, this is "GET", "HEAD", "POST", etc. Executable Path The logical path to the CGI program executable, as needed for self-referencing URLs. This may vary if the server supports multi-homing with separate logical path spaces. The server must provide the physical path equivalent using the logical to physical mapping for the identity on which the current request was received. Document Root The physical path to the logical root "/". This may vary if the server supports multi-homing with separate logical path spaces. The server must provide the physical path to the logical root for the identity on which the current request was received. Logical Path A request may specify a path to a resource needed to complete that request. This path may be in a logical pathname space. This item contains the pathname exactly as received by the server, without logical-to-physical translation. Physical Path If the request contained logical path information, the server provides the path in physical form, in the native object (e.g., file) access syntax of the operating system. This may vary if the server supports multi-homing with separate logical path spaces. The server must provide the physical path equivalent using the logical to physical mapping for the identity on which the current request was received. Query String The information which follows the ? in the URL that generated the request is the "query" information. The server furnishes this to the back end whenever it is present on the request URL, without any decoding or translation, and the WinCGI package performs the URL decoding on this string. This string is fully URL Decoded when returned from the WinCGI Package. Request Range Byte-range specification received with request (if any). See the current Internet Draft (or RFC) describing the byte-range extension to HTTP for more information. The server must support CGI program participation in byte-ranging to be compliant with this Specification. Referer The URL of the document that contained the link pointing to this CGI program. Note that in some browsers the implementation of this is broken, and cannot be relied-on. From The e-mail address of the browser user. Note that this is in the HTTP specification but is not implemented in some browsers due to privacy concerns. User Agent A string description of the client (browser) software. Not generated by all browsers. Content Type For requests which have attached data this is the MIME content type of that data. Format: type/subtype. Content Length For requests which have attached data, this is the length of the content in bytes. Content File For requests which have attached data, the server makes the data available to the CGI program by putting it into this file. The value of this item is the complete pathname of that file. Server Software The name and version of the information server software answering the request (and running the CGI program). Format: name/version. Server Name The network host name or alias of the server, as needed for self-referencing URLs. This (in combination with the ServerPort) could be used to manufacture a full URL to the server, for URL fixups. This may vary if the servetr supports multi-homing. The value of this item must be the host name on which the current request was received. Server Port The network port number on which the server is listening. This is also needed for self-referencing URLs. Server Admin The e-mail address of the server's administrator. This is used in error messages, and might be used to send MAPI mail to the administrator, or to form "mailto:" URLs in generated documents. CGI Version The revision of the CGI specification to which this server complies. Format: CGI/revision. For this version, "CGI/1.2 (Win)". Remote Host The network host name of the client (requestor) system, if available. This item is used for logging. Remote Address The network (IP) address of the client (requestor) system. This item is used for logging if the host name is not available. Authentication Method The protocol-specific authentication method specified in the request. If present, this is normally Basic. The server must provide this whether or not it was used by the server for authentication. Authentication Realm The method-specific authentication realm specified in the request. If present in the request, the server must provide this whether or not it was used by the server for authentication. Authenticated Username The username (in the indicated realm) that the client used to attempt authentication, as specified in the request. If present in the request, the server must provide this whether or not it was used by the server for authentication. Authenticated Password The password that the client used to attempt authentication, as specified in the request. If present in the request, the server must provide this whether or not it was used by the server for authentication. GMT Offset The numper of seconds to be added to GMT time to reach local time. For pacific Standard time, this number is -28,800. Useful for computing GMT times. Unique A file name without an extension or path that is unique within the scope of all outstanding CGI requests. Output File The tcl channel ID of the open output file. This channel must not be closed until then end of processing, or automatically by termination of the tcl shell. This is the channel that CGIWrite writes to. If the parent script needs to write to the outgoing channel ID, it should write to this channel ID. Section ACCEPT -------------- This section contains the client's acceptable data types found in the request header as Accept: type/subtype {parameters} If the parameters (e.g., "q=0.100") are present, they are passed as the value of the item. If there are no parameters, the value is "Yes". Section VALUE ------------- If the request is an HTTP POST or GET from an HTTP form (with content type of application/x-www-form-urlencoded or multipart/form-data), the server will decode the form data and put it into the VALUE section. Both the variable names and the submitted data are URL Decoded before being placed into this section. All TCL escape sequences are generated so that the TCL operations work with the intended data. If the form contains any SELECT MULTIPLE elements, there will be multiple occurrences of the same key. In this case, the server generates a normal "key=value" pair for the first occurrence, and it appends a sequence number to subsequent occurrences in the form name_X where X is an integer. The WinCGI Package decodes all of these and generates a TCL list as the value for the variable. It should be noted that if only one selection is made, the submitted field only occurs once, and can not be detected as a SELECT MULTIPLE field. The result is that the value of the field is NOT a TCL List until the second value gets added. WARNING: If only one selection is made in a SELECT MULTIPLE field, AND the value of the single selection contains spaces, it is immposible to diferentiate between this single multi-word selection and a multiple selection of these single words. It is recommended that all SELECT MULTIPLE values do not contain spaces. The "Query String" from the section SYSTEM is URL decoded and any resulting values are placed in this section as well. -------------------------------------------------------------------------------------------------- -------------------------------------------------------------------------------------------------- Windows CGI 1.3a Specification - Version of 18-Feb-96 Written by O'Rielly Software and included with WebSite 1.1 Webserver. Overview A large class of World Wide Web applications are best implemented using external programs that are controlled by a web server. Examples include front-ends to business applications which are themselves subject to frequent changes in business rules. The broad acceptance of rapid-application development (RAD) tools such as Visual Basic and Delphi have given rise to the need to use these tools to Web-enable many kinds of business applications. The widely used Common Gateway Interface (CGI) uses techniques well suited to the Unix environment. A different sort of interface is needed to support common Windows RAD tools for CGI. It is the purpose of this specification to define such an interface. I/O Spooling A key feature of Windows CGI is its spooled exchange of data between the server and the CGI program. It is essential that the server provide efficient transfer of data between the spool files and the network. This means that the server should use memory-mapped techniques, and minimize the number of separate network I/O requests used. The reasons for using spooled I/O are: Most RAD packages do not have native network (socket) I/O capabilities. Socket I/O techniques are relatively exotic, and efficient results require a thorough knowledge of the Win32 network interface. All input and output would require complex buffering to achieve acceptable network efficiency. Sockets cannot be inherited by a 16-bit program. Spooled input (e.g. POST content) can be memory mapped and thus processed far more efficiently than is possible using stream-oriented techniques. A reference set of spool files may be used for regression testing and debugging in the RAD development environment. Spool files may be retained after a CGI program runs, for "post-mortem" analysis, also using the RAD environment. HTML Form Data Decoding Windows CGI requires that the web server decode HTML form data if present in a POST request. It is not required that the server decode form data if it appears in the "query string" portion of a request URL. There are two ways in which form data may be may be sent by a browser to the server: URL-Encoded This is the most common form data format. The contents of form fields are "escaped" according to the rules in the HTML 1.0 Specification, then concatenated using unescaped ampersand characters. This URL-encoded data is sent as a stream to the server, with a content type of application/x-www-form-urlencoded. Multipart Form Data This format has been introduced to permit efficient file uploading with forms. It may be used without explicitly including a file upload form field, however. The contents of the form fields are sent as a MIME multipart message. Each field is contained within a single part. The content type indicated by the browser is multipart/form-data. Compliant servers must decode both form data types. Launching the CGI program The server uses the CreateProcess() service to launch the CGI program. The server maintains synchronization with the CGI program so it can detect when the CGI program exits. This is done using the Win32 WaitForSingleObject() service, waiting for the CGI process handle to become signalled, indicating program exit. The server must never use a shell to execute the CGI program. This can create serious security risks. NOTE: The CGI program's process handle becomes signalled before the process rundown is complete. Reliance on rundown to close files, inherited handles, etc., can cause obscure synchronization problems. Command Line The server must execute a CGI program request by doing a CreateProcess() with a command line in the following form: WinCGI-exe cgi-data-file WinCGI-exe The complete path to the CGI program executable. The server does not depend on the "current directory" or the PATH environment variable. Note that the "executable" need not be a .EXE file. It may be a document, provided an "association" with a corresponding executable has been established. cgi-data-file The complete path to the CGI data file. Launch Method The server issues the CreateProcess() such that the process being launched has its main window hidden. The launched process itself should not cause the appearance of a window nor a change in the Z-order of the windows on the desktop. The server supports a CGI program/script debugging mode. If that mode is enabled, the CGI program is launched such that its window shows and is made active. This can assist in debugging CGI applications. Document Associations The server must honor document associations. If the target of a Windows CGI request is a document (not an executable), the server must attempt to find the associated application for the document and launch the application such that the document is "processed". The CGI Data File The server passes data to the CGI program via a Windows "private profile" file, in key-value format. The CGI program may then use the standard Windows API services for enumerating and retrieving the key-value pairs in the data file. The CGI data file contains the following sections: [CGI] [Accept] [System] [Extra Headers] [Form Literal] [Form External] [Form File] [Form Huge] The [CGI] Section This section contains most of the CGI data items (accept types, content, and extra headers are defined in separate sections). Each item is provided as a string value. If the value is an empty string, the keyword is omitted. The keywords are listed below: Request Protocol The name and revision of the information protocol this request came in with. Format: protocol/revision. Example: "HTTP/1.0". Request Method The method with which the request was made. For HTTP, this is "GET", "HEAD", "POST", etc. Executable Path The logical path to the CGI program executable, as needed for self-referencing URLs. This may vary if the server supports multi-homing with separate logical path spaces. The server must provide the physical path equivalent using the logical to physical mapping for the identity on which the current request was received. Document Root The physical path to the logical root "/". This may vary if the server supports multi-homing with separate logical path spaces. The server must provide the physical path to the logical root for the identity on which the current request was received. Logical Path A request may specify a path to a resource needed to complete that request. This path may be in a logical pathname space. This item contain the pathname exactly as received by the server, without logical-to-physical translation. Physical Path If the request contained logical path information, the server provides the path in physical form, in the native object (e.g., file) access syntax of the operating system. This may vary if the server supports multi-homing with separate logical path spaces. The server must provide the physical path equivalent using the logical to physical mapping for the identity on which the current request was received. Query String The information which follows the ? in the URL that generated the request is the "query" information. The server furnishes this to the back end whenever it is present on the request URL, without any decoding or translation. Request Range Byte-range specification received with request (if any). See the current Internet Draft (or RFC) describing the byte-range extension to HTTP for more information. The server must support CGI program participation in byte-ranging to be compliant with this Specification. Referer The URL of the document that contained the link pointing to this CGI program. Note that in some browsers the implementation of this is broken, and cannot be relied-on. From The e-mail address of the browser user. Note that this is in the HTTP specification but is not implemented in some browsers due to privacy concerns. User Agent A string description of the client (browser) software. Not generated by all browsers. Content Type For requests which have attached data this is the MIME content type of that data. Format: type/subtype. Content Length For requests which have attached data, this is the length of the content in bytes. Content File For requests which have attached data, the server makes the data available to the CGI program by putting it into this file. The value of this item is the complete pathname of that file. Server Software The name and version of the information server software answering the request (and running the CGI program). Format: name/version. Server Name The network host name or alias of the server, as needed for self-referencing URLs. This (in combination with the ServerPort) could be used to manufacture a full URL to the server, for URL fixups. This may vary if the servetr supports multi-homing. The value of this item must be the host name on which the current request was received. Server Port Tne network port number on which the server is listening. This is also needed for self-referencing URLs. Server Admin The e-mail address of the server's administrator. This is used in error messages, and might be used to send MAPI mail to the administrator, or to form "mailto:" URLs in generated documents. CGI Version The revision of the CGI specification to which this server complies. Format: CGI/revision. For this version, "CGI/1.2 (Win)". Remote Host The network host name of the client (requestor) system, if available. This item is used for logging. Remote Address The network (IP) address of the client (requestor) system. This item is used for logging if the host name is not available. Authentication Method The protocol-specific authentication method specified in the request. If present, this is normally Basic. The server must provide this whether or not it was used by the server for authentication. Authentication Realm The method-specific authentication realm specified in the request. If present in the request, the server must provide this whether or not it was used by the server for authentication. Authenticated Username The username (in the indicated realm) that the client used to attempt authentication, as specified in the request. If present in the request, the server must provide this whether or not it was used by the server for authentication. Authenticated Password The password that the client used to attempt authentication, as specified in the request. If present in the request, the server must provide this whether or not it was used by the server for authentication. NOTE - Current practice on the O'Reilly WebSite servers require that the CGI program's name begin with a dollar sign ($) to have the password supplied through the CGI interface. This is not required by this specification. It is recommended, however, as it forces the CGI programmer to do something special to have the password info exported from within the server's internal environment. The [Accept] Section This section contains the client's acceptable data types found in the request header as Accept: type/subtype {parameters} If the parameters (e.g., "q=0.100") are present, they are passed as the value of the item. If there are no parameters, the value is "Yes". Note: The accept types may easily be enumerated by the CGI program with a call to GetPrivateProfileString() with NULL for the key name. This returns all of the keys in the section as a null-delimited string with a double-null terminator. The [System] Section This section contains items that are specific to the Windows implementation of CGI. The following keys are used: GMT Offset The numper of seconds to be added to GMT time to reach local time. For pacific Standard time, this number is -28,800. Useful for computing GMT times. Debug Mode This is No unless the server's "CGI/script tracing" mode is enabled, then it is Yes. Useful for providing conditional tracing within the CGI program. Output File The full path/name of the file in which the server expects to receive the CGI program's results. Content File The full path/name of the file that contains the content (if any) that came with the request. The [Extra Headers] Section This section contains the "extra" headers that were included with the request, in "key=value" form. The server must URL-unescape both the key and the value prior to writing them to the CGI data file. Note: The extra headers may easily be enumerated by the CGI program with a call to GetPrivateProfileString() with NULL for the key name. This returns all of the keys in the section as a null-delimited string with a double-null terminator. The [Form Literal] Section If the request is an HTTP POST from an HTTP form (with content type of application/x-www-form-urlencoded or multipart/form-data), the server will decode the form data and put it into the [Form Literal] section. For URL-encoded form data, raw form input is of the form "key=value&key=value&...", with the value parts in url-encoded format. The server splits the key=value pairs at the '&', then splits the key and value at the '=', url-decodes the value string, and puts the result into key=(decoded)value form in the [Form Literal] section. For multipart form data, raw form input is in a MIME-style multipart format, with each field in a separate part. The server extracts the field namd and value from each part and puts the result into key=value form in the [Form Literal] section. If the form contains any SELECT MULTIPLE elements, there will be multiple occurrences of the same key. In this case, the server generates a normal "key=value" pair for the first occurrence, and it appends a sequence number to subsequent occurrences. It is up to the CGI program to know about this possibility and to properly recognize the tagged keys. The [Form External] Section If the decoded value string is more than 254 characters long, or if the decoded value string contains any control characters or double-quotes, the server puts the decoded value into an external tempfile and lists the field into the [Form External] section as: key=pathname length where pathname is the path and name of the tempfile containing the decoded value string, and length is the length in bytes of the decoded value string. Note: Be sure to open this file in binary mode unless you are certain that the form data is text! The [Form Huge] Section If the raw value string is more than 65,535 bytes long, the server does no decoding, but it does get the keyword and mark the location and size of the value in the Content File. The server lists the huge field in the [Form Huge] section as: key=offset length where offset is the offset from the beginning of the Content File at which the raw value string for this key is located, and length is the length in bytes of the raw value string. You can use the offset to perform a "Seek" to the start of the raw value string, and use the length to know when you have read the entire raw string into your decoder. Note: Be sure to open this file in binary mode unless you are certain that the form data is text! The [Form File] Section If the request is in the multipart/form-data format, it may contain one or more file uploads. In this case, each file upload is placed into an external tempfile similar to the form external data. Each such file upload is listed in the [Form File] section as: key=[pathname] length type xfer [filename] where pathname is the pathname of the external tempfile containing the uploaded file, length is the length in bytes of the uploaded file, type is the MIME content type of the uploaded file, xfer is the content-transfer encoding of the uploaded file, and filename is the original name of the uploaded file. The square brackets must be included. They are used to delimit the file and pathnames, which may contain spaces. Example of Form Decoding In the following sample, the form contained a small field, a SELECT MULTIPLE with 2 small selections, a field with 300 characters in it, one with line breaks (a text area), and a 230KB field. [Form Literal] smallfield=123 Main St. #122 multiple=first selection multiple_1=second selection [Form External] field300chars=C:\TEMP\HS19AF6C.000 300 fieldwithlinebreaks=C:\TEMP\HS19AF6C.001 43 [Form Huge] field230K=C:\TEMP\HS19AF6C.002 276920 Results Processing The CGI program returns its results to the server as a data stream representing (directly or indirectly) the goal of the request. The server is responsible for "packaging" the data stream according to HTTP, and for using HTTP to transport the data stream to the requesting client. This means that the server normally adds the needed HTTP headers to the CGI program's results. The data stream consists of two parts: the header and the body. The header consists of one or more lines of text, and is separated from the body by a blank line. The body contains MIME-conforming data whose content type must be reflected in the header. The server does not interpret or modify the body in any way. It is essential that the client receive exactly the data that was generated by the back end. Special Header Lines The server recognizes the following header lines in the results data stream: Content-Type: Indicates that the body contains data of the specified MIME content type. The value must be a MIME content type/subtype. URI: (value enclosed in angle brackets) The value is either a full URL or a local file reference, either of which points to an object to be returned to the client in lieu of the body (which the server shall ignore in this type of result). If the value is a local file, the server sends it as the results of the request, as though the client issued a GET for that object. If the value is a full URL, the server returns a "401 redirect" to the client to retrieve the specified object directly. Location: Same as URI, but this form is now deprecated. The value must not be enclosed in angle brackets with this form. Other Headers Any other headers in the result stream are passed (unmodified) by the server to the client. It is the responsibility of the CGI program to avoid including headers that clash with those used by HTTP.