CGI: URL Encoding

From The Uniform Server Wiki
Jump to navigation Jump to search

URL Encoding

Dynamic pages are not all about flashing images and changing content. They also can include user interaction, such as user data sent to your web page for processing. Data is sent in a special format referred to as URL encoding. This page provides an overview of URL encoding and how to extract data sent to the server.

Encoding

Data arrives with a page request tagged with either GET or POST markers; it may have been initiated by a user filling in a form or by a crafted link from a previous request. The data is a string of ASCII characters included with the URL; certain characters have a special meaning and are encoded before being sent.

Table of common special characters

Character URL Encoded Character URL Encoded Character URL Encoded Character URL Encoded
; %3B ? %3F / %2F : %3A
# %23 & %26 = %3D + %2B
$ %24 , %2C space %20 or + % %25
< %3C > %3E ~ %7E    

The number following the % sign is the
hexadecimal ASCII code of the character
being encoded.

Browser encoding

A browser automatically encodes data from a form, but data included with links must be pre-coded.

Scripting language support

Scripting languages provide URL Encoding and Decoding functions.

The table on the right lists three script languages and their functions:

Language URL Encoding URL Decoding
VBScript escape(string) unescape(string)
JavaScript escape(string) unescape(string)
PHP urlencode(string) urldecode(string)


Environment variables

When Apache runs a CGI script it passes its current set of environment variables to the script. Included in these are the data sent and method (GET or POST) used for sending this data. Using these environment variables, name-value pairs can be extracted.

Test Script 6 - Display environment variables

This script accesses the process environment variables and displays them in your default browser.

  • Create a new file test6.vbs with content as shown on right.
  • Save to test folder \www\vbs_test
  • Start Apache if not already running
  • Enter: http://localhost:8081/vbs_test/test6.vbs into browser.
'!c:/windows/system32/cscript //nologo
Wscript.Echo "Content-type: text/html" & vbLF & vbLF

'-- List Process Environment Variables
Set wshShell = CreateObject( "WScript.Shell" )     'Create shell object
Set wshUserEnv = wshShell.Environment( "Process" ) 'Read process collection
For Each strItem In wshUserEnv                     'Scan returned collection
  WScript.Echo strItem & "<br />"                  'and output each item
Next                                               'get next item
Set wshUserEnv = Nothing                           'clean-up
Set wshShell   = Nothing

Wscript.Quit 0                                     'return exit code

Script output

The script produces output as shown on right.

Your data displayed will differ but what is of interest are the variable names passed to a script. Of particular importance are the variables REQUEST_METHOD and QUERY_STRING. These allow data sent with the page request to be extracted.

Remember that a query string is appended to a URL and sets the method to GET. This information is available via the environment variables.

POST data can be relatively long and exceed the allowed environment variable length. Because of this it is treated differently and is not accessible via an environment variable. Instead it is sent directly to the standard input stream, allowing it to be captured in a user defined variable.

HTTP_HOST=localhost:8081
HTTP_USER_AGENT=Mozilla/5.0 (Windows; U; Windows ....
HTTP_ACCEPT=text/html,application/xhtml+xml,....
HTTP_ACCEPT_LANGUAGE=en-gb,en;q=0.5
HTTP_ACCEPT_ENCODING=gzip,deflate
HTTP_ACCEPT_CHARSET=ISO-8859-1,utf-8;q=0.7,*;q=0.7
HTTP_KEEP_ALIVE=115
HTTP_COOKIE=cookie_test=cookie_value
HTTP_CONNECTION=keep-alive
PATH=C:\WINDOWS\system32;C:\WINDOWS;C:\WINDOWS\System32\Wbem
SystemRoot=C:\WINDOWS
COMSPEC=C:\WINDOWS\system32\cmd.exe
PATHEXT=.COM;.EXE;.BAT;.CMD;.VBS;.VBE;.JS;.JSE;.WSF;.WSH
WINDIR=C:\WINDOWS
SERVER_SIGNATURE=
SERVER_SOFTWARE=Apache
SERVER_NAME=localhost
SERVER_ADDR=127.0.0.1
SERVER_PORT=8081
REMOTE_ADDR=127.0.0.1
DOCUMENT_ROOT=F:/coral_mini/MiniServer/www
SERVER_ADMIN=fred@www.somedomain.com
SCRIPT_FILENAME=F:/coral_mini/MiniServer/www/vbs_test/test6.vbs
REMOTE_PORT=1762
GATEWAY_INTERFACE=CGI/1.1
SERVER_PROTOCOL=HTTP/1.1
REQUEST_METHOD=GET
QUERY_STRING=
REQUEST_URI=/vbs_test/test6.vbs
SCRIPT_NAME=/vbs_test/test6.vbs

Access environment variables

This code snippet shows how to access environment variables and post data:

Set wshShell = CreateObject( "WScript.Shell" )'Create shell
Set objEnv = wshShell.Environment( "Process" )'Access enviroment
method       = objEnv("REQUEST_METHOD")       'Get method
query_string = objEnv("QUERY_STRING")         'Get query string
If method="POST" Then                         'Check method flag
  post_data    = WScript.Stdin.ReadAll        'Get post data
End If

Note: The default method is GET; attempting to read POST data when the method is set to GET results in the following error: "Input past end of file" Therefore, before reading POST data, always check that the method is set to POST.


Test Script 7 - Raw data form example

Raw data test script:

This script contains two forms. These send data either using GET or POST. It displays the method used and data sent. Data is displayed raw, meaning that it is not decoded or separated into variables.

  • Create a new file test7.vbs with content as shown on right.
  • Save to test folder \www\vbs_test
  • Start Apache if not already running
  • Enter: http://localhost:8081/vbs_test/test7.vbs into browser.

Click either the Get or Post buttons. Encoded data is sent to the server and displayed.

Default string entered:

Cat  & dog=50a%

There is a hidden string:

English

This gives the encoded string as shown below:

text=Cat++%26+dog%3D50a%25&Language=English
'!c:/windows/system32/cscript //nologo
Wscript.Echo "Content-type: text/html" & vbLF & vbLF
WScript.Echo "<title>Test 7</title>"

'--Access environment variables and post data
Set wshShell   = CreateObject( "WScript.Shell" )   'Create shell
Set objEnv     = wshShell.Environment( "Process" ) 'Access enviroment
method         = objEnv("REQUEST_METHOD")          'Get method
query_string   = objEnv("QUERY_STRING")            'Get query string
If method="POST" Then
  post_data    = WScript.Stdin.ReadAll             'Get post data
End IF
Set wshShell   = Nothing                           'clean-up
Set objEnv     = Nothing

'--Display environment variables and post data
WScript.Echo  "Method:      " & method       & "<br />"
WScript.Echo  "QueryString: " & query_string & "<br />"
WScript.Echo  "Post data:   " & post_data    & "<br />" & "<br />"

'--Form method post
WScript.Echo  "<form method=""POST"" action=""test7.vbs"" >"
WScript.Echo  "  <input type=""text"" name=""text"" size=""50"" value=""Cat  & dog 50a%"">"
WScript.Echo  "  <input type=""hidden"" name=""Language"" value=""English"">"
WScript.Echo  "  <input type=""submit"" value=""Post"">"
WScript.Echo  "</form>"

'--Form method get
WScript.Echo  "<form method=""GET"" action=""test7.vbs"" >"
WScript.Echo  "  <input type=""text"" name=""text"" size=""50"" value=""Cat  & dog 50a%"">"
WScript.Echo  "  <input type=""hidden"" name=""Language"" value=""English"">"
WScript.Echo  "  <input type=""submit"" value=""Get"">"
WScript.Echo  "</form>"

Wscript.Quit 0                                     'return exit code

Data organisation

Data is organised as variable special character "=" value pairs. These pairs are concatenated using the special character &. Any special characters appearing in the value string are encoded. It’s an elegant structure making it easy to extract variables and their associated data.

  • Using the above string as an example: text=Cat++%26+dog%3D50a%25&Language=English
  • Split the string at "&" to give variable-value pairs
    • text=Cat++%26+dog%3D50a%25
    • Language=English
  • Split each variable-value pair at the "=" character
    • Variable: text
    • Value: Cat++%26+dog%3D50a%25
    • Variable: Language
    • Value: English
  • Decode each value
    • Variable: text
    • Value: Cat & dog=50a%
    • Variable: Language
    • Value: English
  • Save results in a dictionary (associative array)
    • array_data[text] = "Cat & dog=50a%"
    • array_data[Language] = "English


Test Script 8 - Function Get CGI variables

Get CGI variables test script:

This script contains two forms. These send data either using GET or POST. It uses a function to separate variables sent and decodes their corresponding values which are then displayed.

  • Create a new file test8.vbs with content as shown on right.
  • Save to test folder \www\vbs_test
  • Start Apache if not already running
  • Enter: http://localhost:8081/vbs_test/test8.vbs into browser.

Click either the Get or Post buttons; encoded and decoded data are displayed.

Note: After testing the encoded data display, this section can be removed from the function.


Function cgi_variables():

  • Received data is split using split(data_string,"&") and the result saved to array variable_array.
  • This array is scanned using a for loop. Each variable-value pair is split using split(variable_array(i), "=")
  • The cgi_variable(0) part is used directly for the index while cgi_variable(1) requires further processing.
  • First the special character "+" is replaced with a space and the result decoded.
  • The index cgi_variable(0) and value decoded_value are added to the dictionary.
  • Finally, the function returns the dictionary.
'!c:/windows/system32/cscript //nologo
Wscript.Echo "Content-type: text/html" & vbLF & vbLF
WScript.Echo "<title>Test 8</title>"

'=== Function Get CGI variables ===============================================
Function cgi_variables()
 Set cgi_hash = CreateObject("Scripting.Dictionary")'Create associative array

 '--Access environment variables and post data
 Set wshShell   = CreateObject( "WScript.Shell" )   'Create shell
 Set objEnv     = wshShell.Environment( "Process" ) 'Access enviroment
 method         = objEnv("REQUEST_METHOD")          'Get method
 data_string    = objEnv("QUERY_STRING")            'Get query string
 If method="POST" Then                              'If post data read stdin
   data_string  = WScript.Stdin.ReadAll             'Get post data
 End IF
 Set wshShell   = Nothing                           'clean-up
 Set objEnv     = Nothing

'===Test code ===Remove================================
WScript.Echo  "Method:      " & method       & "<br />"
WScript.Echo  "QueryString: " & data_string  & "<br />"
'================================Test code ===Remove===

 '--Save variable-value pairs to array
 variable_array = split (data_string, "&")          'split and save to array

 '--Populate the scripting dictionary
 for i = 0 to ubound(variable_array)                 'Scan array
  cgi_variable = split (variable_array(i), "=")      'Split variable-value pairs
  decoded_value1 = replace(cgi_variable(0), "+", " ")'Replace special character
  decoded_value1 = unescape(decoded_value1)          'Decode encoded value
  decoded_value2 = replace(cgi_variable(1), "+", " ")'Replace special character
  decoded_value2 = unescape(decoded_value2)          'Decode encoded value
  cgi_hash.add decoded_value1, decoded_value2        'Add to dictionary 
 next                                                'Get next pair

 Set cgi_variables = cgi_hash                       'Return dictionary
End Function
'=========================================== End Function Get CGI variables ===

'--Example of cgi dictionary function use.
Set cgi = cgi_variables()                                    'Get cgi dictionary 
wscript.echo "Text:    "  & cgi.item("text")      & "<br />" 'Display text var
wscript.echo "Language: " & cgi.item("Language")  & "<br />" 'Display language var
wscript.echo "<br />" 'Display blank line

'--Form method post
WScript.Echo  "<form method=""POST"" action=""test8.vbs"" >"
WScript.Echo  "  <input type=""text"" name=""text"" size=""50"" value=""Cat  & dog=50a%"">"
WScript.Echo  "  <input type=""hidden"" name=""Language"" value=""English"">"
WScript.Echo  "  <input type=""submit"" value=""Post"">"
WScript.Echo  "</form>"

'--Form method get
WScript.Echo  "<form method=""GET"" action=""test8.vbs"" >"
WScript.Echo  "  <input type=""text"" name=""text"" size=""50"" value=""Cat  & dog=50a%"">"
WScript.Echo  "  <input type=""hidden"" name=""Language"" value=""English"">"
WScript.Echo  "  <input type=""submit"" value=""Get"">"
WScript.Echo  "</form>"

Wscript.Quit 0                                     'return exit code

Summary

URL encoding and decoding is a very important part of creating dynamic web sites. Although the above is not a definitive guide, it provides enough detail to get you up and running.

Of particular importance is the Get CGI variables function. You can use this as is, alternatively roll your own or use a third party version. A better solution is to use a class. This is covered on the VBScript CGI Property Class page

At this stage you currently have enough information to creative dynamic web pages written in VBScript! Test script eight is a working example of a dynamic web page. Perhaps you would prefer writing your script in JavaScript (JScript) this is covered next.

Note: What has been previously covered applies equally to JavaScript.

Where to next

The next page covers JavaScript (JScript) CGI.