HISTORY.TXT
(for DosLynx v0.34b, June 2007)
by Fred C. Macall
Introduction
In this document, I'll try to take up where I left off in the last
history.txt and not rehash the history of the previous versions.
However, there is a cumulative list of unresolved bugs, issues, and To Do(s)
at the end. If you want to know everything else there is to know about my
previous releases of DosLynx, you can get much of that history from:
http://users.ohiohills.com/fmacall/dlx2xdoc.zip .
That archive collects the info.htm and history.txt documents for all eight
of my DosLynx v0.2xb releases. For the rest, the documents for
DosLynx v0.30b, v0.31b, v0.32b, and v0.33b are available at:
http://users.ohiohills.com/fmacall/dlx30/info.htm
http://users.ohiohills.com/fmacall/dlx30/history.txt
http://users.ohiohills.com/fmacall/dlx31/info.htm
http://users.ohiohills.com/fmacall/dlx31/history.txt
http://users.ohiohills.com/fmacall/dlx32/info.htm
http://users.ohiohills.com/fmacall/dlx32/history.txt
http://users.ohiohills.com/fmacall/dlx33/info.htm
and
http://users.ohiohills.com/fmacall/dlx33/history.txt .
Old Business
Remember that HTHnSGut( ), in HTFWRITE.CPP, concerns itself with (among other
things) providing local filename suggestions for objects being "downloaded".
As many times as I already have worked to improve its filename suggestions,
I had missed noticing that such suggestions were quite confusing when they
contained any space(s). Perhaps my oversight resulted from an expectation
that URL(s) won't contain any space(s)! Anyway, once I finally realized that
this occasionally was an issue, I soon dealt with it. As with extra
period(s) in filename suggestions, the solution was to scan tentative
suggestions for space(s) and change them to under bars.
Apparently, previous work, in HTTP.C and HTAABROW.C on issues associated
with authentication, had paid off. I found DosLynx had come to be able to
heartily and robustly loop on an authentication request arising in connection
with an access to a root directory. That is, an access for a URL containing
only a minimum path and no filename part. Here, I put in a two part fix.
The first part added loop recognition and counting to HTLoadHTTP( ), in
HTTP.C, and HTAA_shouldRetryWithAuth( ), in HTAABROW.C. This part is
intends to break any new authentication loop(s) that might be uncovered in
the future. I've added a global, int httprtct, to HTTP.C. It gets cleared
just ahead of HTLoadHTTP( )'s retry label. Then, just ahead of
HTLoadHTTP( )'s HTAA_shouldRetryWithAuth( ) call, it gets incremented and
checked, for a value of six or more. Once that value has been reached, a
message is produced and the loop is exited. HTAA_shouldRetryWithAuth( )
clears httprtct if the user enters an(other) ID and password. So, a loop
won't be broken as long as the user wants to keep retrying their "Login".
Of course, that wasn't a consideration with the loops discussed here.
These loops didn't bother offering the user a Login retry.
This time, the trouble stemmed from HTLoadHTTP( )'s call to
HTAA_composeAuth( ), also in HTAABROW.C. When the third parameter in that
call (HTLoadHTTP( )'s docname) is null or points to an empty string,
HTAA_composeAuth( ) isn't able to do its job. So, the second part of the fix
was to give HTLoadHTTP( ) checking for this troublesome case. When it finds
this case, it now substitutes a wild card string ("*") for the third
parameter in its HTAA_composeAuth( ) call. Then, HTAA_composeAuth( ) does
its job. And, any remaining authentication loop only persists for as long as
the user's response(s) will sustain it.
Over the years, I've gradually improved DosLynx' accuracy in distinguishing
or shading anchor or link text. However, an occasional inaccuracy was still
showing up. Usually near the beginning of a line. This was attributable to
an anchor's established class TextAttribute object content being made
inaccurate by a TURLView::split_line( ) operation. The whole process,
conducted by TURLView::FormatHTML( ) in TURLVIEW.CPP, goes roughly as
follows:
Driven by intermediate file content, 'FormatHTML( ) assembles a display line
until it realizes that the desired display width of eighty characters has
been reached or exceeded. If the desired width has been exceeded, the line
in process gets split at some point prior to the eighty characters mark.
That gets done by 'split_line( ), also in TURLVIEW.CPP. It places a four
octet struct Line object in the display file at each split point.
And, at the same time, it drops any already assembled space character(s)
adjacent to the split point. As a result of this process, display file
offsets, for points beyond the split point, change by a few character(s).
So, any anchor in process and any anchor(s) recently processed, at the time
of a split, are more or less likely to need adjustment(s). That is, they're
likely to need their sli_StreamOffset and/or their sli_StreamExtent display
file offset value(s) adjusted.
As described under the heading Forms Support in the history.txt for
DosLynx v0.24b, I have already dealt with the case of an anchor in process at
the time of a split. So, the work of accumulating a net display file offset
adjustment value, in 'split_line( ), had already been accomplished.
Only the case(s) of recently processed anchor(s), at the time of a split,
remained to be treated.
Global TextAttribute * TAp_newanc identifies any anchor being developed by
'FormatHTML( ). 'split_line( ) uses it to identify any anchor in process at
the time of a split. I've added a new global: TextAttribute * TAp_freshanc,
to identify the most recently completed anchor. TURLView::endAnchor( ), also
in TURLVIEW.CPP, now copies TAp_newanc into TAp_freshanc, just before
clearing TAp_newanc, when it completes an anchor. So, 'split_line( ) is now
able to check any TAp_freshanc anchor's location, with respect to its present
split point, as well. When the TAp_freshanc anchor follows or straddles the
present split point, 'split_line( ) now adjusts its display file offset
value(s). Whether there is much need to carry this kind of anchor adjustment
back to any earlier anchor(s) remains to be seen.
At some point, early this year, it became clear to me that my kludge, for
identifying the TURLView instance to process, in case cmDownload in
TURLWindow::handleEvent( ) in TURLWIN4.CPP, wasn't working reliably.
To fix this, I've added a case cmDownload to TURLView::handleEvent( ), in
TURLVIE9.CPP. It relays its event in a new cmDownloadChild event initiated
via a 'putEvent(TE_load) call. This call's TE_load.message.infoPtr conveys
"this" TURLView's instance pointer. I've defined cmDownloadChild, in
GLOBALS.H, and changed case cmDownload to case cmDownloadChild, in
TURLWIN4.CPP.
Limited HTML
tag
support. The DosLynx
or
tag. That is a or
-like tag with a
end tag
expected to follow. Next, came work in HTML.C.
At about this time, I also made some reductions in HTML.C.
I commented-out all references to in_word.
Now, HTML_put_character( )'s default case will always change Line Feed to
Space. HTML_put_character( ) also was revised to change Form Feed,
Carriage Return, and Horizontal Tab to Space. In addition, I reduced
HTML_put_string( ) to simply call HTML_put_character( ) in a loop.
I needed a new DosLynx "style" for align=right alignment.
(DosLynx internal "styles" should be distinguished from HTML styles.)
Providing this involved making an addition to DEFAULTS.C and to
get_styles( ), in HTML.C. HTML_DIV cases for HTML.C's HTML_start_element( )
and HTML_end_element( ) and an update for HTML_start_element( )'s HTML_P case
completed the parsing work for the new support. It then developed that
TURLView::split_line( )'s i_spare calculation also needed correction, before
align=right processing worked correctly.
SMTP Host Port Number Configuration
In the fall of 2006, ISPs in Europe and the Middle East were reported
changing their SMTP servers, for mail origination, from listening on TCP/IP
port 25 to listening on TCP/IP port 587. To accommodate the possibility that
your SMTP server will be listening on TCP/IP port 587, or some other, I've
made this configurable. You add :portnumber to the end of your smtphost=
configuration, in DOSLYNX.CFG, if your SMTP Host isn't using the traditional
port 25.
Providing this new configuration involved: Changing const int i_SMTPPort, in
TDOSLY15.CPP, to a (global) unsigned int. Adding a declaration for
i_SMTPPort to GLOBALS.H. And, extending the smtphost= configuration data
processing clause, in TDosLynx::EvalConfigFile( ) in TDOSLY14.CPP, to
identify any :portnumber configuration and copy its value into i_SMTPPort.
Making Way for Cookies Support
The time had come to add Cookies Support to DosLynx. And, I knew I would be
adding some new module(s) for that. So, to keep the number of modules making
up DosLynx well below 255, I decided to consolidate some existing small
modules.
I started by changing TURLWindow::rvrefer( )'s declaration, in class
TURLWindow in TURLWIND.H, to make its parameter optional, with a default of
zero. I've also removed the declaration for TURLWindow::rlrefer( ).
'rvrefer( ) and 'rlrefer( ) were introduced in the history.txt for
DosLynx v0.32b. Once 'rvrefer( ), defined in TURLWIN4.CPP, had been extended
to act like 'rlrefer( ) when called with no parameter, I was able to change
all the 'rlrefer( ) calls to 'rvrefer( ) calls. These were in TURLWIN4.CPP,
TURLWIN5.CPP, and TURLWI10.CPP. And, eliminate 'rlrefer( ) and the small
module: TURLWI15.CPP, that contained it.
Next, I turned to consolidating small overlay modules. Each containing a
single function. I looked for their caller(s) and consolidated each of them
in the module containing their only caller, where possible.
I've consolidated: TURLVI21.CPP with TURLVIE9.CPP, into TURLVIE9.CPP.
THEAPVI4.CPP with THEAPVIE.CPP, into THEAPVIE.CPP. TSOCKVI4.CPP with
TSOCKVIE.CPP, into TSOCKVIE.CPP. TDISKVI3.CPP with TDISKVI2.CPP, into
TDISKVI2.CPP. TDOSLY11.CPP with TDOSLYNX.CPP, into TDOSLYNX.CPP.
And, MEMSTRAT.CPP with MAIN.CPP, into MAIN.CPP. If I had this to do over,
I might skip that consolidation with MAIN.CPP. Because, I suppose MAIN.CPP
may have to remain loaded in overlay memory through the whole run.
But, MEMSTRAT.CPP was small enough that I don't feel much need, now, to undo
its consolidation.
Here, I came to realize that there seems to be only one downside to
consolidating resident or non-overlay modules. No matter what their size(s),
up to a point. The one downside I can see in this is that a little more
effort would be needed to change a consolidated function to an overlay
module, if that should ever be desired. Following this view, I've
consolidated: TEXTATT6.CPP, TEXTATT7.CPP, TEXTATT8.CPP, TEXTATT9.CPP,
TEXTAT10.CPP, and TEXTAT12.CPP, all into TEXTATT6.CPP.
TDOSLYN2.CPP, TDOSLYN8.CPP, TDOSLY13.CPP, and TTEMPNA3.CPP, all into
TDOSLYN2.CPP. And, TURLVIE8.CPP, TURLVI10.CPP, TURLVI11.CPP, and
TURLVI25.CPP, all into TURLVIE8.CPP.
Specifications for Cookies Support
RFCs 2109, 2964, and 2965 all deal with Cookies support. In particular,
RFC2109 and RFC2965 have been described as "Internet standards track"
specifications. (RFC2109 has been Obsoleted by RFC2965.)
However, the Max-Age= and Version= attributes, introduced in RFC2109 and
confirmed in RFC2965, appear in very few of the HTTP Set-Cookie: Header
fields that I've collected over the past five years. And, I think I have
seen only one real example of the Set-Cookie2: Header field, introduced in
RFC2965! So, what's going on here? As best as I have been able to tell, it
appears that the HTTP server developers have been satisfied with the original
Netscape HTTP Cookies Specification, contained in:
http://wp.netscape.com/newsref/std/cookie_spec.html
Well, if the server developers are ignoring the RFCs, should I worry about
supporting them in DosLynx? I've taken the approach of essentially following
the Netscape Specification. However, in areas where the
Netscape Specification seems vague or allows Web server(s) to be extra
aggressive, I've incorporated some of the defensive provisions of the RFCs.
In particular, Doslynx requires all hosts attempting to share Cookie(s)
across a "domain" to have exactly one additional qualification in their
names. That is, DosLynx allows www.yahoo.com and groups.yahoo.com to share
Cookie(s), as they share the domain: yahoo.com. However, DosLynx doesn't
support attempts by tech.groups.yahoo.com to participate in that same
yahoo.com domain. (Sharing remains possible for the latter host, via the
"domain": groups.yahoo.com. However, Cookie(s) for that domain won't be
sent to the "host": groups.yahoo.com, by DosLynx.) Also, DosLynx is careful
to neither accept nor send Cookie(s) from/to "third-party hosts" that might
be introduced to it in the course of redirection. (As DosLynx doesn't
automatically fetch inline images nor embedded objects, those kinds of
"unverifiable transactions" aren't issue(s) for it.)
User Interfaces for Cookies Support
These are the configuration interface, a new Options|Manage HTTP Cookie Mode
menu entry or command and its dialog, some new audit: messages, and some new
error messages. (A Cookie(s) file viewing and, possibly, editing facility or
tool remains to be provided.) This section primarily deals with the new
configuration and the new menu entry or command and its dialog. I thought I
would make these and the new audit: messages parallel those for
HTTP Referer: Header Field sending, added to DosLynx in version 0.32b.
With only a few exceptions:
For Cookies support, a new cookiedir= configuration item is needed, in
addition to the new cookiemode= configuration item.
cookiedir= configuration might be thought to be dispensable.
However, having it insures that the user will always know where her Cookie(s)
are! Next, there was an aspect of the Options|Manage HTTP Referer Mode
dialog that I had found I didn't like much. That was its Toggle Mode
push button, that also exited the dialog. I never seemed to get away from
feeling a need to revisit the dialog after using that push button, to see if
the Mode had been toggled! Also, I decided that HTTP Cookie Mode needed a
third state, for Monitoring, in addition to its OFF and ON states.
The simple Toggle Mode push button wasn't appropriate for managing a set of
three possible states. So, a replacement was needed for it.
As usual for new configuration items and dialogs, the first step was to
declare globals for holding the configured values. These are
extern char * cpckydir, extern int ckymode, and
const unsigned short int cmCkyMode, all declared in GLOBALS.H.
Next, GLOBALS.CPP was provided with instantiations for cpckydir and ckymode.
And, TDosLynx::EvalConfigFile( ), in TDOSLY14.CPP, was extended to give these
globals any cookiedir= and cookiemode= configuration found. At the end of
'EvalConfigFile( ), a test was added for cpckydir. If it remains zero, or
unconfigured, all of ckymode's Enable bits (but not its audit bit(s)) get
zeroed, too. A new entry has been added to TDosLynx::initMenuBar( ), in
TDOSLYN9.CPP, for invoking the new Manage HTTP Cookie Mode dialog via the
new cmCkyMode command or event.
In spite of the three differences between HTTP Referer Mode and
HTTP Cookie Mode that I've already introduced, I wanted to consolidate their
dialogs as much as possible. I started with the code added to TDOSLYN7.CPP,
for HTTP Referer Mode, in DosLynx v0.32b. I've moved that into a new
TDosLynx::mHTTPmo( ) function, in new module MHTTPDLG.CPP.
In TDosLynx::handleEvent( ), in TDOSLYN7.CPP, former case cmRefMode has been
replaced with a call to 'mHTTPmo( ). And, new case cmCkyMode has been given
another call to 'mHTTPmo( ). These calls provide the distinctive data needed
by 'mHTTPmo( ) for customizing its dialog, for either Referer Mode or
Cookie Mode management.
I decided to stay with a push button for stepping through the options'
available modes or states. Without leaving the dialog, when pressed.
As Turbo Vision class TDialog seems to provide push buttons that always exit
a dialog, I've found it necessary to define a class mHTTPDlg, which extends
or inherits Turbo Vision class TDialog. class mHTTPDlg is declared in new
headers file MHTTPDLG.H. Following the example of Borland
Technical Information document TI1158, it declares only constructor and
'handleEvent( ) member functions, of its own. These, too, are defined in new
module MHTTPDLG.CPP. mHTTPDlg::mHTTPDlg( ), which gets invoked by
'mHTTPmo( ), provides most of the new consolidated dialog's construction.
When refmode= or cookiemode= configuration enables an ON or Monitoring mode
or state, 'mHTTPDlg( ) provides a Change Mode push button. It's in the upper
right quadrant of the dialog's window. Near the Open button's relative
location, in the local file dialogs. (The Open button doesn't end its local
file dialog when used on a directory name. So, it provides something of a
precedent for the new Change Mode push button.) The Change Mode button
provides for stepping through the two or three modes or states available.
The new 'handleEvent( ) member function handles Change Mode button press
events without ending the dialog's run. A standard OK push button, at the
bottom of the dialog's window, provides for ending the dialog's run, and is
its default button.
In order to make the new dialog's status reporting line respond to
Change Mode button presses, a new class TVarText, which extends or inherits
Turbo Vision class TStaticText, was needed. class TVarText is declared in
new headers file TVARTEXT.H. Following the example of Borland
Technical Information document TI1532, it declares only inline constructor
and 'setText( ) member functions, of its own.
A new "glue" function: TDosLynx::upmHTTPD( ) gets called from 'mHTTPmo( )
and mHTTPDlg::handleEvent( ) when the dialog's status reporting line needs to
be updated. (The status line gets established with only a space, during
construction. It gets its first update as soon as the dialog is executed.)
'upmHTTPD( ) provides the text needed and calls TVarText::setText( ) for
setting it into place. 'mHTTPmo( ) and 'upmHTTPD( ) have been added to the
declaration of class TDosLynx, in TDOSLYNX.H, of course.
Receiving Cookies
An HTTP server offers to initiate a "session" with a user or her browser by
including Set-Cookie: field(s) in Response Header(s) being sent to the
browser. (That is, in its response to a user's request for a URL.)
So, the first part of the additions for Cookies handling came in HTMIME.C,
where DosLynx interprets the Response Header(s) that it receives.
I extended HTMIME_put_character( ), there, to recognize and accumulate
Set-Cookie: Header line(s). Doing this involved adding SET_COOKIE and
SET_CKY_FIELD values to (global) enum MIME_state, in HTMIME.C. I also felt a
need to add a variation of my favorite Carriage Return character filter, to
HTMIME_put_character( ). In order to improve HTMIME_put_character( )'s
predictability in cases of non-standard end-of-line sequence(s).
This filter was displayed under the heading:
Carriage Return Filtering for Presented Documents, in the history.txt for
DosLynx v0.31b. This filter's addition led to having to clarify exactly what
constitutes the "blank line" that marks the end of the HTTP Headers, in a
received document. I've settled on the following four sequences: x CR CR ,
x CR LF CR y , x LF CR , and x LF LF . (Where x stands for any character,
other than CR or LF . And, y stands for any character at all.
Though, y might be expected to be a LF , normally.)
While working on Carriage Return filtering in HTMIME.C, I discovered that
DosLynx had no good way to deal with a TCP/IP end-of-file indication coming
while in HTTP Headers processing! The most disruptive thing that happens in
this case is that TURLView::HTp_HyperDoc doesn't get set, beyond its initial
value of zero. To deal with this issue, I've added a check of 'HTp_HyperDoc,
to TURLView::TURLView( ) in TURLVIE2.CPP, after its TURLView::loadURL( )
call. If 'HTp_HyperDoc is found to be zero while TURLView::B_valid and
B_considerValid are both True, a new doslynx: Unable to load . . . message
gets issued. And, 'B_valid gets cleared to False. Clearing 'B_valid keeps
DosLynx from attempting any 'HTp_HyperDoc based presentation processing.
And, the new message prepares the user for the resulting absence of the
expected presentation.
If any mode or status bit(s) are set in ckymode when the beginning of a
Set-Cookie: Header field is encountered, that Header line gets extracted from
the received Headers. So, little new effort is required here as long as
HTTP Cookie Mode remains OFF. Once the end of an extracted
Set-Cookie: Header line is reached, ckymode gets rechecked.
If HTTP Cookie Mode is (Enabled and) ON, the received Set-Cookie: Header line
gets passed to new function savecky( ), in new module CKIESRCV.CPP.
So, the effort required for loading savecky( ), in the DosLynx Real Mode
version, doesn't get expended unless HTTP Cookie Mode is ON.
savecky( ) begins by rechecking ckymode's bits. And, by verifying: That the
present transaction's "original request host" has a domain levels score of
three or more. And, that the present "request host" is the original request
host or a host with a domain that matches the original request host's domain.
This verification is accomplished with the help of URLoader( ), in
TURLWIND.CPP, HTLoadHTTP( ), in HTTP.C, and several new globals declared in
GLOBALS.H and instanciated in GLOBALS.CPP. These new globals are
int URLdrect, char * cprqhost, and char * cpnvhost. A related new global is
char * cprqpath. It is used for checking Cookie Path= attribute fields, as
described below. All four are initialized to zero. Another related new
global is unsigned long int ulisessn. It is initialized, with a
time(& (long int) ulisessn) call, by TDosLynx::TDosLynx( ) in TDOSLYNX.CPP.
As long as the user doesn't try to do anything tricky with their system's
date and time settings, ulisessn's value will provide a unique DosLynx run or
session identification. ulisessn is discussed further, below, as well.
URLoader( )'s recursive or self-referential calls have been bracketed with
++URLdrect and --URLdrect statements. As redirection is handled by
recursion in URLoader( ), URLdrect's resulting value provides an indication
that a redirected transaction is in progress. This is an
"unverifiable transaction". In such a transaction, the latest request host
is deemed a "third-party host" if its domain doesn't match the (verifiable)
original request host's domain.
HTLoadHTTP( ) has been extended to make further use of the hostname string
it develops in the course of developing its Host: Request Header field.
When URLdrect is clear, HTLoadHTTP( ) copies the hostname string into
cprqhost, and clears cpnvhost. When URLdrect is set, it copies the hostname
string into cpnvhost. So, cprqhost always contains the latest original
request host. And, cpnvhost contains any request host that savecky( ) needs
to verify is not a third-party host. HTLoadHTTP( ) has also been extended
to make further use of the URL string that it uses in developing its GET or
POST request. It copies the URL's complete path, from the URL string's first
post-host name slash to its last slash, into cprqpath.
savecky( ) checks cprqhost's domain levels score by calling new function
ckdomlvl( ), also in new module CKIESRCV.CPP. ckdomlvl( ) figures its score
by counting the periods in the given string and adding a bonus point for a
suffix length of three or more characters. i.e.: .com, .net, .info, etc..
But, not .uk nor .hu . savecky( ) requires a domain levels score of three or
more. So, for Cookies support purposes, www.google.com and www.google.co.uk
are acceptable request host names. But, google.com and google.co.uk aren't.
If cpnvhost isn't zero, indicating a possible non-original request host,
savecky( ) calls new function ck43rdpy( ), also in new module CKIESRCV.CPP.
ck43rdpy( ) always works on cprqhost and cpnvhost. If they match fully, in a
case insensitive compare, ck43rdpy( ) indicates no third-party host
involvement. Otherwise, it gives cprqhost and cpnvhost to new function
dommatch( ), also in new module CKIESRCV.CPP, for the decision.
To check for a domain match, dommatch( ) uses ckdomlvl( ) calls to verify
that both names it's been given have matching domain levels scores of three
or more. Then, it uses calls to isaddr( ), in WATTCP module UDP_NDS.C, to
verify that neither string is a TCP/IP address. Finally, it uses a case
insensitive compare to check for a match in everything following the first
period in each given string.
If savecky( ) detects an unsuitable original request host domain levels score
or a third-party host transaction, as described just above, it returns with
an indication that no Cookie has been saved. That indication is used, as
described below, for audit(ing) purposes. Otherwise, a suitable request host
has been verified. savecky( ) saves its name in char * cprhost and proceeds
to looking for Cookie(s) in the given Set-Cookie: Header line.
At the highest level, a Cookie is defined as a list of name(s) or name=value
field(s). Where the list's first name is one that isn't on a specified list
of "attribute" names. But, any additional name(s) in a Cookie are attribute
names from that specified list of attribute names. "Which specified list?"
You might be wondering. Well, the list of attribute names used in
CKIESRCV.CPP, includes all the attribute names mentioned in the
Netscape Specification. All the additional attribute names mentioned in
the RFCs listed earlier. $ prefixed versions of some of these.
And, $DLXsessn. ($DLXsessn is explained below.) For a total of fifteen
attribute names.
savecky( ) uses new function ckattrn( ), also in new module CKIESRCV.CPP, for
stepping through the names in the SetCookie: Header line it is given.
ckattrn( ) locates the first name at, or following white space at, a given
starting point in a string. It trys to find the located name in
CKIESRCV.CPP's list of fifteen attribute names, using case insensitive
compares. And, it scans for a comma (unless the Expires attribute name has
been located), semi-colon, or null ending the located name or name=value
field. ckattrn( ) makes only one kind of change in the string segment it
processes. It changes any field ending comma it finds to a semi-colon.
When it finds a Domain= or Path= attribute, savecky( ) uses new function
locvalue( ), also in new module CKIESRCV.CPP, for locating its value string.
Then it checks the located value string as follows. For Domain='s value:
If the string doesn't begin with a period, one is added. Then, the possibly
augmented string is given, together with cprhost, to dommatch( ) for
checking. For Path='s value: The string is compared with the cprqpath
string using a case sensitive comparison. The string must have a length less
than or equal to the length of the cprqpath string. The whole string must
exactly match the beginning of the cprqpath string.
Having found a subsequent unlisted name, or a null, in its given
Set-Cookie: Header line, savecky( ) considers all the Domain= and Path=
attribute(s) that it may have seen while scanning. If any were found
to be invalid, as described above, savecky( ) discards the tentative Cookie
it has just scanned and issues a new doslynx: Rejecting Cookie. message, to
the Messages window. Otherwise, savecky( ) makes a copy of the Cookie it has
just scanned and adds a prefix and suffix. These additions make a Cookie
record to be written in a Cookie(s) file.
The added prefix consists of the string: "$ddddd; ". That provides a place
for a leading zero(s) padded record length value. That gets copied in, over
the ddddd, once the Cookie's preparation has been completed.
The Cookie record length field(s) in DosLynx Cookie files simplify matching
buffer allocations to actual record(s), when these files are read back, a
record at a time, later.
The added suffix consists of two or three attribute=value fields.
These always include $DLXsessn= and $Domain= fields.
A Path= attribute field also may be added. But, only if the
received Cookie doesn't include a Path= attribute field of its own.
This insures that every saved Cookie will include a Path= attribute field.
The added $DLXsessn= attribute field identifies the DosLynx session writing
this Cookie record. This is for checking "non-persistent" or "session"
Cookies. These are Cookies that don't contain an Expires= attribute field.
They are defined to expire as soon as the browser's present run, or session,
ends. When these Cookies get read back from Cookie(s) files, anytime later,
DosLynx compares their $DLXsessn= attribute value with its then present
ulisessn value. If these values don't match, the read back Cookie is stale
and will get discarded. The need for the added $Domain= attribute
field will become clear a few paragraphs further on.
The final test for a Cookie record is a check of its length.
Should an augmented Cookie record be found to be longer than 4090 octets, it
will be dropped. And, the new doslynx: Rejecting Cookie. message will be
issued, to the Messages window.
When it is ready to save a received and accepted Cookie, savecky( ) calls new
function ckypname( ), in new module CKIESRCV.CPP, with the cprhost value.
ckypname( ) develops a complete DOS pathname, from the hostname it is given,
as follows: ckypname( ) uses isaddr( ) to see if the given name is a TCP/IP
address. If so, ckypname( ) takes as much of this as will fit into a DOS 8.3
format filename, for a DOS filename. If not, ckypname( ) takes as much of
the given name's domain as will fit into a DOS 8.3 format filename, for a DOS
filename. Any truncation occurs on the left in the prefix and on the right
in the suffix, of the DOS filename developed. Of course, a given name's
domain is everything after the first period in the given name.
Any period(s), other than the rightmost one in the resulting filename, are
changed to underbar(s). Any leading underbar, in the resulting filename,
gets discarded. ckypname( ) develops a complete DOS pathname by appending
the DOS filename it has developed to a copy of the cookiedir= configuration
value.
It may be seen that a result of ckypname( )'s approach is that DosLynx
dedicates each of its Cookie(s) file(s) to holding Cookie(s) from a
relatively small set of server(s). However, it is by no means assured that
any given Cookie(s) file will contain Cookie(s) from only a single server.
The $Domain= attribute field that gets added to each received and
accepted Cookie makes the source of each saved Cookie clear.
Something that is assured is that any and all Cookie(s) available for sending
to a given host will be found in a single DosLynx Cookie(s) file.
ckypname( ) will always be able to determine that file's name from the given
host's name.
In spite of the ambiguities in this approach, relatively fast DOS file
locating facilities will find the single needed file quickly. And, that file
will be relatively short. Because, all of the saved Cookie(s) will have been
dispersed over a relatively large number of Cookie(s) files.
The relatively meaningful filenames being used will enable users to employ a
simple directory report or dialog, for the cookiedir= directory, for
assessing and accessing their saved Cookies.
Having obtained ckypname( )'s determination of the Cookie(s) file that must
include the Cookie in process, savecky( ) tries to open that file for raw, or
binary, input. If that open succeeds, that file will have to be updated.
If not, the Cookie in process needs to be written into a new file with the
determined name.
If a Cookie(s) file update is needed, savecky( ) gives ckypname( ) the name:
www.cookydom.tmp . And, tries to open the pathname determined for raw, or
binary, output. Having successfully opened both an input file and an output
file, savecky( ) begins reading from the input file. First, it reads the
first record's eight octets long length field. Having obtained a
recognizable length field, it allocates a memory buffer of the indicated
length plus nine. And, reads a record of the indicated length plus eight.
This read should result in an actual read length of either the requested
length or the requested length minus eight. When the requested length is
obtained, the next record's length field has become available at the end of
the buffer. Otherwise, the input file's last record has (now) been read.
Once the first record has been read, in two reads, each additional record is
read in a single read. Perhaps needless to say, all of the file I/O
operations are checked as much as possible for all kinds of error(s).
Two new doslynx error messages will inform the user of any error(s) detected
in these file I/O operations.
savecky( ) uses new function readcky( ), also in new module CKIESRCV.CPP,
to extract the desired content from each Cookie record read.
The desired content consists of the Cookie's name and the contents of five of
its attribute fields. The five attribute field values sought are those from
the Path, Domain, $Domain, $DLXsessn, and Expires fields. Values for the
Path, $Domain, and $DLXsessn attribute fields are expected to be available in
every record. readcky( ) uses ckattrn( ) and locvalue( ) for parsing the
record it is given. While copying any Domain= attribute field's value it
finds, readcky( ) adds a leading period, if necessary to make the value begin
with a period. readcky( ) also returns an indication that the Cookie record
it has been given is either fresh or stale. It makes this determination as
explained in the following five paragraphs:
If it finds an Expires= attribute field, readcky( ) checks to see if the date
and time its value specifies have been reached. The exact syntax for this
field has been found to be somewhat variable! At one point, the
Netscape Specification calls for: Expires=Wdy, DD-Mon-YYYY HH:MM:SS GMT
Unfortunately, the Netscape Specification's only example of an Expires=
attribute field reads as follows: expires=Wednesday, 09-Nov-99 23:12:40 GMT
So, is the Wdy three letters, or more? Is YYYY four digits, or only two?
I have seen all four of the combinations, of short and full weekday length
with short and full year length, in real world examples!
Other variations I've seen, in real world examples, include: No comma!
Spaces instead of dashes in the date. And, PST instead of GMT at the end.
I've tried to handle all but the no comma and no GMT variations.
readcky( ) does this by: Ignoring the Wdy field. But, it does look for a
leading comma and an ending GMT. It uses a case insensitive compare for
checking for the ending GMT. It ignores the separators in the date.
It uses case insensitive compares for identifying the three character month
name. And, it treats any small YYYY value with the following sequence, after
parsing the YYYY field into int years:
if (years <= 39)
years += 2000 ;
if (years <= 99)
years += 1900 ;
readcky( ) tends to treat unrecognizable Expires= attribute field values as
having been reached or expired. A lot of code is needed for processing this
one field.
If the Cookie record it has been given doesn't contain an Expires= attribute
field, readcky( ) checks the expected $DLXsessn= attribute field's value
against ulisessn's present value. This has already been described, above.
If readcky( ) indicates that a given Cookie record is stale, savecky( ) drops
it and goes on to the next record from its input file.
Otherwise, savecky( ) compares the name and Path, Domain, and $Domain
attribute field values of the just read back Cookie record with those of the
pending received Cookie. name and Path= values are compared with case
sensitive compares. Domain= and $Domain= values are compared with case
insensitive compares. If the names and all three value pairs match, the
received Cookie is replacing the read back Cookie. So, the read back Cookie
is simply dropped, as if it had been found to be stale.
If the read back Cookie hasn't been dropped, savecky( ) compares the length
of its Path= attribute field value with the length of the pending received
Cookie's Path= attribute field value. If the pending received Cookie's Path=
value is longer, savecky( ) writes it to its output file ahead of the read
back Cookie, if it hasn't done that already. This eliminates any need for
reordering Cookie(s) when selecting them, later, for sending.
The received Cookie will get written exactly once. Whether it has just
written the received Cookie, or not, savecky( ) then writes the read back
Cookie. The process described above continues until the input file has been
exhausted. Finally, savecky( ) writes the received and accepted Cookie to
its output file, if it hasn't done that already.
Once it has written its received and accepted Cookie and processed all of the
Cookie(s) from any input file it has open, savecky( ) closes its output file.
And, any input file it has open. If it processed an input file, savecky( )
then removes that file. And, it renames its COOKYDOM.TMP output file to
replace the removed file. If the received Set-Cookie: Response Header line
hasn't been exhausted, yet, savecky( ) then returns to parsing, checking, and
writing another received Cookie.
It may be noted that savecky( ) writes each accepted Cookie to a Cookie(s)
file, whether it is fresh or not. The server may have sent a "pre-expired"
Cookie in order to clear the browser's record of a previous Cookie.
Without sending any replacement session state information. In that case, an
expired Cookie may be left in a Cookie(s) file. This isn't much of a bug
because Cookie(s) read back from a DosLynx Cookie(s) file always get checked
for freshness before being used or reused. As received Cookies get processed
one at a time, there won't be any accumulation of stale Cookie(s).
Because all stale Cookie(s), found in a Cookie(s) file, get dropped during
Cookie(s) file updating, as explained above. This is true for unmatched, as
well as for matched, Cookie(s) in the file.
Perhaps, an "undocumented feature" here is that savecky( )'s process will
never result in an established Cookie(s) file going empty or disappearing.
It will always contain, at least, the last Cookie accepted for that file.
However, Cookie(s) file(s) shouldn't be expected to contain a record of all
servers that ever sent DosLynx an accepted Cookie. Except in possibly rare
cases. To the extent that each Cookie(s) file may receive Cookie(s) from
multiple servers, any and all of a given server's Cookie(s) are subject to
disappearing from the file upon becoming stale.
Once savecky( ) has finished with the Set-Cookie: Response Header line that
it has been given, it returns a count of the number of received Cookie
record(s) it has written. At this point, all aspects of Receiving Cookie(s)
have been described, except for auditing and Monitoring.
Auditing received Cookies is complicated by the provision of Monitoring mode.
And, by the fact that a presented or received Cookie's acceptance is not
assured. As explained above, four kinds of issues may block a received
Cookie's acceptance. These are third-party host transactions, unacceptable
Cookie attribute field value(s), unacceptable Cookie length, and the
possibility for file I/O error(s) encountered in the course of Cookie(s) file
updating.
It seems that many users would or should be bothered or alarmed by audit
report(s) about unaccepted Cookie(s), while HTTP Cookie Mode is (Enabled and)
ON. Therefore, when HTTP Cookie Mode is ON and auditing is (also) enabled,
it seems appropriate to audit or report on accepted Cookie(s), only.
But, no Cookie(s) are accepted when Cookie Mode isn't ON. So, applying that
approach to Cookie Monitoring would keep it from doing anything.
Instead, it seems most appropriate to audit or report on all presented or
received Cookie(s), when Monitoring. After all, if the user doesn't want to
be aware of Cookie(s) only being presented, she probably won't select the
Cookie Monitoring Mode. These decisions result in some Cookie(s) being
reported, while Monitoring, that might go unaccepted and unaudited were
Cookie Mode to be ON, as well.
So, HTMIME_put_character( ) has been extended to check for the HTTP Cookie
auditing state, in combination with the Cookie Mode ON state, after it
extracts a received Set-Cookie: Response Header line. When HTTP Cookie
Monitoring (only) is in effect, HTMIME_put_character( ) produces an
audit: Set-Cookie: . . . message for every Set-Cookie: Response Header line
it finds. But, it doesn't bother calling savecky( ) for any of them.
When HTTP Cookie Mode is ON, HTMIME_put_character( ) calls savecky( ) for
every Set-Cookie: Response Header line it finds. If auditing is (also)
enabled, it may produce an audit: Set-Cookie: . . . message, for each of
these. But, only when savecky( ) returns with a non-zero count of Cookie(s)
written.
Sending Cookies
An HTTP browser continues a "session" with an HTTP server, by including
Cookie: field(s) in all Request Header(s) it sends to that server.
(That is, when relaying the user's request(s) for additional URL(s), from
that server or other server(s) in the same domain.)
The Cookie: Request Header field(s) sent are composed from the name or
name=value field(s) that begin each of the still fresh and relevant Cookie(s)
the browser has on hand. The relevant Cookie(s) are those that contain a
Path= attribute field value matching the beginning of the additional URL's
path. These include Cookie(s) that already have been received from the same
server. As well as Cookie(s) that already have been received from other
server(s) in the present server's domain, which also contain an accepted
Domain= attribute field. (To have been accepted, a Cookie's Domain=
attribute field value must match the domain of the server that sent it.)
So, the additions for sending Cookies start in HTLoadHTTP( ), in HTTP.C,
where the Request Headers, for each request for a URL, get constructed.
This comes after cprqhost, cpnvhost, and cprqpath, all introduced in the
preceding section, have been updated for the URL being requested.
I've extended HTLoadHTTP( ) to check (global) ckymode for HTTP Cookie Mode
(Enabled and) ON state. Finding ON status, HTLoadHTTP( ) calls new function
getckies( ), in new module CKIESRCV.CPP. getckies( ) uses much of the new
infrastructure described in the preceding section. So, the following will
assume a knowledge of that and not reintroduce all of its functions.
We'll move along quite a bit faster from here!
In short, getckies( ) rechecks ckymode and checks cprqhost and cpnvhost, as
savecky( ) does. If the present request host isn't a "third-party host",
getckies( ) sets its name into its own cprhost value. Then, it calls
ckypname( ), with that value, to obtain the name of the Cookie(s) file that
must contain any Cookie(s) on hand for that host. Next, it tries to open the
determined Cookie(s) file for raw, or binary, input. As savecky( ) does for
a Cookie(s) file it is updating. With the needed Cookie(s) file open,
getckies( ) proceeds to read each of its Cookie record(s).
getckies( ) calls readcky( ), with each Cookie record read, in order to check
it for freshness and to locate the attribute field values, in it, that need
to be checked. Finding a fresh Cookie record, getckies( ) uses a case
insensitive compare to check its $Domain= attribute field value for a match
with the cprhost value. If no match is found there, but a Domain= attribute
field value is available in this Cookie, getckies( ) calls dommatch( ) with
that and the cprhost value. If one of these checks has succeeded,
getckies( ) uses a case sensitive compare to check the Cookie's Path=
attribute field value for a match with the beginning of the requested URL's
path.
Having found a Cookie record passing all of the above checks, getckies( )
uses ckattrn( ) to locate its entire name or name=value field.
It normalizes this field's ending, and uses a call to StrAllocCat( ),
actually HTSACat( ) in HTSTRING.C, to append it to a Cookie being built up
from a "Cookie: " string. So, the whole Cookie: line built looks like:
Cookie: name1[=value1][; name2[=value2]] . . . [; namen[=valuen]]CRLF
Without those square brackets denoting parts that may or may not appear.
getckies( ) continues the above process until its Cookie(s) file has been
exhausted.
After closing any Cookie(s) file it has opened, getckies( ) returns any
Cookie: Request Header line it has built.
Back in HTLoadHTTP( ), there is a simple audit process.
Because, Monitoring only pertains to receiving Cookies. If getckies( ) has
been called and returns with a Cookie: Request Header line, HTLoadHTTP( )
appends that to the block of Request Headers it will send and checks ckymode
for HTTP Cookie auditing. If auditing is enabled, HTLoadHTTP( ) sends an
audit: Cookie: . . . message, mostly made from getckies( )' Cookie: line,
to the Messages window.
Cookies Support and Redirection
It seems that Cookies using HTTP servers send redirect responses much more
frequently than HTTP servers that don't use Cookies. Perhaps they are
"walking" their sessions through intermediate state(s) that might not be
known or otherwise apparent to the users. Extra redirect cycle(s) might be
seen as needed to get the server another Request from the browser and, with
it, an updated Cookie. And then, a Set-Cookie: Response update from the
server to the browser again. Or, are they simply checking to see how
faithfully a browser is returning Cookie: Request Header line(s)
corresponding to the Set-Cookie: Response Header line(s) that have just been
sent to it?
That a block of Response Headers can get quite large when filled with several
long Set-Cookie: lines, compounds this section's issue. I am trying to keep
HTLoadHTTP( )'s two Response Headers buffers from getting much larger than
about 3 KB, each. This was explained in the history.txt for DosLynx v0.31b,
under the heading: Precautions Against Buffer Overrun in HTTP.C .
However, it may be seen that 3 KB may not be enough to hold a complete set of
Response Headers that include several very long Set-Cookie: lines.
That means the complete Location: Response Header, sent in a redirection
response, might not have been found in its buffers when HTLoadHTTP( ) went to
look for it. What to do about this?
Given that I wouldn't allow HTLoadHTTP( )'s buffers to grow much, I might
have extended it to adopt HTAA_shouldRetryWithAuth( )'s HTAA_setUpReader( )
and HTAA_getUnfoldedLine( ) utilities, in HTAAUTIL.C.
These utilities provide for reading unbuffered Response Headers as they are
needed. However, I wasn't looking forward to doing that! Then, I thought:
Why not move redirection handling, from HTLoadHTTP( ), into
HTMIME_put_character( ), in HTMIME.C? Doing that would involve little more
than adding recognition, for Location: , to HTMIME_put_character( ).
Wouldn't it?
Actually doing that involved adding LOCATION to enum MIME_state, in HTMIME.C.
I've also changed int server_status to a global in HTTP.C. And, I've added
extern(s) for it and int redirects and char * cp_URLRedirect, to HTMIME.C.
After HTMIME_put_character( ) discovers Location:, it sets SET_CKY_FIELD
state. That state serves for accumulating either a Set-Cookie: or a
Location: Header line. A single if ((me->field) == SET_COOKIE) check, in
case SET_CKY_FIELD, takes care of accumulating the Header field in process
into either cpcookie or cp_URLRedirect. I also added an
if (((server_status/100) == 3) && (!cp_URLRedirect))
check to case '\n', in HTMIME_put_character( ), to see if the old familiar
"Redirection error or probable loop." message needs to be issued.
Finally (I then thought), TURLView::loadURL( ), in TURLVIE5.CPP, needed a
small change to check cp_URLRedirect, no matter what ::HTLoadAbsolute( )
returns.
Well, if (save_this) clauses in HTTP.C and HTMIME.C provided additional
complications. They needed to include new checks of server_status or
cp_URLRedirect. Those were needed to keep from giving the user an unexpected
and unnecessary prompt, for a local filename, when a "Save Source" type
request got redirected! If you've already used a previous version of
DosLynx, you'll notice WWW: Content-Length: . . . message(s) newly
appearing amid redirection sequences, when you start using DosLynx v0.34b.
Why servers seem to be more faithful in sending Content-Length: Headers
during redirection, than they are at other times, is beyond me!
We'll designate those additional messages an "undocumented feature" of this
change. Perhaps they will strengthen this story's credibility.
Cookies Support To Do
I've already mentioned the lack (so far) of any provision for convenient
viewing of one's DosLynx Cookie(s) files.
(In the User Interfaces for Cookies Support section, way up above.)
Like handling redirection well, observing caching directives is another area
that is quite important to the smooth operation of Cookies based sessions.
This means forgoing use of cached document copies, when accompanying Headers
and/or HTML directives. And/or, provide a menu entry or command
for designating all cached document(s) "stale". (Bypass: Use the
Navigate|Reload Current menu entry or command, when necessary.)
- Find some way to guard against making duplicate Post requests.
(Bypass: Move the anchor cursor off the submit button after using it,
if/when you resume, or first return to, a window containing a submitted
Form!)
- Provide some way for retrying a temporarily failed e-mail send attempt?
- Find ways to improve support for UTF-8 and other document encodings!
- Support the HTML Form