modern Tcl and correct quoting

Poor Yorick org.macosforge.lists.macports-dev at pooryorick.com
Wed Jun 12 15:18:54 PDT 2013


On Wed, Jun 12, 2013 at 02:00:17PM +0200, Rainer Müller wrote:
[SNIP]
> Good work! Nice to see you digged right into one of the core source file
> there.
> 
> However, many different things are changed in this single patch. With
> regard to my comment above, could you please split it into the parts
> that can be applied without breaking compatibility with 8.4 and
> introducing features of 8.5? This would help a lot to get the easier
> fixes in.

[SNIP]

I'm working on breaking the patch up as suggested.

Putting [expr] arguments in braces can result in much better performance as it
allows Tcl to cache a byte-coded version of the expression.  It can also avoid unintended processing that might occur via double substitution.  Ref http://wiki.tcl.tk/10225.  

[eval] concatenates its arguments and passes them through the interpreter again
to be parsed and executed as a script.  This results in double substitution.
Here's an example from the patch:

    $workername eval "package ifneeded $pkgName $pkgVers {$pkgLoadScript}"

The first issue with this is that if, e.g., $pkgName contained a space, the
constructed script would no longer be syntactically correct, as it would pass
too many arguments to [package ifneeded].  The second issue is that simply
putting braces around a substituted variable like $pkgLoadScript is not
guaranteed to result in a well-formed single value.  If $pkgLoadScript contains
a left or right curly bracket, or if it ends in a backslash, the script will be
corrupted.  .  The robust way to write the line above is like this:

	$workername eval [list package ifneeded $pkgName $pkgVers $pkgLoadScript]

http://wiki.tcl.tk/1535 has more details.

When substituting values into a script template, the key is to substitute in
the value as a well-formed list.  [string map] is one common way to accomplish
that.  In one exmple from the patch, this was the original code:

    $workername eval \
        "proc trace_$opt {name1 name2 op} { \n\
            trace remove variable ::$opt read ::trace_$opt \n\
            global $opt \n\
            set $opt \[getoption $opt\] \n\
        }"

First, this looks more like Makefile or shell syntax than Tcl.  Newlines in
double-quoted strings are preserved, so we can get rid of some of that ugly and
unnecessary escaping:

    $workername eval "
        proc trace_$opt {name1 name2 op} {
            trace remove variable ::$opt read ::trace_$opt
            global $opt
            set $opt \[getoption $opt\]
        }"

The next problem is that since $opt is being directly substituted into the
string value, the script would be corrupted if $opt contained special
characters.  In practice, $opt is probably always a safe value, so it has
worked so far, but the standard way of avoiding those problems is to either
enclose $opt in a [list]:

    $workername eval "
        proc trace_[list $opt] {name1 name2 op} {
            trace remove variable ::[list $opt] read ::trace_[list $opt]
            global [list $opt]
            set [list $opt] \[getoption [list $opt]\]
        }"

or to use [string map] to get the same result, with the bonus that the script
template remains more readable since special characters don't have to be
escaped:

    set script {
        proc trace_${opt} {name1 name2 op} {
            trace remove variable ::${opt} read ::trace_${opt}
            global ${opt}
            set ${opt} [getoption ${opt}]
        }
    }
    set script [string map [list \${opt} [list $opt]] $script]
    $workername eval $script

The ${} form is used here to reduce the chances of [string map] making an
unintended replacement, perhaps of the beginning of another variable that also
starts with "$opt".

Regarding the question of {} or "", I prefer {} because I'm interested in
helping people really understand Tcl.  Because Tcl's syntax is so flexible, one
thing that happens a lot is that instead of learning how Tcl syntax really
works, people fall into the habit of using some variation of Tcl quoting that
borrows heavily from what they new before.  One one level, Tcl is pretty
forgiving about this, but eventually, such habits can lead to frustration,
misunderstanding and quoting hell (http://wiki.tcl.tk/1726).  One way to get
past this is to hit them with constructs like {}, which illustrate that curly
brackets in Tcl are merely escaping operators and not some sort of mechanism
for constructing formally-scoped code blocks as in C.
 
On to another topic: An alternative to maintaining source-code compatibility
with Tcl-8.4 would be for macports to include in its distribution its own
bootstrap version of a newer Tcl.  This could have additional advantages.  For
example, I'm aware of at least one bug in the Tcl-8.5.7 shipped with Snow
Leopard  that causes silent data corruption in the running interpreter, and
I've seen some very perplexing macports bugs reports that may in fact be caused
by this same bug.

-- 
Yorick


More information about the macports-dev mailing list