
Tcl Source Code Encryption System
=================================

Document Id: $Id: README,v 2.0 1994/12/23 20:04:00 karl Exp $

Designed and implemented for NeoSoft by Karl Lehenbauer

Contents
========

    Overview
    Installing the NeoSoft Tcl Source Code Encryption System
    Operating the NeoSoft Tcl Source Code Encryption System
    Goals of the encryption system
    Theory of Implementation
    Tcl Implementation
    Potential problems
    Security
    Setting up the "ef" array
    Misc notes

Overview
========

The NeoSoft Tcl Source Code Encryption System provides Tcl developer's
with the ability to create and ship Tcl applications with the application's
Tcl/Tk source code stored in an encrypted form.

The encryption system functions quietly and unobtrusively, 


Installing the NeoSoft Tcl Source Code Encryption System
========================================================

There are two files in the distribution, tcl.patch and tclx.patch.

Apply tcl.patch to the baseline Tcl 7.3 sources by cd'ing to tcl7.3
and doing a "patch -p0 <tcl.patch".

Apply tclx.patch to the Extended Tcl 7.3a patchlevel 2 sources by
cd'ing to tclX7.3a/src and doing a "patch -p0 <tclx.patch"

Put the "decrypt.c" file in tclX7.3a/src and edit Makefile.in and
Makefile to include "encrypt.o" in the list of object files.

Also add the following declarations to tclExtdInt.h and initialization
code to tclXAppInit.c:

------------------- add this to tclExtdInt.h --------------------
#ifdef NEOSOFT_TCL_ENCRYPTION

EXTERN void
NeoSoft_Decrypt _ANSI_ARGS_((Tcl_Interp *interp,
                             char       *fileName,
			     char       *dataBuffer,
			     long       fileOffset));


EXTERN int
NeoSoft_InitEncrypt _ANSI_ARGS_((Tcl_Interp *interp));

#endif /* NEOSOFT_TCL_ENCRYPTION */
------------------- end of add to tclExtdInt.h --------------------


------------------- add this to tclXAppInit.c -------------------
#ifdef NEOSOFT_TCL_ENCRYPTION
    if (NeoSoft_InitEncrypt (interp) == TCL_ERROR) {
        return TCL_ERROR;
    }
#endif
------------------- end of add to tclXAppInit.c -------------------


Tcl Implementation
==================

The C portion of the encryption/decryption code is in TclX7.3a/src/decrypt.c.

Two new C-code extensions are added at the Tcl level:  neo_encrypt and 
neo_decrypt.  Both take a string to be encrypted (or decrypted) and one 
or more keys as arguments.  Each key is used in turn to encrypt (decrypt) 
the string.  If keys are fairly long and of different lengths that are not 
divisible by the same factors, better encryption results.  If the exact
keys used to encrypt the string are not used to decrypt the string, a
"garbaged" string will be produced.

The C-code comprising the "source" and "sourcepart" commands had an
additional call made, to NeoSoft_DecryptFile.  NeoSoft_DecryptFile is a C 
routine that looks up a key for the base name of the file being sourced in
a global array named "ef" and uses the value of that entry as a key for 
decrypting the file.  It also uses a global char * array called
fileEncryptionKey as an additional key, and it uses the file basename
(the filename minus any directory names.)  In the absence of an entry
in the "ef" array, the file is sourced without decryption.

So, in other words, files are decrypted if their base names (filename.tcl
without any preceding directories) are found in the global "ef" array
using three keys -- the basename of the file, the char * fileEncryptionKey
array, and the value of the entry in the "ef" array corresponding to the
file.

For example, if "foo.tlib" was encrypted using the strings "thecars",
"foo.tlib" and "This Is My Product Name", then upon execution
the Tcl global array "ef" has an element named "foo.tlib" 
containing the string "thecars", the last part of the filename is
still "foo.tlib", and the fileEncryptionKey C array contains
"This Is My Product Name", then you can "source"
or "sourcepart" any region of the file and it will work.

A file or library encrypted by the "neo_encrypt" command (by reading the file
in as a string, encrypting the string, then writing it out) can be read
by "source" and "sourcepart", as long as the "ef" array contains the correct 
key and the file basename and value of the global C fileEncryptionKey 
remain the same.

Operating the NeoSoft Tcl Source Code Encryption System
=======================================================

The Source Code Encryption System works in two phases.  One is the
generation of the encrypted Tcl files -- this activity is of course
performed by the developer who is cutting a release of a Tcl
application.

The second phase is the end-user decryption phase.  In this phase
encrypted files are decrypted by Tcl during its normal execution.

In the first phase the program encryptdir.tcl is used encrypt 
the files that you want to have encrypted in your release.

You set it up by editing encryptconfig.tcl to set the Tcl variable
"encryptedFileString" to be exactly the same as the "fileEncryptKey"
C char * array in decrypt.c.

Next in encryptconfig.tcl add set commands of the "fileKeys" array
for the basenames of each file to be encrypted.

Finally run the encryptor using "tcl encryptdir.tcl sourceDir destDir"
where sourceDir contains the unencrypted source and destDir is where
you want the encrypted source written.

Example contents of encryptconfig.tcl:

    set encryptedFileString "Solid Systems CD-ROM Jukebox"

    set fileKeys(encryptconfig.tcl) "alpha beetle is my name"
    set fileKeys(encryptdir.tcl) "once upon a time"
    set fileKeys(tencrypt.tcl) "hey is this a key or what"


Setting up the "ef" array
-------------------------

The decryptor runtime system "ef" array needs to be stuffed with
the encryption keys for each encrypted file.  The place to do this 
is in the TclInit.tcl you ship with your app, or with the startup 
file that gets sourced in to run your app.


Goals of the encryption system
==============================

    To not change the length of a file once it's encrypted.

    To not change the filenames of encrypted files.

    To support libraries unchanged, which requires:

	Being able to seek into the middle of a file and start
	reading and correctly decrypting.

    To not overduly hinder legitimate developers:

	The existing autoload mechanisms should still be supported.

	The encryption system should be unobtrusive.

	Files should encrypt into something benign that won't screw up a
	terminal if viewed.

	A mixed-mode environment with some encrypted and some unencrypted
	source should be supported.


Theory of Implementation
========================

The encryption algorithm is a variation of the one I designed and
implemented for a multiuser chat system [Lehenbauer-91].  It
encrypts printable characters to printable characters, and passes
the rest unchanged.  This makes encrypted files OK to inspect without 
screwing up your terminal, yet visually results in a heavily modified 
file that bears little resemblance to its unencrypted counterpart.

The algorithm has been modified from the original version to advance the 
key pointer for every input character seen, regardless of whether the
chatacter required encrypted or not.  (The old one only advanced the key 
pointer when it saw characters it needed to decrypt.)  This
change was necessary to allow us to seek into a file and start decrypting
in the middle, a requirement if libraries are to be supported.
Also it visually scrambles the characters quite a bit more than the
original.

Potential problems
==================

Files that have matching basenames between our stuff and a third-party
developer's, or whatever, will screw up because we don't differentiate
beyond the basename.  If we are to use more of the filename, it raises
installation issues, like if we allow users to install us wherever they
want, we have to write/rewrite the file that sets the encrypted key array
with those pathnames.

Security
========

This encryption algorithm will not withstand *serious* cryptanalysis.
Particularly, there is a technique called differential cryptanalysis
that would probably be effective at decrypting the files.
Expending a great deal of effort try to attain a stronger algorithm is 
pointless in any case, as all loaded procedures are available in almost 
plaintext by forcing the application/Tcl to generate a coredump while it 
is running.  Single-stepping Tcl startup with an assembly-level debugger
will also reveal what's going on, although someone doing that could be
in for a good deal of work.

The theory is that the encryption needs to be hard enough to stop most people
from getting around it, and for the rare individual who does get around it,
to insure that they have clearly violated the licensing agreement.

Misc Notes
==========

It is important that you strip the symbol tables off of the C programs you 
ship that contain the Tcl core and the encryption software using the 
"strip" command, or much useful information will be left around for
the potential cracker.

