diff -rcN AFAIRE.txt.old AFAIRE.txt *** AFAIRE.txt.old Tue Jul 20 10:59:40 1999 --- AFAIRE.txt Fri Sep 03 15:34:54 1999 *************** *** 40,51 **** peut-être aussi pour Windows, pour du développement multi-plateforme), et fournir des binaires Macintosh ! - Modifier le comportement de " begin initial" afin qu'il vide la pile ! des conditions courantes. Pour l'instant il place la condition "initial" en ! tête de la pile, ce qui peut provoquer des piles sans fin quand on convertit ! des lexeurs (f)lex. Ceux-ci utilisent typiquement BEGIN(INITIAL) pour ! retourner à l'état initial, car elles n'ont pas de commande END. Peut-être ! ajouter une option aux lexeurs pour désactiver la pile de conditions (par ! exemple -nocs, pour "no conditions stack"), pour une meilleure compatibilité ! avec (f)lex. Il n'y a pas de réel équivalent aux piles de conditions dans ! (f)lex. --- 40,46 ---- peut-être aussi pour Windows, pour du développement multi-plateforme), et fournir des binaires Macintosh ! - Peut-être ajouter une option aux lexeurs pour désactiver la pile de conditions ! (par exemple -nocs, pour "no conditions stack"), pour une meilleure ! compatibilité avec (f)lex. Il n'y a pas de réel équivalent aux piles de ! conditions dans (f)lex. diff -rcN ANNONCE.txt.old ANNONCE.txt *** ANNONCE.txt.old Tue Jul 20 11:32:02 1999 --- ANNONCE.txt Fri Sep 03 15:34:54 1999 *************** *** 1,9 **** Je suis heureux d'annoncer la nouvelle version corrigée de tcLex : ! tcLex v1.1.4: un générateur d'analyseur lexical pour Tcl Par Frédéric BONNET (frederic.bonnet@ciril.fr) ! Mis à jour le 20 juillet 1999, 11:15 La page Web dédiée à cette extension est : http://www.multimania.com/fbonnet/Tcl/tcLex/index.htm --- 1,9 ---- Je suis heureux d'annoncer la nouvelle version corrigée de tcLex : ! tcLex v1.2a1: un générateur d'analyseur lexical pour Tcl Par Frédéric BONNET (frederic.bonnet@ciril.fr) ! Mis à jour le 3 septembre 1999, 15:08 La page Web dédiée à cette extension est : http://www.multimania.com/fbonnet/Tcl/tcLex/index.htm *************** *** 12,18 **** QUOI DE NEUF: ! - Bug corrigé sous Tcl8.1 qui causait un dépassement de chaîne. Voir le fichier changements.txt pour les détails. --- 12,22 ---- QUOI DE NEUF: ! - Adapté pour Tcl8.2. ! - Modifié le comportement de "lexer begin initial" pour qu'il vide la pile des ! conditions. ! - Supprimé les sources de la distribution Windows, qui devient une distribution ! exclusivement binaire. La distribution .tar.gz est multi-plateforme. Voir le fichier changements.txt pour les détails. *************** *** 63,75 **** VERSION ! La version actuelle de tcLex est 1.1.4 et suit la convention de Tcl, ! c'est-à-dire majeur.mineur.correctif. Cela signifie que cette version est le ! quatrième correctif (suffixe .4) de la dernière version stable 1.1, qui ! n'apporte pas de fonctions supplémentaires mais corriges différents problèmes. ! La plupart des informations utiles sont dans le fichier changements.txt. Le ! fichier AFAIRE.txt contiens les fonctions qui sont à implémenter dans de futures ! versions. OU OBTENIR TCLEX --- 67,82 ---- VERSION ! La version actuelle de tcLex est 1.2a1. Le suffixe "a1" signifie "alpha 1", ce ! qui veut dire que cette version est une version incomplète par rapport aux ! fonctions de la future 1.2, qui étend et corrige la précédente. Le fichier ! changements.txt décrit les changements effectués entre la première version de ! tcLex et la version actuelle. Bien que ce soit une version alpha, elle apporte ! plus de corrections de bugs que de nouveaux ;-). Dans ce cas, alpha signifie que ! de nombreuses fonctions prévues ne sont pas encore implémentées, et la ! documentation peut être incomplête. La plupart des informations utiles sont dans ! le fichier changements.txt. Le fichier AFAIRE.txt contiens les fonctions prévues ! qui ne sont pas encore implémentées. OU OBTENIR TCLEX *************** *** 78,91 **** http://www.multimania.com/fbonnet/Tcl/tcLex/index.htm Fichiers de distribution : ! - http://www.multimania.com/fbonnet/pub/tcLex114.zip ! (sources et binaires Windows pour Tcl8.0.5 et Tcl8.1.1) ! - http://www.multimania.com/fbonnet/pub/tcLex1.1.4.tar.gz ! (sources Unix pour Tcl8.0.5 et Tcl8.1.1) ! - http://www.multimania.com/fbonnet/pub/tcLex1.1.4.patch ! (fichier correctif pour la version 1.1.3) SUPPORT --- 85,98 ---- http://www.multimania.com/fbonnet/Tcl/tcLex/index.htm Fichiers de distribution : ! - http://www.multimania.com/fbonnet/pub/tcLex12a1.zip ! (binaires Windows pour Tcl8.0.5, Tcl8.1.1 et Tcl8.2) ! - http://www.multimania.com/fbonnet/pub/tcLex1.2a1.tar.gz ! (sources Windows/Unix pour Tcl8.0.5, Tcl8.1.1 et Tcl8.2) ! - http://www.multimania.com/fbonnet/pub/tcLex1.2a1.patch ! (fichier correctif pour la version 1.1.4) SUPPORT diff -rcN ANNOUNCE.txt.old ANNOUNCE.txt *** ANNOUNCE.txt.old Tue Jul 20 11:17:22 1999 --- ANNOUNCE.txt Fri Sep 03 15:34:54 1999 *************** *** 1,9 **** ! I am pleased to announce the new patched version of tcLex: ! tcLex v1.1.4: a lexical analyzer generator for Tcl by Frédéric BONNET (frederic.bonnet@ciril.fr) ! Updated 20 July 1999, 11:15 The home page for this package is: http://www.multimania.com/fbonnet/Tcl/tcLex/index.en.htm --- 1,9 ---- ! I am pleased to announce the new development version of tcLex: ! tcLex v1.2a1: a lexical analyzer generator for Tcl by Frédéric BONNET (frederic.bonnet@ciril.fr) ! Updated 3 September 1999, 15:08 The home page for this package is: http://www.multimania.com/fbonnet/Tcl/tcLex/index.en.htm *************** *** 12,18 **** WHAT'S NEW: ! - Corrected bug under Tcl8.1 that caused string overflow. See the file changes.txt for details --- 12,22 ---- WHAT'S NEW: ! - Adapted for Tcl8.2. ! - Modified the behavior of "lexer begin initial" so that it empties the ! conditions stack. ! - Removed sources from the Windows distribution, becoming a binary-only ! distribution. The .tar.gz distribution is cross-platform. See the file changes.txt for details *************** *** 59,70 **** VERSION ! The current tcLex version is 1.1.4 and follows the Tcl convention, that is ! major.minor.patchlevel. It means that this version is the fourth correction ! (.4 suffix) to the last stable version 1.1, and provides no new function but ! corrects several problems. Most of the useful info is in the file ! changes.txt. The file TODO.txt contains features that have to be implemented ! in future versions. WHERE TO GET TCLEX --- 63,77 ---- VERSION ! The current tcLex version is 1.2a1. The suffix "a1" means "alpha 1", meaning ! that this version is the first fature-incomplete release of the future 1.2, ! extending and correcting the previous 1.1. The file changes.txt describes the ! changes made between the first version of tcLex and the current version. ! Although it is alpha software, it brings more bugs corrections than new ones ! ;-). In this case, alpha means that many planned features are not yet ! implemented, and documentation may be incomplete. Most of the useful info is in ! the file changes.txt. The file TODO.txt contains planned features that needs to ! be implemented. WHERE TO GET TCLEX *************** *** 73,86 **** http://www.multimania.com/fbonnet/Tcl/tcLex/index.en.htm Distribution files: ! - http://www.multimania.com/fbonnet/pub/tcLex114.zip ! (sources and Windows binaries for Tcl8.0.5 and Tcl8.1.1) ! - http://www.multimania.com/fbonnet/pub/tcLex1.1.4.tar.gz ! (Unix sources for Tcl8.0.5 and Tcl8.1.1) ! - http://www.multimania.com/fbonnet/pub/tcLex1.1.4.patch ! (patch file for version 1.1.3) SUPPORT --- 80,93 ---- http://www.multimania.com/fbonnet/Tcl/tcLex/index.en.htm Distribution files: ! - http://www.multimania.com/fbonnet/pub/tcLex12a1.zip ! (Windows binaries for Tcl8.0.5, Tcl8.1.1 and Tcl8.2) ! - http://www.multimania.com/fbonnet/pub/tcLex1.2a1.tar.gz ! (Windows/Unix sources for Tcl8.0.5, Tcl8.1.1 and Tcl8.2) ! - http://www.multimania.com/fbonnet/pub/tcLex1.2a1.patch ! (patch file for version 1.1.4) SUPPORT diff -rcN LISEZMOI.txt.old LISEZMOI.txt *** LISEZMOI.txt.old Tue Jul 20 11:34:10 1999 --- LISEZMOI.txt Fri Sep 03 15:34:54 1999 *************** *** 47,59 **** VERSION ! La version actuelle de tcLex est 1.1.4 et suit la convention de Tcl, ! c'est-à-dire majeur.mineur.correctif. Cela signifie que cette version est le ! quatrième correctif (suffixe .4) de la dernière version stable 1.1, qui ! n'apporte pas de fonctions supplémentaires mais corriges différents problèmes. ! La plupart des informations utiles sont dans le fichier changements.txt. Le ! fichier AFAIRE.txt contiens les fonctions qui sont à implémenter dans de futures ! versions. POURQUOI TCLEX ? --- 47,62 ---- VERSION ! La version actuelle de tcLex est 1.2a1. Le suffixe "a1" signifie "alpha 1", ce ! qui veut dire que cette version est une version incomplète par rapport aux ! fonctions de la future 1.2, qui étend et corrige la précédente. Le fichier ! changements.txt décrit les changements effectués entre la première version de ! tcLex et la version actuelle. Bien que ce soit une version alpha, elle apporte ! plus de corrections de bugs que de nouveaux ;-). Dans ce cas, alpha signifie que ! de nombreuses fonctions prévues ne sont pas encore implémentées, et la ! documentation peut être incomplête. La plupart des informations utiles sont dans ! le fichier changements.txt. Le fichier AFAIRE.txt contiens les fonctions prévues ! qui ne sont pas encore implémentées. POURQUOI TCLEX ? *************** *** 97,110 **** http://www.multimania.com/fbonnet/Tcl/tcLex/index.htm Fichiers de distribution : ! - http://www.multimania.com/fbonnet/pub/tcLex114.zip ! (sources et binaires Windows pour Tcl8.0.5 et Tcl8.1.1) ! - http://www.multimania.com/fbonnet/pub/tcLex1.1.4.tar.gz ! (sources Unix pour Tcl8.0.5 et Tcl8.1.1) ! - http://www.multimania.com/fbonnet/pub/tcLex1.1.4.patch ! (fichier correctif pour la version 1.1.3) SUPPORT --- 100,113 ---- http://www.multimania.com/fbonnet/Tcl/tcLex/index.htm Fichiers de distribution : ! - http://www.multimania.com/fbonnet/pub/tcLex12a1.zip ! (binaires Windows pour Tcl8.0.5, Tcl8.1.1 et Tcl8.2) ! - http://www.multimania.com/fbonnet/pub/tcLex1.2a1.tar.gz ! (sources Windows/Unix pour Tcl8.0.5, Tcl8.1.1 et Tcl8.2) ! - http://www.multimania.com/fbonnet/pub/tcLex1.2a1.patch ! (fichier correctif pour la version 1.1.4) SUPPORT *************** *** 121,134 **** Si vous voulez compiler tcLex vous-même, vous devez savoir qu'elle a besoin des sources Tcl pour compiler car elle utilise quelques structures internes. Elle ! peut être compilée avec Tcl 8.0 ou 8.1. * Windows: ! Des bibliothèques précompilées sont disponibles dans le répertoire "win". ! Cependant, vous pouvez compiler l'extension vous-même. Allez dans le répertoire ! "src", éditez le fichier "makefile.vc" pour Microsoft Visual C++ (pas de Borland ! pour l'instant, des volontaires :-) et éditez les différentes variables pour ! refléter votre propre installation (compilateur, Tcl...). Ensuite, tapez sur la ligne de commande : --- 124,137 ---- Si vous voulez compiler tcLex vous-même, vous devez savoir qu'elle a besoin des sources Tcl pour compiler car elle utilise quelques structures internes. Elle ! peut être compilée avec Tcl 8.0, 8.1 ou 8.2. * Windows: ! Des bibliothèques précompilées sont disponibles dans une distribution binaire ! distincte. Cependant, vous pouvez compiler l'extension vous-même. Allez dans le ! répertoire "src", éditez le fichier "makefile.vc" pour Microsoft Visual C++ (pas ! de Borland pour l'instant, des volontaires :-) et éditez les différentes ! variables pour refléter votre propre installation (compilateur, Tcl...). Ensuite, tapez sur la ligne de commande : *************** *** 167,184 **** * MacOS: Il n'y a pas de makefile pour cette plateforme pour l'instant, cependant la ! compilation devrait être facile, il n'y a qu'un seul fichier C. La seule chose ! dont le source a besoin est la variable TCLEX_VERSION définie à la compilation. ! Vous pouvez jeter un oeil au makefile pour Windows. INSTALLATION DES BINAIRES Windows: ! Deux bibliothèques précompilées sont fournies avec la distribution, nommés ! tcLex80.dll et tcLex81.dll, respectivement pour Tcl 8.0 et 8.1, dans le ! répertoire "win". Copiez les seulement avec le fichier pkgIndex.tcl dans un ! répertoire de votre choix dand le répertoire "lib" de Tcl. MacOS, Unix: Pas de distribution binaire pour l'instant. --- 170,188 ---- * MacOS: Il n'y a pas de makefile pour cette plateforme pour l'instant, cependant la ! compilation devrait être facile, il n'y a que deux fichiers C. Les seules choses ! dont le source a besoin sont les variables TCLEX_VERSION, BUILD_tcLex et ! USE_TCL_STUBS (si applicable) définies à la compilation. Vous pouvez jeter un ! oeil au makefile pour Windows. INSTALLATION DES BINAIRES Windows: ! Trois bibliothèques précompilées sont fournies avec la distribution binaire, ! nommés tcLex80.dll, tcLex81.dll et tcLex82.dll, respectivement pour Tcl 8.0, 8.1 ! et 8.2. Copiez-les simplement avec le fichier pkgIndex.tcl dans un ! sous-répertoire de votre choix dans le répertoire "lib" de Tcl. MacOS, Unix: Pas de distribution binaire pour l'instant. diff -rcN README.txt.old README.txt *** README.txt.old Tue Jul 20 11:18:46 1999 --- README.txt Fri Sep 03 15:34:54 1999 *************** *** 40,51 **** VERSION ! The current tcLex version is 1.1.4 and follows the Tcl convention, that is ! major.minor.patchlevel. It means that this version is the fourth correction ! (.4 suffix) to the last stable version 1.1, and provides no new function but ! corrects several problems. Most of the useful info is in the file ! changes.txt. The file TODO.txt contains features that have to be implemented ! in future versions. WHY TCLEX? --- 40,54 ---- VERSION ! The current tcLex version is 1.2a1. The suffix "a1" means "alpha 1", meaning ! that this version is the first fature-incomplete release of the future 1.2, ! extending and correcting the previous 1.1. The file changes.txt describes the ! changes made between the first version of tcLex and the current version. ! Although it is alpha software, it brings more bugs corrections than new ones ! ;-). In this case, alpha means that many planned features are not yet ! implemented, and documentation may be incomplete. Most of the useful info is in ! the file changes.txt. The file TODO.txt contains planned features that needs to ! be implemented. WHY TCLEX? *************** *** 86,99 **** http://www.multimania.com/fbonnet/Tcl/tcLex/index.en.htm Distribution files: ! - http://www.multimania.com/fbonnet/pub/tcLex114.zip ! (sources and Windows binaries for Tcl8.0.5 and Tcl8.1.1) ! - http://www.multimania.com/fbonnet/pub/tcLex1.1.4.tar.gz ! (Unix sources for Tcl8.0.5 and Tcl8.1.1) ! - http://www.multimania.com/fbonnet/pub/tcLex1.1.4.patch ! (patch file for version 1.1.3) SUPPORT --- 89,102 ---- http://www.multimania.com/fbonnet/Tcl/tcLex/index.en.htm Distribution files: ! - http://www.multimania.com/fbonnet/pub/tcLex12a1.zip ! (Windows binaries for Tcl8.0.5, Tcl8.1.1 and Tcl8.2) ! - http://www.multimania.com/fbonnet/pub/tcLex1.2a1.tar.gz ! (Windows/Unix sources for Tcl8.0.5, Tcl8.1.1 and Tcl8.2) ! - http://www.multimania.com/fbonnet/pub/tcLex1.2a1.patch ! (patch file for version 1.1.4) SUPPORT *************** *** 109,122 **** If you want to compile tcLex yourself, you must know that it needs the Tcl source to compile because it makes use of some internal structures. It will ! compile with Tcl 8.0 or 8.1. * Windows: ! Precompiled libraries are available in the "win" subdir. However, you can ! compile the extension yourself. Go to the "src" directory, edit the file ! "makefile.vc" for Microsoft Visual C++ (no Borland file yet, volunteers :-) and ! edit the different variables to reflect your own installation (compiler, ! Tcl...). Next, type on the command line: --- 112,125 ---- If you want to compile tcLex yourself, you must know that it needs the Tcl source to compile because it makes use of some internal structures. It will ! compile with Tcl 8.0, 8.1 or 8.2. * Windows: ! Precompiled libraries are available in a separate binary distribution. However, ! you can compile the extension yourself. Go to the "src" directory of the source ! distribution, edit the file "makefile.vc" for Microsoft Visual C++ (no Borland ! file yet, volunteers :-) and edit the different variables to reflect your own ! installation (compiler, Tcl...). Next, type on the command line: *************** *** 154,171 **** * MacOS: There are no makefiles for this platform yet, however compilation should be ! easy, there is only two C file. The only thing the source needs is the variable ! TCLEX_VERSION being defined at compile time. You can take a look at the makefile ! for Windows. BINARY INSTALLATION Windows: ! Two precompiled libraries are provided with the distribution, named tcLex80.dll ! and tcLex81.dll, respectively for Tcl 8.0 and 8.1, in the "win" dir. Just copy ! them and the file pkgIndex.tcl in a directory of your choice in the Tcl "lib" ! dir. MacOS, Unix: No binary distribution for now. --- 157,174 ---- * MacOS: There are no makefiles for this platform yet, however compilation should be ! easy, there are only two C files. The only things the source needs are the ! variables TCLEX_VERSION, BUILD_tcLex and USE_TCL_STUBS (if appliable) being ! defined at compile time. You can take a look at the makefile for Windows. BINARY INSTALLATION Windows: ! Three precompiled libraries are provided in the binary distribution, named ! tcLex80.dll, tcLex81.dll and tcLex82.dll, respectively for Tcl 8.0, 8.1 and 8.2. ! Just copy them and the file pkgIndex.tcl in a sub-directory of your choice in ! the Tcl "lib" dir. MacOS, Unix: No binary distribution for now. diff -rcN TODO.txt.old TODO.txt *** TODO.txt.old Tue Jul 20 11:00:56 1999 --- TODO.txt Fri Sep 03 15:34:54 1999 *************** *** 39,48 **** maybe also for Windows, for cross-platform development), and provide Macintosh binaries ! - Modify " begin initial" behavior so that it empties the current ! conditions stack. For now it pushes the "initial" condition on top of the ! stack, which can result in endless stacks when converting (f)lex lexers. ! These typically use BEGIN(INITIAL) to return to the initial state, as they ! have no END statement. Maybe add an option to lexers to disable the conditions ! stack (eg. -nocs, for "no conditions stack"), for a better (f)lex ! compatibility. There is no real equivalent to conditions stacks in (f)lex. --- 39,44 ---- maybe also for Windows, for cross-platform development), and provide Macintosh binaries ! - Maybe add an option to lexers to disable the conditions stack (eg. -nocs, for ! "no conditions stack"), for a better (f)lex compatibility. There is no real ! equivalent to conditions stacks in (f)lex. diff -rcN changements.txt.old changements.txt *** changements.txt.old Tue Jul 20 11:14:28 1999 --- changements.txt Fri Sep 03 15:34:54 1999 *************** *** 262,272 **** (http://www.scriptics.com/support/howto/regexp81.html). ! -------- 20/07/1999 tcLex version 1.1.3 -------- 1. Bug majeur corrigé avec Tcl 8.1. Les fonctions BufferNotStarving() et BufferAtEnd() mélangeaient des index en caractères et en octets, ce qui ! entrainait des dépassements de chaine. Bug rapporté par Neil Walker. Il est étonnant que ce bug ne se soit pas déclaré avant car les risques de dépassement étaient quasi systématiques, or il ne plantait tcLex que dans des cas bien précis (difficiles à reproduire sous Windows). --- 262,297 ---- (http://www.scriptics.com/support/howto/regexp81.html). ! -------- 20/07/1999 tcLex version 1.1.4 -------- 1. Bug majeur corrigé avec Tcl 8.1. Les fonctions BufferNotStarving() et BufferAtEnd() mélangeaient des index en caractères et en octets, ce qui ! entrainait des dépassements de chaîne. Bug rapporté par Neil Walker. Il est étonnant que ce bug ne se soit pas déclaré avant car les risques de dépassement étaient quasi systématiques, or il ne plantait tcLex que dans des cas bien précis (difficiles à reproduire sous Windows). + + + -------- 03/09/1999 tcLex version 1.2a1 -------- + + 1. Ajouté le support pour Tcl8.2 et supérieur. Maintenant que le moteur regexp + de Tcl8.2 procure les fonctions dont tcLex a besoin (cad. détection de + dépassement de chaîne et recherche limitée au début chaîne), tcLex ne nécessite + plus de modification de ce moteur. Ceci rend le code plus simple car il utilise + maintenalnt les fonctions standard de la bibliothèque Tcl. Ajouté le fichier + RE82.c + + 2. La chaîne d'entrée est maintenant stockée sous forme de Tcl_Obj et non plus + de Tcl_DString. Retravaillé le code correspondant en conséquence (RuleTry(), + RuleExec(), RuleGetRange()). Sous Tcl8.0, on utilise la chaîne 8bits de l'objet. + Sous Tcl8.2, la chaîne Unicode (pas UTF-8) (en fait, on passe la chaîne aux + procédures de bibliothèque Tcl, qui à leur tour utilisent la représentation + Unicode de l'objet). Sous Tcl8.1, ajout d'un type d'objet Unicode et des + procédures relatives (par exemple Tcl_NewUnicodeObj(), Tcl_GetUnicode() et + Tcl_GetCharLength()) afin d'être compatible source avec Tcl8.2. Ces nouveaux + objets Unicode utilisent des Tcl_DStrings Unicode comme représentation interne. + + 3. Modifié le comportement de "lexer begin initial" afin qu'il vide la pile de + conditions au lieu d'empiler la condition "initial" à son sommet. Cela rend + l'écriture de certains lexeurs plus facile (par exemple, les exemples flex de + Neil Walker). diff -rcN changes.txt.old changes.txt *** changes.txt.old Tue Jul 20 11:13:38 1999 --- changes.txt Fri Sep 03 15:34:54 1999 *************** *** 244,252 **** -------- 07/20/1999 tcLex version 1.1.4 -------- ! 1. Corrected major bug majeur with Tcl 8.1. The functions BufferNotStarving() and BufferAtEnd() mixed character and byte indices. which resulted in string overflows. Bug reported by Neil Walker. It is surprising that this bug did not show up earlier because the string overflows occured eventually in virtually any case, however it only crashed tcLex in very precise cases (hard to reproduce on Windows). --- 244,274 ---- -------- 07/20/1999 tcLex version 1.1.4 -------- ! 1. Corrected major bug with Tcl 8.1. The functions BufferNotStarving() and BufferAtEnd() mixed character and byte indices. which resulted in string overflows. Bug reported by Neil Walker. It is surprising that this bug did not show up earlier because the string overflows occured eventually in virtually any case, however it only crashed tcLex in very precise cases (hard to reproduce on Windows). + + + -------- 09/03/1999 tcLex version 1.2a1 -------- + + 1. Added support for Tcl8.2 and higher. Now that Tcl8.2's regexp engine provides + the features needed by tcLex (ie string overrun detection and matching at the + beginning of the string), tcLex no longer needs a patched version of this + engine. This makes the code much simpler as it now uses standard Tcl library + functions. Added file RE82.c + + 2. The input string is now stored as a Tcl_Obj instead of a Tcl_DString. + Reworked the related code in consequence (RuleTry(), RuleExec(), + RuleGetRange()). Under Tcl8.0, use the obj's 8bits string. Under Tcl8.2, use the + obj's Unicode (not UTF-8) string (actually, only pass the string obj to the Tcl + library procs, which in turn use the obj's Unicode representation). Under + Tcl8.1, added a Unicode object type and related procs (eg. Tcl_NewUnicodeObj(), + Tcl_GetUnicode() and Tcl_GetCharLength()) to be source compatible with Tcl8.2. + These new Unicode objects use Unicode Tcl_DStrings as their internal rep. + + 3. Modified "lexer begin initial" behavior so that it empties the conditions + stack rather than pushing the "initial" condition on top of it. This makes some + lexers easier to write (eg. Neil Walker's flex examples). Binary files doc/en/light_slate.jpg.old and doc/en/light_slate.jpg differ diff -rcN src/Makefile.in.old src/Makefile.in *** src/Makefile.in.old Thu Jun 17 15:18:40 1999 --- src/Makefile.in Fri Sep 03 15:34:52 1999 *************** *** 9,27 **** # # Project setting -- version and stuff # ! # PROJECT -- name of the project. Used in the file and package names ! # PROJECT_VERSION -- version of the project, in major.minor format ! # VERSION_DEFINE -- symbol used in code that hold the version number ! PROJECT = tcLex ! PROJECT_VERSION = 1.1 VERSION_DEFINE = TCLEX_VERSION # Object files ! OBJS = tcLex.o \ ! tcLexRE.o ! # MANS = tclex.n #------------------------------------------------- INSTALL = @INSTALL@ --- 9,27 ---- # # Project setting -- version and stuff # ! # PROJECT -- name of the project. Used in the file and package names ! # PROJECT_VERSION -- version of the project, in major.minor format ! # VERSION_DEFINE -- symbol used in code that hold the version number ! PROJECT = tcLex ! PROJECT_VERSION = 1.2 VERSION_DEFINE = TCLEX_VERSION # Object files ! OBJS = tcLex.o \ ! tcLexRE.o ! # MANS = tclex.n #------------------------------------------------- INSTALL = @INSTALL@ diff -rcN src/RE80.c.old src/RE80.c *** src/RE80.c.old Thu Jun 24 16:10:16 1999 --- src/RE80.c Fri Sep 03 15:34:52 1999 *************** *** 487,501 **** that only matches at the beginning of the strings */ int ! RuleExec(interp, lexer, rule, string, start, numChars, pOverrun) Tcl_Interp *interp; TcLex_Lexer *lexer; TcLex_Rule *rule; ! char *string; ! char *start; ! int numChars; /* Unused, strings are null-terminated */ int *pOverrun; /* Used to report overrun conditions */ { int match; #if 0 /* Deactivate regmust -- see below */ --- 487,502 ---- that only matches at the beginning of the strings */ int ! RuleExec(interp, lexer, rule, stringObj, index, start, pOverrun) Tcl_Interp *interp; TcLex_Lexer *lexer; TcLex_Rule *rule; ! Tcl_Obj *stringObj; ! int index; ! int start; int *pOverrun; /* Used to report overrun conditions */ { + char *string = Tcl_GetStringFromObj(stringObj, NULL); int match; #if 0 /* Deactivate regmust -- see below */ *************** *** 535,541 **** /* If there is a "must appear" string, look for it. */ if (prog->regmust != NULL) { ! s = string; /* Modification */ /* while ((s = strchr(s, prog->regmust[0])) != NULL) {*/ while ((s = findChar(s, prog->regmust[0], restate)) != NULL) { --- 536,542 ---- /* If there is a "must appear" string, look for it. */ if (prog->regmust != NULL) { ! s = string+index; /* Modification */ /* while ((s = strchr(s, prog->regmust[0])) != NULL) {*/ while ((s = findChar(s, prog->regmust[0], restate)) != NULL) { *************** *** 555,567 **** /* Modification */ /* restate->regbol = start;*/ ! restate->regbol = ( (string == start) || ( (restate->bLines) && (*(string-1) == '\n'))) ! ? string : start; /* End modification */ ! match = regtry(prog, string, restate); *pOverrun = restate->overrun; if (TclGetRegError() != NULL) { --- 556,568 ---- /* Modification */ /* restate->regbol = start;*/ ! restate->regbol = ( (index == start) || ( (restate->bLines) && (*(string-1) == '\n'))) ! ? string+index : string+start; /* End modification */ ! match = regtry(prog, string+index, restate); *pOverrun = restate->overrun; if (TclGetRegError() != NULL) { *************** *** 592,609 **** */ void ! RuleGetRange(interp, lexer, rule, string, stringIndex, index, startPtr, endPtr) Tcl_Interp *interp; TcLex_Lexer *lexer; TcLex_Rule *rule; ! char *string; ! int stringIndex; /* Unused */ int index; int *startPtr; int *endPtr; { char *s, *e; ! Tcl_RegExpRange(rule->re, index, &s, &e); if (s == NULL) { *startPtr = *endPtr = -1; --- 593,611 ---- */ void ! RuleGetRange(interp, lexer, rule, stringObj, index, ruleIndex, startPtr, endPtr) Tcl_Interp *interp; TcLex_Lexer *lexer; TcLex_Rule *rule; ! Tcl_Obj *stringObj; int index; + int ruleIndex; int *startPtr; int *endPtr; { + char *string = Tcl_GetStringFromObj(stringObj, NULL); char *s, *e; ! Tcl_RegExpRange(rule->re, ruleIndex, &s, &e); if (s == NULL) { *startPtr = *endPtr = -1; diff -rcN src/RE81.c.old src/RE81.c *** src/RE81.c.old Thu Jun 24 16:04:32 1999 --- src/RE81.c Fri Sep 03 15:34:54 1999 *************** *** 20,25 **** --- 20,26 ---- * regexp compile (we still use Tcl's). Debug options have been stripped. */ + #include "tcLexInt.h" /************************************** * Modified regexec.c from Tcl source * *************** *** 1701,1720 **** that only matches at the beginning of the strings */ int ! RuleExec(interp, lexer, rule, string, start, numChars, pOverrun) Tcl_Interp *interp; TcLex_Lexer *lexer; TcLex_Rule *rule; ! Tcl_UniChar *string; ! Tcl_UniChar *start; ! int numChars; int *pOverrun; /* Used to report overrun conditions */ { int status, flags = 0; TclRegexp *regexpPtr = (TclRegexp *)getRegexp(interp, rule->reObj, lexer->flags); /* Addition */ - struct guts *g = (struct guts *)regexpPtr->re.re_guts; rm_detail_t details; /* Error checking */ --- 1702,1722 ---- that only matches at the beginning of the strings */ int ! RuleExec(interp, lexer, rule, stringObj, index, start, pOverrun) Tcl_Interp *interp; TcLex_Lexer *lexer; TcLex_Rule *rule; ! Tcl_Obj *stringObj; ! int index; ! int start; int *pOverrun; /* Used to report overrun conditions */ { int status, flags = 0; TclRegexp *regexpPtr = (TclRegexp *)getRegexp(interp, rule->reObj, lexer->flags); + int numChars = Tcl_GetCharLength(stringObj); + Tcl_UniChar *string = Tcl_GetUnicode(stringObj); /* Addition */ rm_detail_t details; /* Error checking */ *************** *** 1728,1736 **** flags |= REG_REPORTEOS; /* Check for beginning of (line|string) */ ! flags |= ( (string == start) ! || ( (g->cflags & REG_NLANCH) ! && (*(string-1) == '\n'))) ? 0 : REG_NOTBOL; /* End addition */ --- 1730,1738 ---- flags |= REG_REPORTEOS; /* Check for beginning of (line|string) */ ! flags |= ( (index == start) ! || ( (regexpPtr->flags & REG_NLANCH) ! && (string[index-1] == '\n'))) ? 0 : REG_NOTBOL; /* End addition */ *************** *** 1744,1750 **** (rm_detail_t*)NULL, regexpPtr->re.re_nsub + 1, regexpPtr->matches, ((string > start) ? REG_NOTBOL : 0)); */ ! status = exec(®expPtr->re, string, (size_t) numChars, &details, regexpPtr->re.re_nsub + 1, regexpPtr->matches, flags); *pOverrun = details.endReached; --- 1746,1752 ---- (rm_detail_t*)NULL, regexpPtr->re.re_nsub + 1, regexpPtr->matches, ((string > start) ? REG_NOTBOL : 0)); */ ! status = exec(®expPtr->re, string+index, (size_t) numChars-index, &details, regexpPtr->re.re_nsub + 1, regexpPtr->matches, flags); *pOverrun = details.endReached; *************** *** 1792,1804 **** */ void ! RuleGetRange(interp, lexer, rule, string, stringIndex, index, startPtr, endPtr) Tcl_Interp *interp; TcLex_Lexer *lexer; TcLex_Rule *rule; ! Tcl_UniChar *string; /* Unused */ ! int stringIndex; int index; int *startPtr; int *endPtr; { --- 1794,1806 ---- */ void ! RuleGetRange(interp, lexer, rule, stringObj, index, rangeIndex, startPtr, endPtr) Tcl_Interp *interp; TcLex_Lexer *lexer; TcLex_Rule *rule; ! Tcl_Obj *stringObj; /* Unused */ int index; + int rangeIndex; int *startPtr; int *endPtr; { *************** *** 1810,1820 **** return; } ! if ((size_t) index > regexpPtr->re.re_nsub) { *startPtr = *endPtr = -1; } else { ! *startPtr = regexpPtr->matches[index].rm_so+stringIndex; ! *endPtr = regexpPtr->matches[index].rm_eo+stringIndex-1; } } --- 1812,1822 ---- return; } ! if ((size_t) rangeIndex > regexpPtr->re.re_nsub) { *startPtr = *endPtr = -1; } else { ! *startPtr = regexpPtr->matches[rangeIndex].rm_so+index; ! *endPtr = regexpPtr->matches[rangeIndex].rm_eo+index-1; } } diff -rcN src/RE82.c.old src/RE82.c *** src/RE82.c.old Thu Jan 01 01:00:00 1970 --- src/RE82.c Fri Sep 03 15:34:54 1999 *************** *** 0 **** --- 1,214 ---- + /* + * As Tcl 8.2 introduced some changes needed by Expect and also by tcLex + * (ie matching at the beginning of string and partial match reporting, + * we don't need to include a patched version of the regexp engine. + */ + + + static Tcl_RegExp + getRegexp(interp, reObj, flags) + Tcl_Interp *interp; + Tcl_Obj *reObj; + int flags; + { + /* Hack: Corrects a bug in Tcl handling of REG_BOSONLY: REs must be + * enclosed between non-capturing parentheses */ + static Tcl_ObjType *regexpObjType = NULL; + Tcl_RegExp regexp; + TclRegexp *regexpPtr = (TclRegexp *) reObj->internalRep.otherValuePtr; + int cflags = REG_ADVANCED | REG_BOSONLY | REG_EXPECT + | ((flags & LEXER_FLAG_LINES) ? REG_NEWLINE : 0) + | ((flags & LEXER_FLAG_NOCASE) ? REG_ICASE : 0); + + + if (regexpObjType == NULL || reObj->typePtr != regexpObjType || regexpPtr->flags != cflags) { + /* Build a temporary string holding the modified regexp */ + int tmpi; + char *tmpc; + char *re; + + re = Tcl_Alloc(reObj->length+5); + strcpy(re, "(?:"); + strncat(re, reObj->bytes, reObj->length); + strcat(re, ")"); + + tmpi = reObj->length; + tmpc = reObj->bytes; + reObj->bytes = re; + reObj->length = reObj->length+4; + + regexp = Tcl_GetRegExpFromObj(interp, reObj, cflags); + + reObj->bytes = tmpc; + reObj->length = tmpi; + + Tcl_Free(re); + + regexpObjType = reObj->typePtr; + } else { + regexp = (Tcl_RegExp) regexpPtr; + } + + return regexp; + } + + + + + int + RuleExec(interp, lexer, rule, stringObj, index, start, pOverrun) + Tcl_Interp *interp; + TcLex_Lexer *lexer; + TcLex_Rule *rule; + Tcl_Obj *stringObj; + int index; + int start; + int *pOverrun; /* Used to report overrun conditions */ + { + int status, flags = 0; + Tcl_UniChar *string = Tcl_GetUnicode(stringObj); + int numChars = Tcl_GetCharLength(stringObj); + TclRegexp *regexpPtr = (TclRegexp *)getRegexp(interp, rule->reObj, lexer->flags); + + /* Error checking */ + if (regexpPtr == NULL) + return -1; + + /* Check for beginning of (line|string) */ + flags |= ( (index == start) + || ( (regexpPtr->flags & REG_NLANCH) + && (string[index-1] == '\n'))) + ? 0 : REG_NOTBOL; + + /* + * Perform the regexp match. + */ + + status = Tcl_RegExpExecObj(interp, (Tcl_RegExp)regexpPtr, stringObj, index, + regexpPtr->re.re_nsub + 1, flags); + + /* Check overrun */ + if (regexpPtr->details.rm_extend.rm_so == 0 + && regexpPtr->details.rm_extend.rm_eo == numChars-index) { + *pOverrun = 1; + } + + return status; + } + + + + /* + *-------------------------------------------------------------- + * + * RuleGetRange -- + * + * This procedure returns the matching range of re's i-th + * subexpression. + * + * Results: + * None. + * + * Side effects: + * Returns range indices in the given pointers. + * + *-------------------------------------------------------------- + */ + + void + RuleGetRange(interp, lexer, rule, stringObj, index, rangeIndex, startPtr, endPtr) + Tcl_Interp *interp; + TcLex_Lexer *lexer; + TcLex_Rule *rule; + Tcl_Obj *stringObj; /* Unused */ + int index; + int rangeIndex; + int *startPtr; + int *endPtr; + { + TclRegexp *regexpPtr = (TclRegexp *)getRegexp(interp, rule->reObj, lexer->flags); + + /* Error checking */ + if (regexpPtr == NULL) { + *startPtr = *endPtr = -1; + return; + } + + if ((size_t) rangeIndex > regexpPtr->re.re_nsub) { + *startPtr = *endPtr = -1; + } else { + *startPtr = regexpPtr->matches[rangeIndex].rm_so+index; + *endPtr = regexpPtr->matches[rangeIndex].rm_eo+index-1; + } + } + + + /* + *-------------------------------------------------------------- + * + * RuleCompileRegexp -- + * + * This procedure compiles the given regexp. + * + * Results: + * A standard Tcl result. + * + * Side effects: + * Modifies the given rule. + * + *-------------------------------------------------------------- + */ + + Tcl_ObjCmdProc *regexpObjCmd = NULL; + Tcl_Obj *regexpObj = NULL, + *lineObj = NULL, + *nocaseObj = NULL, + *dashObj = NULL, + *stringObj = NULL; + int + RuleCompileRegexp(interp, rule, reObj, flags) + Tcl_Interp *interp; + TcLex_Rule *rule; + Tcl_Obj *reObj; + int flags; + { + int i, reLength; + char *reString = Tcl_GetStringFromObj(reObj, &reLength); + Tcl_RegExp re = getRegexp(interp, reObj, flags); /* Compiled regexp */ + + /* Error checking */ + if (re == NULL) { + return TCL_ERROR; + } + + /* + * Initialize the rule + */ + + /* Roughly count the number of ranges, ie the parentheses */ + rule->nbRanges = 1; + for (i=0; i < reLength; i++) { + if (reString[i] == '(') rule->nbRanges++; + } + + rule->re = re; + rule->reObj = reObj; + Tcl_IncrRefCount(reObj); + + return TCL_OK; + } + + + void + RuleFree(rule) + TcLex_Rule *rule; + { + Tcl_Free((char*)rule->conditionsIndices); + if (rule->reObj) + Tcl_DecrRefCount(rule->reObj); + /* Do not free re because it belongs to reObj */ + if (rule->matchVars) + Tcl_DecrRefCount(rule->matchVars); + if (rule->script) + Tcl_DecrRefCount(rule->script); + } diff -rcN src/makefile.vc.old src/makefile.vc *** src/makefile.vc.old Tue Jul 06 11:43:58 1999 --- src/makefile.vc Fri Sep 03 15:34:52 1999 *************** *** 14,20 **** # VERSION_DEFINE -- symbol used in code that hold the version number PROJECT = tcLex ! PROJECT_VERSION = 1.1 VERSION_DEFINE = TCLEX_VERSION # Set NODEBUG to 0 to compile with symbols --- 14,20 ---- # VERSION_DEFINE -- symbol used in code that hold the version number PROJECT = tcLex ! PROJECT_VERSION = 1.2 VERSION_DEFINE = TCLEX_VERSION # Set NODEBUG to 0 to compile with symbols *************** *** 41,54 **** TCL_V = 80 !ENDIF ! !IF "$(TCL_V)" == "81" TCL_DIR = c:\usr\local\development\tcl8.1.1 TCL_SRC = $(TCL_DIR)\src\tcl8.1.1 STUBS = 1 !ELSE TCL_DIR = c:\usr\local\development\tcl8.0.5 TCL_SRC = $(TCL_DIR)\src\tcl8.0.5 - TCL_LIB = $(TCL_DIR)\lib\tcl$(TCL_V).lib STUBS = 0 !ENDIF --- 41,57 ---- TCL_V = 80 !ENDIF ! !IF "$(TCL_V)" == "82" ! TCL_DIR = c:\usr\local\development\tcl8.2.0 ! TCL_SRC = $(TCL_DIR)\src\tcl8.2.0 ! STUBS = 1 ! !ELSEIF "$(TCL_V)" == "81" TCL_DIR = c:\usr\local\development\tcl8.1.1 TCL_SRC = $(TCL_DIR)\src\tcl8.1.1 STUBS = 1 !ELSE TCL_DIR = c:\usr\local\development\tcl8.0.5 TCL_SRC = $(TCL_DIR)\src\tcl8.0.5 STUBS = 0 !ENDIF *************** *** 147,153 **** ###################################################################### tcLex.c: tcLex.h tcLexInt.h tcLexRE.h ! !IF "$(TCL_V)" == "81" tcLexRE.c: tcLex.h tcLexRE.h RE81.c !ELSE tcLexRE.c: tcLex.h tcLexRE.h RE80.c --- 150,158 ---- ###################################################################### tcLex.c: tcLex.h tcLexInt.h tcLexRE.h ! !IF "$(TCL_V)" == "82" ! tcLexRE.c: tcLex.h tcLexRE.h RE82.c ! !ELSEIF "$(TCL_V)" == "81" tcLexRE.c: tcLex.h tcLexRE.h RE81.c !ELSE tcLexRE.c: tcLex.h tcLexRE.h RE80.c diff -rcN src/pkgIndex.tcl.win.old src/pkgIndex.tcl.win *** src/pkgIndex.tcl.win.old Wed Nov 11 17:46:28 1998 --- src/pkgIndex.tcl.win Fri Sep 03 15:34:52 1999 *************** *** 1,4 **** # Tcl package index file, version 1.1 ! package ifneeded tcLex 1.1 [list tclPkgSetup $dir tcLex 1.1 "{tclex[join [split [info tclversion] .] ""].dll load lexer}"] --- 1,4 ---- # Tcl package index file, version 1.1 ! package ifneeded tcLex 1.2 [list tclPkgSetup $dir tcLex 1.2 "{tclex[join [split [info tclversion] .] ""].dll load lexer}"] diff -rcN src/tcLex.c.old src/tcLex.c *** src/tcLex.c.old Fri Jul 16 12:04:24 1999 --- src/tcLex.c Fri Sep 03 15:34:52 1999 *************** *** 272,278 **** for (i=0; inbRules; i++) statePtr->bFailed[i] = 0; ! Tcl_DStringInit(&statePtr->inputBuffer.chars); StateSetString(lexer, n, string); statePtr->inputBuffer.index = 0; --- 272,278 ---- for (i=0; inbRules; i++) statePtr->bFailed[i] = 0; ! statePtr->inputBuffer.chars = NULL; StateSetString(lexer, n, string); statePtr->inputBuffer.index = 0; *************** *** 291,296 **** --- 291,451 ---- /* + * Tcl8.1 compatibily procs: Unicode object type. + * This object uses a Unicode string in a Tcl_DString as their + * internal representation + */ + + #if (TCL_MAJOR_VERSION == 8 && TCL_MINOR_VERSION == 1) + + Tcl_FreeInternalRepProc FreeUnicode; + Tcl_DupInternalRepProc DupUnicode; + Tcl_UpdateStringProc UpdateUnicode; + Tcl_SetFromAnyProc SetUnicodeFromAny; + + Tcl_ObjType UnicodeObjType = { + "unicode", + + FreeUnicode, /* freeIntRepProc */ + DupUnicode, /* dupIntRepProc */ + UpdateUnicode, /* updateIntRepProc */ + SetUnicodeFromAny /* setFromAnyProc */ + }; + + void + FreeUnicode(objPtr) + Tcl_Obj *objPtr; + { + register Tcl_DString *string = (Tcl_DString *) objPtr->internalRep.otherValuePtr; + + /* Free the DString */ + Tcl_DStringFree(string); + Tcl_Free((char *) string); + } + + void + DupUnicode(srcPtr, dupPtr) + Tcl_Obj *srcPtr; + Tcl_Obj *dupPtr; + { + register Tcl_DString *srcString = (Tcl_DString *) srcPtr->internalRep.otherValuePtr; + register Tcl_DString *dupString; + + /* Duplicate the DString */ + dupString = (Tcl_DString *) Tcl_Alloc(sizeof(Tcl_DString)); + Tcl_DStringInit(dupString); + Tcl_DStringAppend(dupString, Tcl_DStringValue(srcString), + Tcl_DStringLength(srcString)); + + dupPtr->internalRep.otherValuePtr = (char *) dupString; + dupPtr->typePtr = &UnicodeObjType; + } + + void + UpdateUnicode(objPtr) + Tcl_Obj *objPtr; + { + register Tcl_DString *string = (Tcl_DString *) objPtr->internalRep.otherValuePtr; + Tcl_DString strUtf; + + /* + * Set the UTF string from the internal Unicode DString + */ + + /* Generate an UTF representation in a DString */ + Tcl_DStringInit(&strUtf); + Tcl_UniCharToUtfDString((Tcl_UniChar *) Tcl_DStringValue(string), + Tcl_DStringLength(string)/sizeof(Tcl_UniChar), &strUtf); + + /* Allocate and fill the UTF string */ + objPtr->length = Tcl_DStringLength(&strUtf); + objPtr->bytes = Tcl_Alloc((unsigned) objPtr->length + 1); + strcpy(objPtr->bytes, Tcl_DStringValue(&strUtf)); + + Tcl_DStringFree(&strUtf); + } + + int + SetUnicodeFromAny(interp, objPtr) + Tcl_Interp *interp; + Tcl_Obj *objPtr; + { + register Tcl_ObjType *oldTypePtr = objPtr->typePtr; + register Tcl_DString *string; + + /* + * Free the old internalRep before setting the new one. + */ + + if ((oldTypePtr != NULL) && (oldTypePtr->freeIntRepProc != NULL)) { + oldTypePtr->freeIntRepProc(objPtr); + } + + /* Set the internal Unicode string from the UTF string */ + string = (Tcl_DString *) Tcl_Alloc(sizeof(Tcl_DString)); + Tcl_DStringInit(string); + Tcl_UtfToUniCharDString(objPtr->bytes, objPtr->length, string); + + objPtr->internalRep.otherValuePtr = (char *) string; + objPtr->typePtr = &UnicodeObjType; + return TCL_OK; + } + + int + Tcl_GetCharLength(objPtr) + Tcl_Obj *objPtr; + { + register Tcl_DString *string; + + if (objPtr->typePtr != &UnicodeObjType) { + SetUnicodeFromAny(NULL, objPtr); + } + + string = (Tcl_DString *) objPtr->internalRep.otherValuePtr; + return Tcl_DStringLength(string)/sizeof(Tcl_UniChar); + } + + Tcl_UniChar * + Tcl_GetUnicode(objPtr) + Tcl_Obj *objPtr; + { + register Tcl_DString *string; + + if (objPtr->typePtr != &UnicodeObjType) { + SetUnicodeFromAny(NULL, objPtr); + } + + string = (Tcl_DString *) objPtr->internalRep.otherValuePtr; + return (Tcl_UniChar *) Tcl_DStringValue(string); + } + + Tcl_Obj * + Tcl_NewUnicodeObj(unicode, numChars) + Tcl_UniChar *unicode; + int numChars; + { + register Tcl_Obj *objPtr; + register Tcl_DString *string; + + objPtr = Tcl_NewObj(); + objPtr->bytes = NULL; + + string = (Tcl_DString *) Tcl_Alloc(sizeof(Tcl_DString)); + Tcl_DStringInit(string); + Tcl_DStringAppend(string, (char *) unicode, numChars*sizeof(Tcl_UniChar)); + + objPtr->internalRep.otherValuePtr = (char *) string; + objPtr->typePtr = &UnicodeObjType; + return objPtr; + } + + #endif + + + + + + /* *-------------------------------------------------------------- * * StateSetString -- *************** *** 313,326 **** Tcl_Obj *string; /* New string being lexed */ { TcLex_State *statePtr = lexer->states[n]; ! int len; ! char *str = Tcl_GetStringFromObj(string, &len); - Tcl_DStringFree(&statePtr->inputBuffer.chars); #if (TCL_MAJOR_VERSION == 8 && TCL_MINOR_VERSION == 0) ! Tcl_DStringAppend(&statePtr->inputBuffer.chars, str, len); #else ! Tcl_UtfToUniCharDString(str, len, &statePtr->inputBuffer.chars); #endif } --- 468,510 ---- Tcl_Obj *string; /* New string being lexed */ { TcLex_State *statePtr = lexer->states[n]; ! if (statePtr->inputBuffer.chars) ! Tcl_DecrRefCount(statePtr->inputBuffer.chars); ! statePtr->inputBuffer.chars = string; ! Tcl_IncrRefCount(statePtr->inputBuffer.chars); ! } ! ! ! /* ! *-------------------------------------------------------------- ! * ! * StateGetString -- ! * ! * Given its index, modify a lexer state's string. ! * ! * Results: ! * None ! * ! * Side effects: ! * The state is modified. ! * ! *-------------------------------------------------------------- ! */ ! ! static Char * ! StateGetString(lexer, n, lengthPtr) ! TcLex_Lexer *lexer; ! int n; /* Index of the state */ ! int *lengthPtr; /* Length of string */ ! { ! TcLex_State *statePtr = lexer->states[n]; #if (TCL_MAJOR_VERSION == 8 && TCL_MINOR_VERSION == 0) ! return Tcl_GetStringFromObj(statePtr->currentBuffer->chars, lengthPtr); #else ! if (lengthPtr) ! *lengthPtr = Tcl_GetCharLength(statePtr->currentBuffer->chars); ! return Tcl_GetUnicode(statePtr->currentBuffer->chars); #endif } *************** *** 355,361 **** Tcl_Free((char*)statePtr->conditionsStack); Tcl_Free((char*)statePtr->bFailed); ! Tcl_DStringFree(&statePtr->inputBuffer.chars); Tcl_Free((char*)statePtr->startIndices); Tcl_Free((char*)statePtr->endIndices); if (statePtr->pendingResult) --- 539,545 ---- Tcl_Free((char*)statePtr->conditionsStack); Tcl_Free((char*)statePtr->bFailed); ! Tcl_DecrRefCount(statePtr->inputBuffer.chars); Tcl_Free((char*)statePtr->startIndices); Tcl_Free((char*)statePtr->endIndices); if (statePtr->pendingResult) *************** *** 468,474 **** if (statePtr = lexer->states[i]) { Tcl_Free((char*)statePtr->conditionsStack); Tcl_Free((char*)statePtr->bFailed); ! Tcl_DStringFree(&statePtr->inputBuffer.chars); Tcl_Free((char*)statePtr->startIndices); Tcl_Free((char*)statePtr->endIndices); if (statePtr->pendingResult) --- 652,658 ---- if (statePtr = lexer->states[i]) { Tcl_Free((char*)statePtr->conditionsStack); Tcl_Free((char*)statePtr->bFailed); ! Tcl_DecrRefCount(statePtr->inputBuffer.chars); Tcl_Free((char*)statePtr->startIndices); Tcl_Free((char*)statePtr->endIndices); if (statePtr->pendingResult) *************** *** 1357,1376 **** int *pMinLength; /* if -longest, minimum # chara to match */ { TcLex_State *statePtr = lexer->states[lexer->curState]; ! Char *str = (Char *)Tcl_DStringValue(&statePtr->currentBuffer->chars); ! int len = Tcl_DStringLength(&statePtr->currentBuffer->chars)/sizeof(Char); ! Char *bol; TcLex_Rule *rule = &lexer->rules[iRule]; int i, s, e; int overrun; if ( (lexer->flags & LEXER_FLAG_LINES) ! && (*(str+statePtr->currentBuffer->index-1) == '\n')) ! bol = str+statePtr->currentBuffer->index; else ! bol = str; switch (RuleExec(interp, lexer, rule, ! str+statePtr->currentBuffer->index, bol, len-statePtr->currentBuffer->index, &overrun)) { case -1: return LEXER_RULETRY_ERROR; --- 1541,1561 ---- int *pMinLength; /* if -longest, minimum # chara to match */ { TcLex_State *statePtr = lexer->states[lexer->curState]; ! int len; ! Char *str = StateGetString(lexer, lexer->curState, &len); ! int bol; TcLex_Rule *rule = &lexer->rules[iRule]; int i, s, e; int overrun; if ( (lexer->flags & LEXER_FLAG_LINES) ! && (str[statePtr->currentBuffer->index-1] == '\n')) ! bol = statePtr->currentBuffer->index; else ! bol = 0; ! switch (RuleExec(interp, lexer, rule, ! statePtr->currentBuffer->chars, statePtr->currentBuffer->index, bol, &overrun)) { case -1: return LEXER_RULETRY_ERROR; *************** *** 1388,1394 **** * Get info about the matched string */ ! RuleGetRange(interp, lexer, rule, str, statePtr->currentBuffer->index, 0, &s, &e); if ( (lexer->flags & LEXER_FLAG_LONGEST) && (e-s+1 <= *pMinLength)) /* We want a longer match */ --- 1573,1579 ---- * Get info about the matched string */ ! RuleGetRange(interp, lexer, rule, statePtr->currentBuffer->chars, statePtr->currentBuffer->index, 0, &s, &e); if ( (lexer->flags & LEXER_FLAG_LONGEST) && (e-s+1 <= *pMinLength)) /* We want a longer match */ *************** *** 1417,1423 **** statePtr->endIndices = (int *)Tcl_Realloc((char *)statePtr->endIndices, statePtr->nbRanges * sizeof(int)); for (i=0; i < statePtr->nbRanges; i++) { ! RuleGetRange(interp, lexer, rule, str, statePtr->currentBuffer->index, i, &statePtr->startIndices[i], &statePtr->endIndices[i]); } statePtr->currentBuffer->nextIndex = statePtr->endIndices[0]+1; --- 1602,1608 ---- statePtr->endIndices = (int *)Tcl_Realloc((char *)statePtr->endIndices, statePtr->nbRanges * sizeof(int)); for (i=0; i < statePtr->nbRanges; i++) { ! RuleGetRange(interp, lexer, rule, statePtr->currentBuffer->chars, statePtr->currentBuffer->index, i, &statePtr->startIndices[i], &statePtr->endIndices[i]); } statePtr->currentBuffer->nextIndex = statePtr->endIndices[0]+1; *************** *** 1499,1519 **** { TcLex_State *statePtr = lexer->states[lexer->curState]; int i, result; ! char *str; ! #if (TCL_MAJOR_VERSION > 8 || TCL_MINOR_VERSION >= 1) ! Tcl_DString strUtf; ! #endif if (!bIndices) { #if (TCL_MAJOR_VERSION == 8 && TCL_MINOR_VERSION == 0) ! str = Tcl_DStringValue(&statePtr->currentBuffer->chars)+statePtr->startIndices[0]; #else ! Tcl_DStringInit(&strUtf); ! Tcl_UniCharToUtfDString( ! (Char *)Tcl_DStringValue(&statePtr->currentBuffer->chars)+statePtr->startIndices[0], ! statePtr->endIndices[0]-statePtr->startIndices[0]+1, ! &strUtf); ! str = Tcl_DStringValue(&strUtf); #endif } --- 1684,1696 ---- { TcLex_State *statePtr = lexer->states[lexer->curState]; int i, result; ! Char *str; if (!bIndices) { #if (TCL_MAJOR_VERSION == 8 && TCL_MINOR_VERSION == 0) ! str = Tcl_GetStringFromObj(statePtr->currentBuffer->chars, NULL)+statePtr->startIndices[0]; #else ! str = Tcl_GetUnicode(statePtr->currentBuffer->chars)+statePtr->startIndices[0]; #endif } *************** *** 1541,1568 **** /* Substring not matched, return empty string */ val = Tcl_NewObj(); } else { #if (TCL_MAJOR_VERSION == 8 && TCL_MINOR_VERSION == 0) ! char *s = str+statePtr->startIndices[i]-statePtr->startIndices[0]; ! char *e = str+statePtr->endIndices[i]-statePtr->startIndices[0]+1; #else ! char *s = Tcl_UtfAtIndex(str, statePtr->startIndices[i]-statePtr->startIndices[0]); ! char *e = Tcl_UtfAtIndex(str, statePtr->endIndices[i]-statePtr->startIndices[0]+1); #endif - val = Tcl_NewStringObj(s, e-s); } } if (Tcl_ObjSetVar2(interp, matchv[i], NULL, val, TCL_PARSE_PART1 | TCL_LEAVE_ERR_MSG) == NULL) { ! result = TCL_ERROR; ! goto cleanup; } } - cleanup: - #if (TCL_MAJOR_VERSION > 8 || TCL_MINOR_VERSION >= 1) - if (!bIndices) { - Tcl_DStringFree(&strUtf); - } - #endif return result; } --- 1718,1737 ---- /* Substring not matched, return empty string */ val = Tcl_NewObj(); } else { + Char *s = str+statePtr->startIndices[i]-statePtr->startIndices[0]; + Char *e = str+statePtr->endIndices[i]-statePtr->startIndices[0]+1; #if (TCL_MAJOR_VERSION == 8 && TCL_MINOR_VERSION == 0) ! val = Tcl_NewStringObj(s, e-s); #else ! val = Tcl_NewUnicodeObj(s, e-s); #endif } } if (Tcl_ObjSetVar2(interp, matchv[i], NULL, val, TCL_PARSE_PART1 | TCL_LEAVE_ERR_MSG) == NULL) { ! return TCL_ERROR; } } return result; } *************** *** 1588,1601 **** BufferNotStarving(buffer) TcLex_Buffer *buffer; { ! return (buffer->index*(int)sizeof(Char) <= Tcl_DStringLength(&buffer->chars)); } int BufferAtEnd(buffer) TcLex_Buffer *buffer; { ! return (buffer->index*(int)sizeof(Char) >= Tcl_DStringLength(&buffer->chars)); } int --- 1757,1784 ---- BufferNotStarving(buffer) TcLex_Buffer *buffer; { ! int length; ! ! #if (TCL_MAJOR_VERSION == 8 && TCL_MINOR_VERSION == 0) ! Tcl_GetStringFromObj(buffer->chars, &length); ! #else ! length = Tcl_GetCharLength(buffer->chars); ! #endif ! return (buffer->index <= length); } int BufferAtEnd(buffer) TcLex_Buffer *buffer; { ! int length; ! ! #if (TCL_MAJOR_VERSION == 8 && TCL_MINOR_VERSION == 0) ! Tcl_GetStringFromObj(buffer->chars, &length); ! #else ! length = Tcl_GetCharLength(buffer->chars); ! #endif ! return (buffer->index >= length); } int *************** *** 1607,1613 **** { TcLex_State *statePtr = lexer->states[lexer->curState]; int result; - Char *str; /* String to lex */ int nbRules = lexer->nbRules; int i; int iRule, iRule2; --- 1790,1795 ---- *************** *** 1624,1632 **** result = TCL_OK; /* Variable holding the result */ - str = (Char*)Tcl_DStringValue(&statePtr->currentBuffer->chars); - - /* * Get info about the conditions */ --- 1806,1811 ---- *************** *** 2228,2233 **** --- 2407,2424 ---- * Add the new condition to the stack: allocate or grow the array */ + if (ci == 0) { + /* + * Initial condition + * Empty the whole conditions stack, that way " begin initial" + * will reset the lexer to the original condition state. + */ + statePtr->conditionsStackLength = 0; + Tcl_Free((char *) statePtr->conditionsStack); + statePtr->conditionsStack = NULL; + return TCL_OK; + } + statePtr->conditionsStackLength++; if (statePtr->conditionsStack) *************** *** 2511,2524 **** { TcLex_Lexer *lexer = (TcLex_Lexer *)clientData; TcLex_State *statePtr = lexer->states[lexer->curState]; ! Char *str = (Char*)Tcl_DStringValue(&statePtr->currentBuffer->chars); ! int strLen = Tcl_DStringLength(&statePtr->currentBuffer->chars); int nbChars = 1; int oldIndex; Tcl_Obj *val; - #if (TCL_MAJOR_VERSION > 8 || TCL_MINOR_VERSION >= 1) - Tcl_DString strUtf; - #endif /* * Sanity check --- 2702,2712 ---- { TcLex_Lexer *lexer = (TcLex_Lexer *)clientData; TcLex_State *statePtr = lexer->states[lexer->curState]; ! int strLen; ! Char *str = StateGetString(lexer, lexer->curState, &strLen); int nbChars = 1; int oldIndex; Tcl_Obj *val; /* * Sanity check *************** *** 2546,2555 **** #if (TCL_MAJOR_VERSION == 8 && TCL_MINOR_VERSION == 0) val = Tcl_NewStringObj(str+oldIndex, statePtr->currentBuffer->nextIndex-oldIndex); #else ! Tcl_DStringInit(&strUtf); ! Tcl_UniCharToUtfDString(str+oldIndex, statePtr->currentBuffer->nextIndex-oldIndex, &strUtf); ! val = Tcl_NewStringObj(Tcl_DStringValue(&strUtf), -1); ! Tcl_DStringFree(&strUtf); #endif Tcl_SetObjResult(interp, val); --- 2734,2740 ---- #if (TCL_MAJOR_VERSION == 8 && TCL_MINOR_VERSION == 0) val = Tcl_NewStringObj(str+oldIndex, statePtr->currentBuffer->nextIndex-oldIndex); #else ! val = Tcl_NewUnicodeObj(str+oldIndex, statePtr->currentBuffer->nextIndex-oldIndex); #endif Tcl_SetObjResult(interp, val); *************** *** 2593,2600 **** { TcLex_Lexer *lexer = (TcLex_Lexer *)clientData; TcLex_State *statePtr = lexer->states[lexer->curState]; - Char *str = (Char*)Tcl_DStringValue(&statePtr->currentBuffer->chars); - int strLen = Tcl_DStringLength(&statePtr->currentBuffer->chars); int nbChars = 1; int oldIndex; --- 2778,2783 ---- *************** *** 2617,2623 **** oldIndex = statePtr->currentBuffer->nextIndex; statePtr->currentBuffer->nextIndex -= nbChars; if (statePtr->currentBuffer->nextIndex < statePtr->currentBuffer->index) { ! statePtr->currentBuffer->nextIndex = statePtr->currentBuffer->index; } return TCL_OK; --- 2800,2806 ---- oldIndex = statePtr->currentBuffer->nextIndex; statePtr->currentBuffer->nextIndex -= nbChars; if (statePtr->currentBuffer->nextIndex < statePtr->currentBuffer->index) { ! statePtr->currentBuffer->nextIndex = statePtr->currentBuffer->index; } return TCL_OK; diff -rcN src/tcLex.h.old src/tcLex.h *** src/tcLex.h.old Fri Jul 16 12:03:22 1999 --- src/tcLex.h Fri Sep 03 15:34:52 1999 *************** *** 98,104 **** #endif typedef struct TcLex_Buffer { ! Tcl_DString chars; int index; /* Current character position within */ int nextIndex; --- 98,104 ---- #endif typedef struct TcLex_Buffer { ! Tcl_Obj *chars; int index; /* Current character position within */ int nextIndex; diff -rcN src/tcLexInt.h.old src/tcLexInt.h *** src/tcLexInt.h.old Fri Apr 30 23:05:30 1999 --- src/tcLexInt.h Fri Sep 03 15:34:52 1999 *************** *** 41,46 **** --- 41,57 ---- */ /* + * Tcl8.1 compatibily procs: Unicode object type. + * This object uses a Unicode string in a Tcl_DString as their + * internal representation + */ + + #if (TCL_MAJOR_VERSION == 8 && TCL_MINOR_VERSION == 1) + int Tcl_GetCharLength _ANSI_ARGS_((Tcl_Obj *objPtr)); + Tcl_UniChar* Tcl_GetUnicode _ANSI_ARGS_((Tcl_Obj *objPtr)); + #endif + + /* * Lexer info functions */ void LexerSetCurrent _ANSI_ARGS_((Tcl_Interp *interp, TcLex_Lexer *lexer)); *************** *** 56,64 **** /* * State management functions */ ! static int StateNew _ANSI_ARGS_((TcLex_Lexer *lexer, Tcl_Obj *string)); ! static void StateSetString _ANSI_ARGS_((TcLex_Lexer *lexer, int n, Tcl_Obj *string)); ! static void StateDelete _ANSI_ARGS_((TcLex_Lexer *lexer, int n)); /* * Rule functions --- 67,76 ---- /* * State management functions */ ! static int StateNew _ANSI_ARGS_((TcLex_Lexer *lexer, Tcl_Obj *string)); ! static void StateSetString _ANSI_ARGS_((TcLex_Lexer *lexer, int n, Tcl_Obj *string)); ! static Char * StateGetString _ANSI_ARGS_((TcLex_Lexer *lexer, int n, int *length)); ! static void StateDelete _ANSI_ARGS_((TcLex_Lexer *lexer, int n)); /* * Rule functions diff -rcN src/tcLexRE.c.old src/tcLexRE.c *** src/tcLexRE.c.old Thu Jun 24 15:14:00 1999 --- src/tcLexRE.c Fri Sep 03 15:34:52 1999 *************** *** 7,14 **** #include "tcLexRE.h" #if (TCL_MAJOR_VERSION == 8 && TCL_MINOR_VERSION == 0) ! #include "RE80.c" #else ! #include "RE81.c" #endif --- 7,18 ---- #include "tcLexRE.h" #if (TCL_MAJOR_VERSION == 8 && TCL_MINOR_VERSION == 0) ! #include "RE80.c" #else ! #if (TCL_MAJOR_VERSION == 8 && TCL_MINOR_VERSION == 1) ! #include "RE81.c" ! #else ! #include "RE82.c" ! #endif #endif diff -rcN src/tcLexRE.h.old src/tcLexRE.h *** src/tcLexRE.h.old Thu Jun 24 15:39:32 1999 --- src/tcLexRE.h Fri Sep 03 15:34:52 1999 *************** *** 1,4 **** int RuleCompileRegexp _ANSI_ARGS_((Tcl_Interp *interp, TcLex_Rule *rule, Tcl_Obj *reObj, int flags)); ! int RuleExec _ANSI_ARGS_((Tcl_Interp *interp, TcLex_Lexer *lexer, TcLex_Rule *rule, Char *string, Char *start, int numChars, int *pOverrun)); ! void RuleGetRange _ANSI_ARGS_((Tcl_Interp *interp, TcLex_Lexer *lexer, TcLex_Rule *rule, Char *string, int stringIndex, int index, int *start, int *end)); void RuleFree _ANSI_ARGS_((TcLex_Rule *rule)); --- 1,4 ---- int RuleCompileRegexp _ANSI_ARGS_((Tcl_Interp *interp, TcLex_Rule *rule, Tcl_Obj *reObj, int flags)); ! int RuleExec _ANSI_ARGS_((Tcl_Interp *interp, TcLex_Lexer *lexer, TcLex_Rule *rule, Tcl_Obj *stringObj, int index, int start, int *pOverrun)); ! void RuleGetRange _ANSI_ARGS_((Tcl_Interp *interp, TcLex_Lexer *lexer, TcLex_Rule *rule, Tcl_Obj *stringObj, int index, int rangeIndex, int *start, int *end)); void RuleFree _ANSI_ARGS_((TcLex_Rule *rule));