This is part of a private discussion I had with John Collins, the author of the first LaTeX2LyX conversor. Attached is also a document describing his ideas about the subject. Such ideas could be useful for our project. Alejandro ------------------------------------------------------------ From: John Collins <John.Collins@cern.ch> Date: Wed, 4 Sep 1996 11:39:45 +0200 (MET DST) To: Alejandro Aguilar Sierra <asierra@servidor.unam.mx> Subject: Re: TeX2LyX document [Removed non-related stuff, AAS] I am not convinced devising a lyx format as a subset of latex is ideal. It would take some time to explain my thoughts. But the problem I see with latex is that its syntax is not pleasant, so that it is relatively time-consuming and/or memory intensive to parse it. Most importantly from the point of view of an editor, it cannot be locally and bidirectionally parsed. (I can explain further.) IMHO, the important matter in the relation to latex is that the format should be in (1-1) correspondence with a suitably rational subset of latex. Is your format supposed to be just a file format, or will it also be used for the representation of a document in memory? If the latter is true, then I would particularly stay away from a directly latex based format. A file format could be closer to latex, since it might need to read by humans. But you have obviously thought about the issue and have come to different conclusions. My experience in writing ET convinced me that a good internal format is important to speed and to robustness. ET will run with good speed on an 8088 processor(!), whereas the scientific word processor Scientific Word, which has somewhat of the same philosophy as ET and lyx, needs a minimum of a Pentium to be acceptable, and even then it frequently crashes the operating system. (Scientific Word appears to use a latex-based format internally.) John Collins ---------------------------------------------------- From: John Collins <collins@phys.psu.edu> Date: Fri, 12 Jan 1996 11:57:23 -0500 To: lyx@via.ecp.fr Subject: Latex2lyx Although I have been working on the latex2lyx program, I do not have something ready for use yet. The trouble is that I am staring at several other priority projects including a research grant renewal. I realize a quick-and-dirty version would be useful, so perhaps I ought to get that out. That could probably be done quickly from my preexisting code. Several messages have provoked me to think that would be worth doing. What I realized over the Christmas break is that by a suitable design of C++ classes I can fairly easily make a powerful, efficient and robust (La)TeX parser. (It would also give useful error messages.) The tricky bit was to figure out the right design. But I have that now, and the basic program. I would estimate about a month to get something presentable to the lyx collaboration. I have appended some of my thoughts about the overall ideas. They correspond to what I did for my ET editor. I'd appreciate comments. John Collins ============================================================ Latex2lyx convertor =================== 1. Latex2lyx and the lyx2latex convertor in the lyx program should be as close to 1-1 as reasonable. This is particularly important for collaborations where some people are using lyx and some are not. Then ordinary TeX files serve as the communications medium. 2. Perfect 1-1 conversion is not possible, because there are several different TeX constructs that are equivalent (for example a blank line and \par). 3. Parts of a TeX file not translated by latex2lyx should be converted into something like LyX TeX style. One then has raw TeX in the lyx file, and that is reconstructed when making the TeX file. 4. Some enhancements to lyx will be necessary to handle converted files sensibly. (E.g., (a) multiline pieces of raw TeX, including arbitrary numbers of blank lines, (b) comments.) 5. It appears necessary to change some of the lyx2latex conversion done by lyx to approach the 1-1 ideal better. (My preference, for example, would be for paragraphs not to be enclosed in braces, normally. Paragraphs with font changes, etc would need different treatment, of course.) User preferences will undoubtedly vary here, so some configuration options are likely to be needed. 6. In particular, white space and new lines are often used to prettify TeX files to make them convenient for humans to edit. I feel it would be useful to provide a mechanism for entering extra space and newlines in a lyx file, not for the purpose of getting the corresponding spaces and newlines in the hard copy, but for making the TeX file pretty. (User preferences will vary here, of course.) This will not matter much if lyx is the only editor being used, but it will be important if the .tex file is being worked on by a collaborator. Automatic prettyprinting of the output .tex file would help here, but lyx's preprogrammed concept of prettyprinting may not agree with a user's need-of-the-moment. 7. The most general case of a TeX file would be very hard to deal with, since catcodes and TeX macros may be redefined in an arbitrary way. Moreover the definitions may be hidden away in a TeX format file. So it is necessary to restrict latex2lyx's scope to files where catcodes don't change, and where the macros don't lose their standard meanings. 8. I think all TeX files that I have ever seen satisfy this requirement, except possibly in their preambles where commands are defined or redefined. 9. The program should handle incorrect TeX gracefully, and with useful error messages. At worst it should bundle up the incorrect/misunderstood TeX as uninterpreted raw TeX that would be recreated in the .tex file. (I emphasize this because one commerical competitor to lyx, Scientific Word (for MS-Windows), does a particularly bad job: it sometimes crashes the operating system when reading correct Latex files, and often loses bits of a file when the structure doesn't correspond to Scientific Word's standards for LaTeX. That is sufficiently unfriendly that I stopped using it.) 10. latex2lyx is aimed at converting document contents, not TeX programming, so it need not attempt to understand the semanitics of \defs, \newcommands etc. (Although it should know the syntax, to parse the commands.) 11. Since the most general case of a TeX macro definition is likely to cause indigestion to latex2lyx (and quite often to a human reader, even a TeXpert), there needs to be a mechanism to force latex2lyx to treat sections of the tex file as raw TeX which is not to be converted. I propose that the conversion be controlled by metacommands (like preprocessor metacommands in C). These must be TeX comments, so that they are invisible to TeX. 12. It would be useful to recognize that other programs might want to use the same mechanism. I propose the following kind of format: %#{}raw %#{}end.raw The signature of a metacommand is %#{program_name}command .... The # is reminiscent of a C metacommand. To allow a mechanism for metacommand sets for different programs, I allow a place for an identifier for a program. For example we MIGHT have %#{lyx}begin.preamble %#{lyx}end.preamble for stuff that's to be bundled up in the LaTeX preamble used by lyx. However, for that case, my preference is for latex2lyx to bundle up ALL of the preamble automatically, except for those commands that are specifically used by lyx (e.g., \usepackage).