Prepare for languages that use CJK with TeX fonts and Polyglossia
with non-TeX fonts.
Korean is already supported by Polyglossia,
LyX support will follow (file version change).
* Fix macro termination if \textcyrillic or \textgreek is not required
for Greek or Cyrillic letter.
* Replace "writeScriptChars" with conditionals in the character-output loop in
"Paragraph::latex" (solves "FIXME: modifying i here is not very nice...").
The font changing commands \textcyrillic and \textgreek are no longer
part of the textcommand in "lib/unicodesymbols" but added when required
in Paragraph::Private::latexSpecialChar.
This is not possible for '$', because of the latex-meaning to
start/end math inset.
Therefore, if not ignoring format, we still have to use
[\\][\$] in regex in order to find '$' in text.
This revives a ten year old idea (and patch) by Dov.
You can now mark in the character dialog text and exclude it from spell
checking.
Fixes: #1042
File format change
Remaining issue: The instant spell checking marks are not immediately
removed, but only after some editing.
A feature can now be required only for specific input or font encodings:
- <feature>=enc1;enc2... Require the feature <feature> only if the
character is used in one if the specified font
or input encodings.
- <feature>!=enc1;enc2... Require the feature <feature> only if the
character is used in a font or input encoding
that is not among the specified.
Following a request by Günter, we consider the document fonts (only rm
for now) when selecting an appropriate font encoding.
See #9741
The new default font encoding setting "auto" does
* consider the font encoding needed by the language(s), which can now
have fallback alternatives
* Consider which font encoding is provided by the document font
Thus, cm now will result in OT1 fontenc, if the language can deal with
that.
The font_enc pref is ditched: it is no longer needed.
The automatism is still very basic and is subject to extension.
File format and prefs format change.
Use the command as defined by Babel. This allows us to use the (more
advanced) Babel command if provided instead of rolling our own.
I add a dummy file format change in case it turns out we need to
do something here for old documents (e.g. with user preamble definitions)
This allows (some) verbatim contents in macros, such as \url's with
specific chars (#, % etc.) in section headings or footnotes (#449)
or comments in captions (#9313).
The mentioned two bugs are fixed by this commit.
Note that the implementation is still rather basic and might need
extension for other cases.
Ligating of -- to en dashes occure also in teletype fonts.
With some 8-bit fonts, em and en dashes are not copied
exported from the PDF (but this is not limited to dashes in teletype).
With LatinModern, PDF export works fine
MWE:
\documentclass[]{article}
%\usepackage{lmodern}
\usepackage[T1]{fontenc}
\begin{document}
Hallo \texttt{Welt --Welt ---Welt}
Hallo Welt --Welt ---Welt
- If a display math not starting a new paragraph is deleted, the
current \lyxdeleted macro (if any) must be closed and a new one
started, otherwise the display math will be shifted up.
- Use \linewidth instead of \columnwidth because the former will adapt
to the reduced horizontal width in list environments, avoiding shifting
to the right the diplay math.
This commit does a bulk fix of incorrect annotations (comments) at the
end of namespaces.
The commit was generated by initially running clang-format, and then
from the diff of the result extracting the hunks corresponding to
fixes of namespace comments. The changes being applied and all the
results have been manually reviewed. The source code successfully
builds on macOS.
Further details on the steps below, in case they're of interest to
someone else in the future.
1. Checkout a fresh and up to date version of src/
git pull && git checkout -- src && git status src
2. Ensure there's a suitable .clang-format in place, i.e. with options
to fix the comment at the end of namespaces, including:
FixNamespaceComments: true
SpacesBeforeTrailingComments: 1
and that clang-format is >= 5.0.0, by doing e.g.:
clang-format -dump-config | grep Comments:
clang-format --version
3. Apply clang-format to the source:
clang-format -i $(find src -name "*.cpp" -or -name "*.h")
4. Create and filter out hunks related to fixing the namespace
git diff -U0 src > tmp.patch
grepdiff '^} // namespace' --output-matching=hunk tmp.patch > fix_namespace.patch
5. Filter out hunks corresponding to simple fixes into to a separate patch:
pcregrep -M -e '^diff[^\n]+\nindex[^\n]+\n--- [^\n]+\n\+\+\+ [^\n]+\n' \
-e '^@@ -[0-9]+ \+[0-9]+ @@[^\n]*\n-\}[^\n]*\n\+\}[^\n]*\n' \
fix_namespace.patch > fix_namespace_simple.patch
6. Manually review the simple patch and then apply it, after first
restoring the source.
git checkout -- src
patch -p1 < fix_namespace_simple.path
7. Manually review the (simple) changes and then stage the changes
git diff src
git add src
8. Again apply clang-format and filter out hunks related to any
remaining fixes to the namespace, this time filter with more
context. There will be fewer hunks as all the simple cases have
already been handled:
clang-format -i $(find src -name "*.cpp" -or -name "*.h")
git diff src > tmp.patch
grepdiff '^} // namespace' --output-matching=hunk tmp.patch > fix_namespace2.patch
9. Manually review/edit the resulting patch file to remove hunks for files
which need to be dealt with manually, noting the file names and
line numbers. Then restore files to as before applying clang-format
and apply the patch:
git checkout src
patch -p1 < fix_namespace2.patch
10. Manually fix the files noted in the previous step. Stage files,
review changes and commit.
Make sure to properly nest \begin{lang} and \end{lang} tags even
when no language package is selected. In this case, LyX assumes
that babel is being used, so the language names might be wrong
if the user arranged for using polyglossia in the preamble.
Nevertheless, we assure that the produced output is syntactically
correct, so that by adding proper preamble code a correct output
is still possible.
This commit fixes the regression introduced in 2.2 about the
output of en- and em-dashes. In 2.2 en- and em-dashes are output as
the \textendash and \textemdash macros when using TeX fonts, causing
changed output in old documents and also bugs (for example, #10490).
Now documents produced with older versions work again as intended,
while documents produced with 2.2 can be made to produce the exact
same output by simply checking "Don't use ligatures for en-and
em-dashes" in Document->Settings->Fonts.
When exporting documents using TeX fonts to earlier versions, in order
to avoid changed output, a zero-width space character is inserted after
each en/em-dash if dash ligatures are allowed. These characters are
removed when reloading documents with 2.3, so that they don't accumulate.
This avoids some duplicate code. Note that the return value of
Paragraph::getAlign had to be changed. I suspect it was set to char to
avoid reading one header file in Paragraph.h.
LyX assumes that everything in \lyxdeleted is struck out by ulem
and increases the corresponding counter. However, deleted display
math material is struck out using tikz. As we also take into
account the deletion of underlined display math (in order to
properly position such material vertically), we have to take
care that the count is correct.
It should be now possible underlining or striking out any kind
of math inset containing any math construct indigestible to ulem.
While this was already possible for inline math insets, they could
have break if an aligned environment was used, for example.
This is now possible also for diplay math. Even if this can be
nonsensical and not visually perfect, at least no latex errors
should be generated if one tries to.
Font changes are brought inside the \lyxdeleted macro, just before
outputting the latex code for the math inset. The inset writes a
signature before itself and this is checked by \lyxsout for recognizing
a display math. So, the font changes confuse \lyxsout, which also
swallows the first macro at the very start of \lyxdeleted. The result
is that the font changing command is not seen by latex and \sout is also
used to further strike out the formula already striked out by tikz.
This commit makes sure that the expected signature actually appears
just after the opening brace of \lyxdeleted. It also accounts for a
paragraph break occurring just before the math inset, in order to not
introduce too much vertical space, which is noticeable when using
larger font sizes.
Showing deleted display math by enabling "Show Changes in Output" was
only possible with dvi (through dvipost). Although LyX strikes out
such formulas on screen, it was impossible obtaining an output
directly using pdflatex (or other engines producing pdf) because
ulem cannot cope with display math material and gives errors.
The solution is to strike out by ourselves such deleted formulas.
I took into account several options. One of them would produce
an output similar to dvipost (which strikes out each element), but
would have required much more changes in the output routines.
Eventually, I opted for using tikz, which gives a more clean
output (as it requires to simply adding a preamble and a postamble
to the latex code of any displayed math, instead of a mark up
tailored to each particular math construct). The look of the pdf
output is similar to the way LyX strikes out the equations on screen.
This enables error reporting for the preamble, provided the preamble is written
using the new InPreamble layouts.
In the future, I find it preferable to deprecate the usual preamble in favour of
InPreamble layouts rather than implementing error reporting for the usual
preamble. This requires some improvements to code editing in the buffer view
first (line breaking behaviour, syntax highlighting).
texstring is a pair of a docstring and a corresponding TexRow. The row count in
the TexRow has to match the number of lines in the docstring.
otexstringstream is an output string stream that can be used to create
texstrings (i.e. it's an odocstringstream that records the TexRow information
and let us extract a texstring from it).
texstrings can be passed around and output to otexstream and otexrowstream,
which produces an accurate TexRow information by concatenating TexRows.
dealing properly with the paragraph separator tag.
We really need to use that tag as a kind of general marker for which
tags we're responsible for in a given paragraph and which tags we are
not. So the changes to InsetText.cpp use the tag as that kind of marker.
Note that, as of this commit, the User Guide again exports without any
kind of error. I haven't yet checked the other manuals.
This fixes bug #8022.
The initial values for maxasc and maxdes (renamed from maxdesc) is obtained as a maximum of max ascents/descents of all row elements.
This allows to get rid of Paragraph::highestFontInRange and FontList::highestInRange.
Some auxilliary variables declarations are also moved to where they are needed.
(#8738)
For efficiency, we add a new flag to the buffer indicating when changes are
present. This flag is updated at each buffer update, and also when explicitly
requested via a dispatch result flag.
This was a regression of 8aa37c43. I did not take into account that end_pos
could be -1, so the code that checked whether a pair of braces needs to be
inserted between two hyphens did not work for that case. Now we check for
the length of text_, which should be done anyway, and only take end_pos into
account when it is not -1.
There are two regressions that are fixed here:
* empty rows at the end of a paragraph (think after newline at end of
paragraph or empty line in Verbatim) do not have an end-of-par
marker. This is fixed by removing the early return in breakRow and
letting the whole function be executed. This requires to relax an
assertion in Paragraph::fontSpan. It makes sense here to query
position at the end of the paragraph.
* a newline at the end of a paragraph will be followed by and
end-of-par marker. This is fixed by skipping the end-of-par marker
when a new row has been requested.
Unfortunately the footmisc package does not work together with hyperref:
Before 0bf8b8a1, a footnote in a section title was created as a link in pdf
outpout, after 0bf8b8a1 ist was no link anymore. For now we revert to the old
code, and wait until the footmisc and hyperref packages are made compatible.
Prevent encoding changes whenever the TeX engine is XeTeX or LuaTeX,
as XeTeX/LuaTeX use only one encoding per document:
* with useNonTeXFonts: "utf8plain",
* with XeTeX and TeX fonts: "ascii" (inputenc fails),
* with LuaTeX and TeX fonts: only one encoding accepted by luainputenc.
+1 no needless encoding switches
+1 runparams.encoding matches the correct encoding at any time
+1 less complicated code.
-1 there may still be problems with CJK (possibly impossible to
solve for Xe/LuaTeX with TeX fonts).
For LuaTeX & TeX fonts, the complete document uses the encoding
of the global document language.
See also #9740.
Actually, the changed tests were used to prevent overwriting the encoding
changed in Buffer::writeLaTeX with a language-default encoding.
This is still required for XeTeX with TeX-fonts unless a proper solution is found.
Documents with more than one encoding and TeX-fonts fail with LuaTeX,
as "luainputenc" can only handle one encoding.
With inputenc == "auto" or "default", the encoding changes with
the language and must be reset after an eventual language switch in insets
or environments (see #6216).
However, whether we need to do this does not depend on 8-bit TeX vs. LuaTeX
but on the possible use of more than one encoding for the document.
With "nonTeXFonts", the encoding is utf8,
LuaTeX with TeX fonts requires encoding handling similar to 8-bit TeX.
(Additionally, the value of "params.inputenc" could be tested: if it is
not "auto" or "default", we have just one common encoding and could skip
the reset as well.) Not sure how much time this saves, though.
Fixes output for 3 of the 4 test lyx-files.
Includes "FIXME"s at places where further action is required to get the XeTeX
export right but I don't know how.
This is preliminary work for extending the cursor<->row tracking to math.
TexRow used to associate, to each row, a location id/pos where id determines a
paragraph and pos the position in the paragraph.
TexRow now associates to each row a list of entries, text or math. A math is a
pair uid/idx where uid will determine a math inset and idx is the number of the
cell.
The analogy id/pos<->inset/idx works better than the analogy id/pos<->idx/pos,
because what matters for the TexRow algorithm(TM) is the behaviour in terms of
line breaks.
This only improves the source view and the forward search, not the error report
and the reverse search (though this could be easily added now).
Use the function support:truncateWithEllipsis() to shorten a docstring with
... at the end. Actually we use U+2026 HORIZONTAL ELLIPSIS instead of "..." when
automatically shortening strings. This is to be consistent with Qt's own
truncation and is much nicer on the screen.
This includes the bugs #9575 and #9572 regarding broken text elision in the
outliner.
Known issues (non-regressions):
* TocBackend::updateItem() should be rewritten to update all TOCs. (#8386)
* "..." should be replaced with … everywhere else on the interface (including
translation strings).
* We should prefer to rely on QFontMetrics::elidedText() to truncate strings
with an ellipsis whenever possible, or an equivalent for the buffer view
dependent on the font metrics. See the warning in src/support/lstrings.h.
These were all flagged by "(style) The scope of the variable 'x' can be reduced."
Narowing the scope improves readability, and if it is in a loop then the
compiler will be clever enough to produce efficient code, we do not need
manual optimization for POD types.
There is a mismatch between the way text is tokenized in Row objects
and the way it is shown on screen. When metrics are computed,
continuous spell checking has not been done yet. Yet, the row painter
explicitly breaks words at spell status boundaries. This creates
problem with a text like "PMP," (see bug #9649), where there is a
negative kerning before the comma.
This is solved by not taking in account spell status when drawing
text, and drawing spell underlines separately.
* replace Paragraph::isSameSpellRange with new method getSpellRange.
* merge RowPainter::paintChars into RowPainter::paintFromPos
* move the actual text painting code into the new paintTextAndSel.
* merge some code from paintFromPos to paintMisspelledMark
* in paintMisspelledMark, scan the string which needs to be annotated
and add dashed line below text marked as misspelled.
Fixes bug #9649.
Previously, LyX did replace some words with typeset logos, and there was no
way to prvent this except putting them, in ERT (bug #4752). Now we have
special insets for these words, and standard text is left alone.
Previously, consecutive dashes in .lyx files were combined to endash and emdash
in some cases, and in other cases they were output as is. This made the code
complicated, and resulted in inconsitencies ((bug #3647).
Now, a dash in a .lyx file is always a dash in the output, for all flavours.
The special handling is moved to the input side, so that you still get an
endash if you type two hyphens. If needed, this can be changed or made
customizable without the need to update the file format again. Many thanks
for the fruitful mailing list dicsussion, which contributed significantly to
the final version.
For exports based on LaTeX, consecutive hyphens are only converted to endash
and emdash if the current font family is not typewriter, and if none of the
parent insets is an IPA inset. Now this is done for XHTML export as well.
This is a patch I originally sent to lyx-devel in 2012 with subject
'Load footmisc.sty instead of using copied code from obsolete stblftnt.sty'.
It now takes all comments into account: It works also if the user loads the
package herself, it can be disabled by providing the footmisc feature in a
layout, and it does not use the ugly \AtBeginDocument{}.
This branch implements string-wise metrics computation. The goal is to
have both good metrics computation (and font with proper kerning and
ligatures) and better performance than what we have with
force_paint_single_char. Moreover there has been some code
factorization in TextMetrics, where the same row-breaking algorithm
was basically implemented 3 times.
Globally, the new code is a bit shorter than the existing one, and it
is much cleaner. There is still a lot of potential for code removal,
especially in the RowPainter, which should be rewritten to use the new
Row information.
The bugs fixed and caused by this branch are tracked at ticket #9003:
http://www.lyx.org/trac/ticket/9003
What is done:
* Make TextMetrics methods operate on Row objects: breakRow and
setRowHeight instead of rowBreakPoint and rowHeight.
* Change breakRow operation to operate at strings level to compute
metrics The list of elements is stored in the row object in visual
ordering, not logical. This will eventually allow to get rid of the
Bidi class.
* rename getColumnNearX to getPosNearX (and change code accordingly).
It does not make sense to return a position relative to the start of
row, since nobody needs this.
* Re-implement cursorX and getPosNearX using row elements.
* Get rid of lyxrc.force_paint_single_char. This was a workaround that
is not necessary anymore.
* Implement proper string metrics computation (with cache). Remove
useless workarounds which disable kerning and ligatures.
* Draw also RtL text string-wise. This speeds-up drawing.
* Do not cut strings at selection boundary in RowPainter. This avoids
ligature/kerning breaking in latin text, and bad rendering problems
in Arabic.
* Remove homebrew Arabic and Hebrew support from Encoding.cpp. We now
rely on Qt to do handle complex scripts.
* Get rid of LyXRC::rtl_support, which does not have a real use case.
* Fix display of [] and {} delimiters in Arabic scripts.
This variable was introduced to guard against any bad consequence of the then-new right-to-left
languages support. Let's be bold and get rid of it altogether!
Now right to left support is always enabled.
This is handled by Qt now.
Note that a small optimization (do not draw text that is to the left
of WorkArea) is removed because it cannot be guaranteed to be exact
anymore. It was probably not very useful anyway, and would become
useless once the RowPainter is rewritten to use Row information.
Update 00README_STR_METRICS_BRANCH.
The statement
if (pos < from + lyxrc.completion_minlength)
triggers a signed vs. unsigned warning. I don't know why this happens, it
could be a MSVC bug, or related to LLP64 (windows) vs. LP64 (unix)
programming model, or the C++ standard might be ambigous in the section
defining the "usual arithmetic conversions". However, using a temporary
variable is safe and works on all compilers.