Commit Graph

360 Commits

Author SHA1 Message Date
Stephan Witt
225de7830e Remove useless assignment to a local variables never read later. 2020-02-18 08:55:00 +01:00
Kornel Benko
ae7a7fa882 Adv search: fix handling of multiple params of a latex command
Fix the case of possibly nested parentheses
2020-01-03 13:11:47 +01:00
Kornel Benko
49aaf95894 Fix handling of doRemove in advanced search
Amend 11c47ddf
2020-01-01 14:03:21 +01:00
Kornel Benko
48c7d9b028 Do not search in deleted text in change tracking mode 2019-12-29 17:42:18 +01:00
Jean-Marc Lasgouttes
c73d397d32 Do not use same name for members and arguments
Spotted by cppcheck.
2019-10-27 00:06:54 +02:00
Jean-Marc Lasgouttes
714113655a Follow some of the performance advice from cppcheck
Most of that is changing string to string const &.
2019-09-13 16:23:49 +02:00
Kornel Benko
8acbcebf11 Findadv: Add some missing accents.
They are defined in lib/unicodesymols, but were not handled yet.
2019-07-30 15:21:56 +02:00
Kornel Benko
ebc7105c36 FindAdv: Cosmetics
Remove parentheses from return statements,
add '_' to private members
2019-03-21 12:58:16 +01:00
Kornel Benko
e55244ccd8 FindAdv: Added remaining accents(2) dgrave, textdoublegrave, rcap, textroundcap 2019-03-20 23:22:29 +01:00
Jean-Marc Lasgouttes
1c755fefa5 Initialize hasTitle in Intervall constructor
I also moved around some things while I was at it.

Spotted by coverity scan.
2019-03-20 17:26:56 +01:00
Kornel Benko
d7354a1a09 FindAdv: Polishing
1.) Use vector for borders, because any value may be too small
  if there are plenty of accented characters in a paragraph
2.) use '[\S]' instead of '.' in regex for 'accre'. The regex would
  otherwise find also patterns like '\ {some text}'
2019-03-18 18:28:49 +01:00
Kornel Benko
9e825d5035 FindAdv: Added remaining accents cedilla, subring, subhat subtilde 2019-03-18 12:59:40 +01:00
Kornel Benko
13b3808aa0 FindAdv: Casting to satisfy Windows compiler
Thanks to Jean-Marc Lasgouttes
2019-03-18 09:38:34 +01:00
Kornel Benko
bf4394e282 FindAdv: Expand the list of handled chars for ogonek 2019-03-17 13:06:56 +01:00
Kornel Benko
9a1a806b60 FindAdv: Correct start of search if not using regex
Do not try to find pattern inside the leading string.
2019-03-16 11:26:20 +01:00
Kornel Benko
8e7c427c7c Amend 7ac04a2b: Count and display number of replaced strings in FindAdv
We have to know if the previous call to search was a single replace or not,
so that we can correctly initialize the numer of replaed strings.
2019-03-16 08:17:09 +01:00
Kornel Benko
4eacc492a3 Typo 2019-03-13 14:14:35 +01:00
Kornel Benko
7ac04a2b75 Fix #11505. Count and display number of replaced strings in FindAdv 2019-03-13 14:06:18 +01:00
Kornel Benko
c041439c51 FindAdv: Special handling for \dot{i} and 'ß'
Different behaviour in regexp{..} for 'İ' and 'ß':
1.) lowercase routine for 'İ' gives 'İ', so that if we are searching
  while ignoring case, the string '\dot{I}' is converted to '\dot{i}'.
  In this case we have to change it to 'İ' (instead of 'i', as one would expect).

2.) If 'ß' is inserted via keybord on fresh created regexp box it appears as \lyxmathsym{ß},
  if pasted from the lyx-screen it appears as \text{ß}
2019-03-10 00:29:56 +01:00
Kornel Benko
f848183fa8 FindAdv: Expand the list of handled chars for dot below and ring above 2019-03-08 22:44:00 +01:00
Kornel Benko
b702eda4ed FindAdv: Amend cd4ae51f
Prevent to match only part of a macro.
For instance, we want find '\imath' but not '\imathxxxx'
while checking for accents.
2019-03-04 14:37:10 +01:00
Kornel Benko
cd4ae51f77 FindAdv: Amend b21c8b21: Expand the list for handled latin characters
1.) Added for 'breve' and 'grave' accents
2.) Corrected handling for 'i'-accents (allowed \hat{i} _and_ \hat{\imath})
	because of problems with ignoring case
3.) Spaces: Changed some indents in source
2019-03-04 14:05:44 +01:00
Kornel Benko
99bacf006e FindAdv: Handle some more accented latin characters.
Also try to use UTF8 encoded chars instead of their
latex equivalent if possible.
2019-03-03 14:08:27 +01:00
Kornel Benko
b21c8b214d FindAdv: Expand the list for handled latin characters 2019-03-02 22:00:20 +01:00
Kornel Benko
3541a49db4 FindAdv: Try to add the possibility to search for accented characters in regex
The problem is the handling of regex as using math-mode. That is
any accented character is converted to a math macro.
For instance "ä" --> "\\ddot{a}".
Outside of math or regex it is not converted (if used xetex flavour),
but there are other chars which are converted in math and in text (but differently)
For instance "ů"
	in math --> "\\mathring{u}"
	in text --> "\\r{u}"

TODO: determine the still not handled conversions.
It would be nice, if we could persuade math factory to not convert
these characters, but I was unable to find the place where the
conversion actually takes place.
2019-03-02 15:42:38 +01:00
Kornel Benko
9d6b71c6b3 FindAdv: Use isAlnumASCII() instead of std::isalnum()
Thanks Jean-Marc
2019-02-28 13:00:12 +01:00
Kornel Benko
2c5c397afa Amend aaffcd0b: Remove some remnants ... 2019-02-27 10:33:25 +01:00
Kornel Benko
aaffcd0b39 FindAdv: Do not use data from included listing if in search mode
Fixes #11496 	"Find and replace (advanced)" is too slow
2019-02-27 10:17:56 +01:00
Kornel Benko
b1f93e0982 FindAdv: Try to use a better algorithm to find begin of a searched string 2019-02-26 23:00:31 +01:00
Kornel Benko
babb291ef3 FindAdv: Partially revert e69f7022
The slowness returns, but the search works again
2019-02-26 13:24:36 +01:00
Kornel Benko
e69f702275 FindAdv: Fix #11496 -- too slow find
Also added some more macros to handle
2019-02-25 12:12:19 +01:00
Kornel Benko
6a6b670bbd FindAdv: Correctly match '\[' and '\]' in regular expressions with format enabled
We have to check for instances of '{[}' and '{]}' and
omit removing the enclosing parentheses
2019-02-23 13:11:34 +01:00
Kornel Benko
44a06adb6c FindAdv: debug info
1.) Fill the 'head'-member to easier recognize the macro. May be discarded
 later, although it does not take too much run-time
2.) Add some comment
3.) Ignore any macro inside the regex.
2019-02-22 13:21:23 +01:00
Kornel Benko
01fd1f7679 FindAdv: Discard \parbox, \input macros
The languge of these macros does not matter. What's more,
without removing them we may obtain wrong matching.
2019-02-21 20:32:08 +01:00
Kornel Benko
8a0db92523 FindAdv: Handle \shortcut
Essentially remove the header and handle the language inside
\shortcut{...} appropriately
2019-02-21 14:45:41 +01:00
Kornel Benko
a298fc55d9 FindAdv: Added handling for latex environments
1.) Make sure the environment is mentioned in the string for search
  (Added the keyword \latexenvironment{...})
2.) Handle it similar to \textcolor{}

That way we can also search for 'conclusion*' or 'summary' etc
in Additional.lyx.
2019-02-20 14:14:50 +01:00
Kornel Benko
0a2dda0904 Findadv: Added handling for frontmatter macros
title, subtitle, author etc.
2019-02-19 23:11:09 +01:00
Kornel Benko
96ca66d664 FindAdv: Handle more cases
Some macros need:
1.) Take care of case sensitivity
2.) Better handling of used argument values
3.) Cleaner list-environment search
4.) Remove superfluous '~' if searching for description or labeling env
2019-02-18 00:40:55 +01:00
Kornel Benko
dfbe29317d FindAdv: Even more fine tuning 2019-02-16 18:39:10 +01:00
Kornel Benko
faf7f0666f FindAdv: More fine tuning 2019-02-13 13:41:57 +01:00
Kornel Benko
8d752b68e9 FindAdv: Fine tuning 2019-02-12 14:21:14 +01:00
Kornel Benko
a47dbed6bd FindAdv: Try to find real start of found match
Sometime it happen that the selection contains area which was skipped
in splitOnKnownMacros().
So we check, if a shorter selection would give the same mach size.
2019-02-11 13:13:28 +01:00
Kornel Benko
f19bd163de FindAdv: Add handling of begin{multicols}[...][...]{...}
Also
a.) try to speed up regex search using non-greedy mode (.*?)
b.) remove '\n' completely in searched strings if it is not surrounded with
	aplanumerical chars
2019-02-10 18:00:55 +01:00
Kornel Benko
7fa1244dd8 Findadv: Add handling for powerdot macros
With format enabled, these macros were hard to process:
	lyxslide, twocolumn
2019-02-07 13:35:47 +01:00
Kornel Benko
50550a215f Findadv: Handle \lettrine{} in initials.module
The problem here is, that selecting any subset of a \lettrine{}
line always creates an initials header. That makes it impossible
to our search engine to find strings, because the regex does not
contain that info. So we have to discard the leading \lettrine part
completely.
We place now a marker (\endarguments) to determine that removable
part.
2019-02-05 08:04:47 +01:00
Kornel Benko
187b518648 FindAdv: to please cppcheck ...
Initialize class elements
Removed unused method
Added 'explicit' keyword
Optimize handling for sizes ( \tiny, \small, etc)
2018-12-18 06:53:58 +01:00
Kornel Benko
a1a7c21871 FindAdv: Amend 4276e1b0 2018-12-17 10:33:23 +01:00
Kornel Benko
4276e1b01e FindAdv: Handle also sizes of characters 2018-12-16 14:50:38 +01:00
Kornel Benko
2682170556 FindAdv: Comments 2018-12-14 19:51:24 +01:00
Kornel Benko
358626b735 FindAdv: Add handling spaces, dots, quotes ...
Treate spaces, dots and quotes as ordinary characters
Also discard length values for hspace,vspace and mspace
2018-12-13 17:12:57 +01:00
Kornel Benko
8a29bdb3d1 FindAdv: Added code, href, url and footnote to handled search formats
Remark: Inside code{} and footnote{} are the language settings ignored.
2018-12-11 17:27:50 +01:00
Kornel Benko
cd94180492 FindAdv: Simplify search for chars '&', '%', '#' and '_'
This is not possible for '$', because of the latex-meaning to
start/end math inset.
Therefore, if not ignoring format, we still have to use
[\\][\$] in regex in order to find '$' in text.
2018-12-05 13:36:43 +01:00
Kornel Benko
1cd80ff6c8 FindAdv: Eliminate a corner case in the binary search
Given the regex 'r.*r\b' and a string
"abc regular something cursor currently"
we expect to find "regular something cursor".
But while searching we may be confronted with input
"regular something cursor curr"
and so the searched string would be seen longer.
2018-11-27 19:10:27 +01:00
Kornel Benko
8549fbb326 FindAdv: Avoid crash finding char at end of inset
Testcase without this patch:
1.) open de/Additional.lyx
2.) goto 6.1 Astronomy & Astrophysics
3.) open the index
4.) find advaced
	a.) not ignoring format
	b.) regex = .+
	c.) language of regex: English
	4.) search next
The seach finds the next break (which is outside of the index)
The following try to display the selection leads to crash
2018-11-26 12:37:18 +01:00
Kornel Benko
578a4b6fb0 find: This change was not intended, amend e96a9d6329 2018-11-25 18:25:14 +01:00
Kornel Benko
e96a9d6329 Find: Use greedy behaviour
This change is valid for findadv too.
Patterns like '.*' now are greedy, like it is normal in regex
Searching for whole words is corrected, but can be slow.
One can speed up the search with adapted pattern.
So for instance searching for words starting and ending with 'r'
the normal pattern is 'r.*r'. The speed-up pattern could be
'\br[^\s]*r\b'. This halves the search time.

Search results are now different to that of lyx2.3, because the greedy
'.*' is now really greedy.
To achive the same results, we have to use '.*?' instead.
2018-11-25 17:51:20 +01:00
Jean-Marc Lasgouttes
e2a3dd1167 Fix compilation with msvc 2015
Without this, the compiler does not know whether 0 is a size_t or char
const *.
2018-11-24 19:17:31 +01:00
Kornel Benko
e9e3c50c65 FindAdv: Optimization
A try to decrement the number of tests for a match.

Also a try to handle Hebrew documents. Unfortunatelly
the latex output is missing the language specification
(only the change of encoding is available there).
I failed to find a proper place to add the lang.
That means, searching for e.g. English text in Hebrew documents
is not satisfying.
2018-11-20 14:36:11 +01:00
Kornel Benko
17ee4cafb1 FindAdv: Enable search for different languages in Korean documents too
The problem here was that for european languages only the encoding
was visible in latex output. Now also the language is provided.
2018-11-18 10:40:42 +01:00
Kornel Benko
0964ffb266 FindAdv: Remove left over comment character
Sometimes language spec starts with "% ". This happens in Japaneese documents
containig English text at start of paragraph.
2018-11-16 12:12:06 +01:00
Kornel Benko
06c05430d9 FindAdv: Added lyx-function search-ignore
Enable/disable ignoring the specified type
	language: e.g. british, slovak, latin, ...
	color:	blue, red, ...
	sectioning: part, chapter, ..
	font:
		series: bold, ...
		shape: upright, italic, slanted
		family: serif, monospace ...
	markup: enphasize, noun
	underline:
	strike:

Examples:
	search-ignore language true
	search-ignore shape true
2018-11-15 14:20:50 +01:00
Kornel Benko
702c495e98 FindAdv: Significantly increase the search speed
The needed time to find a simple string dependes on the
paragraph length was O(n^2)
Now it is down to O(n).
Before:
	To determine if the pattern matches we compared the
	paragraph from current position to the its end.
	Increment current position if no match
Now:
	Check if the character at current position has at least
	the needed features (text, color, language etc)
	If not, Increment current position
	else proceed as before
2018-11-13 12:11:33 +01:00
Kornel Benko
636bb6c2d9 FindAdv: Polishing search with regex containing '.'
Also added missing math env alignat
Modified handling of longtable/tabular
Added a routine to count for valid chars. This is needed
for detection of word boundaries.

Due to detection conflicts
	regex '.*' vs match of word-boundaries in MatchStringAdv::operator()
we need to use '\b' in regex explicitly. E.g. '\b.*\b'

The backward search works, but
1.) only in current paragraph (this is the same as before)
2.) only in the same language environment.
2018-11-12 12:28:31 +01:00
Jean-Marc Lasgouttes
7055bb0098 Change IgnoreFormats to a proper class
Instantiate a global variabble holding the formats and allow to modify
it using the helper function setIgnoreFormat.
2018-11-09 16:05:09 +00:00
Kornel Benko
f5d5777a86 FindAdv: Polishing
1.) Added \textmd to be ignored (sometimes it is used and sometimes not)
2.) Typo: multiline --> multline. Searching in 'multline' caused a crash
	because processing all of the '{' and '}' in the content of this math
	exceeded the size of the interval field.
2018-11-09 13:49:05 +01:00
Kornel Benko
0c05432284 FindAdv: Polishing
1.) Handle some unclosed parentheses
	Sometimes \shortcut is not correctly closed
2.) Added \ldots as known char
3.) Discard some shapes (circlepar, droppar, ...)
4.) Omit resulting empty string and use some value
	which cannot be matched instead
2018-11-08 09:59:51 +01:00
Kornel Benko
88428123ea FindAdv: Optimize for long matches
Still, if the matched string is at a rear part of a very long
paragraph, the search is way too slow.
2018-11-07 13:14:50 +01:00
Kornel Benko
9d304c0a1d FindAdv: Discard table decorations
That way we do not match the whole table but only the cell contents.
The problem I had was
1.) Document language Spanish
2.) Table (copied from English doc) => language English
3.) All cell contents Spanish

Now search for English text led to a selection of the whole
table, although there was no English content in any cell.
2018-11-07 09:35:16 +01:00
Kornel Benko
4f1cd00b02 Findadv: Initialize the position of first unprocessed open parentheses
Not initializing caused some wrong matches.
2018-11-06 15:28:43 +01:00
Kornel Benko
e487274ff4 Findadv: Polishing
1.) Do not remove '{}' unconditionally from \item parameter
2.) Do not output last empty entry
2018-11-05 12:58:45 +01:00
Kornel Benko
70e2f09c4f Findadv: Some glitches found while searching for English text in fr/UserGuide.lyx
Ignore \index
Handle \og and \bg as characters
Remove space in empty list-item (description or labeling)
2018-11-04 21:41:04 +01:00
Kornel Benko
aa68dcefa0 Findadv: 'Optimized' detection of matched string
This is clearly a hack, because I don't understand why the
previous code did not work.
2018-11-04 14:57:40 +01:00
Kornel Benko
e6418431dd Findadv: Handle the problem with list environments
The problem was, that the different list ennvironments
did not look different in tha latex output used for
search.
So the input of "\item ..." did not give information
if it is description, lyxlist, enumeration or labeling.

In search modus we use now "\item{enumeration}" etc.
2018-11-03 11:15:12 +01:00
Kornel Benko
5c83ad37d0 Findadv: Use '\n' as delimiter for end of data with same features
This allows to use '.' in regex without matching also wrong data.

Also added modified patch from ajd (see #11241).
2018-11-02 10:32:28 +01:00
Kornel Benko
5af8ec3240 Findadv: Allow multiple math statements in a line 2018-10-30 20:54:32 +01:00
Kornel Benko
f0d7432608 Findadv: Remaining findadv tests pass now
Exception: findadv-21, but it is not a regression,
because this one never passed.
The problem here is, that we cannot differentiate
between enumeration, itemize, description and labeling
environment here.
2018-10-29 13:17:54 +01:00
Kornel Benko
aecd98dc46 Findadv: Adapt search for special chars '[', ']', '%' and '#' 2018-10-29 07:37:32 +01:00
Kornel Benko
2d477c5e0a Changes to match math equations
Now tests findadv-01 ... findadv-20 pass too.

keytest.py: Expanded time for controll keys (like \[Return])
findadv*: expanded time for normal keys
lyxfind.cpp: Handle math equations
2018-10-28 19:40:14 +01:00
Scott Kostyshak
645d42f451 Revert "Comment out unused functions to restore -Werror"
This reverts commit bceb2390b4.

For details, see:

https://www.mail-archive.com/search?l=mid&q=4724814.5HqUF52VLN%40amd64
2018-10-28 11:43:47 -04:00
Scott Kostyshak
bceb2390b4 Comment out unused functions to restore -Werror
This commit restores compilation with -Werror and g++ version 7.3.0.

Consistent with 6dc450bc.
2018-10-27 20:59:26 -04:00
Kornel Benko
0f72179a07 Amend(4) 74c849d Advanced search with format
Prepare for use with func request. For instance to
ignore language while searching:
	setIgnoreFormat("language", true);
2018-10-27 16:57:42 +02:00
Kornel Benko
0b3f644469 Amend(3) 74c849d Advanced search with format
* Discard now unneeded code
* Remove macro '\uldepth=...'
2018-10-24 11:07:11 +02:00
Kornel Benko
2dd522472a Amend(2) 74c849d Advanced search with format
Added missing handling for chapter/chapter*
Also added frontmatter (title, author etc), but disabled ATM
2018-10-23 21:12:22 +02:00
Kornel Benko
ff9c32b382 Amend 74c849d, Advanced search with format
Remove macros like '\tiny ' or '\tiny{}' or '\tiny' followed by
any other non-alpha char correctly
2018-10-23 20:03:50 +02:00
Kornel Benko
74c849d651 Advanced search with format, preparing for selective searching
As it is now, searching with format needs ALL the features set
in order to match the pattern.
What needs to be done is a GUI specifying which of the features are
important.
1.) language
2.) font (series, shape)
3.) markup, underline, strikeout
4.) color
Having this info, the implementation is easy. Set
some variables and be done
2018-10-22 20:19:36 +02:00
Kornel Benko
140f843690 Advanced search with format, consider also sectioning macros 2018-10-20 12:47:37 +02:00
Kornel Benko
0be01de61f Advanced search with format, refactoring 2018-10-19 19:11:20 +02:00
Kornel Benko
6dc450bc46 Commented out an unused function to please a picky compiler 2018-10-19 11:10:54 +02:00
Kornel Benko
d6cc58f4a3 Amend(4) 7a03fa6: Advanced search with format:
Further normalize the latex input in case of enabled format search.

It was not enough to split the latex input on \foreignlanguage and \textcolor
macros only.
Instead also macros like \textt, or \noun etc had to be accounted for.

This patch uses therefore a different algorithm.
2018-10-18 17:37:15 +02:00
Kornel Benko
162c1f316b Amend(3) 7a03fa6: Advanced search with format:
Grrr... enable the search without format again
2018-10-15 08:09:19 +02:00
Kornel Benko
8b21b2f8fb Amend(2) 7a03fa6: Advanced search with format:
Further tweeking.
2018-10-14 20:39:13 +02:00
Kornel Benko
1967d5411c Amend edca2e0: copy && paste error 2018-10-13 22:22:48 +02:00
Kornel Benko
edca2e0c4a Amend 7a03fa6: Advanced search with format:
In the latexified text:
* Check and handle contained regex properly
* Discard superfluos '{' preventing our search engine
  to match with the search pattern
2018-10-13 21:08:26 +02:00
Kornel Benko
7a03fa6f1d Advanced search with format: Prepare latex for find
Our findadv expects something like
	prefix + 'search'
so that the regex (which is latexified too)
can work on 'search'
(In the source, the prefix is denoted by lead_as_string)

The latex output contains structs like
	\foreignlaguage(abc}{xx\textbf{boldxx\textcolor{blue}{blue 1 blue 2} XX}}
which would never match the simple prefix.

Now the above is converted to
	\foreignlaguage(abc}{xx}\\
	\foreignlaguage(abc}{\textbf{boldxx}}
	\foreignlaguage(abc}{\textbf{\textcolor{blue}{blue 1 blue 2}}}\\
	\foreignlaguage(abc}{\textbf{ XX}}
Of course, more than one language or color in an inset can be searched for now.
2018-10-12 16:47:07 +02:00
Kornel Benko
ff933b52f5 Amend(2) b78bdf8
Modified language handling

Still, there are problems, because sometimes the search pattern
does not contain the the requested info. So the 'find' often fails
for strings inside a list environment.
2018-10-06 23:50:50 +02:00
Kornel Benko
f2d82f879e Amend(1) b78bdf8
In advanced search:
* Ignore font sizes
* ignore \\[a-z]+par{} macros
* ignore \\inputencoding{...} macros
2018-10-06 09:58:29 +02:00
Kornel Benko
b78bdf80a8 Added better handling for languages and colors for advanced F&R
The change is significant if the search format is not disabled.
We try to analyze the pattern string first to get needed features
for the search.
We try to analyse the searched string and if it does not
contain all expected featers (color, language, char style, char decoration)

Still some problems though
2018-10-05 20:26:44 +02:00
Kornel Benko
2348e0b615 Advaced search: Added handling to search for colored text
if used with format enabled
2018-10-02 11:53:01 +02:00
Kornel Benko
4eb5ac9a2c Amend(4) 73188e3
* Added textsl, texttt, uline, uuline, sout, xout to the list of possible
  leading strings.
* Account for correct number of open braces in regex.
  Now the search works for enbled format too.

This is hopefully the last amend
2018-10-01 12:06:42 +02:00
Kornel Benko
cc0c58839f Amend(3) 73188e3
Adapt the positional references in regex supplied by user
so that for instance '([a-z]+)\s\1' to find identical words in sequence
is changed to '([a-z]+)\s\2'.
2018-09-30 18:37:55 +02:00