309 Commits

Author SHA1 Message Date
Kornel Benko
cd94180492 FindAdv: Simplify search for chars '&', '%', '#' and '_'
This is not possible for '$', because of the latex-meaning to
start/end math inset.
Therefore, if not ignoring format, we still have to use
[\\][\$] in regex in order to find '$' in text.
2018-12-05 13:36:43 +01:00
Kornel Benko
1cd80ff6c8 FindAdv: Eliminate a corner case in the binary search
Given the regex 'r.*r\b' and a string
"abc regular something cursor currently"
we expect to find "regular something cursor".
But while searching we may be confronted with input
"regular something cursor curr"
and so the searched string would be seen longer.
2018-11-27 19:10:27 +01:00
Kornel Benko
8549fbb326 FindAdv: Avoid crash finding char at end of inset
Testcase without this patch:
1.) open de/Additional.lyx
2.) goto 6.1 Astronomy & Astrophysics
3.) open the index
4.) find advaced
	a.) not ignoring format
	b.) regex = .+
	c.) language of regex: English
	4.) search next
The seach finds the next break (which is outside of the index)
The following try to display the selection leads to crash
2018-11-26 12:37:18 +01:00
Kornel Benko
578a4b6fb0 find: This change was not intended, amend e96a9d63294b 2018-11-25 18:25:14 +01:00
Kornel Benko
e96a9d6329 Find: Use greedy behaviour
This change is valid for findadv too.
Patterns like '.*' now are greedy, like it is normal in regex
Searching for whole words is corrected, but can be slow.
One can speed up the search with adapted pattern.
So for instance searching for words starting and ending with 'r'
the normal pattern is 'r.*r'. The speed-up pattern could be
'\br[^\s]*r\b'. This halves the search time.

Search results are now different to that of lyx2.3, because the greedy
'.*' is now really greedy.
To achive the same results, we have to use '.*?' instead.
2018-11-25 17:51:20 +01:00
Jean-Marc Lasgouttes
e2a3dd1167 Fix compilation with msvc 2015
Without this, the compiler does not know whether 0 is a size_t or char
const *.
2018-11-24 19:17:31 +01:00
Kornel Benko
e9e3c50c65 FindAdv: Optimization
A try to decrement the number of tests for a match.

Also a try to handle Hebrew documents. Unfortunatelly
the latex output is missing the language specification
(only the change of encoding is available there).
I failed to find a proper place to add the lang.
That means, searching for e.g. English text in Hebrew documents
is not satisfying.
2018-11-20 14:36:11 +01:00
Kornel Benko
17ee4cafb1 FindAdv: Enable search for different languages in Korean documents too
The problem here was that for european languages only the encoding
was visible in latex output. Now also the language is provided.
2018-11-18 10:40:42 +01:00
Kornel Benko
0964ffb266 FindAdv: Remove left over comment character
Sometimes language spec starts with "% ". This happens in Japaneese documents
containig English text at start of paragraph.
2018-11-16 12:12:06 +01:00
Kornel Benko
06c05430d9 FindAdv: Added lyx-function search-ignore
Enable/disable ignoring the specified type
	language: e.g. british, slovak, latin, ...
	color:	blue, red, ...
	sectioning: part, chapter, ..
	font:
		series: bold, ...
		shape: upright, italic, slanted
		family: serif, monospace ...
	markup: enphasize, noun
	underline:
	strike:

Examples:
	search-ignore language true
	search-ignore shape true
2018-11-15 14:20:50 +01:00
Kornel Benko
702c495e98 FindAdv: Significantly increase the search speed
The needed time to find a simple string dependes on the
paragraph length was O(n^2)
Now it is down to O(n).
Before:
	To determine if the pattern matches we compared the
	paragraph from current position to the its end.
	Increment current position if no match
Now:
	Check if the character at current position has at least
	the needed features (text, color, language etc)
	If not, Increment current position
	else proceed as before
2018-11-13 12:11:33 +01:00
Kornel Benko
636bb6c2d9 FindAdv: Polishing search with regex containing '.'
Also added missing math env alignat
Modified handling of longtable/tabular
Added a routine to count for valid chars. This is needed
for detection of word boundaries.

Due to detection conflicts
	regex '.*' vs match of word-boundaries in MatchStringAdv::operator()
we need to use '\b' in regex explicitly. E.g. '\b.*\b'

The backward search works, but
1.) only in current paragraph (this is the same as before)
2.) only in the same language environment.
2018-11-12 12:28:31 +01:00
Jean-Marc Lasgouttes
7055bb0098 Change IgnoreFormats to a proper class
Instantiate a global variabble holding the formats and allow to modify
it using the helper function setIgnoreFormat.
2018-11-09 16:05:09 +00:00
Kornel Benko
f5d5777a86 FindAdv: Polishing
1.) Added \textmd to be ignored (sometimes it is used and sometimes not)
2.) Typo: multiline --> multline. Searching in 'multline' caused a crash
	because processing all of the '{' and '}' in the content of this math
	exceeded the size of the interval field.
2018-11-09 13:49:05 +01:00
Kornel Benko
0c05432284 FindAdv: Polishing
1.) Handle some unclosed parentheses
	Sometimes \shortcut is not correctly closed
2.) Added \ldots as known char
3.) Discard some shapes (circlepar, droppar, ...)
4.) Omit resulting empty string and use some value
	which cannot be matched instead
2018-11-08 09:59:51 +01:00
Kornel Benko
88428123ea FindAdv: Optimize for long matches
Still, if the matched string is at a rear part of a very long
paragraph, the search is way too slow.
2018-11-07 13:14:50 +01:00
Kornel Benko
9d304c0a1d FindAdv: Discard table decorations
That way we do not match the whole table but only the cell contents.
The problem I had was
1.) Document language Spanish
2.) Table (copied from English doc) => language English
3.) All cell contents Spanish

Now search for English text led to a selection of the whole
table, although there was no English content in any cell.
2018-11-07 09:35:16 +01:00
Kornel Benko
4f1cd00b02 Findadv: Initialize the position of first unprocessed open parentheses
Not initializing caused some wrong matches.
2018-11-06 15:28:43 +01:00
Kornel Benko
e487274ff4 Findadv: Polishing
1.) Do not remove '{}' unconditionally from \item parameter
2.) Do not output last empty entry
2018-11-05 12:58:45 +01:00
Kornel Benko
70e2f09c4f Findadv: Some glitches found while searching for English text in fr/UserGuide.lyx
Ignore \index
Handle \og and \bg as characters
Remove space in empty list-item (description or labeling)
2018-11-04 21:41:04 +01:00
Kornel Benko
aa68dcefa0 Findadv: 'Optimized' detection of matched string
This is clearly a hack, because I don't understand why the
previous code did not work.
2018-11-04 14:57:40 +01:00
Kornel Benko
e6418431dd Findadv: Handle the problem with list environments
The problem was, that the different list ennvironments
did not look different in tha latex output used for
search.
So the input of "\item ..." did not give information
if it is description, lyxlist, enumeration or labeling.

In search modus we use now "\item{enumeration}" etc.
2018-11-03 11:15:12 +01:00
Kornel Benko
5c83ad37d0 Findadv: Use '\n' as delimiter for end of data with same features
This allows to use '.' in regex without matching also wrong data.

Also added modified patch from ajd (see #11241).
2018-11-02 10:32:28 +01:00
Kornel Benko
5af8ec3240 Findadv: Allow multiple math statements in a line 2018-10-30 20:54:32 +01:00
Kornel Benko
f0d7432608 Findadv: Remaining findadv tests pass now
Exception: findadv-21, but it is not a regression,
because this one never passed.
The problem here is, that we cannot differentiate
between enumeration, itemize, description and labeling
environment here.
2018-10-29 13:17:54 +01:00
Kornel Benko
aecd98dc46 Findadv: Adapt search for special chars '[', ']', '%' and '#' 2018-10-29 07:37:32 +01:00
Kornel Benko
2d477c5e0a Changes to match math equations
Now tests findadv-01 ... findadv-20 pass too.

keytest.py: Expanded time for controll keys (like \[Return])
findadv*: expanded time for normal keys
lyxfind.cpp: Handle math equations
2018-10-28 19:40:14 +01:00
Scott Kostyshak
645d42f451 Revert "Comment out unused functions to restore -Werror"
This reverts commit bceb2390b473db347b72f9fc16f63a60223daa9d.

For details, see:

https://www.mail-archive.com/search?l=mid&q=4724814.5HqUF52VLN%40amd64
2018-10-28 11:43:47 -04:00
Scott Kostyshak
bceb2390b4 Comment out unused functions to restore -Werror
This commit restores compilation with -Werror and g++ version 7.3.0.

Consistent with 6dc450bc.
2018-10-27 20:59:26 -04:00
Kornel Benko
0f72179a07 Amend(4) 74c849d Advanced search with format
Prepare for use with func request. For instance to
ignore language while searching:
	setIgnoreFormat("language", true);
2018-10-27 16:57:42 +02:00
Kornel Benko
0b3f644469 Amend(3) 74c849d Advanced search with format
* Discard now unneeded code
* Remove macro '\uldepth=...'
2018-10-24 11:07:11 +02:00
Kornel Benko
2dd522472a Amend(2) 74c849d Advanced search with format
Added missing handling for chapter/chapter*
Also added frontmatter (title, author etc), but disabled ATM
2018-10-23 21:12:22 +02:00
Kornel Benko
ff9c32b382 Amend 74c849d, Advanced search with format
Remove macros like '\tiny ' or '\tiny{}' or '\tiny' followed by
any other non-alpha char correctly
2018-10-23 20:03:50 +02:00
Kornel Benko
74c849d651 Advanced search with format, preparing for selective searching
As it is now, searching with format needs ALL the features set
in order to match the pattern.
What needs to be done is a GUI specifying which of the features are
important.
1.) language
2.) font (series, shape)
3.) markup, underline, strikeout
4.) color
Having this info, the implementation is easy. Set
some variables and be done
2018-10-22 20:19:36 +02:00
Kornel Benko
140f843690 Advanced search with format, consider also sectioning macros 2018-10-20 12:47:37 +02:00
Kornel Benko
0be01de61f Advanced search with format, refactoring 2018-10-19 19:11:20 +02:00
Kornel Benko
6dc450bc46 Commented out an unused function to please a picky compiler 2018-10-19 11:10:54 +02:00
Kornel Benko
d6cc58f4a3 Amend(4) 7a03fa6: Advanced search with format:
Further normalize the latex input in case of enabled format search.

It was not enough to split the latex input on \foreignlanguage and \textcolor
macros only.
Instead also macros like \textt, or \noun etc had to be accounted for.

This patch uses therefore a different algorithm.
2018-10-18 17:37:15 +02:00
Kornel Benko
162c1f316b Amend(3) 7a03fa6: Advanced search with format:
Grrr... enable the search without format again
2018-10-15 08:09:19 +02:00
Kornel Benko
8b21b2f8fb Amend(2) 7a03fa6: Advanced search with format:
Further tweeking.
2018-10-14 20:39:13 +02:00
Kornel Benko
1967d5411c Amend edca2e0: copy && paste error 2018-10-13 22:22:48 +02:00
Kornel Benko
edca2e0c4a Amend 7a03fa6: Advanced search with format:
In the latexified text:
* Check and handle contained regex properly
* Discard superfluos '{' preventing our search engine
  to match with the search pattern
2018-10-13 21:08:26 +02:00
Kornel Benko
7a03fa6f1d Advanced search with format: Prepare latex for find
Our findadv expects something like
	prefix + 'search'
so that the regex (which is latexified too)
can work on 'search'
(In the source, the prefix is denoted by lead_as_string)

The latex output contains structs like
	\foreignlaguage(abc}{xx\textbf{boldxx\textcolor{blue}{blue 1 blue 2} XX}}
which would never match the simple prefix.

Now the above is converted to
	\foreignlaguage(abc}{xx}\\
	\foreignlaguage(abc}{\textbf{boldxx}}
	\foreignlaguage(abc}{\textbf{\textcolor{blue}{blue 1 blue 2}}}\\
	\foreignlaguage(abc}{\textbf{ XX}}
Of course, more than one language or color in an inset can be searched for now.
2018-10-12 16:47:07 +02:00
Kornel Benko
ff933b52f5 Amend(2) b78bdf8
Modified language handling

Still, there are problems, because sometimes the search pattern
does not contain the the requested info. So the 'find' often fails
for strings inside a list environment.
2018-10-06 23:50:50 +02:00
Kornel Benko
f2d82f879e Amend(1) b78bdf8
In advanced search:
* Ignore font sizes
* ignore \\[a-z]+par{} macros
* ignore \\inputencoding{...} macros
2018-10-06 09:58:29 +02:00
Kornel Benko
b78bdf80a8 Added better handling for languages and colors for advanced F&R
The change is significant if the search format is not disabled.
We try to analyze the pattern string first to get needed features
for the search.
We try to analyse the searched string and if it does not
contain all expected featers (color, language, char style, char decoration)

Still some problems though
2018-10-05 20:26:44 +02:00
Kornel Benko
2348e0b615 Advaced search: Added handling to search for colored text
if used with format enabled
2018-10-02 11:53:01 +02:00
Kornel Benko
4eb5ac9a2c Amend(4) 73188e3
* Added textsl, texttt, uline, uuline, sout, xout to the list of possible
  leading strings.
* Account for correct number of open braces in regex.
  Now the search works for enbled format too.

This is hopefully the last amend
2018-10-01 12:06:42 +02:00
Kornel Benko
cc0c58839f Amend(3) 73188e3
Adapt the positional references in regex supplied by user
so that for instance '([a-z]+)\s\1' to find identical words in sequence
is changed to '([a-z]+)\s\2'.
2018-09-30 18:37:55 +02:00
Kornel Benko
2fdc52df19 Amend(2) 73188e3.
Added noun, textsf and texit to the list of possible
leading strings if searche with format enabled.
Searching seems to work as intended now.
2018-09-30 16:15:45 +02:00