You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
notepad-plus-plus/lexilla/lexlib/CharacterSet.h

196 lines
5.2 KiB

// Scintilla source code edit control
/** @file CharacterSet.h
** Encapsulates a set of characters. Used to test if a character is within a set.
**/
// Copyright 2007 by Neil Hodgson <neilh@scintilla.org>
// The License.txt file describes the conditions under which this software may be distributed.
#ifndef CHARACTERSET_H
#define CHARACTERSET_H
namespace Lexilla {
template<int N>
class CharacterSetArray {
unsigned char bset[(N-1)/8 + 1] = {};
bool valueAfter = false;
public:
enum setBase {
setNone=0,
setLower=1,
setUpper=2,
setDigits=4,
setAlpha=setLower|setUpper,
setAlphaNum=setAlpha|setDigits
};
CharacterSetArray(setBase base=setNone, const char *initialSet="", bool valueAfter_=false) noexcept {
valueAfter = valueAfter_;
AddString(initialSet);
if (base & setLower)
AddString("abcdefghijklmnopqrstuvwxyz");
if (base & setUpper)
AddString("ABCDEFGHIJKLMNOPQRSTUVWXYZ");
if (base & setDigits)
AddString("0123456789");
}
Update: Scintilla 5.3.5 Lexilla 5.2.5 update to Scinitlla Release 5.3.5 (https://www.scintilla.org/scintilla535.zip) Released 31 May 2023. On Win32, implement IME context sensitivity with IMR_DOCUMENTFEED. Feature #1310. On Win32 remove dependence on MSIMG32.DLL by replacing AlphaBlend by GdiAlphaBlend. Bug #1923. On Qt, stop movement of IME candidate box. On Qt, report correct caret position within paragraph for IME retrieve surrounding text. On Qt for Cocoa, fix crash in entry of multi-character strings with IME. and Lexilla Release 5.2.5 (https://www.scintilla.org/lexilla525.zip) Released 31 May 2023. Add CharacterSetArray constructor without setBase initial argument for common case where this is setNone and the initialSet argument completely defines the characters. This shortens and clarifies use of CharacterSetArray. Bash: implement highlighting inside quoted elements and here-docs. Controlled with properties lexer.bash.styling.inside.string, lexer.bash.styling.inside.backticks, lexer.bash.styling.inside.parameter, and lexer.bash.styling.inside.heredoc. Issue #154, Issue #153, Feature #1033. Bash: add property lexer.bash.command.substitution to choose how to style command substitutions. 0 → SCE_SH_BACKTICKS; 1 → surrounding "$(" and ")" as operators and contents styled as bash code; 2 → use distinct styles (base style + 64) for contents. Choice (2) is a provisional feature and details may change before it is finalized. Issue #153. Bash: fix nesting of parameters (SCE_SH_PARAM) like ${var/$sub/"${rep}}"}. Issue #154. Bash: fix single character special parameters like $? by limiting style. Issue #154. Bash: treat "$$" as special parameter and end scalars before "$". Issue #154. Bash: treat "<<" in arithmetic contexts as left bitwise shift operator instead of here-doc. Issue #137. Batch: style SCE_BAT_AFTER_LABEL used for rest of line after label which is not executed. Issue #148. F#: Lex interpolated verbatim strings as verbatim. Issue #156. VB: allow multiline strings when lexer.vb.strings.multiline set. Issue #151. Close #13729
2 years ago
CharacterSetArray(const char *initialSet, bool valueAfter_=false) noexcept :
CharacterSetArray(setNone, initialSet, valueAfter_) {
}
// For compatibility with previous version but should not be used in new code.
CharacterSetArray(setBase base, const char *initialSet, [[maybe_unused]]int size_, bool valueAfter_=false) noexcept :
CharacterSetArray(base, initialSet, valueAfter_) {
assert(size_ == N);
}
void Add(int val) noexcept {
assert(val >= 0);
assert(val < N);
bset[val >> 3] |= 1 << (val & 7);
}
void AddString(const char *setToAdd) noexcept {
for (const char *cp=setToAdd; *cp; cp++) {
const unsigned char uch = *cp;
assert(uch < N);
Add(uch);
}
}
bool Contains(int val) const noexcept {
assert(val >= 0);
if (val < 0) return false;
if (val >= N) return valueAfter;
return bset[val >> 3] & (1 << (val & 7));
}
bool Contains(char ch) const noexcept {
// Overload char as char may be signed
const unsigned char uch = ch;
return Contains(uch);
}
};
using CharacterSet = CharacterSetArray<0x80>;
// Functions for classifying characters
template <typename T, typename... Args>
constexpr bool AnyOf(T t, Args... args) noexcept {
#if defined(__clang__)
Update: Scintilla 5.3.6 and Lexilla 5.2.6 update to Scinitlla Release 5.3.6 (https://www.scintilla.org/scintilla536.zip) Released 26 July 2023. Redraw calltip after showing as didn't update when size of new text exactly same as previous. Feature #1486. On Win32 fix reverse arrow cursor when scaled. Bug #2382. On Win32 hide cursor when typing if that system preference has been chosen. Bug #2333. On Win32 and Qt, stop aligning IME candidate window to target. It is now always aligned to start of composition string. This undoes part of feature #1300. Feature #1488, Bug #2391, Feature #1300. On Qt, for IMEs, update micro focus when selection changes. This may move the location of IME popups to align with the caret. On Qt, implement replacement for IMEs which may help with actions like reconversion. This is similar to delete-surrounding on GTK. and Lexilla Release 5.2.6 (https://www.scintilla.org/lexilla526.zip) Released 26 July 2023. Include empty word list names in value returned by DescribeWordListSets and SCI_DESCRIBEKEYWORDSETS. Issue #175, Pull request #176. Bash: style here-doc end delimiters as SCE_SH_HERE_DELIM instead of SCE_SH_HERE_Q. Issue #177. Bash: allow '$' as last character in string. Issue #180, Pull request #181. Bash: fix state after expansion. Highlight all numeric and file test operators. Don't highlight dash in long option as operator. Issue #182, Pull request #183. Bash: strict checking of special parameters ($*, $@, $$, ...) with property lexer.bash.special.parameter to specify valid parameters. Issue #184, Pull request #186. Bash: recognize keyword before redirection operators (< and >). Issue #188, Pull request #189. Errorlist: recognize Bash diagnostic messages. HTML: allow ASP block to terminate inside line comment. Issue #185. HTML: fix folding with JSP/ASP.NET <%-- comment. Issue #191. HTML: fix incremental styling of multi-line ASP.NET directive. Issue #191. Matlab: improve arguments blocks. Add support for multiple arguments blocks. Prevent "arguments" from being keyword in function declaration line. Fix semicolon handling. Pull request #179. Visual Prolog: add support for embedded syntax with SCE_VISUALPROLOG_EMBEDDED and SCE_VISUALPROLOG_PLACEHOLDER. Styling of string literals changed with no differentiation between literals with quotes and those that are prefixed with "@". Quote characters are in a separate style (SCE_VISUALPROLOG_STRING_QUOTE) to contents (SCE_VISUALPROLOG_STRING). SCE_VISUALPROLOG_CHARACTER, SCE_VISUALPROLOG_CHARACTER_TOO_MANY, SCE_VISUALPROLOG_CHARACTER_ESCAPE_ERROR, SCE_VISUALPROLOG_STRING_EOL_OPEN, and SCE_VISUALPROLOG_STRING_VERBATIM_SPECIAL were removed (replaced with SCE_VISUALPROLOG_UNUSED[1-5]). Pull request #178. Fix #13901, fix #13911, fix #13943, close #13940
1 year ago
static_assert(__is_integral(T) || __is_enum(T));
#endif
return ((t == args) || ...);
}
// prevent pointer without <type_traits>
template <typename T, typename... Args>
constexpr void AnyOf([[maybe_unused]] T *t, [[maybe_unused]] Args... args) noexcept {}
template <typename T, typename... Args>
constexpr void AnyOf([[maybe_unused]] const T *t, [[maybe_unused]] Args... args) noexcept {}
constexpr bool IsASpace(int ch) noexcept {
return (ch == ' ') || ((ch >= 0x09) && (ch <= 0x0d));
}
constexpr bool IsASpaceOrTab(int ch) noexcept {
return (ch == ' ') || (ch == '\t');
}
constexpr bool IsADigit(int ch) noexcept {
return (ch >= '0') && (ch <= '9');
}
constexpr bool IsAHeXDigit(int ch) noexcept {
return (ch >= '0' && ch <= '9')
|| (ch >= 'A' && ch <= 'F')
|| (ch >= 'a' && ch <= 'f');
}
constexpr bool IsAnOctalDigit(int ch) noexcept {
return ch >= '0' && ch <= '7';
}
constexpr bool IsADigit(int ch, int base) noexcept {
if (base <= 10) {
return (ch >= '0') && (ch < '0' + base);
} else {
return ((ch >= '0') && (ch <= '9')) ||
((ch >= 'A') && (ch < 'A' + base - 10)) ||
((ch >= 'a') && (ch < 'a' + base - 10));
}
}
constexpr bool IsASCII(int ch) noexcept {
return (ch >= 0) && (ch < 0x80);
}
constexpr bool IsLowerCase(int ch) noexcept {
return (ch >= 'a') && (ch <= 'z');
}
constexpr bool IsUpperCase(int ch) noexcept {
return (ch >= 'A') && (ch <= 'Z');
}
constexpr bool IsUpperOrLowerCase(int ch) noexcept {
return IsUpperCase(ch) || IsLowerCase(ch);
}
constexpr bool IsAlphaNumeric(int ch) noexcept {
return
((ch >= '0') && (ch <= '9')) ||
((ch >= 'a') && (ch <= 'z')) ||
((ch >= 'A') && (ch <= 'Z'));
}
/**
* Check if a character is a space.
* This is ASCII specific but is safe with chars >= 0x80.
*/
constexpr bool isspacechar(int ch) noexcept {
return (ch == ' ') || ((ch >= 0x09) && (ch <= 0x0d));
}
constexpr bool iswordchar(int ch) noexcept {
return IsAlphaNumeric(ch) || ch == '.' || ch == '_';
}
constexpr bool iswordstart(int ch) noexcept {
return IsAlphaNumeric(ch) || ch == '_';
}
constexpr bool isoperator(int ch) noexcept {
if (IsAlphaNumeric(ch))
return false;
if (ch == '%' || ch == '^' || ch == '&' || ch == '*' ||
ch == '(' || ch == ')' || ch == '-' || ch == '+' ||
ch == '=' || ch == '|' || ch == '{' || ch == '}' ||
ch == '[' || ch == ']' || ch == ':' || ch == ';' ||
ch == '<' || ch == '>' || ch == ',' || ch == '/' ||
ch == '?' || ch == '!' || ch == '.' || ch == '~')
return true;
return false;
}
// Simple case functions for ASCII supersets.
template <typename T>
constexpr T MakeUpperCase(T ch) noexcept {
if (ch < 'a' || ch > 'z')
return ch;
else
return ch - 'a' + 'A';
}
template <typename T>
constexpr T MakeLowerCase(T ch) noexcept {
if (ch < 'A' || ch > 'Z')
return ch;
else
return ch - 'A' + 'a';
}
int CompareCaseInsensitive(const char *a, const char *b) noexcept;
Updated to Scintilla 5.4.2 & Lexilla 5.3.1 https://www.scintilla.org/scintilla542.zip Release 5.4.2 Released 5 March 2024. Significantly reduce memory used for undo actions, often to a half or quarter of previous versions. Feature #1458. Add APIs for saving and restoring undo history. For GTK, when laying out text, detect runs with both left-to-right and right-to-left ranges and divide into an ASCII prefix and more complex suffix. Lay out the ASCII prefix in the standard manner but, for the suffix, measure the whole width and spread that over the suffix bytes. This produces more usable results where the caret moves over the ASCII prefix correctly and over the suffix reasonably but not accurately. For ScintillaEdit on Qt, fix reference from ScintillaDocument to Document to match change in 5.4.1 using IDocumentEditable for SCI_GETDOCPOINTER and SCI_SETDOCPOINTER. For Direct2D on Win32, use the multi-threaded option to avoid crashes when Scintilla instances created on different threads. There may be more problems with this scenario so it should be avoided. Bug #2420. For Win32, ensure keyboard-initiated context menu appears in multi-screen situations. https://www.scintilla.org/lexilla531.zip Release 5.3.1 Released 5 March 2024. Assembler: After comments, treat \r\n line ends the same as \n. This makes testing easier. Bash: Fix folding when line changed to/from comment and previous line is comment. Issue #224. Batch: Fix handling ':' next to keywords. Issue #222. JavaScript: in cpp lexer, add lexer.cpp.backquoted.strings=2 mode to treat ` back-quoted strings as template literals which allow embedded ${expressions}. Issue #94. Python: fix lexing of rb'' and rf'' strings. Issue #223, Pull request #227. Ruby: fix lexing of methods on numeric literals like '3.times' so the '.' and method name do not appear in numeric style. Issue #225.
9 months ago
bool EqualCaseInsensitive(std::string_view a, std::string_view b) noexcept;
int CompareNCaseInsensitive(const char *a, const char *b, size_t len) noexcept;
}
#endif