A library of language lexers for use with Scintilla

Overview
README for Lexilla library.

The Lexilla library contains a set of lexers and folders that provides support for
programming, mark-up, and data languages for the Scintilla source code editing
component.

Lexilla is made available as both a shared library and static library.
The shared library is called liblexilla.so / liblexilla.dylib / lexilla.dll on Linux / macOS /
Windows.
The static library is called liblexilla.a when built with GCC or Clang and liblexilla.lib
when built with MSVC.

Lexilla is developed on Windows, Linux, and macOS and requires a C++17 compiler.
It may work on other Unix platforms like BSD but that is not a development focus.
MSVC 2019.4, GCC 9.0, Clang 9.0, and Apple Clang 11.0 are known to work.

MSVC is only available on Windows.

GCC and Clang work on Windows and Linux.

On macOS, only Apple Clang is available.

To use GCC, run lexilla/src/makefile:
	make

To use Clang, run lexilla/test/makefile:
	make CLANG=1
On macOS, CLANG is set automatically so this can just be
	make

To use MSVC, run lexilla/test/lexilla.mak:
	nmake -f lexilla.mak

To build a debugging version of the library, add DEBUG=1 to the command:
	make DEBUG=1
	
The built libraries are copied into scintilla/bin.

Lexilla relies on a list of lexers from the scintilla/lexers directory. If any changes are
made to the set of lexers then source and build files can be regenerated with the
lexilla/scripts/LexillaGen.py script which requires Python 3 and is tested with 3.7+.
Unix:
	python3 LexillaGen.py
Windows:
	pyw LexillaGen.py
Comments
  • Markdown header > color entire line

    Markdown header > color entire line

    Hi,

    In case of a header, only the '#' character has a specific color, but not the entire header line.

    Below an example:

    # Header 1  
    
    Test
    

    I've posted this question already in 2020, but at that time there was nobody maintaining the markdown lexer, until now i guess :) Is it possible to have a look at this?

    Thank you

    markdown 
    opened by cdbdev 40
  • Improve LexJulia

    Improve LexJulia

    From https://sourceforge.net/p/scintilla/feature-requests/1380/

    • Prefix lexer options name with lexer.julia.
    • [Edit] Prefix fold options name with fold.julia.
    • Remove unsafe glib calls to isdigit and isspace
    • Add string.interpolation option
    • [Edit] Add fold.julia.syntax.based option
    • Strip trailing spaces
    • [Edit] correct bug with inline comments

    The fact that the end of the line of a single line comment is not styled as a comment, is a bug, all the line should be styled as comment. But I don't manage to make it work. It is likely a problem at line 970 with the condition if (sc.atLineEnd || sc.ch == '\r' || sc.ch == '\n').

    julia 
    opened by getzze 29
  • [R lexer] Support all string styles

    [R lexer] Support all string styles

    See the document at https://search.r-project.org/R/refmans/base/html/Quotes.html, currently the lexer does not handle backtick and raw string literal, also single-quoted string and double-quoted string handled differently regarding multiline (both are single line, see https://cran.r-project.org/doc/manuals/r-release/R-lang.html#Literal-constants). I think following would be some steps to implement this:

    • add AllStyles.r test file.
    • refactoring LexR.cxx (convert the big if-else to switch (sc.state), use anonymous namespace, use C++ includes, etc.).
    • highlight backtick (https://sourceforge.net/p/scintilla/feature-requests/1387/).
    • highlight the C++ like raw string.
    • (optional) highlight escape sequence.
    committed r 
    opened by zufuliu 26
  • 7z is not considered a word in PowerShell

    7z is not considered a word in PowerShell

    https://github.com/notepad-plus-plus/notepad-plus-plus/issues/12465

    In notepad++ when I write the word 7z it is not highlighted, I reported this to the developers of notepad++ and they told me to report here.

    image

    committed powershell 
    opened by GabrielFrigo4 15
  • Allow setting properties inside test example files

    Allow setting properties inside test example files

    Requiring a new subdirectory for each property change test is unwieldy. It would be better to allow test example files to set properties that differ from SciTE.properties. These could be placed in comments at the start of the test example file similar to:

    # Test with f-strings disabled: [|lexer.python.strings.f=0|]
    

    The [|...|] syntax was chosen as these sequences are unusual so should avoid unexpected matches.

    An implementation is available from 62InlineProperties.patch.

    Adding a subdirectory may still be worthwhile for properties that have complex effects or interactions.

    enhancement testing 
    opened by nyamatongwe 14
  • LexInno: Fix style handling of unterminated strings, message sections, and avoid terminating comments between CR and LF

    LexInno: Fix style handling of unterminated strings, message sections, and avoid terminating comments between CR and LF

    Fixed issues with Parameters: and with Keywords= with anchoring the words to trailing : and = characters.

    Tested with https://github.com/XhmikosR/notepad2-mod/blob/master/distrib/notepad2_setup.iss

    Note the emphasised bolding in fixes.

    Fixed in Setup section:

    WizardSmallImageFile=WizardSmallImageFile.bmp

    to

    WizardSmallImageFile=WizardSmallImageFile.bmp

    Value should be default style in this case.

    Fixed in CustomMessages section:

    en.msg_DeleteSettings =Do you also want to delete {#app_name}***'s settings?%n%nIf you plan on installing {#app_name} again then you do not have to delete them.***

    and

    en.tsk_Other =Other tasks:

    and

    en.tsk_ResetSettings =Reset {#app_name}***'s settings***

    Unterminated strings should be Default style in CustomMessages and Messages sections. tasks should be Default style, not Parameter style.

    Fixed in Tasks and InstallDelete sections:

    Name: quicklaunchicon; Description: {cm:CreateQuickLaunchIcon}; GroupDescription: {cm:AdditionalIcons}; Flags: unchecked; OnlyBelowVersion: 6.01

    and

    Type: files; Name: {#quick_launch}{#app_name}.lnk; Check: not IsTaskSelected('quicklaunchicon') and IsUpgrade(); OnlyBelowVersion: 6.01

    OnlyBelowVersion should be Parameter style, not as Keyword style. Even though in latest inno.properties, may be coloured the same.

    Note

    The test inno script has no Registry section to show the fix of String: as Parameter style in INI section and string; as Default style in Registry section, which is what lead to to these fixes as mentioned at SourceForge.

    inno 
    opened by mpheath 13
  • gcc/g++ compiled 32 bits dll doesn't work

    gcc/g++ compiled 32 bits dll doesn't work

    opened by lifenjoiner 13
  • `ifdef windir` doesn't work with MSYS2 make

    `ifdef windir` doesn't work with MSYS2 make

    Testing makefile:

    # Work with MinGW make, but doesn't work with MSYS2 make.
    ifdef windir
        Value1 = 1
    else
        Value1 = 0
    endif
    
    # Work with both.
    ifneq ("$windir", "")
        Value2 = 1
    else
        Value2 = 0
    endif
    
    all:
    	echo $(Value1)
    	echo $(Value2)
    

    Patch: Improve-windir-variable-testing-for-makefile.zip

    BTW: Have you considered adding GitHub Action workflows of CI file and then reviewing/merging PRs? GitHub action can automatically build and test when push and/or PR. That will make the whole process easier.

    building not-a-problem 
    opened by lifenjoiner 13
  • Nested block comments handled incorrectly for Transact-SQL (Microsoft SQL Server)

    Nested block comments handled incorrectly for Transact-SQL (Microsoft SQL Server)

    Description of the Issue

    Code of SQL or C++ or Java commented by block commands which is nested (example below) is not highlighted correctly. Please notice that in development environments I tested, which I suppose doesn't use ScintillaOrg, highlighting works well. I tested Microsoft SQ Server Management Studio, KDevelop, QtCreator.

    /* aaaaaaaaaa /* bbbbbbbbbbbbbbb */ aaaaaaaaaa */

    Steps to Reproduce the Issue

    Put below into editor, for example in Notepad++
    /* aaaaaaaaaa
    /* bbbbbbbbbbbbbbb */
    aaaaaaaaaa */
    
    Change highlighting with SQL or C++ or Java
    

    Expected Behavior

    all lines coming from example should be like as commented

    Actual Behavior

    last line is incorrectly highlighted - like would be not commented

    Other information

    I reported this issue also in Notepad++ bug tracker: https://github.com/notepad-plus-plus/notepad-plus-plus/issues/11718

    mssql committed 
    opened by piomiq 12
  • Matlab folding bugfix

    Matlab folding bugfix

    Lexilla doesn't know a couple of MATLAB's keywords, so it cannot do the proper folding when these keywords are used.
    This PR adds the absent keywords into the MATLAB lexer.
    It also includes the code processing the "arguments" keyword edge case.

    matlab 
    opened by oswald3141 12
  • TestLexers can't associate more than one file extension per module

    TestLexers can't associate more than one file extension per module

    Lexilla's test runner can't parse a lexer.* key containing multiple file extensions, in the manner of a file.patterns definition: lexer.*.ext1;*.ext2;*.ext3=lexmodule

    While acknowledging the runner's necessarily limited functionality, I think even a rudimentary substitute for file.patterns may prove useful when testing lexers that specify multiple languages, or mutually incompatible dialects and standards of a single language. A suite of test files with various extensions could then share a common set of static properties, optionally supplemented by conditional ones.

    For example, a downstream application may want to implement an option suppressing JSON-LD constructs in plain JSON files. A contributor writes a pair of test files with similar content, one with the IANA-designated *.jsonld extension, the other ending in *.json. A single *.properties file would be sufficient if both extensions could be mapped to the JSON lexer:

    lexer.*.json;*.jsonld=json
    
    # common properties
    fold=1
    # etc.
    
    match x.jsonld
        # properties unique to JSON-LD
    
    match x.json
        # properties unique to JSON
    

    This patch extends PropertyMap::GetPropertyForFile to also check for a delimited list of file extensions: 0001-TestLexers-Multi-File-Exts.diff.txt

    testing 
    opened by rdipardo 11
  • Don't highlight match and case as keywords in contexts where they probably aren't used as keywords

    Don't highlight match and case as keywords in contexts where they probably aren't used as keywords

    Changes the python lever to classify match and case as keywords if they are in the keywords set and are in a position where they are probably keywords and not identifiers -- at the start of the line and not followed by a '=' or '.'.

    There are ambiguous cases where match / case could be a keyword or identifier that are classified as a keyword. This could be resolved by scanning ahead for a terminating ':', but it would require something close to a python tokenizer and the match / case should probably be classified as keywords while the line is being entered. An example are the lines 'match (x)' and 'match (x):' -- match is an identifier in the first, but a keyword in the second.

    There are cases where the statement starts on the prior line that aren't handled correctly, but should be rare and would be difficult to fix within the current approach.

    I opted not to add an option to control this, but could if someone wants to highlight all match and case symbols as keywords.

    Note that match and case were added to python/SciTE.properties for the tests; this doesn't seem to affect the other tests.

    python committed 
    opened by jpe 2
  • Add Objeck lexer

    Add Objeck lexer

    Objeck is similar to C/C++ syntax but there are minor differences that annoying enough to unable to use C/C++ lexer for it. Objeck's variables have the prefix @ (just like PHP's variables have the prefix $). Objeck uses # for single line comment like Ruby but for multi line comment it uses #~ and ~# inspired by the syntax /* and */ of C/C++.

    This is Objeck: https://github.com/objeck/objeck-lang

    new lexer 
    opened by nsgnkhibdk2cls0f 0
  • (Matlab) Wrong syntax highlighting for numeric literals

    (Matlab) Wrong syntax highlighting for numeric literals

    When using Notepad++ v8.4.7 (I don't know which version of lexilla it uses), and editing a file with Matlab syntax highlighting, the numeric literals are not properly colored:

    d    = 123;
    x    = 0x123ABC;
    b    = 0b010101;
    xs64 = 0x2As64;
    xs32 = 0x2As32;
    xs16 = 0x2As16;
    xs8  = 0x2As8;
    xu64 = 0x2Au64;
    xu32 = 0x2Au32;
    xu16 = 0x2Au16;
    xu8  = 0x2Au8;
    bs64 = 0b10s64;
    bs32 = 0b10s32;
    bs16 = 0b10s16;
    bs8  = 0b10s8;
    bu64 = 0b10u64;
    bu32 = 0b10u32;
    bu16 = 0b10u16;
    bu8  = 0b10u8;
    c = .1;
    c = 1.1;
    c = .1e1;
    c = 1.1e1;
    c = 1e1;
    c = 1i;
    c = 1j;
    c = .1e2j;
    c = 1e2j;
    

    img

    For non-decimal literals, only the leading 0 is colored, whereas the whole literal should be - maybe including the optional suffix. The suffixes above (eight for fixed-width integers, and two for imaginary numbers) are all the ones I know about. There are no suffixes for 128-bit literals, nor a prefix for octal.

    For the documentation, see here: https://www.mathworks.com/help/matlab/matlab_prog/specify-hexadecimal-and-binary-numbers.html I've tried all the literals shown above, and indeed all of them are recognized by MATLAB, though Matlab's editor doesn't highlight numbers.

    Octave seems to support the prefixes as well: https://docs.octave.org/v6.3.0/Numeric-Data-Types.html but I don't know about its support for the suffixes.

    Regarding complex literals: any decimal or floating point literal - but not binary/hex literals - can have an "i" or "j" suffix, which makes the number imaginary. Complex numbers are just the sum of a real part and an imaginary part; they're an expression rather than a literal, so they don't need to be treated specially. Also, a plain "i" or "j" evaluates to "1i", but since they can be reassigned as regular variables, I wouldn't color them as numeric literals. E.g.

    a = i; % a == 1i
    b = 1i; % b == 1i
    i = 1;
    a = i; % a == 1  "surprising"
    b = 1i; % b == 1i  still working
    c = 0x23i % causes "Error: Invalid digit in hexadecimal literal. Supported hexadecimal digits are 0-9 and A-F. Supported type suffixes are u8, u16, u32, u64, and s8, s16, s32, s64."
    

    Despite the error message, hex literals also accept lowercase a-f. The suffix and prefix are both case-insensitive too.

    Then there's the topic of "constants" (which are miscellaneous "literals" - though more properly they are re-assignable variables that come pre-bound to default values):

    eps
    Inf
    inf
    intmax
    intmin
    namelengthmax
    realmax
    realmin
    pi
    NaN
    nan
    NaT
    false
    true
    

    with the added complication that some but not all of them are also functions that can be used to build matrices, e.g.

    n = nan(2, 3, 'single');
    f = false(5, 2);
    p = pi(3,4); % error "Too many input arguments."
    

    and also the fact that they are case-sensitive:

    k = inf;
    k = +inf;
    k = -inf;
    k = Inf;
    k = +Inf;
    k = -Inf;
    k = inF; % error
    k = INF; % error
    k = InF; % error
    t = NaT; % ok, it's "not-a-datetime"
    t = nat; % error
    

    All of these are reassignable too:

    a = inf; % a is infinity
    inf = 5;
    a = inf; % a == 5
    a = Inf; % a is infinity
    

    So it's probably better to just treat them as regular identifiers.

    In my opinion, the correct behavior for lexilla should be to either 1) color the whole literal without the suffix, or 2) color the whole literal including the suffix, but not a plain "i" or "j".

    You might want to look here for a more detailed description of Matlab's grammar: https://github.com/mathworks/MATLAB-Language-grammar/blob/master/Matlab.tmbundle/Syntaxes/MATLAB.tmLanguage

    matlab committed 
    opened by stefano-zanotti-88 5
  • [JS] exclude ${} from backtick highlighting

    [JS] exclude ${} from backtick highlighting

    Is it possible to not highlight the ${} and text inside of it, in a backtick string in Javascript? To improve readability.

    Example of Notepad++ (and the double quote string for comparison): image

    Example of how it could look like: image

    duplicate cpp javascript 
    opened by TomRobert 1
  • Gcode lexer

    Gcode lexer

    Hello I am thinking of creating a lexer for gcode. It has a fairly straight forward format with Single character commands followed directly by a number. The main commands (eg. G0 G1 M3 M5 ) are fairly straight forward as they are consistent and such could easily be handled with a set of keywords. However coordinate and feed speed and other commands are different. These look like Z-11.009 X1.1 Y0.23 F250 Comment only blocks start with ';' and end with EOL Please correct me if I'm wrong I don't think there is an existing lexer that would handle this type of variable situation.

    Gcode is very simple however the code can be huge and traversing it can be awkward. Syntax highlighting could help immensely. Would this be something acceptable to add to the existing lexers?

    new lexer 
    opened by Yardie- 7
Owner
Scintilla
Source code editing and lexical analysis software components
Scintilla
T# Programming Language. Interpreted language. In development. I will make this compilable later.

The T# Programming Language WARNING! THIS LANGUAGE IS A WORK IN PROGRESS! ANYTHING CAN CHANGE AT ANY MOMENT WITHOUT ANY NOTICE! Install Install $ make

T# 91 Jun 24, 2022
This repository is a summary of the basic knowledge of recruiting job seekers and beginners in the direction of C/C++ technology, including language, program library, data structure, algorithm, system, network, link loading library, interview experience, recruitment, recommendation, etc.

?? C/C++ 技术面试基础知识总结,包括语言、程序库、数据结构、算法、系统、网络、链接装载库等知识及面试经验、招聘、内推等信息。This repository is a summary of the basic knowledge of recruiting job seekers and beginners in the direction of C/C++ technology, including language, program library, data structure, algorithm, system, network, link loading library, interview experience, recruitment, recommendation, etc.

huihut 27k Dec 31, 2022
A cheatsheet of modern C++ language and library features.

C++20/17/14/11 Overview Many of these descriptions and examples come from various resources (see Acknowledgements section), summarized in my own words

Anthony Calandra 15.4k Jan 6, 2023
Welcome to my dungeon. Here, I keep all my configuration files in case I have a stroke and lose all my memory. You're very welcome to explore and use anything in this repository. Have fun!

Fr1nge's Dotfiles Welcome to my dungeon. Here, I keep all my configuration files in case I have a stroke an d lose all my memory. You're very welcome

Fr1nge 33 Oct 28, 2022
A reliable and easy to use CPP program header file for simplifying, code writing in cpp

CPP Custom Header This header file main purpose is to implement most famous and most used algorithm that are easy to implement but quite lengthy and t

Jitesh Kumar 1 Dec 22, 2021
C++ intrusive container templates. Abstract node links, no use of new/delete.

C-plus-plus-intrusive-container-templates C++ intrusive container templates. Abstract node links, no use of new/delete (AVL tree, singly-linked list,

Walt Karas 10 Nov 7, 2022
learn how to use BPF/eBPF

学习Linux BPF/eBPF 编程 打造学习BPF知识的中文社区。

Wen-Quan Li 360 Jan 4, 2023
Tutorial how to use Travis CI with C++

travis_cpp_tutorial Branch master develop richel Tutorial how to use Travis CI with C++. View the tutorial (screen friendly) Download the PDF Want to

Richel Bilderbeek 176 Dec 8, 2022
Competitive Programming Language MM

MM Language MM Languageは、競技プログラミングのために開発中のプログラミング言語です。 どんなことが可能なのかは、examplesおよびexamples_outputsを参照ください。 まだ開発中の言語であるため、諸々不備があり、コンパイルエラーの行数表示さえまともに出せない状

null 21 Aug 22, 2022
simple and fast scripting language

The Aument Language The Aument language is a work-in-progress dynamically-typed scripting language with performance first: this scripting language is

The Aument Project 35 Dec 27, 2022
The DSiLanguagePatcher increases accessibility to foreign region DSi consoles by providing a mean to change the user interface language.

DSi Language Patcher The DSi Language patcher is a small tool, which runs on your DSi (homebrew execution required) and create a copy of your original

null 20 Nov 7, 2022
An open source, OOP language with editable syntax.

Copper An Open source compiled programming language, In development. Goals Copper is an general-purpose OOP language. Coppers main goal is to allow ea

null 13 Nov 18, 2022
Applied SDLC using C language

SDLC Activity Based Learning Build Code Quality Unity Git Inspector Folder Structure Folder Description 1_Requirements Documents detailing requirement

Arnob Chowdhury 7 Nov 8, 2022
the kyanite programming language

Kyanite Kyanite is a small passion-project programming language intended to be light-weight and simple to use. You can read more on the language itsel

Kyanite 5 Aug 19, 2021
λQ: A Simple Quantum Programming Language based on QWIRE.

λQ λQ: A Simple Language based on QWIRE which can be compiled to QASM. The name λQ means lambda calculus with quantum circuits. This is a term project

Wenhao Tang 5 Jul 11, 2021
A comprehensive catalog of modern and classic books on C++ programming language

A comprehensive catalog of modern and classic books on C++ programming language

Yurii Cherkasov 384 Dec 28, 2022
A minimal, toy programming language implemented in C++ and STL.

od Programming Language Mod (or ModLang) is a minimal, toy programming language implemented in C++ and STL (Standard Template Library) with no other e

Pranav Shridhar 27 Dec 4, 2022
The Scallion Programming Language

--------------------------------- The Scallion Programming Language --------------------------------- Version ------- Here's no version to download,

Scallion 3 Oct 17, 2021
A modern dynamically typed programming language that gets compiled to bytecode and is run in a virtual machine called SVM (Strawbry Virtual Machine).

Strawbry A bytecode programming language. Here is what I want Strawbry to look like: var a = 1 var b = 2 var c = a + b print(c) func sqrt(x) { re

PlebusSupremus1234 6 Jan 5, 2022