Own syntax schemes

Questions on using RSyntaxTextArea should go here.

Moderator: robert

Own syntax schemes

Postby Taneeda » Thu Jul 16, 2009 12:47 pm

Hi @ all,

Is it possible to create own syntax highlighting schemes for RSyntaxTextArea?

Greetz Taneeda
Image
User avatar
Taneeda
 
Posts: 10
Joined: Fri Jun 19, 2009 12:35 pm
Location: Oldenburg, Germany

Re: Own syntax schemes

Postby robert » Fri Jul 17, 2009 9:01 pm

It's not straightforward to add your own syntax schemes at the moment, unfortunately. RSTA's design does not lend itself well to the easy creation of syntax schemes (actually adding an external syntax scheme at runtime is easy, it's just the creation of the scheme that's the problem).

If it's a common language that others in the "real world" would find beneficial, I'd be happy to look into creating it and adding it to the official distribution. When I find time, I'd like to add support for wiki markup, JavaFX, ActionScript, Scala and Clojure. If it's an "in-house" language things get stickier.

I'd like to create a small app that would allow you to generate syntax scheme classes. It would be GPL and come with JFlex pre-packaged. You'd input parameters into a UI describing your language and it would pop out a .java (and optionally compiled .class) that implemented your language. Initially you'd only be able to define simple, common things - strings, single and multi-line comments, keywords. Definitely enough to do C-style languages. If it met with success I could work on adding support for more complex language constructs.

But if you just have a suggestion for a language to add to RSTA - just let me know. :)

Also, just an FYI - if you think RSTA is getting too big as a library for your project, you can remove the *TokenMaker classes that your app doesn't need from the jar file. It won't hurt RSTA - all you should need to keep is PlainTextTokenMaker, any Abstract*TokenMakers and the specific TokenMakers for the languages you want to support. The bulk of RSTA's size is in the *TokenMaker classes, which define the languages support.
User avatar
robert
 
Posts: 797
Joined: Sat May 10, 2008 5:16 pm

Re: Own syntax schemes

Postby hookahey » Thu Jul 23, 2009 8:48 am

Hi

I have just gone through the process of creating an own syntax scheme, so I will tell shortly about my experience.
First have a look at JFlex http://jflex.de/manual.html and some tutorials http://www.javaranch.com/journal/2008/04/handling-file-formats-introducing-lexers.html and maybe http://bmrc.berkeley.edu/courseware/cs164/spring98/proj/jlex/tutorial.html to become accustomed to it. Luckily if the language is popular you may even find an already existing implementation for it in JFlex. Then you will only need to change it to return the RSyntaxTextArea Tokens.
Then have a look into the already provided examples that come with the source of RSyntaxTextArea in the ui.rsyntaxtextarea.modes package and take one of these as a start (copy over the java methods! and maybe some of the regex macros or just copy everything :) ). With a quite symple language you just need to look at the predefined tokens in the Token class and play around to return the right ones for the elements in the language by returning them in the action associated with the regex for your language element. A more complicated syntax needs different states in JFlex and a logic to change from one to the other.
Next you need to translate the file with the help of JFlex to a Java file and follow the instructions in the class javadoc of the already existent syntax styles (e.g. remove the zzRefill() and yyreset() generated by JFlex). Don't forget to declare the right package in the lexer file. That's it, mostly. Now you can compile the file and test it by adding it to your RSyntaxTextArea with this code:

java code:

AbstractTokenMakerFactory atmf = (AbstractTokenMakerFactory) TokenMakerFactory.getDefaultInstance();
atmf.putMapping("a key string denoting your style", "fully.qualified.classNameOfYourSyntaxStyle");
TokenMakerFactory.setDefaultInstance(atmf);
RSyntaxTextArea.setSyntaxEditingStyle("a key string denoting your style");

So far if you have a simple language! It gets a bit more complicated as soon as you want to use states with JFlex as then you have to change the switch in the

java code:

getTokenList(...)
Method in the lexer file to keep track of the state transitions. For that it is best to define own tokens that start with the int value -1 and grow negatively to not clash with the ones defined in RSyntaxTextArea. Look at the XML Token Marker for a good example how and where to return own tokens and how to implement the transitions in the method mentioned above.
The weird thing when using states in JFlex with RSyntaxTextArea is that there seems to not exist a BOL (beginning of line) and EOL (end of line) but EOF (end of file) is returned instead. Or then maybe I didn't get it right but that was a big annoyance because in JFlex normal mode it worked but not in RSyntaxTextArea.

Hope that helped
Hookahey
hookahey
 

Re: Own syntax schemes

Postby abgorn » Fri Jul 24, 2009 7:25 pm

hookahey wrote:Hi

I have just gone through the process of creating an own syntax scheme, so I will tell shortly about my experience.
First have a look at JFlex http://jflex.de/manual.html and some tutorials http://www.javaranch.com/journal/2008/04/handling-file-formats-introducing-lexers.html and maybe http://bmrc.berkeley.edu/courseware/cs164/spring98/proj/jlex/tutorial.html to become accustomed to it. Luckily if the language is popular you may even find an already existing implementation for it in JFlex. Then you will only need to change it to return the RSyntaxTextArea Tokens.
Then have a look into the already provided examples that come with the source of RSyntaxTextArea in the ui.rsyntaxtextarea.modes package and take one of these as a start (copy over the java methods! and maybe some of the regex macros or just copy everything :) ). With a quite symple language you just need to look at the predefined tokens in the Token class and play around to return the right ones for the elements in the language by returning them in the action associated with the regex for your language element. A more complicated syntax needs different states in JFlex and a logic to change from one to the other.
Next you need to translate the file with the help of JFlex to a Java file and follow the instructions in the class javadoc of the already existent syntax styles (e.g. remove the zzRefill() and yyreset() generated by JFlex). Don't forget to declare the right package in the lexer file. That's it, mostly. Now you can compile the file and test it by adding it to your RSyntaxTextArea with this code:

Code: Select all
AbstractTokenMakerFactory atmf = (AbstractTokenMakerFactory) TokenMakerFactory.getDefaultInstance();
atmf.putMapping("a key string denoting your style", "fully.qualified.classNameOfYourSyntaxStyle");
TokenMakerFactory.setDefaultInstance(atmf);
RSyntaxTextArea.setSyntaxEditingStyle("a key string denoting your style");

So far if you have a simple language! It gets a bit more complicated as soon as you want to use states with JFlex as then you have to change the switch in the
Code: Select all
getTokenList(...)
Method in the lexer file to keep track of the state transitions. For that it is best to define own tokens that start with the int value -1 and grow negatively to not clash with the ones defined in RSyntaxTextArea. Look at the XML Token Marker for a good example how and where to return own tokens and how to implement the transitions in the method mentioned above.
The weird thing when using states in JFlex with RSyntaxTextArea is that there seems to not exist a BOL (beginning of line) and EOL (end of line) but EOF (end of file) is returned instead. Or then maybe I didn't get it right but that was a big annoyance because in JFlex normal mode it worked but not in RSyntaxTextArea.

Hope that helped
Hookahey

I love you.
abgorn
 
Posts: 5
Joined: Sun Jun 28, 2009 1:51 pm

Re: Own syntax schemes

Postby robert » Fri Jul 24, 2009 8:59 pm

Wow, hookahey hit it right on the head! His instructions are right on the money.

Unfortunately, because of the way RSTA bastardizes standard JFLex functionality, simply writing a standard JFlex scanner isn't sufficient. Follow hookahey's advice - start with a standard RSTA scanner, modify it to suit your needs, generate a .java from it with JFlex, and remove the 2 duplicated methods as described in the Javadoc comment at the top of the class. Then just add it to your project. You can start using it at runtime with the method he describes.

And I actually have started working on a GUI frontend for this whole business. For those that don't want to learn JFlex, this little app will allow you to generate scanners for your own languages with ease. It'll even let you test it out, by launching an RSTA instance using your scanner in a window, so you can interactively test at build time! It'll generate the .java for you (e.g. the JFlex output, with any needed modifications), so you can just take that and drop it into your project.

Expect an SVN repository for this side-project soon, as well as some screenshots on the blog site!
User avatar
robert
 
Posts: 797
Joined: Sat May 10, 2008 5:16 pm

Re: Own syntax schemes

Postby syntax keyword additions » Wed Aug 19, 2009 12:21 pm

greetings,

i apologize if this is a duplicate question, but i have a need to just add additional keywords that will be recognized by the text area as command in addition to those of java. i presently use jedit's editor and add additional keywords to be highlighted as my commands. would i need to recreate a complete new language parser to add these keywords to those of the java parser, or by using the gui you intend to create to generate the java files could i just add the new keyword additions? other than this one tiny limitation this component is a perfect solution for my needs. The last post on this subject is recent, and i checked the blog, but has there been any additional coding that has gone it this effort? as soon as this functionality is exposed i plan on cutting over to your component.....very nice work!

thanks much,

mike
syntax keyword additions
 

Re: Own syntax schemes

Postby hubersn » Wed Aug 19, 2009 3:47 pm

Just a quick hint for people with the need for simple syntax colouring that does not need the complexity of a full parser.

Start with the *TokenMaker.java implementations that are not created by JFlex. WindowsBatchTokenMaker is a very simple one, UnixShellTokenMaker is slightly more advanced. It is very easy to start from there and "roll your own". Just change the list of reserved words and change the comment handling, and - hey presto! - here's your own syntax colouring scheme.

hubersn
hubersn
 
Posts: 10
Joined: Fri Jul 31, 2009 11:45 pm

Re: Own syntax schemes

Postby robert » Wed Aug 19, 2009 5:29 pm

syntax keyword additions wrote:would i need to recreate a complete new language parser to add these keywords to those of the java parser


Yes, at present, short of hubersn's suggestion, you'd unfortunately have to create an entirely new parser class that mimicked the Java one, but with more keywords. This request (to add keywords to an already-done language) does come up every so often, so I've added an RFE on SourceForge so this is tracked.

or by using the gui you intend to create to generate the java files could i just add the new keyword additions?


With that tool you would be able to create a scanner that mimicked Java, but with new keywords, quite easily. Luckily Java isn't a complicated language to parse, so a basic scanner is easy to generate. :D

other than this one tiny limitation this component is a perfect solution for my needs.


Awesome! I'm glad you find it useful. :D

The last post on this subject is recent, and i checked the blog, but has there been any additional coding that has gone it this effort?


It's been worked on a little bit, but recently I've gotten a little over-zealous with the focusable tooltips and the spelling parser. There's currently a proof-of-concept app that generates a Java source scanner for RSTA. It allows you to define keywords, multi-line comments, and EOL comments. Thanks for the ping about the blog post; I meant to write about it earlier, so I'll put up what I have soon.
User avatar
robert
 
Posts: 797
Joined: Sat May 10, 2008 5:16 pm

Re: Own syntax schemes

Postby mike » Thu Aug 20, 2009 1:52 pm

thx so much to all for suggestions and advice.

i was thinking that if i had JavaTokenMarker class that was not generated by flex i could adapt it to dynamically parse my beanshell folder and get each beanshell command name, adding it to the reserved word list. this would essentially allow reserved words to be loaded at start up...still a little less flexible than jedit, as they can be added dynamically as the user adds a new command, but should be good enough. the ideal solution would be to have the TokenMarker class accept dynamic additions, but i suspect it is read at load time not allowing this...is this a correct assumption?

so many thanks for the help. it has worried me for some time to be using the dead jedit code, and replacing it with this component will be great :)

mike
mike
 

Re: Own syntax schemes

Postby mike » Thu Aug 20, 2009 1:57 pm

or....

if my tokenMarker class dynamically parses a folder for file names, adding them to the reserver word list, if that folder changes can i just call setSyntaxEditingStyle() to reload the tokenMarker, and pick up the new tokens? if so it will be exactly what i need....

thx,

mike
mike
 

Next

Return to Help

Who is online

Users browsing this forum: No registered users and 2 guests