Nested multiline-comments

Questions on using RSyntaxTextArea should go here.

Moderator: robert

Nested multiline-comments

Postby Stefan » Fri Jan 29, 2010 12:28 pm

Hello,

the programming language I want to edit with RSyntaxTextArea does contain nested multiline comments. Here is an example:

/* outer multiline comment
/* inner multiline comment */
*/

currently I do not find an easy way how to count the nesting information for the comments?

Did someone already encounter this requirement and has any proposals how this could be implemented?

Greetings
Stefan
Stefan
 

Re: Nested multiline-comments

Postby robert » Sun Jan 31, 2010 4:44 am

The only way to currently implement this is to "cheat" and have 1 state in your TokenMaker implementation per level of "nested comment." So, for example, you could create 10 states of multiline comments and support comments nested up to 10 levels deep. Anything more than that would cause a highlighting error (the "end" of any MLC's less than 10 levels deep wouldn't be colored as a comment, if that makes sense). In practice this would probably be more than sufficient - the common case would likely be comments nested 2-deep, but not much more.

If you're creating your own TokenMaker implementation, feel free to email me and I can help you get it implemented. There are languages I would like to add proper support for in RSTA that need this feature.

Arbitrary nesting of comments is something RSTA should properly support, however, so please add a Feature Request so it can be tracked and I don't forget about it.
User avatar
robert
 
Posts: 222
Joined: Sat May 10, 2008 5:16 pm

Re: Nested multiline-comments

Postby Guest » Wed Feb 03, 2010 9:38 am

Hello Robert,

you are right, support for arbitrary nesting is best implemented directly in RSTA.

One could add nesting-level information for multiline-comment tokens (in the Token class) and pass this information additionally to the initialTokenType to the method getTokenList() in order to set the appropriate nesting information.

Thanks for your great work, I will add a Feature Request.

Greetings
Stefan
Guest
 

Re: Nested multiline-comments

Postby robert » Wed Feb 03, 2010 1:47 pm

I'm thinking that even better than explicit nested comment support is to have a field that a TokenMaker can use for arbitrary per-line information, besides that of its last token type. Each TokenMaker could use this field for whatever info they want. I hate to make the API any more complex than it already is (not that most users create their own TokenMakers, but still) so I'll have to think over the best way to implement this.

Thanks for the feedback!
User avatar
robert
 
Posts: 222
Joined: Sat May 10, 2008 5:16 pm

Re: Nested multiline-comments

Postby Guest » Wed Feb 03, 2010 8:56 pm

Hi Robert,

do you mean the Java-class Object with "field"?

Greetings from Munich
Stefan
Guest
 

Re: Nested multiline-comments

Postby robert » Thu Feb 04, 2010 1:47 pm

I was thinking along the lines of a new method to go alongside getLastTokenTypeOnLine(), something like "getExtraDataForLine(int)". This would return arbitrary data that is meaningful to the current TokenMaker, but could vary from one TokenMaker to the next. The current TokenMakers, for example, wouldn't need it. This information could be used to specify things such as nested comment depth, the current "section" of a language's source code is divided into discrete sections, etc.

The implementation wouldn't be using a new int per line, but rather it would use space in the current "lastTokenTypeOnLine" list of ints. There would be a new limitation on number of states (say 256, should be more than enough) and the remaining 24 bits would be used for the "extra data." So no extra space or time overhead for languages that don't use the feature.

I'm starting to question the need for this though. Implementing this would require slightly modifying and recompiling (and re-testing) several current TokenMakers, and the thing is, I'm still not convinced this cannot be done with the current implementation. For example, instead of my previous proposal of a certain number of states representing "comment depth," say you had a single state be "comment, 1 deep" and each succeeding state (e.g. token type be an extra comment layer). For example, since currently, negative token types are used for states internal to a particular TokenMaker:

Code: Select all
/**
* Type this TokenMaker for "last token type on line" for multi-line comments 1 level deep.  Anything
* less than this is used to specify more layers; i.e. "-2" means "2 levels deep," "-3" means "3 levels
* deep," etc.  This allows arbitrary nested comment depth.  Any other internal states would have to
* have values in the range -1..-9 in this case.
*/
public static final int INTERNAL_MLC_DEPTH_1  = -10;


Then, just end un-ended MLC lines with (INTERNAL_MLC_DEPTH_1-depth+1) instead of COMMENT_MULTILINE. Your parsing code would decode the lastTokenType for the previous line, set a "depth" field, then parse the current line with this knowledge.

You mentioned earlier that you felt like this functionality should be "built-in" to RSTA, and while I agree to a certain extent, if a language supports nested comments, it'll have to write some code in its TokenMaker to support it whether RSTA has any built-in support or not, so I'm not sure how having support for a "comment depth" field, or arbitrary info-per-line field, really helps over my proposal above?

Or am I missing something? :D Suggestions are welcome of course.
User avatar
robert
 
Posts: 222
Joined: Sat May 10, 2008 5:16 pm


Return to Help

Who is online

Users browsing this forum: No registered users and 1 guest

cron