Individual rules exist for the following scopes
Definition | Indented? | Quoted? | Scope | Notes |
---|---|---|---|---|
<<"HTML" | No | Double | text.html.embedded.perl | - |
<<"XML" | No | Double | text.xml.embedded.perl | - |
<<"CSS" | No | Double | text.css.embedded.perl | - |
<<"JAVASCRIPT" | No | Double | text.js.embedded.perl | - |
<<"SQL" | No | Double | source.sql.embedded.perl | - |
<<"POSTSCRIPT" | No | Double | text.postscript.embedded.perl | - |
<<"OTHER" | No | Double | string.unquoted.heredoc.doublequote.perl | This rule assigns $self to the incorrect capture. |
<<'HTML' | No | Single | text.html.embedded.perl | - |
<<'XML' | No | Single | text.xml.embedded.perl | - |
<<'CSS' | No | Single | text.css.embedded.perl | - |
<<'JAVASCRIPT' | No | Single | text.js.embedded.perl | - |
<<'SQL' | No | Single | source.sql.embedded.perl | - |
<<'POSTSCRIPT' | No | Single | text.postscript.embedded.perl | - |
<<'OTHER' | No | Single | string.unquoted.heredoc.quote.perl | This rule assigns $self to the incorrect capture. |
<<\::: | No | Single | string.unquoted.heredoc.quote.perl | This rule assigns $self to the incorrect capture. |
<<`OTHER` | No | Backticks | string.unquoted.heredoc.backtick.perl | This rule assigns $self to the incorrect capture. |
<<HTML | No | None | text.html.embedded.perl | - |
<<XML | No | None | text.xml.embedded.perl | - |
<<CSS | No | None | text.css.embedded.perl | This rule is not present. |
<<JAVASCRIPT | No | None | source.js.embedded.perl | This scope uses "source.js" while its siblings use "text.js". Why? |
<<SQL | No | None | source.sql.embedded.perl | - |
<<POSTSCRIPT | No | None | source.postscript.embedded.perl | This scope uses "source.postscript" while its siblings use "text.postscript". Why? |
<<OTHER | No | None | string.unquoted.heredoc.doublequote.perl | - |
- Fix the 4 rules that assign
$self
to the incorrect capture. It should always match the(.*)
in the first line after the heredoc part. (e.g. inmy @vals = (<<HTML, 1);
,$self
should be interested in the, );
portion of text.) - Add the missing unquoted embedded CSS case.
- Confirm if the scope name differences among siblings (i.e.
text
/source
are intended or one or the other are in error).
There are currently 22 rules used to handle heredocs. It would be nice if they could be refactored to a fewer number.
- There's 3 main types of parsing approaches: double-quoted, single-quoted, and bare. It would be nice if all double-quoted approaches could be unified, all single-quoted approaches could be unified, and all bare approaches could be unified. This would require being able to specify the
contentName
attribute dynamically based on abegin
capture. - In order to make the existing rules also work for indented heredocs, instead of duplicating each current rule and making a small tweak, we'd need conditional expressions to work, e.g.
<<(~)?HTML(?(1)\s*)HTML
would allow spaces between the twoHTML
s only if the~
was present. Despite this being apparently possible it doesn't seem to work. Can it be done?
- Why do the following approaches diverge?
# Double-quoted other (<<"OTHER")
(((<<) *"([^"]*)"))(.*)\n?
# Single-quoted other and craziness
# <<'OTHER'
(((<<) *'([^']*)'))(.*)\n?
# <<\:::
(((<<) *\\((?![=\d\$\( ])[^;,'"`\s\)]*)))(.*)\n?
# Un-quoted other (<<OTHER)
(((<<) *((?![=\d\$\( ])[^;,'"`\s\)]*)))(.*)\n?
- In order to correctly identify an indented heredoc, we should be checking that the whitespace portion of the end terminator is matched exactly at the beginning of each line within the indented heredoc, and if not, then we should not match it. For example:
my @sql_and_bind = (<<~SQL, $id);
SELECT a, b, c
FROM the_table
WHERE id = ?
SQL
is a valid indented heredoc, but the following is not:
my @sql_and_bind = (<<~SQL, $id);
SELECT a, b, c
FROM the_table
WHERE id = ?
SQL
and neither is:
my @sql_and_bind = (<<~SQL, $id);
SELECT a, b, c
FROM the_table
WHERE id = ?
SQL
Why is this last case not? Because the line for the where
clause uses eight spaces to indent, while the end terminator uses a tab. Yeah, perl is that picky. Is this level of parsing possible with textmate grammars?
<dict>
<key>begin</key>
<string>(((<<) *"HTML"))(.*)\n?</string>
<key>captures</key>
<dict>
<key>1</key>
<dict>
<key>name</key>
<string>punctuation.definition.string.perl</string>
</dict>
<key>2</key>
<dict>
<key>name</key>
<string>string.unquoted.heredoc.doublequote.perl</string>
</dict>
<key>3</key>
<dict>
<key>name</key>
<string>punctuation.definition.heredoc.perl</string>
</dict>
<key>4</key>
<dict>
<key>patterns</key>
<array>
<dict>
<key>include</key>
<string>$self</string>
</dict>
</array>
</dict>
</dict>
<key>contentName</key>
<string>text.html.embedded.perl</string>
<key>end</key>
<string>(^HTML$)</string>
<key>patterns</key>
<array>
<dict>
<key>include</key>
<string>#escaped_char</string>
</dict>
<dict>
<key>include</key>
<string>#variable</string>
</dict>
<dict>
<key>include</key>
<string>text.html.basic</string>
</dict>
</array>
</dict>