This utility script demonstrates how to use Tree-sitter to parse and traverse JavaScript, TypeScript, and React+TypeScript (TSX) source code. It includes:
π¦ Installation instructions for tree-sitter and related language bindings.
pip install tree-sitter=="0.23.1"
pip install tree-sitter-javascript=="0.23.0"
pip install tree-sitter-languages=="1.10.2"
π§ A utility function to read source code from a file.
def read_file(file_path):
with open(file_path, "r") as file:
return file.read()
π² Parsing logic using the Tree-sitter parser for:
- JavaScript files
from tree_sitter import Language, Parser
import tree_sitter_javascript as jsscript
JS_LANGUAGE = Language(jsscript.language())
js_parser = Parser(JS_LANGUAGE)
JS_CODE = read_file("your_javascript_file_path.js")
js_tree = js_parser.parse(bytes(JS_CODE, "utf8"))
for item in js_tree.root_node.children:
print(item, item.type, item.start_byte, item.end_byte)
- TypeScript files
from tree_sitter import Language, Parser
import tree_sitter_typescript as tsscript
TS_LANGUAGE = Language(tsscript.language_typescript())
ts_parser = Parser(TS_LANGUAGE)
TS_CODE = read_file("your_typescript_file_path.tsx")
ts_tree = ts_parser.parse(bytes(TS_CODE, "utf8"))
for item in ts_tree.root_node.children:
print(item, item.type, item.start_byte, item.end_byte)
React+TypeScript (TSX) files
TSX_LANGUAGE = Language(tsscript.language_tsx())
tsx_parser = Parser(TSX_LANGUAGE)
REACT_CODE = read_file("your_typescript_file_path.tsx")
tsx_tree = tsx_parser.parse(bytes(REACT_CODE, "utf8"))
for item in tsx_tree.root_node.children:
print(item)
π¨οΈ Simple AST node traversal to print each child node with its type and byte positions.
This script is useful for building custom code Language Sesitive Chunker for Codebase