Ordering of files alg. is splitted into 3 parts:
- Collecting symbols from AST
- From symbols, find dependencies between files
- Construct oriented graph from dependencies and solve topological order on it
For each file, from untyped tree we collect three types of symbols:
- symbol definition, fully qualified (let bindings, type declaration, DU case constructors, record fields, members)
- symbol usage
- open declarations, including "implicit" ones from module declaration and namespace declaration
For collecting, I copied AstTraverse
code from https://github.com/fsharp/FSharp.Compiler.Service/blob/13ecd8d4d080465bce4f49de72e4c13c6005e842/src/fsharp/vs/ServiceParseTreeWalk.fs and altered it to support going through the whole tree.
This allows us to traverse AST and cherry-pick only certain type of nodes.
Symbols and opens are represented as simple strings.
- only
let
bindings are supported - inner module are not supported -- we will need to create group of open declarations and usages for each submodule
For each symbol usage we try to find corresponding definition.
We do it by combining symbol with all open declarations, and compare it with definitions.
Because identifier can be partially qualified, we must detect common parts in the end of open
and the start of indentifier,
so open A.B.C
+ C.D
-> A.B.C.D
.
From this we get list of file dependencies (file1, file2, symbols)
, where file2
uses symbols
defined in file1
.
Actual implementation uses Map with the last part of the qualified identifier of symbol for performance.
Correct ordering of files is such that for each dependency file1 -> file2
, file1
is before file2
.
This directly translates to Topological order problem in oriented graphs: https://en.wikipedia.org/wiki/Topological_sort
Kahn's algorithm mentioned on wiki is implemented.
Later, I need to find a way how to find ordering with minimal editing distance from original ordering.