Created
September 16, 2019 19:09
Static Function that can be used as a map lambda to translate a string into the columns for a Spark Dataframe.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import org.apache.spark.sql.*; | |
/** | |
* Parses a line of a fixed width flat file | |
* @param pos list of integers describing the width of each column of the flat file | |
* @param str line of the fixed width flat file | |
* @return Row containing data from fixed width flat file line | |
*/ | |
public static Row lSplit(List<Integer> pos, String str) { | |
List<String> cols = new ArrayList<>(); | |
int start = 0; | |
for (Integer col_pos : pos) { | |
parseCol(str, col_pos, start, cols); | |
start += col_pos; | |
} | |
return RowFactory.create(cols.toArray()); | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment