Skip to content

Instantly share code, notes, and snippets.

@grundprinzip
Created August 17, 2021 13:10
Show Gist options
  • Save grundprinzip/04fd74d1b103b9d911a621b89b72cb05 to your computer and use it in GitHub Desktop.
Save grundprinzip/04fd74d1b103b9d911a621b89b72cb05 to your computer and use it in GitHub Desktop.
Parquet Column Names
import pyarrow as pa
import pyarrow.parquet as pq
col1 = pa.array([1,2,3])
col2 = pa.array(["a","b", "c"])
table = pa.Table.from_arrays([col1,col2], ["This aint; `no (name)", "Isthis"])
pq.write_table(table, "result.parquet")
parquet pages result.parquet
Column: This aint; `no (name)
--------------------------------------------------------------------------------
page type enc count avg size size rows nulls min / max
0-D dict S _ 3 8.00 B 24 B
0-1 data S R 3 3.33 B 10 B 0 "1" / "3"
Column: Isthis
--------------------------------------------------------------------------------
page type enc count avg size size rows nulls min / max
0-D dict S _ 3 5.00 B 15 B
0-1 data S R 3 3.33 B 10 B 0 "a" / "c"
parquet schema result.parquet
Unknown error
shaded.parquet.org.apache.avro.SchemaParseException: Illegal character in: This aint; `no (name)
at shaded.parquet.org.apache.avro.Schema.validateName(Schema.java:1561)
at shaded.parquet.org.apache.avro.Schema.access$400(Schema.java:87)
at shaded.parquet.org.apache.avro.Schema$Field.<init>(Schema.java:541)
at shaded.parquet.org.apache.avro.Schema$Field.<init>(Schema.java:580)
at org.apache.parquet.avro.AvroSchemaConverter.convertFields(AvroSchemaConverter.java:280)
at org.apache.parquet.avro.AvroSchemaConverter.convert(AvroSchemaConverter.java:264)
at org.apache.parquet.cli.util.Schemas.fromParquet(Schemas.java:89)
at org.apache.parquet.cli.BaseCommand.getAvroSchema(BaseCommand.java:405)
at org.apache.parquet.cli.commands.SchemaCommand.getSchema(SchemaCommand.java:110)
at org.apache.parquet.cli.commands.SchemaCommand.run(SchemaCommand.java:87)
at org.apache.parquet.cli.Main.run(Main.java:155)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.parquet.cli.Main.main(Main.java:185)
parquet-tools schema result.parquet
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.apache.hadoop.security.authentication.util.KerberosUtil (file:/usr/local/Cellar/parquet-tools/1.12.0/libexec/parquet-tools-deprecated-1.12.0.jar) to method sun.security.krb5.Config.getInstance()
WARNING: Please consider reporting this to the maintainers of org.apache.hadoop.security.authentication.util.KerberosUtil
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
message schema {
optional int64 This aint; `no (name);
optional binary Isthis (STRING);
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment