Skip to content

Instantly share code, notes, and snippets.

View eng-rodrigocunha's full-sized avatar

Rodrigo Cunha eng-rodrigocunha

View GitHub Profile
@eng-rodrigocunha
eng-rodrigocunha / trata_permissionarios_tse.py
Last active August 15, 2024 21:32
Formata arquivo leiaute de importação de Permissionários de Serviços Públicos da Secretaria Municipal de Transportes da Prefeitura da Cidade do Rio de Janeiro ao Tribunal Superior Eleitoral (https://www.tse.jus.br/eleicoes/eleicoes-2022/prestacao-de-contas/nota-fiscal-eletronica-e-informacoes-de-permissionarios-fiscalizaje)
"""
Formata arquivo leiaute de importação de permissionários do TSE
Formata arquivo leiaute de importação de Permissionários de Serviços Públicos da
Secretaria Municipal de Transportes da Prefeitura da Cidade do Rio de Janeiro
ao Tribunal Superior Eleitoral
https://www.tse.jus.br/eleicoes/eleicoes-2022/prestacao-de-contas/nota-fiscal-eletronica-e-informacoes-de-permissionarios-fiscalizaje
"""
# import sys
# !{sys.executable} -m pip install --upgrade numpy pandas
@eng-rodrigocunha
eng-rodrigocunha / mergePdf.gs
Last active January 31, 2025 23:04
Merge PDF based on a list of Google Drive PDF links on a spreadsheet
// https://tanaikech.github.io/2023/01/10/merging-multiple-pdf-files-as-a-single-pdf-file-using-google-apps-script/
async function mergePDF() {
// Informe o ID da planilha onde estão os links
var planilhaId = "1Iq6DNGk8XkfOX3kp2HDmDb1Ac7vK76brkJc6T6oydNc";
// Informe o nome da planilha que contém os links
var nomePlanilha = "LINKS";
// Obter a planilha
var planilha = SpreadsheetApp.openById(planilhaId);
@eng-rodrigocunha
eng-rodrigocunha / download_gcs.py
Created March 17, 2023 02:33
Realiza download de bucket no GCS e procura quais arquivos possuem determinada condição
import basedosdados as bd
import pandas as pd
import glob
bd.config.project_config_path = "D:\\basedosdados\\staging"
for hour in range(14, 24, 1):
print(hour)
st = bd.Storage(dataset_id="br_rj_riodejaneiro_onibus_gps", table_id="registros")
st.download(savepath=".", partitions=f"data=2023-03-08/hora={hour}", mode="staging")
@eng-rodrigocunha
eng-rodrigocunha / get_vaccination_status.gs
Last active March 12, 2023 02:29
Realiza scrapping na Carteira Nacional de Vacinação Digital ou no Certificado Nacional de Vacinação Covid-19 emitido através do ConecteSUS para identificar quantas doses de COVID-19 foram administradas
/*
* Convert PDF file to text
* @param {string} fileId - The Google Drive ID of the PDF
* @param {string} language - The language of the PDF text to use for OCR
* return {string} - The extracted text of the PDF file
* https://www.labnol.org/extract-text-from-pdf-220422
* IMPORTANT! https://www.labnol.org/shared-drives-google-script-220128
*/
const convertPDFToText = (fileId, language) => {
@eng-rodrigocunha
eng-rodrigocunha / get_vaccination_status.py
Last active March 12, 2023 01:58
Realiza scrapping na Carteira Nacional de Vacinação Digital ou no Certificado Nacional de Vacinação Covid-19 emitido através do ConecteSUS para identificar quantas doses de COVID-19 foram administradas
#!pip install pdfminer.six
import io
from pdfminer.high_level import extract_text
doses = ["Reforço", "Dose Adicional", "2/2", "1/2"]
# abrir o arquivo PDF
with open(r'E:\DOCUMENTOS PESSOAIS\Carteira Nacional de Vacinação Digital_4_DOSE.pdf', 'rb') as f:
# extrair o texto do PDF
text = extract_text(f)
@eng-rodrigocunha
eng-rodrigocunha / mail_web_scrapping.py
Last active March 12, 2023 01:59
Realiza web scrapping para coletar todos os e-mails de determinado conjunto de páginas web
#!pip install requests
#!pip install beautifulsoup4
# https://stackoverflow.com/questions/63533115/extract-valid-email-address-using-regular-expression-and-beautifulsoup
import requests
import re
from bs4 import BeautifulSoup
email = re.compile(r'([a-zA-Z0-9._-]+@[a-zA-Z0-9._-]+\.[a-zA-Z0-9_-]+){0,}')
@eng-rodrigocunha
eng-rodrigocunha / dbt_to_dbdiagram.rb
Created March 1, 2023 22:50
Ruby code to convert dbt yml to dbdiagram.io format
#!/usr/bin/env ruby
# Generate a dbdiagram for dbdiagram.io from a dbt project.
#
# Usage:
# 1. Write your model schema.yml (there's another code in this gist to make it automatically)
# 2. Run `dbt docs generate` first.
# 3. Run `dbt_to_dbdiagram.rb`
# 4. Paste the output in https://dbdiagram.io/
require 'yaml'
@eng-rodrigocunha
eng-rodrigocunha / bigquery_schema_generator.sql
Last active July 16, 2024 16:32
dbt schema.yml generator query using the information_schema of the generated tables for BigQuery
WITH
columns AS (
SELECT
" " || "- name: " || column_name || "\n" ||
" " || ' description: "' || column_name || '"' AS column_statement,
table_name
FROM
`rj-smtr.veiculo`.INFORMATION_SCHEMA.COLUMNS ),
tables AS (
SELECT
# Sumário por quinzena e consórcio
WITH
sumario AS (
SELECT
EXTRACT(YEAR
FROM
DATA) AS ano,
EXTRACT(MONTH
FROM
DATA) AS mes,
@eng-rodrigocunha
eng-rodrigocunha / pdf_reduct.py
Created February 18, 2023 17:57
Reduct pdf sensitive content
#!pip install pdf-redactor
import re
from datetime import datetime
import pdf_redactor
## Set options.
options = pdf_redactor.RedactorOptions()