JONAS COELHO

DATA SCIENTIST

ABOUT ME

I am a data scientist who believes that data can be converted into knowledge to foster progress. I have a diverse and multidisciplinary background and have been using Python with SQL since 2014 and R since 2018 to explore, clean, and analyze a variety of datasets from public administration. I was a fellow at Data Science for Social Good Fellowship in 2022 and worked at the third sector from 2020 to 2023. My journey in the field included a pivotal role at Transparência Brasil from 2020 to 2023, where I honed my skills in leveraging data for transparency and social impact. In 2023, I transitioned to Multiplan, a leader in the Brazilian real estate sector, continuing my pursuit of data-driven insights and innovation in a new industry. Here is an overview of my experience and skills:

  • Obtaining information from different sets of databases (PostgreSQL, BigQuery)
  • Automated data collection through APIs and Web Scraping (Python)
  • Communicating new findings in accessible language for journalists and general public
  • Development of predictive Machine Learning models and neural networks (R-caret, Python-ScikitLearn, PyTorch)
  • Writing reports and articles based on data analysis
  • Development of internal documents with information on public databases (RMarkdown, ggplot, GoogleVis)

EDUCATION

Carnegie Mellon University

Co-developed a deep learning algorithm using PyTorch to identify collapsing structures in Baltimore City, assisting the city’s Department of Housing and Community Development in mitigating risks.

  • PyTorch
  • Machine Learning
  • Python
  • PostgreSQL

FGV - EBAPE
Fundação Getulio Vargas (FGV) is one of the most prestigious universities in Latin America. The MSc program offered through its School of Administration (EBAPE) has achieved the highest score in the evaluation conducted by CAPES - a governmental agency responsible for overseeing educational institutions in Brazil - along with only three other universities.

Classes focused on advanced statistical methods for public policy analysis and political science studies
CAPES Scholarship

  • Statistics
  • Causal Inference
  • R
  • Qualitative methods

FGV - EMAp

Introduction to computing, mathematical modeling and programming languages.

  • Applied Math
  • Python
  • Graph theory
  • SQL

FGV - Direito Rio

Multidisciplinary undergraduate law degree with learnings in economics, programming and statistics.
Full Scholarship

  • Law
  • Python
  • Economics
  • SQL

SOME OF MY WORK

Here are a selected few of the projects I worked on. By clicking on one of them, you can read more about it.

Identifying collapsed roofs in Baltimore

Using deep learning and aerial imagery, I co-developed an algorithm to identify collapsing structures in Baltimore City.

  • Deep Learning
  • PostgreSQL
  • Machine Learning
  • GIS
Misuse of public funding
in education

By analyzing budgetary data from the Brazilian Federal Government and combining it with other databases, I identified and reported potential irregularities.

  • Data visualization
  • ETL
  • Data analysis
  • Writing
Wage Inequality Among São Paulo's Teachers

Using data from the municipal and federal level, I analyzed the socioeconomic disparities of public schools in São Paulo according to the wages paid to teachers in each school.

  • Data visualization
  • Statistics
  • Tableau
  • Data analysis
Teachers and territories: inequality in education
R Tidyverse
R Markdown
Linux AWK

The city of São Paulo has a vast public education network spread across its extensive territory. However, the city is also the stage for a large scenario of inequality, with the highest incomes in the country occupying the city center while people with high levels of socio-economic vulnerability live in the suburbs.

In this report, I analyzed demographic data of students enrolled in the entire municipal public network, comparing it with the average wages of teachers at each school. The result, although not surprising, is still shocking: schools in the poorest areas of the city, with the highest proportion of black students, also have teachers with considerably lower wages. This scenario is alarming because the municipal education network theoretically has wage equity, meaning all teachers receive the same value, regardless of where they work. However, some practical details end up distorting this principle.

The report received extensive media coverage, including a feature on SPTV (the local news TV show with the highest viewership in the state) and on Folha de São Paulo (one of the largest newspapers in the country).

Scraping information from real estate websites

In this basic personal Python project I created a scraper for two real estate websites to collect info on available units and store it in a spreadsheet.

  • Web Scraping
  • Logging
  • YAML
  • Open Source
Public Transparency in Latin America

On this article published in Government Information Quarterly, I explored over 300 transparency evaluations in Latin America to highlight patterns and biases in the way that governments are evaluated.

  • Data visualization
  • ETL
  • Data analysis
  • Writing
Are governments complying with transparency?

In this study, me and my co-authors have conducted a comprehensive analysis of 265 transparency compliance evaluations conducted by NGOs, academics, and government oversight authorities across Latin America between 2003 and 2018. Our study sheds light on the compliance of Latin American governments with their own transparency statutes. We found that compliance has modestly improved over time, but there is still a significant lack of compliance with passive transparency at the local level compared to the national level. Additionally, our data shows that evaluations done by government oversight agencies tend to obtain higher scores.

Furthermore, our study highlights significant gaps in evaluation efforts, particularly in the evaluation of passive transparency. Our study is the first large-scale cross-national assessment of transparency compliance in Latin America and was only possible after intense work on data collection and standardization that allowed us to perform data analysis and visualization on this database.

The interaction between transparency and education

In my master's thesis, I used a propensity score matching to understand the potential impacts that public transparency could have on education.

  • Causal inference
  • Statistics
  • Academic Writing
  • Data analysis
Different types of transparency, different impacts
R Tidyverse
R Tidyverse

In my master’s thesis, I meticulously investigated three distinct forms of transparency and their corresponding levels of government adherence to these policies at the municipal level. The primary objective was to assess the influence of transparency on public administration outcomes, with a specific focus on the impact of varying transparency types on municipal-level public education in Brazil.

To accomplish this, I initially employed Data Envelopment Analysis (DEA) to quantify the efficiency of education expenditures. This required the careful acquisition, cleaning, and processing of multiple sets of public administration databases for over 5,500 Brazilian municipalities. Next, I conducted a Propensity Score Matching analysis to evaluate the effect of transparency on education, utilizing the Brazilian National Index of Primary Education (IDEB) as a metric.

CONTACT ME

If you wish to get in touch, please fill the form provided below.