{Click to play with colors :) Scroll down for more info!}
Data-driven software engineer, always learning, always improving.
Master in Big Data Management & Analytics. I have experience researching, designing and developing software using multiple technologies.
Portfolio
Relevant projects connecting my career path:
Web platform for electrical operators in Colombia (PHP, Drupal)
cvTraffic
We implemented a vehicle classifier using OpenCV (C).
Microkernel app
Plugin for traffic light status visualization in TransModeler (C#)
Intelligent Transportation Systems
I designed the West-Central Metropolitan Area ITS architecture.
smaps
Web platform for Intelligent Transportation Systems.
Bariatrack
Data modeling & collection. Mobile app using Ionic.
Centro Memoria
Colombian armed conflict dashboard.
Geo Tweets
Tweets visualization according to topic and geolocation.
UDAPTOR
Big Data startup business plan & prototype
Sentiment Analysis & Visual Analytics
Perception of Migration on social media
GeoBERT
Named Entity Recognition (NER) System for the Oil&Gas Industry
CREG
Web platform for electrical operators in Colombia (PHP, Drupal)
Before finishing my career I started working as a research assistant in a High Performance Computing Lab. This was my first project for an external client: I Designed and developed a web application (Drupal, PHP) to evaluate strategic plans to reduce non-technical losses in electricity distribution for the Regulation Commission of Energy and Gas, CREG.
Traffic sensor to count and classify vehicles using computer vision.
This was a project developed with the Colombian Administrative Department of Science, Technology and Innovation; COLCIENCIAS.
I co-developed (team of two) a traffic sensor to count and classify vehicles. We had all imaginable challenges coming from an outdoor and uncontrolled environment. The project was developed using early releases of OpenCV. I participated in idea generation to product deployment at traffic intersections. My teammate worked heavily in the hardware (processing unit and camera communication) while I was in charge of the software components, including the sensor communication. We both worked in computer vision for background subtraction and vehicle classification.
This is how our sensor was integrated with other components to analyze traffic congestion:
We integrated the sensor with traffic controllers and geographic information systems to visualize traffic state in different management systems across the city. Other small teams were responsible for different components, including the traffic controller, the communication system & the Geographic Information System.
We will later deploy a similar system for the entire city. For that project, instead of ArcGIS, we used a traffic simulation platform called TransModeler. I was responsible for its integration into the entire system.
Traffic Central
Traffic sensor to count and classify vehicles using computer vision.
I designed the software architecture required for the Traffic Control Center in Pereira
I developed an application (.NET) using microkernel architecture to visualize the traffic lights of the city over a Traffic Simulation Platform (TransModeler), enabling traffic simulation using real time information.
Co-developed an application to manage the timing of traffic controllers for the city of Pereira following NTCIP communication protocols.
These products allowed us to be part of the Pereira Innovation and Technological Development Center (Colombia), creating an university spin-off focusing on Intelligent Transportation Systems.
Design the 'Regional Intelligent Transportation Systems Architecture for West Central Metropolitan Area' (Risaralda, Colombia). The local transport authorities adopted the ITS architecture, including agencies for public transport, traffic management, emergency response, traveller information, and data management.
I defined data flows, services, and applications to integrate all transport management stakeholders under a common data integration framework.
Date: 2014
Client: AMCO-CIDT
Category: Software architecture, ITS
smaps
Web platform for Intelligent Transportation Projects
I led the implementation of 'Regional Intelligent Transportation Systems Architecture for West Central Metropolitan Area' with a web platform (Node.js, OpenStreetMap, Common Alerting Protocol) integrating regional transport information from multiple data sources including traffic data, road emergencies & alerts, traffic light status, security cameras, public transport among others.
Smaps is currently under development by a different team. You can check their website here.
Date: Dec. 2015
Client: CIDT
Category: Backend/web development
Bariatrack
Software engineer for mobile app.
We developed a prototype required to evaluate the evolution of bariatric surgery' patients. I had to find the way to adapt the BAROS (Bariatric Analysis and Reporting Outcome System) for Colombian users in a digital environment. I led the development team and I was in charge of direct interaction with the client.
Date: Dec. 2016
Client: Mednovation (Medellín, Antioquia)
Category: Mobile
Colombian Armed Conflict
Interact with Colombian Armed Conflict data from Centro Memoria.
At this point I knew I wanted to pursue a data-driven career. I started to study visual analytics not only because it was a missing skill in our team. Besides taking responsibility for visual components in our projects at work, I continued learning in my free time developing side projects.
This is a data visualization project about warlike actions in the context of the Colombian armed conflict. In 2016 I sent this project to Centro Nacional de Memoria Histórica, asking for more data and proposing new ideas for visualization. They told me they were developing a project covering the topic. It was later published at 'Rutas del Conflicto', which I think lacks data-humanism. I agree with Lisa Charlotte Rost when she says we need to create visualizations and make people feel empathy about these topics.
Date: Nov. 2016
Client: Side project
Category: information visualization
Geo Tweets
Tweets visualization according to topic and geolocation.
This is another side project. Tweets visualization developed at Bios to practice and showcase some technologies. You can search up to four (4) different topics to see places where people are tweeting them the most. You can also filter topics by sentiment.
I designed the interface and lead the team for project implementation. I developed d3.js and leaflet related components.
All my side-projects in visual analytics would later be crucial to obtain a scholarship from the European Union for the Master in Big Data Management & Analytics, which I started in 2018 :)
Date: 2017
Client: Internal project at BIOS
Category: information visualization
UDAPTOR
Big Data startup business plan & prototype
In our Master we developed a data portability tool, we called it UDAPTOR. The project takes advantage of the General Data Protection Regulation (GDPR) to transfer data between service providers. Our idea was to facilitate data portability converting collections of files from one provider (e.g. Spotify) to another (Deezer, Apple Music, etc.). I participated from idea conception to the prototype implementation, responsible for the Knowledge Graph mappings required to transform the source files to the target files. I also developed the PoC using Spotify, improving music recommendations for users coming from Apple Music.
Later, two members of the original team continued the project with their BDMA Master Thesis. Here the official website, where you can find more info.
Date: July. 2019
Location: Barcelona, Spain
Client: UPC - Class Project
Category: data engineering, data integration, semantics
Twitter sentiment analysis
Sentiment analysis & Visual Analytics.
With the objective of getting information about the Perception of migrants on social media, we gathered Twitter data during the months of August, September and October 2019. We looked for tweets in English, that were not re-tweets and that contained one of the following words: migrant, immigrant, refugee and asylum seeker.
Visual Analytics Project. Course by Petra Isenberg at CentraleSupelec.
I designed the initial sketch for storytelling and implemented 80% of the visuals using d3.js.
The project is available at this Github site where you can find a data analysis report.
Date: January. 2020
Location: Paris, France
Client: CentraleSupélec - Class Project
Category: information visualization, data science, visual analytics
I built a Named Entity Recognition (NER) System for the Oil&Gas Industry.
First, I developed a distributed NLP pipeline using Apache Spark NLP for weak data labeling with distant resources (dictionaries, RegExp, ontologies) making the pipeline extensible to new named-entities and suitable for Big Data scenarios in other domains.
Using the noisy labels, I trained Deep Neural Network language models (BERT, DistilBERT & DistilRoBERTa) with a two-step fine-tuning process, improving accuracy in hard-to-learn named entities and showing promising results to solve the polysemy problem in our domain.
The project was developed in a cloud environment (Kubernetes, GCP) & it was managed using developer tools for Machine Learning (nbdev, Weights & Biases). The team continued the project at Schlumberger to incorporate more entities, aiming to build a complete Language Model for GeoSciences (GeoBERT)
At the beginning, this project was especially challenging since it was developed during the COVID-19 pandemic, but I was able to improve my capacity to keep focused and motivated