🏆 Correlation One is ranked #6 on LinkedIn's Top Startups 2022 List in the U.S.!

In this blog series, we’re proud to shine a light on some of the top Capstone projects from the fifth graduating class of Data Science for All / Colombia. Capstone projects are a critical component of the program’s curriculum. Teams work on projects together to apply what they learn during the program to a real-world data problem. These projects were sourced directly from public and private entities in Colombia and solve a real problem these entities are facing. Through these projects, our graduates learn practical, job-oriented data skills and give back to their community using the power of data science & AI.

Meet the Team

Vanessa Movilla

Vanessa Movilla

David Duque Betancur, ingeniero mecánico y aeronáutico de formación, con Maestría en Mecánica de la Université Grenoble Alpes y con una Maestría en Administración y Dirección de Empresas de EUDE Business School.


David Gonzalez

David Gonzalez

Electronic Engineer , and MBA from Inalde Business School Bogota , Colombia , currently working as Country Business Manager for Rockwell Automation , I chose to join DS4A because is a high quality program and I wanted to be part of this world class community , My goal is to use all this acquired knowledge to support my desire to be a leader of digital transformation in Colombia, Latin America and the world.

Alejandro Soto

Alejandro Soto

statistician from the Universidad del Valle (Cali, Colombia), currently working in Clinica Nuestra Cali as a statistics and analytics coordinator. i chose to join DS4A since it is one of the main plugins for statistics, since computational and programming skills are obtained that will serve me in my professional future.

Julian Santiago Tauta

Julian Santiago Tauta

Software Systems engineer and Telematics engineer from Universidad Icesi (Cali, Colombia), currently working at Globant as a Python Developer. I chose to Join DS4A because data science, data analytics and AI are fields that have always interested me and I saw the course as a great opportunity to explore, learn new things and reinforce my knowledge.

Juan David Arias Orrego

Juan David Arias Orrego

Electronic Engineer from Universidad Pontificia Bolivariana (Medellín, Colombia), MSc in Information Technology from Universitaet Stuttgart (Stuttgart, Germany), currently working at Telefónica Colombia as regional manager for RF Engineering in Colombia’s Northwest Mobile Network. I chose to join DS4A because data science is the present and the future of the economy, and is also a powerful tool to help us improve our society.


About the Project: How to Reactivate Bogotá’s Tourism Industry

Project Overview

The problem we chose to solve is related to the tourism industry, an industry that was highly affected by the Covid-19 pandemic, with the data provided by the “Instituto de Turismo de Bogotá”. We addressed the use of Data Science to improve the availability, visualization, and analysis of tourism-related data, so that entities have the relevant information to design public policy initiatives to promote tourism in the city. With the help of sentiment analysis tools from a source of information obtained using web-scrapping, we also identified how to improve attractions for a better tourist experience. Finally, after testing several models, by using a random forest regression model on an Airbnb database of the city, we predicted rental prices for a property depending on its location, characteristics and tourist attractions nearby. Our product’s name is Trip-City.


Click to read the datafolio

Learn more about this capstone project:  final report and project presentation

What was the most exciting/surprising findings from your project? 

We had several “eureka moments” during the project. One was related to the sentiment analysis tools we learned in DS4A since it helped us developing one of the key benefits of Trip-City. Another one was when we were trying to include a model in Trip-City and, after testing several ones, we finally got a very good result with the random forest regression model. The day of the final delivery, the cloud server was not working but we managed to find the problem and solve it. And for presenting the results we used the concept of hitting a “piñata”.

What were some challenges you faced and how did you overcome them?

There were several challenges during the development of the project. One of them was to identify how to address the problem with the data available. What we did was a brainstorming and then we prioritized the options with our client. Nevertheless, the main challenge was to work efficiently as a team, to identify each one’s strengths, and to assign and keep tracking of tasks and activities.

Who is your team’s mentor and how did he/she help?

Our Teaching Assistants during the program were Juan García and Wbeimar Ossa. They guided us and gave us recommendations on how to prioritize the main problems to be solved by using the available data. From the “Instituto Distrital de Turismo de Bogotá” we had the support from Daniel Valencia, Luis Fernando Pineda and Diego Rodriguez. Their help and support was an important factor for the success of our project.

What do you view as the impact of your project?

The main benefits of our project are that: 1. It allows entities to have the required information, visualizations, and analysis to design public policies and to execute marketing campaigns to promote the tourism industry, 2. It allows public entities to identify the main strengths and the main problems of tourist attractions, to know how to invest to improve the tourist experience, and 3. It allows investors in the lodging sector to estimate their incomes based on the location and characteristics of their property. The use of Trip-City has also several benefits for tourists and for tourist service providers. Moreover, Trip-City can be easily implemented in many other tourist promoting entities nationwide.
If we had additional time and resources, we could have included other sources of information for sentiment analysis, new data sources that enrich the visualizations and we could have used more advanced techniques of data science, different types of models and neural networks. One additional feature that can be added to the tool is a new model to estimate the monthly occupation rate of each property, to get closer to what the real income would be.


Congratulations to this team, their mentors, and TA, for this accomplishment! 

If you're interested in joining our Data Science for All mission to recruit our Data Science for All fellows or to become a Mentor, please get in touch.