Juan Manuel Ortiz de Zarate, Data Scientist and Developer in Ciudad de Buenos Aires, Buenos Aires, Argentina
Juan Manuel Ortiz de Zarate

Data Scientist and Developer in Ciudad de Buenos Aires, Buenos Aires, Argentina

Member since November 6, 2019
Currently, Juan is a PhD candidate at the University of Buenos Aires, researching the subjects of AI, NLP, and social networks. He has over a decade of professional development experience under his belt. For the last few years, he’s been immersing himself in various types of data science projects and loving every minute of it. Juan relishes taking on data problems, building prediction models, and learning state-of-the-art techniques.
Juan is now available for hire

Portfolio

Experience

Location

Ciudad de Buenos Aires, Buenos Aires, Argentina

Availability

Part-time

Preferred Environment

RStudio, Jupyter Notebook

The most amazing...

...thing I've coded is a an entire web app to monitor social networks. It has a back end in R for statistics and a PHP front end.

Employment

  • Senior Data Scientist

    2022 - PRESENT
    Fundar
    • Managed and supervised external students in their research.
    • Managed a different kind of research project with internal and external members.
    • Interviewed external collaborators and consultants to hire for specific projects and technical works.
    Technologies: Artificial Intelligence (AI), Big Data, Natural Language Processing (NLP), Team Management, Hiring, Data Analysis
  • Ph.D. Student Researcher

    2017 - PRESENT
    Universidad de Buenos Aires
    • Created new techniques to analyze discussions on social networks with R and Python.
    • Predicted movie reviews using the IMDB database with R.
    • Predicted implication clauses with NLP through Python models.
    • Developed new techniques to predict controversy with NLP techniques on social networks with R and Python.
    • Created new techniques to graph clusters with NLP techniques on social networks with Python and R.
    Technologies: R, Python
  • Freelance Data Scientist Advisor

    2016 - PRESENT
    Massomedia S.A.
    • Predicted presidential votes via telephone surveys using Python.
    • Analyzed several discussions on Twitter and Facebook using R.
    • Developed a web app using R, PHP, and MySQL to monitor social networks and media.
    • Built a product with Python that could analyze telephone surveys about product sales.
    • Presented the results, conclusions, and explanations for each task to the client.
    Technologies: Python, R
  • Front-end Developer

    2022 - 2022
    Nixtla Inc.
    • Developed new and custom plugins in TypeScript for JupyterLab.
    • Researched JupyterLab documentation to understand how to create new features.
    • Documented the creation of the new features and how they can be extended and installed in new environments.
    Technologies: CSS, TypeScript, Front-end, Jupyter, Jupyter Notebook, Data Science, Forecasting, JupiterLab
  • Technical Editor

    2021 - 2022
    Auth0
    • Corrected and edited technical articles about the latest technologies implementations.
    • Tested technological implementations described in the articles.
    • Evaluated writers' applications from all over the world for the company writers program.
    Technologies: Python 3, PHP, Technical Writing, Writing & Editing, Auth0, Authentication, Authorization
  • Head Teaching Assistant

    2020 - 2022
    University of Buenos Aires
    • Acted as the head teaching assistant at Data Organization subject. In this subject, we try to introduce students to data science. This subject is part of the mandatory career plan for informatics engineering.
    • Designed all of the subject content together with another professor. Due to the COVID-19 lockdown, all of the online classes (in Spanish) are available on YouTube; if you would like to view them, contact me for the link.
    • Taught half of the theoric classes and half of the practical lessons. I also have to coordinate the teachers of the practical lessons and prepare and correct the final exams and practical works.
    • Designed the final test and the practical work that students must be approved to complete the subject.
    Technologies: University Teaching, Data Science, Education, Machine Learning
  • Data Scientist

    2021 - 2021
    Carrie Beam Consulting
    • Created new functionalities in R to give a graph of financial relations suggesting new unions that improved trade between the actors.
    • Optimized algorithms on graphs so that they can work on large data sets.
    • Found bugs in existing R code and suggested better ones.
    Technologies: Graphs, igraph, R, Networks, Computer Science
  • Teaching Assistant

    2020 - 2021
    Universidad Católica Argentina
    • Composed practical lessons for the Data Science class.
    • Corrected exams and practical homework for the Data Science subject.
    • Tutored and answered questions from students in the Data Science topics.
    • Explained machine learning techniques, regularizations methods, feature extraction and selection, data visualization methods, and more tasks related to data science.
    Technologies: Data Science
  • Statistical Developer

    2020 - 2020
    Decentral Park Advisors LLC (via Toptal)
    • Predicted Bitcoin 1D, 3D, and 7D returns with multinomial regressions and generalized additive models (GAMs).
    • Predicted Bitcoin 1D, 3D, and 7D positive or negative values through classification models as randomForest, XGBoost, and Bagging.
    • Reported the results through Jupyter Notebooks and R Shiny dynamic graphs.
    Technologies: RStudio Shiny, R, Jupyter, Matplotlib, Pandas, Scikit-learn, Python
  • Data Analyst

    2020 - 2020
    LL Media, LLC (via Toptal)
    • Standardized multiple information sources about leads using Python and Pandas.
    • Scored the performance of each source over different kinds of campaigns using Python, Pandas, and Matplotlib.
    • Predicted good leads by demographic data using machine learning classifiers scikit-learn.
    • Predicted bad leads by demographic data using machine learning classifiers scikit-learn.
    • Analyzed lead data to find simple correlations between good and bad lead performance.
    Technologies: Matplotlib, Pandas, Scikit-learn, Python, Data Engineering
  • Teaching Assistant

    2017 - 2018
    Universidad de Buenos Aires
    • Composed practical lessons for the Computer Structures 1 class.
    • Corrected exams and practical homework for the Computer Structures 1 class.
    • Tutored and answered questions from students in the Computer Structures 1 class.
    Technologies: Structure, Computer
  • Teaching Assistant

    2016 - 2017
    Universidad de Buenos Aires
    • Composed and gave practical lessons for the Network Theory class.
    • Corrected exams and homework for the Network Theory class.
    • Tutored and answered questions from students in the Network Theory class.
    Technologies: Network Theory
  • Senior Full-stack Developer

    2014 - 2015
    Telam
    • Developed and maintained a system for journalists, which allowed them to write different types of notes and publish them on the news site.
    • Built and maintained a system to administrate the advertisement money.
    • Developed a REST API to connect with other media news sites.
    Technologies: JavaScript, MySQL, PHP, Back-end
  • Senior Full-stack Developer

    2010 - 2014
    Intraway
    • Built and maintained a system for call center assistance.
    • Developed and maintained features to communicate with the STB Systems and reset them.
    • Constructed and supported a dynamic decision tree to give the best answers to clients based on their specific problems and configurations.
    • Created and maintained an internal ticket system to organize tasks and assign them to different teams.
    Technologies: jQuery, JavaScript, SQL, PHP, Back-end, Front-end
  • Principal Developer

    2008 - 2010
    Imprek
    • Developed the company's administrative system.
    • Built a system to administer technique services.
    • Maintained the stock system.
    • Implemented a system to print digital photos from a Kodak machine.
    • Developed the company's financial system.
    Technologies: JavaScript, MySQL, PHP

Experience

  • Data scientist at Fundar

    I am a data scientist researcher at Fundar. Fundar is an organization dedicated to studying, researching, and designing public policies focused on developing a sustainable and inclusive Argentina. There I have worked for different government dependencies, such as the Ministry of Tourism, the presidential secretary, and more. Also, I had to direct research scholarships that the foundation grants to students from different universities.

  • Professor of Data Science at University of Buenos Aires
    https://orga-de-datos.github.io/

    I designed all of the subject content together with another professor.
    I was responsible for giving theoretical and practical classes, correcting and designing midterms, and taking final tests. Due to the COVID-19 lockdown, all of the online classes (in Spanish) are available on YouTube; if you would like to view them, contact me for the link.

  • Technical Editor
    https://auth0.com/blog/

    I should correct and review technical articles that different authors write for the Auth0 technology blog. The subject of the articles includes various fields: machine learning, security, developing frameworks, design patterns, and more.
    I should also test the applications or code the authors develop in the article to see if they were well designed and implemented.

  • Social Network Analysis in R and Gephi: Digging Into Twitter
    https://www.toptal.com/r/social-network-analysis-in-r-gephi-tutorial

    In this article, I show how to make a social network analysis using the Twitter API, R, and Gephi. By downloading a specific conversation, I build the social graph, plot it through a meaningful layout, and identify its principal communities.

  • Understanding Twitter Dynamics With R and Gephi: Text Analysis and Centrality
    https://www.toptal.com/r/social-network-analysis-in-r-gephi-2

    In this second article, I applied centrality measures to detect the principal actors of the discussion and natural language processing techniques to understand what they are talking about in each community.

  • Ensemble Methods: The Kaggle Machine Learning Champion
    https://www.toptal.com/machine-learning/ensemble-methods-kaggle-machine-learn

    Two heads are better than one. This proverb describes the concept behind ensemble methods in machine learning. In this article, I examine why ensembles dominate ML competitions and what makes them so powerful.

  • A Glimpse Into the Future of Data Science
    https://www.pangea.ai/data-science-resources/future-of-data-science/

    Data science is changing the world, it is at the heart of the fourth technological revolution. But how do we get here? How is the world changing? What else does this future hold?
    In this article, I introduce the irruption of data science in our life, how we get here, some representative cases, and where we are going.

  • 10 Best Data Science Development Frameworks to Use in 2021

    In a world where data is more valuable than oil, the demand for data scientists and analysts is skyrocketing. In this article, I present the best tools for tapping into these data reserves. Hands down, Python is the clear choice for any aspiring developer trying to break into the field of data analysis.

  • Application to Monitor Social Networks

    I developed, on my own, an entire application to monitor social networks like Instagram, Facebook, and Twitter. It has statistics about any public account needed and sends messages/alarms to the client through a telegram if something important is happening.

    It has a back end in R to download information, process it, and calculate statistics and a PHP front end to show the data. I also used C to manage sessions, create, delete, modify searches, and more.

  • Stars Realigned: Improving the IMDb Rating System
    https://www.toptal.com/data-science/improving-imdb-rating-system

    IMDb ratings have genre bias: Dramas tend to score higher, for example. Is there a way to remove such biases and discover what makes a movie unique?

    In this article, I show you how to refine IMDb scores and create a better ranking system through data science and machine learning techniques.

  • Hiring Data Scientists — Best Practices and Job Description Template

    Hiring an IT candidate is one of the hardest tasks that human resource professionals have to accomplish. Demand for IT professionals is greater than available individuals on the market, which produces competition between companies for the scarcely qualified developers.
    In this article, I advise you on how to improve your candidates' research and hire the best profiles for your team.

  • Mining for Twitter Clusters: Social Network Analysis With R and Gephi (Publication)
    Explore Twitter data clusters to discover keywords that dominate identified groups. Focusing on a politically slanted data set provides an easy target for analysis.
  • Understanding Twitter Dynamics With R and Gephi: Text Analysis and Centrality (Publication)
    Centrality and text analysis allow users to get more out of their social network data. Here’s how you can leverage them using R and Gephi.
  • Social Network Analysis in R and Gephi: Digging Into Twitter (Publication)
    Thanks to rapid advances in technology, large amounts of data generated on social networks can be analyzed with relative ease, especially for those who use the R programming language and Gephi.
  • Ensemble Methods: The Kaggle Machine Learning Champion (Publication)
    Two heads are better than one. This proverb describes the concept behind ensemble methods in machine learning. Let's examine why ensembles dominate ML competitions and what makes them so powerful.
  • Stars Realigned: Improving the IMDb Rating System (Publication)
    IMDb ratings have genre bias: For example, dramas tend to score higher. Removing common feature bias and keeping unique characteristics, it's possible to create a new, refined score based on IMDb information.

Skills

  • Languages

    PHP 7, Python 3, R, SQL, JavaScript, CSS, Python, PHP, TypeScript, HTML
  • Frameworks

    RStudio Shiny, CodeIgniter, .NET
  • Libraries/APIs

    igraph, Scikit-learn, Matplotlib, Pandas, Keras, Ggplot2, NumPy, jQuery, Caret
  • Paradigms

    Data Science, Test-driven Development (TDD)
  • Platforms

    RStudio, Jupyter Notebook, Linux, WordPress, Oracle, Gephi
  • Storage

    MySQL
  • Other

    Data Visualization, Charts, Social Networks, Social Network Analysis, Visualization Tools, OOP Designs, Machine Learning, Data Analytics, Data Analysis, Big Data, Time Series, Time Series Analysis, Clustering, Statistics, Code Review, Source Code Review, Team Management, Interviewing, Network Theory, Computer, Structure, Education, University Teaching, Stock Market, Stock Trading, Graphs, Networks, Computer Science, Writing & Editing, Hiring, Technical Writing, Authentication, Authorization, Back-end, Front-end, Data Engineering, Forecasting, JupiterLab, Natural Language Processing (NLP), Artificial Intelligence (AI), Blogging, Blog Posting, Technical Hiring
  • Tools

    Dplyr, Seaborn, Jupyter, Auth0, R Studio

Education

  • Ph.D. Degree (in Progress) in Computer Science
    2017 - 2022
    Universidad de Buenos Aires - Buenos Aires, Argentina
  • Master's Degree in Computer Science
    2010 - 2016
    Universidad de Buenos Aires - Buenos Aires, Argentina

Certifications

  • Deep Learning
    DECEMBER 2018 - PRESENT
    Coursera
  • Machine Learning
    OCTOBER 2018 - PRESENT
    Coursera
  • Laboratory of Machine Learning
    SEPTEMBER 2018 - PRESENT
    ITBA | Instituto Tecnológico de Buenos Aires

To view more profiles

Join Toptal
Share it with others