Maciej Jarka, Data Engineer and Developer in Warsaw, Poland
Maciej Jarka

Data Engineer and Developer in Warsaw, Poland

Member since June 18, 2020
Maciej is an ETL, BI, and big data engineer who's eager to share his knowledge and experience. He has over a decade of international experience under his belt, gained in projects spanning from data integration to analytics while working for industry leaders such as ING (banking) and Roche (pharma). Maciej is also quite handy with several tools including Snowflake, Netezza, Informatica, and DataStage (he’s a certified IBM DataStage trainer).
Maciej is now available for hire

Portfolio

Experience

  • SQL 10 years
  • Java 5 years
  • IBM InfoSphere (DataStage) 5 years
  • Data Modeling 5 years
  • Netezza 3 years
  • Talend ETL 3 years
  • Snowflake 2 years
  • AWS Lambda 1 year

Location

Warsaw, Poland

Availability

Part-time

Preferred Environment

SQL Server Analysis Services (SSAS), Eclipse, Git, Linux

The most amazing...

...solution I've worked on was a JDBC driver allowing you to transform your relational data into RDF graph; see more at Powerdf.com

Employment

  • Cloud Data Warehouse Architect

    2019 - PRESENT
    Fortune 100 North American Construction Equipment Manufacturer(Toptal Client)
    • Built foundations of enterprise data lake architecture.
    • Designed and developed data pipeline patterns to be used across the enterprise.
    • Trained business and engineering teams to use the product.
    Technologies: Snowflake, SnapLogic, Data Lakes, Data Warehousing, AWS
  • Senior Data Engineer

    2019 - 2021
    SeaChange
    • Designed data pipelines for processing large VOD metadata datasets in Snowflake and Talend.
    • Created data warehouse structures to support advanced analytics.
    • Prepared and masked large data sets for machine-learning activities.
    Technologies: Data Pipelines, Data Architecture, Data Warehouse Design, Data Warehousing, Amazon Kinesis Data Firehose, XSLT, AWS Glue, Azure Blobs, Talend ETL, Data Modeling, Git, Boto 3, Pandas, Python 3, Windows PowerShell, Amazon Athena, AWS Lambda, AWS Kinesis, Azure Blob Storage API, Amazon S3 (AWS S3), Snowflake, SQL, Databases, ETL, Tableau, Docker, Talend
  • Senior Data Engineer | Team Leader | Trainer

    2017 - 2019
    ING (Belgium)
    • Built ETL solutions for the bank's marketing department.
    • Optimized data structures in IBM Pure Data for Analytics.
    • Trained team members in the scope of IBM DataStage.
    • Designed data structures in star schema to meet reporting requirements.
    Technologies: Data Pipelines, Data Architecture, Data Warehouse Design, Data Warehousing, IBM Tivoli Workload Scheduler, Kimball Methodology, IBM InfoSphere (DataStage), Data Modeling, Git, Windows PowerShell, AWS Lambda, Amazon S3 (AWS S3), Snowflake, SQL, Databases, ETL, Netezza, Datastage
  • Senior BI Developer

    2016 - 2017
    RobertHalf (US)
    • Built management reports in SAP Business Objects.
    • Designed SAP BO universes.
    • Optimized Oracle data warehouse for reporting.
    Technologies: Data Pipelines, Data Architecture, Data Warehouse Design, Data Warehousing, SAP BusinessObjects (BO), Kimball Methodology, Data Modeling, Oracle RDBMS, SQL, Databases, ETL, Oracle, SAP BusinessObjects Data Service (BODS)
  • Senior ETL Developer

    2015 - 2017
    Roche Pharmaceuticals
    • Developed ETL solutions for large market data sets.
    • Optimized Teradata database structures for reporting.
    • Gathered requirements from the data science team.
    • Built a database to graph solutions with Talend and SPARQL.
    Technologies: Data Pipelines, Data Architecture, Data Warehouse Design, Data Warehousing, XSLT, Talend ETL, Kimball Methodology, Data Modeling, Git, PL/SQL, SQL, Databases, ETL, RDF, SPARQL, Teradata, Talend, Informatica PowerCenter
  • Senior ETL Developer

    2014 - 2015
    Ascen
    • Developed ETL processes with IBM DataStage and Microsoft Integration Services.
    • Generated reports in SAP BusinessObjects for LOT Polish Airlines.
    • Created analytical cubes with Microsoft Analysis Services for LOT Polish Airlines.
    Technologies: Data Pipelines, Data Architecture, Data Warehouse Design, Data Warehousing, Netezza, Kimball Methodology, IBM InfoSphere (DataStage), Data Modeling, Git, PL/SQL, SQL, Databases, ETL, SQL Server Analysis Services (SSAS), Datastage
  • Data Integration Developer

    2013 - 2014
    Roche Pharmaceuticals
    • Integrated enterprise data sources such as Oracle, Salesforce, and LDAP.
    • Built ETL and data virtualization solutions.
    Technologies: Data Pipelines, Data Architecture, Data Warehouse Design, Data Warehousing, Data Modeling, PL/SQL, SQL, Databases, ETL, Java, Oracle RDBMS, JBoss
  • Master Data Management (MDM) Developer

    2011 - 2013
    ValueTank
    • Designed a data model for a master data management solution.
    • Developed Java software components embedded in IBM MDM platform.
    • Collected a client's requirements related to change requests.
    Technologies: Data Pipelines, Data Architecture, XSLT, IBM Tivoli Workload Scheduler, Netezza, Kimball Methodology, IBM InfoSphere (DataStage), Data Modeling, Windows PowerShell, PL/SQL, Master Data Management (MDM), Oracle RDBMS, Web Services, SQL, Databases, Management, Data, Java
  • Software Developer

    2008 - 2011
    Comarch
    • Developed an advanced data mining and integration platform called ADMIRE (Admire-project.eu).
    • Built data-mining models for churn prediction and cross-selling.
    • Constructed Java-based components for the platform.
    Technologies: Data Pipelines, XSLT, Git, Windows PowerShell, PostgreSQL, PL/SQL, Oracle RDBMS, Web Services, SQL, Databases, Data Mining, Weka, Java

Experience

  • PoweRDF

    PoweRDF is a JDBC driver allowing the user to transform their relational data into an RDF graph. It is portable to all ETL tools supporting JDBC. Please refer to the website for more details.

  • Trend Catcher for Twitter

    Trend Catcher is an academic app for a research team led by Szymin Piatek from the University of Warwick. It aggregates Twitter data per location enabling trend analyses.

  • The DATA Bonanza (Book)

    I am a guest co-author of a data mining-related book released by Wiley in 2013. This work was done while I was also a member of the ADMIRE project (Admire-project.eu). You can find my chapter about churn prediction in the section on analytical CRMs.

Skills

  • Languages

    Snowflake, SQL, Java, SPARQL, RDF, XSLT, Python 3
  • Tools

    IBM InfoSphere (DataStage), Talend ETL, Informatica PowerCenter, IBM Tivoli Workload Scheduler, Amazon Athena, AWS Glue, Git, Weka, Tableau, Boto 3, SnapLogic
  • Paradigms

    ETL, Kimball Methodology, Management
  • Storage

    Databases, Data Pipelines, Netezza, Master Data Management (MDM), PL/SQL, Azure Blobs, Amazon S3 (AWS S3), PostgreSQL, Oracle RDBMS, Datastage, SQL Server Analysis Services (SSAS), Data Lakes, Teradata
  • Other

    Data Warehousing, Data Architecture, Data Warehouse Design, Data Modeling, SAP BusinessObjects (BO), Web Services, SAP BusinessObjects Data Service (BODS), AWS, Data Mining, Amazon Kinesis Data Firehose
  • Libraries/APIs

    Azure Blob Storage API, Pandas
  • Platforms

    Linux, AWS Kinesis, AWS Lambda, Eclipse, JBoss, Talend, Oracle, Docker
  • Frameworks

    Windows PowerShell

Education

  • Master of Science (MSc) Degree in Computer Science
    2010 - 2012
    Warsaw University of Technology - Warsaw, Poland
  • Engineer's Degree in Computer Science
    2006 - 2010
    Warsaw University of Technology - Warsaw, Poland

Certifications

  • SnowPro Advanced Architect
    DECEMBER 2021 - PRESENT
    Snowflake (Udemy)
  • SnowPro Core
    JANUARY 2021 - PRESENT
    Snowflake
  • C2090-424-EN InfoSphere DataStage v11.3
    JANUARY 2017 - PRESENT
    IBM
  • 00-N50 IBM PureData System for Analytics (Netezza) Technical Mastery
    JANUARY 2014 - PRESENT
    IBM
  • 000-M78 β€” IBM Initiate Master Data Service Technical Mastery
    JANUARY 2012 - PRESENT
    IBM
  • P2090-086 IBM InfoSphere Master Data Management β€” PIM Technical Mastery
    JANUARY 2011 - PRESENT
    IBM
  • 1Z0-851 β€” Oracle Certified Programmer for the Java 2 Platform, SE 6.0
    JANUARY 2010 - PRESENT
    Oracle

To view more profiles

Join Toptal
Share it with others