philip Boltuzic,克罗地亚萨格勒布的开发者
Filip is available for hire
Hire Filip

菲利普Boltuzic

验证专家  in 工程

机器学习工程师和开发人员

Location
克罗地亚的萨格勒布
至今成员总数
2020年4月30日

Filip is a machine learning engineer with several years of professional experience. He's worked on large-scale problems at Amazon Web服务 as a software developer and built natural language processing models as a research associate at the University of Zagreb. Filip's main interests are machine learning and natural language processing, 重点是建立文本分类模型.

Portfolio

在线自由职业机构
机器学习,监督机器学习,强化学习...
Aggieland软件公司
人工智能(AI), 机器学习, Azure机器学习...
PD4 Solutions LLC
人工智能,机器学习,Python...

Experience

Availability

Part-time

首选的环境

Java, Git, Linux, Docker, Apache Solr, Django, PyTorch, Pandas, NumPy, Scikit-learn, Python

最神奇的...

...machine learning model I've developed was an LSTM and CRF model to segment text into argumentative claims as part of my Ph.D. thesis.

工作经验

研究顾问

2022年至今
在线自由职业机构
  • 调查、研究和记录软件中的缓存方法.
  • Reproduced the most popular caching methods for predicting time-to-live from research papers.
  • Built a simulator and reinforcement learning model which tries to solve TTL prediction for object caching.
Technologies: 机器学习,监督机器学习,强化学习, 深度强化学习, 数据科学, NumPy

AI和ML开发人员

2023 - 2023
Aggieland软件公司
  • Developed a large language model (LLM) LangChain bot to generate software requirements.
  • Built and deployed to the cloud a multi-process application exposed via an API that can chat with a user to generate software requirements.
  • Collaborated with two teams to integrate the LLM app via APIs to provide both web and mobile application access to the LLM app.
技术:人工智能(AI), 机器学习, Azure机器学习, 大型语言模型(llm), Llama 2, FastAPI, LangChain

AI Expert

2023 - 2023
PD4 Solutions LLC
  • Developed an LLM-based solution to determine which scientific articles are related to user-inputted free-text criteria.
  • Evaluated the LLM solution performance and demonstrated metrics proving considerable improvement over the previously implemented solution.
  • Worked with ML engineers to deploy solutions and define an optimal architecture for applying the LLM solution.
Technologies: 人工智能,机器学习,Python, 自然语言处理(NLP), 语言模型, 文本分类, 无监督学习, LangChain, 亚马逊网络服务(AWS), Git, GPT, 文本生成

高级数据科学家

2021 - 2023
Lionbridge自由撰稿人(通过Newfire Global Partners)
  • Developed a machine learning sequence labeling model on text data that achieved above 0.9 F1 score.
  • Decreased inference time on a previously developed machine learning model without sacrificing their F1 score.
  • Used PySpark and Databricks to perform a large-scale data analysis that the company employed to drive future business decisions.
  • Developed multiple highly scalable Python web services that are currently serving production traffic.
技术:Python, Agile, Scrum, Web服务, JSON, PyTorch, SpaCy, 自然语言工具包(NLTK), PySpark, Jupyter, Databricks, 开放神经网络交换(ONNX), 神经网络, LSTM, Pandas, 数据科学, NumPy, Git, 自然语言处理(NLP), 数据分析, Azure砖

数据科学工程师

2022 - 2022
BJS
  • Developed prototype product recommenders which showed customer purchasing patterns.
  • 构建简单的AWS Lambda函数来执行ETL工作流.
  • Worked with PySpark on large sets of data (>100GB of historical purchases).
技术:Python, 机器学习, Spark ML, Scikit-learn, PySpark, 亚马逊网络服务(AWS), Git

机器学习工程师

2020 - 2021
Alchemy V Ltd(通过Toptal)
  • Created a marketing slogan text generator using 拥抱的脸 transformers/text generation pipelines and customer-provided data.
  • Created a data ingestion and reporting process via multiple Google Cloud services: BigQuery, 云计算功能, 云终端, 和Dataproc.
  • 将现有的R报告代码移植到Python web服务中.
Technologies: Google Cloud, 谷歌云API, 谷歌BigQuery, R, Python, 文本生成, SQL, Git

自然语言处理顾问

2020 - 2021
格兰维尔知识管理(通过Toptal)
  • 开发了一个刮板下载一个大型(约20,000) and diverse legal documents (1990 until today) from a European public repository.
  • Used machine learning to build a text classification model to automatically classify categories based on document content.
  • Created a dataset of legal documents and used it to train and evaluate the built machine learning text classification model. Shared results via Google collab such that customers can interactively try the model performance with their held-out data.
技术:Python, Scrapy, 网页抓取, PyTorch, Jupyter, 谷歌合作实验室(Colab), 文本分类, 自然语言处理(NLP)

研究助理

2018 - 2020
萨格勒布大学的实验室
  • 开发了克罗地亚法律文件的搜索引擎.
  • Built a named entity recognition model in PyTorch by combining LSTM with a CRF.
  • Mentored several students doing intern projects and wrote my master thesis on natural language processing.
技术:Scikit-learn, PyTorch, Apache Solr, Django, Python, Torch, Pandas, 数据科学, Git, 自然语言处理(NLP)

软件开发工程师

2014 - 2017
亚马逊网络服务(AWS)
  • Contributed to developing a scalable time-series database solution in Java and C++, 它每秒处理大约100万个请求.
  • 担任团队scrum主管和产品负责人.
  • Designed and implemented a network correlation engine microservice to handle networking events from the entire Amazon network (patent award http://patents.justia.com/inventor/filip-boltuzic).
技术:亚马逊网络服务(AWS), C++, Python, Java, Algorithms, 编程, Agile, Git, Web服务

商业智能分析师

2012 - 2014
萨格勒巴卡联合信贷银行集团
  • Developed SQL reports to determine the promising retail strategies in a data warehouse.
  • Built an interactive tool in Java to speed up the processes in Oracle Data Integrator.
  • Developed small web applications for the accounting department, using PL/SQL and Oracle Apex.
技术:Java, SQL,数据科学

克罗地亚法律文件搜索引擎

一个Django和Apache Solr web应用程序.

I was the lead developer on this project and proposed the system's architecture as a set of microservices. 文档在Solr中存储和索引, whereas the Django front end served requests and communicated with Solr.

零售额预测

The project was to design a model to predict sale amounts based on historical data of orders, 以前的销售, 和地区. The forecasting was done on a regional and global level and acted as a time series prediction matter. I experimented with several time-series prediction techniques such as ARIMA and SARIMA models.

Other

自然语言处理(NLP), GPT, 生成预训练变压器(GPT), 人工智能(AI), 机器学习, Back-end, OpenAI GPT-3 API, 数据分析, Azure砖, 检索增强生成(RAG), 聚类算法, 聚类, 分类算法, 文本分类, Torch, 网页抓取, 谷歌合作实验室(Colab), 谷歌BigQuery, 文本生成, Web服务, 开放神经网络交换(ONNX), 神经网络, Research, 学生参与, 监督式机器学习, 时间序列, 自回归综合移动平均线(ARIMA), LangChain, OpenAI, 强化学习, 深度强化学习, Algorithms, 编程, Heuristics, 优化, 进化计算, 遗传算法, 卷积神经网络(CNN), 排序算法, 模式识别, 语言模型, 无监督学习, Big Data, 非结构化数据分析, 大型语言模型(llm), Llama 2, FastAPI, 提示工程, OpenAI GPT-4 API

Languages

Python, SQL, Haskell, Java, c++, R

库/ api

Scikit-learn, NumPy, Pandas, PyTorch, 谷歌云API, SpaCy, 自然语言工具包(NLTK), PySpark, LSTM, Spark ML

Tools

Vim Text Editor, Solr, Apache Solr, Git, Oh My Zsh, Boto, Jupyter, LaTeX, Azure机器学习

Paradigms

数据科学, Anomaly Detection, Agile, Scrum, Business Intelligence (BI)

Platforms

亚马逊网络服务(AWS)、Linux、Docker、Databricks、SolrCloud

Frameworks

Django, Scrapy

Storage

Elasticsearch, Google Cloud, JSON

2012 - 2020

Ph.D. 自然语言处理专业

萨格勒布大学-萨格勒布,克罗地亚

2010 - 2012

计算机科学硕士学位

萨格勒布大学-萨格勒布,克罗地亚

2010 - 2011

伊拉斯谟计算机科学交流学习

KTH皇家理工学院-斯德哥尔摩,瑞典

2007 - 2010

计算机科学学士学位

萨格勒布大学-萨格勒布,克罗地亚

2017年11月至今

卷积神经网络

Coursera

有效的合作

如何使用Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

分享你的需求

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

选择你的才能

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

开始你的无风险人才试验

与你选择的人才一起工作,试用最多两周. 只有当你决定雇佣他们时才付钱.

对顶尖人才的需求很大.

开始招聘