The biggest European conference about ML, AI and Deep Learning applications
running in person in Prague and online.

Machine Learning Prague 2023

In cooperation with Kiwi.com

, 2023

Registration

World class expertise and practical content packed in 3 days!

You can look forward to an excellent lineup of 45 international experts in ML and AI business and academic applications at ML Prague 2023. They will present advanced practical talks, hands-on workshops and other forms of interactive content to you.

What to expect

  • 1000+ Attendees
  • 3 Days
  • 45 Speakers
  • 10 Workshops
  • 1 Party

Phenomenal Speakers

Radovan Kavicky

Data Science Instructor, Datacamp

Radovan Kavicky joined DataCamp among its first employees (historically 1st Data Science Instructor from CEE region & is still historically the only person worldwide who have made successful transition from regular student to DataCamp instructor and employee after being #1 worldwide @ DataCamp platform for a year, back in 2017).

Radovan is Data Science Polyglot (R, Python, Julia ++more) and Data Science Veteran with 11+ years of experience in Data Science and Applied AI/ML Consulting & extensive knowledge in the area (Data Science consulting, education & community building with successful cooperation together with global leaders within our industry, like f.e. H2O.ai, Anaconda or Tableau). Radovan is also co-founder of Slovak.AI (Slovak Research Center for Artificial Intelligence), member of AIslovakIA (National platform for the AI development in Slovakia) and  various international professional societies within Data Science & AI/ML industry, like f.e. IEEE Computer Society, CLAIRE (Confederation of Laboratories for Artificial Intelligence Research in Europe), European AI Alliance (European Commission/Futurium), TAILOR network (Trustworthy AI - Integrating Learning, Optimisation and Reasoning), UDSC (United Data Science Communities), PyData Global Network, Global Tableau #DataLeader network & The Python Software Foundation (PSF).

Radovan is Founder of PyData Slovakia/Bratislava (#PyDataSK #PyDataBA), R <- Slovakia (#RSlovakia), Julia Users Group Slovakia (#JUGSlovakia) & SK/CZ Tableau User Group (#skczTUG) that you are all welcome to join.

Present your project

Would you like to present your project or research at Machine Learning Prague 2023?
Apply for the poster session before April 15, 2023.

Submit a poster

Practical & Inspiring Program

Friday
Workshops

O2 Universum, Českomoravská 2345/17a, 190 00, Praha (workshops won't be streamed)

Registration

Room D2 Room D3 Room D4 Room D6 Room D7
coffee break

Operationalizing Responsible AI in Practice

Room D2

Mehrnoosh Sameki, Microsoft
Michal Marusan, Microsoft

Are you a data scientist looking to author machine learning solutions responsibly using the latest tooling? Our brand-new Responsible AI dashboard is designed to help you by providing a single pane of glass bringing together a variety of model assessment and responsible decision-making capabilities under one roof. The dashboard enables you to easily assess and validate your models by looking into a variety of model performance fairness and error analysis components interpret your models (including blackbox ones) to understand how they are making their predictions perform perturbations via what-if analysis and counterfactual analysis and understand/fix data imbalance issues. By the end of this session you will have gained hands-on experience in the utilization of these tools and how you can use the outputs to identify diagnose and mitigate your models’ issues and communicate their value to your stakeholders across the organization.

Gaussian process regression when it comes to numerical simulators

Room D3

Thomas Browne, Kiwi.com
Lucie Blechová, Kiwi.com

While numerical simulators are often used by heavy industries to model complicated phenomenons their complexity makes them sometimes slow and harder to exploit. Gaussian process regression (GPR) provides an accurate framework where based on a limited amount of calls to the simulator one can have a prediction on any of the simulator's output together with confidence bounds. GPR can then be extended to solve optimization and sensitivity analysis tasks with a parsimonious approach. In this workshop the attendants will be given a walk through the basics of GPR in Python. Besides they will be provided with implemented examples how GPR can help.

Drug discovery using NLP

Room D4

Aisling O’Sullivan, Dataclair

NLP is an important and rapidly growing field. While its application in fields such as language translation and chatbots is well-known the use of NLP in the billion-dollar pharmaceutical industry is less commonly cited. NLP is particularly appealing to drug discovery since these models are capable of capturing complex medical concepts that are difficult for humans to grasp as well as understanding the structure of molecules which are key to discovering novel drugs. In this workshop you will be introduced to the world of using machine learning for drug discovery with a focus on NLP. We'll show you how to apply ML techniques to discover novel drug candidates using NLP on the text and also by applying NLP to the "language of molecules."We will do this through a use-case of classifying molecules that can or cannot cross the blood-brain barrier. This use-case is important for developing drugs that target diseases of the central nervous system (such as Alzheimer's) as well as for identifying potentially toxic drugs. We'll also explain the applicability of these approaches to other important problems such as identifying antibiotics and cancer drugs.

Learning to Learn: Hands-on Tutorial on Using and Improving Few-Shot Language Models

Room D6

Michal Štefánik, Gauss Algorithmic
Nikola Groverová, Gauss Algorithmic

As AI models become an increasingly common element of many applications we more notoriously face practical limitations of specialized models working well only for a single training task and data. Huge language models like OpenAI's GPT-3 showed that models could be much more versatile and adapt to new tasks without updating the model provided only with natural instructions and a small number of input-output examples of the desired task. In practice Few-shot learners can solve your new task with accuracy comparable to the supervised models trained on hundreds to thousands of samples. Our workshop will give you an overview of the existing models able of Few-shot learning including their limitations. We will experiment with creative ways of utilizing in-context Few-shot learning such as customizing the model's predictions to specific users. Finally we will provide some recipes for training Few-shot learners for new languages or further scaling up the accuracy of existing Few-shot models.

Reproducible, portable, and distributable ML solutions in Python

Room D7

Stepan Kadlec, ForML
Mike Pearmain, VietcomBank

When achieved the combination of reproducibility portability and distributability in ML solutions constitutes a powerful faculty unlocking a number of operational opportunities. While reproducibility is a well-established pathway for conducting scientific research it is not always receiving the same recognition within the data product industry. Similarly portability and distributability are typically regarded as irrelevant for bespoke solutions and only pursued in case of explicit demands. This might be reasonable given the extra cost incurred by conventional development; but with modern tooling these properties can be easily achieved without much extra effort. In return this brings significant benefits in the form of highly collaborative R&amp;D inherent lifecycle management effective model troubleshooting carefree and flexible deployment (latency/throughput-optimal runtime modes) and even potential commoditization (market of turnkey solutions). In this workshop we will dive deeper into these concepts examining carefully the available technologies and reviewing some of the existing tools. A significant amount of the time will be spent working with the ForML framework implementing a practical end-to-end ML solution demonstrating all of these declared principles.

Lunch
coffee break

LIME & SHAP: Explainable AI (xAI)

Room D2

Radovan Kavicky, Datacamp

In this workshop led by Radovan Kavicky from Datacamp Basecamp.ai &amp; GapData Institute you will get familiar with principles &amp; tools of Explainable AI (xAI) like LIME SHAP and others. Complex modern-day ML algorithms where deep learning &amp; ensemble methods dominate are really hard to fully understand but the decision process behind them can &amp; need to be transparent and trustworthy for decision makers within critical domains as finance healthcare or public sector &amp; governmental services where TRUST is a MUST. In fact with growing regulatory pressure also outside these areas Explainable AI (xAI) will be necessity for any organization soon. You will learn how to understand the inner workings of these ML algorithms &amp; how to design systems that imitate intelligence in a transparent way. You will also get an overview of current trends in Explainable AI/ML &amp; the challenges that are ahead of us.

Bayesian Networks in business planning and risk management

Room D3

Martin Plajner, Logio
Theodor Petřík, Logio

Explore with us a complex and powerful family of models Bayesian Networks. In our workshop you will have a chance to i) understand the Bayesian Network models and their strengths drawbacks and application areas ii) build a data-based model which you will use to answer business planning questions and what-if scenarios and iii) create an expert-knowledge model to handle risk management infer posterior probabilities and construct emergency scenarios. In this workshop you will have an opportunity to get hands-on experience with Bayesian Networks modeling using R language. No prior Bayesian Networks knowledge is required bring a laptop with the current R version ready to use.

Predicting weather with deep learning

Room D4

Petr Šimánek, FIT CTU
Jiří Pihrt, FIT CTU
Matej Choma, Meteopress

In this workshop we will implement train and test machine learning models that analyze satellite and weather radar data. You will get hands-on experience with the most common deep neural nets used for spatiotemporal predictions (e.g. UNet with some bells and whistles and convolutional recurrent nets). You will play with PyTorch implementation and analyze the results. You will understand the common pitfalls and reasons why the prediction fails.

ML with a Large Set of Variables: Feature Selection Techniques for Regression in Python

Room D6

Aneta Havlínová, Workday
Martin Koryťák, Workday

In many ML applications we encounter a situation when datasets have a large amount of potential features but relatively few observations—from an analysis of genetics data with thousands of gene expressions through financial data modelling with voluminous data that flows in from capital markets and economies to HR analytics area with extensive data on employees such as their personal information skills job histories and more. In these cases feature selection is crucial to prevent overfitting and to improve model performance. This workshop provides participants with an overview of some of the useful feature selection methods including linear models such as Orthogonal Matching Pursuit or tree-based methods such as Random Forest or Boruta. First a theoretical background is presented. Afterwards the participants are guided step-by-step through implementation of these methods in Python with the practical use-case being tied to the HR data analytics context.

Transform Your Data Game: Mastering Data Modeling and Analytics with dbt

Room D7

Jozef Reginac, STRV
Pavel Jezek, STRV

Dbt has gained significant traction in the analytics engineering community and is on the quest to become the go-to tool for data teams. With the latest addition Python models it’s becoming relevant even for machine learning engineers. We would like to walk you through the basic project setup the first data model all the way up to creating the Python model. Our goal is for you to be confident in using dbt in your team and to help you merge the work of all data team members into one environment.

Saturday,
Workshops

O2 Universum, Českomoravská 2345/17a, 190 00, Praha (and on-line)

Registration from 9:00

Welcome to ML Prague 2023

Player of Games - Search in Imperfect Information Games

Martin Schmid, DeepMind

From the very dawn of the field, search with value functions was a fundamental concept of computer games research. Turing’s chess algorithm from 1950 was able to think two moves ahead, and Shannon’s work on chess from 1950 includes an extensive section on evaluation functions to be used within a search. Samuel’s checkers program from 1959 already combines search and value functions that are learned through self-play and bootstrapping. TD-Gammon improves upon those ideas and uses neural networks to learn those complex value functions — only to be again used within search. The combination of decision-time search and value functions has been present in the remarkable milestones where computers bested their human counterparts in long standing challenging games — DeepBlue for Chess and AlphaGo for Go. Until recently, this powerful framework of search aided with (learned) value functions has been limited to perfect information games. We will talk about why search matters, and about generalizing search for imperfect information games

Boosting Investment Decisions with Graph Attention Reinforcement Learning

Michal Dufek, Analytical Platform

Are you tired of using traditional methods for asset pricing? Look no further than our cutting-edge research on Graph Attention Reinforcement Learning! By utilizing graph neural networks with attention mechanisms and a deep reinforcement learning framework, we have developed a new approach that outperforms existing methods in terms of accuracy and efficiency. Our GARL approach is evaluated using synthetic simulated data and shows that it is an effective approach, particularly when the problem is redesigned as a multi-class classification problem. Don't miss out on this exciting new development in asset pricing!

To be announced

LUNCH

LLM-driven game characters

Marek Rosa, GoodAI

We present AI Game, a novel type of role-playing sandbox game that leverages LLM-powered agents to enhance player experience. Our game features agents with long-term memory and autonomous goal pursuit, enabled by large language models that emulate their personalities, behaviors, thoughts, actions, and dialogues. These agents can observe and interact with their environment, communicate with each other, and make decisions based on their individual goals. Our game offers a unique and immersive gameplay experience that challenges traditional notions of game design and opens up exciting new avenues for exploration in AI and game development. In this talk, we will discuss the technical and design aspects of our game, and highlight some of the key challenges and opportunities in this emerging field.

Smart summaries

Martin Neznal, Productboard

In this talk, we would like to focus on the summarization of collections of feedback and describe all its challenges. We will focus on the state-of-the-art summarization models, such as GPT-3, open source GPT variants, Bart, and other transformers as well as some extractive approaches such as Gensim. We will show how they perform for summarization of different types of text such as conversations, reviews, long & short texts, etc.
We will present what are the industry standard methods for the evaluation of summaries such as ROUGE, BLEU, BLANC, BERTscore, or Supert, and use them to evaluate the summarization models. We will show how we use these approaches in Productboard to automatically and without supervision evaluate the quality of thousands of summaries daily.
We will talk about techniques to apply to summarization models to achieve significantly better summaries such as for example fine-tuning, ways how to query GPT models, text cleaning, etc.
We will also focus on multi-document summarization. We will describe what are the state-of-the-art models for this task, how to evaluate the multi-document summary, and which techniques we use to preprocess the input documents when we need to summarize a collection comprising hundreds or thousands of texts into one paragraph (such as clustering, text relevancy or pre-summarization of single documents)
In the last section of our talk, we will share our experience of implementing the summarization feature in Productboard, how we incorporate the user feedback into our summarization pipeline, how we connect summaries with other ML features and also which tech stack we use, and how we scale it to deploy an independent solution for thousands of companies (each with thousands of text/feedback).

Deep learning approaches to speaker diarization

Mireia Diez Sánchez, Brno University of Technology

Speaker diarization is the task of determining the speaker turns in a recording of a conversation, automatically finding “who speaks when”. 
Speaker diarization is one of the most challenging tasks in the automatic speech processing field: it deals with voice activity detection (VAD), speaker recognition, segmentation of the speech into speaker turns, handling of overlapped speech and it needs to infer the number of speakers in the input conversation.
In this talk, we will focus on the recent neural network-based state-of-the-art methods, such as end-to-end diarization (EEND) and target-speaker VAD systems and will explain how these architectures tackle the speaker diarization problem.

COFFEE BREAK

Multi-Model Machine Learning based Industrial Vision Tool for Assembly Part Quality Control

Aimira Baitieva, Valeo

Creating a visual inspection tool in the automotive industry can be challenging due to having many different types of defects, including ones we have not seen before. Physical setup constraints, importance of missed defects and subjectivity of labels adds even more complexity to this task. I will illustrate all this on a Valeo project, which is aimed at helping the operator detect bad sensors on the production line using visual information. To tackle this complex problem we have combined different computer vision models, extracting features from the segmented anomaly map alongside with the supervised classifier score and using them for the final classification.

3D Pose Estimation in Sport

Piotr Skalski, Roboflow

Are you ready to take your sports analysis to the next level? Look no further! In this talk, we will dive deep into the exciting world of 3D pose estimation using multiple cameras and the powerful YOLOv7 model. From detection to post-processing, calibration to visualization, I'll be walking you through every step of the process and providing you with the professional insights you need to improve your analysis. But don't worry, this talk won't be all work and no play - I promise to add some humor to keep things interesting.

Neural fields in aerial 3D reconstruction

Martina Bekrová, Melown Technologies

Methods using neural fields for novel view synthesis are hot research topic. Training of neural fields was initially very computationally demanding and could take days to create one scene and minutes for rendering each view. But soon after came invention of Instant NeRF from Nvidia which fasten the computation rapidly and we decided to test how it works with images from aerial scanning. With few modifications it is possible to also create mesh of the scene. In this talk we will share our experiments with various neural field based methods applied on aerial data and comparison of the results with our traditional SfM algorithm for 3D reconstruction.

COFFEE BREAK

To be announced

How to Lead a Data Science Team: Practical solutions for a more streamlined workflow.

Foad Vafaei, JetBrains Datalore

When stakeholders see the tangible benefits driven from large datasets, they are fascinated by data science. At other times, a chasm separates the data science team from business domain experts. How do we cross the chasm and develop a strong vision? How do we foster a collaborative culture so that data science and analytics projects will not run into costly delays?

If data science is to be transformational, it must be democratized in the organization. There are many ways to democratize data science in your organization and foster collaboration.

In this talk, we'll discuss practical solutions to overcome the hurdles and work towards a more efficient data science workflow.

PARTY

Location will be announced.

Sunday,
Conference day 1

O2 Universum, Českomoravská 2345/17a, 190 00, Praha (and on-line)

Doors open at 08:30

State and Future of Quantum Computing & Quantum Machine Learning

Alexander Del Toro Barba, Google

In this talk, Alexander will give an overview of recent developments in quantum computing and quantum machine learning, tackle potential applications in the near term and for fault tolerant quantum computers, and provide some tips on how to start in this field.

Probabilistic Precipitation Nowcasting with Deep Physics-Constrained Neural Networks

Matej Murín, Meteopress

In Meteopress, we have developed a neural network for precipitation nowcasting, achieving state-of-the-art quantitative performance in our geographic area but producing blurry predictions. In this contribution, we go over the shortcomings of traditional, regression-based neural networks for nowcasting and showcase why and how they fail to produce realistically looking and physically sound predictions. We then propose a new type of physics-constrained generative adversarial network, named PhyDGAN, and explain the decisions that lead to this architecture's design. We show how this type of network has better probabilistic qualities than a GAN without any physical constraints while still producing accurate, realistic predictions. We study how the introduced physical constraints influence the model and explore the possibility of creating new physically-based output features that would be interesting from a meteorological perspective.

Improving network quality with embeddings

Network quality is a key determinant of the customer experience. Poor network quality can lead to a negative customer experience and a decrease in customer loyalty. Mobile operators are under pressure to monitor and improve their network quality. Traditional methods of predicting network quality, such as signal strength and network coverage, are unable to capture the full range of information available in the network. To address this, we present a novel approach to network quality prediction by using embeddings. The approach uses a Word2Vec approach to create embeddings for network elements called BTS (Base Transceiver Station) and mobile devices (e.g., smartphones). Data for network element embeddings is generated from logs of communications between a mobile device and BTS stations. The set of these logs for a given mobile device can be represented as a sentences where tokens are BTS IDs. To generate embeddings of mobile devices, we were inspired by the Node2Vec approach, where the interactions between mobile devices can be seen as a directed graph, so that the sentences for Word2Vec are generated by random walks through the graph. Initial experiments show that our approach captures important information that cannot be obtained using traditional methods and correctly encodes, for example, geospatial information. Together with traditional data, the models using embeddings are able to predict the utilisation of network elements, and other insights. Therefore, this approach can be used to predict the network quality, improve the network quality and provide a better customer experience. In addition, the model could be used to improve network optimisation, allowing mobile operators to identify network elements that require more attention and allocate resources more efficiently.
 

COFFEE BREAK

Bringing automation and fairness to identity verification on the internet with deep learning

Olivier Koch, Onfido

This talk will cover the technical challenges of leveraging the latest computer vision and machine learning techniques in a context of fraud detection. In particular, we will show how the latest advancements in deep learning allow us to significantly automate industrial processes that heavily rely on manual labeling.

We will start with a presentation of our system leveraging biometrics and identity document data. We will present the key constraints of fraud detection, such as constantly evolving threats, massively unbalanced data, and unreliable labels. We will discuss how we learned to navigate them, while finding the best possible balance between false acceptance and false rejection.

We will then move on to the trade-offs of supervised and unsupervised learning and how deep learning can be used most effectively in the fraud detection setting.

Finally, we will focus on how we address bias in our system. Making identity verification as fair as possible is a core objective for Onfido. We will present concrete steps that practitioners can use in a real-world setting to reduce bias in their systems.

Open Source Explainability - Understanding Model Decisions using Alibi

Alex Athorne, Seldon

Explainable AI, or XAI, is a rapidly expanding field of research that aims to supply methods for understanding model predictions. We will start by providing a general introduction to the field of explainability, introduce the Alibi library and focus on how it helps you to understand trained models. We will then explore the collection of algorithms provided and the types of insight they each provide, looking at a broad range of datasets and models, and discussing the pros and cons of each. In particular, we'll look at methods that apply to any model. The aim is to give the ML practitioner a clear idea of how explainability techniques can be used to justify, explore and enhance their use of ML, especially for models in deployment.

Explainable AI for Computer Vision and NLP models

Uri Rosenberg, Amazon

With organizations levraging AI/ML solutions to transform their businesses, comes the need to ensure that models are trustworthy and understandable. For structured tabular data, known techniques (SHAP, LIME) have been proven effective in providing model and inference explainability. However, gaining model interpretability for unstructured, computer vision and NLP tasks requires innovative approaches. In this talk we will deep dive into the theory, methods and examples used in AWS Clarify to gain explainability on computer vision and NLP models and expand how they fit in the MLOps pipeline.

LUNCH

Building a Framework for Easy Model Deployments at Scale

Alexander Hagerf, Emplifi

This talk will go through how we at Emplifi created and use a framework that now enables us to deploy any of our ML models as HTTP endpoints, stream consumers, or batch jobs in Spark. All of this with no (or almost no) custom coding. This in turn has also helped with automatic pipelines for retraining, deployments, and testing for us.

We will show how adopting a standard for all models (MLflow) enabled us to abstract away the models' implementations and write code that works for any and all models, regardless of the underlying technology and/or dependencies. This separation of concerns also makes clear the line between where the data scientist's work ends, and where the data engineer's begins. Having one single code base for all deployments also means that updates or extensions are fast and easy to do.

From Prototype to Production: Best Practices for AI/ML Model Implementation

Petr Šimánek, FIT CTU

As we all know, creating an AI/ML model is just the first step toward developing a successful data product. The real challenge lies in integrating the model into the company's systems and workflows, ensuring sustainability and observability, and simplifying the model production process.

In this presentation, we will focus on practical solutions to these challenges using cutting-edge technologies such as MLFlow, Azure ML Studio, SageMaker, Databricks, and Kubeflow. We will explore how to integrate AI/ML models with the rest of the IT ecosystem, how to ensure the quality and functionality of the infrastructure, and how to avoid hidden defects and illegal libraries.

We will also discuss the best practices for creating a data product using feature stores, frameworks, and custom libraries to simplify the model production process.

Furthermore, we will dive into the principles of DevOps and MLOps, discussing how to achieve them in MS Azure environments. We will explore the different environments to use and what to use them for, as well as the tools that can help us achieve these principles.

In short, this presentation will provide practical and professional insights into the world of MLOps and DevOps in AI/ML and shows how to create sustainable and observable AI/ML models that integrate seamlessly into the company's systems and workflows.

To be announced

COFFEE BREAK

PANEL DISCUSSION

CLOSING REMARKS

Have a great time Prague, the city that never sleeps

You can feel centuries of history at every corner in this unique capital. We'll invite you to get a taste of our best pivo (that’s beer in Czech) and then bring you back to the present day to party at one of the local clubs all night long!

alt="">

Venue ML Prague 2023 will run hybrid, in person and online!

The main conference as well as the workshops will be held at O2 Universum.

We will also livestream the talks for all those participants who prefer to attend the conference online. Our platform will allow interaction with speakers and other participants too. Workshops require intensive interaction and won't be streamed.

Conference building

O2 Universum
Českomoravská 2345/17a, 190 00, Praha 9

Workshops

O2 Universum
Českomoravská 2345/17a, 190 00, Praha 9

Now or never Registration

Early Bird

Sold Out

  • Conference days € 240
  • Only workshops € 170
  • Conference + workshops € 390

Standard

Late

Last 100 registrations

  • Conference days € 290
  • Only workshops € 240
  • Conference + workshops € 490

What You Get

  • Practical and advanced level talks led by top experts.
  • Party in the city with people from around the world. Let’s go wild!
  • Delicious food and snacks throughout the conference.

They’re among us We are in The ML Revolution age

Machines can learn. Incredibly fast. Faster than you. They are getting smarter and smarter every single day, changing the world we’re living in, our business and our life. The artificial intelligence revolution is here. Come, learn and make this threat your biggest advantage.

Our Attendees What they say about ML Prague

Thank you to Our Partners

Co-organizing Partner

Platinum Partners

Gold Partners

Communities and Further support

Would you like to present your brand to 1000+ Machine Learning enthusiasts? Send us an email at info@mlprague.com to find out how to become a ML Prague 2023 partner.

Become a partner

Happy to help Contact

If you have any questions about Machine Learning Prague, please e-mail us at
info@mlprague.com

Organizers

Jiří Materna
Scientific program & Co-Founder
jiri@mlprague.com

Teresa Caklova
Event production
teresa@mlprague.com

Gonzalo V. Fernández
Marketing
gonzalo@mlprague.com

Jona Azizaj
Partnerships
jona@mlprague.com