Predicting antibiotic resistance using fusion transformers

dc.contributor.authorOlsson, Jesper
dc.contributor.departmentChalmers tekniska högskola / Institutionen för matematiska vetenskapersv
dc.contributor.examinerKristiansson, Erik
dc.contributor.supervisorJohnning, Anna
dc.contributor.supervisorKristiansson, Erik
dc.date.accessioned2024-09-05T13:34:29Z
dc.date.available2024-09-05T13:34:29Z
dc.date.issued2024
dc.date.submitted
dc.description.abstractAntimicrobial resistance threatens recent gains in global public health by making it more difficult to treat infections. Clinicians must administer treatments based on limited diagnostic information and increasing resistance complicates these decisions. This thesis project explores ways to support this process by developing a framework for training a transformer model using data fusion of patient and genotype data with phenotype data to make individualized predictions of antibiotic resistance in Escherichia coli on these multimodal data. To achieve this, the model was trained in two stages: first, the model was pre-trained on large volumes of unimodal data using masked language modeling to learn patterns within the modalities; and second, the model was fine-tuned on a small multimodal dataset to learn patterns across modalities. To evaluate pre-training strategies, the model was fine-tuned on two clinically relevant tasks and smaller training sets. To determine the value of introducing multimodality and the effect of genotype data availability on performance, the model was fine-tuned on varying levels of available genotype information. The results show that the model performs well on the fine-tuning tasks, that pretraining on unimodal data improves performance, and that the model can extrapolate well from small training sets and incomplete data. Therefore, it can be concluded that this work has achieved the aim of developing a model that can make accurate predictions based on limited diagnostic information. Importantly, large performance improvements were observed with increasing genotype data availability, especially on difficult antibiotics. Furthermore, the model was better able to utilize available genotype information when pre-trained. However, while no clear conclusion on the best pre-training strategy can be drawn from the results of this work, they indicate that using systematic class masking in pre-training yields the highest performance. Future research should further investigate the best strategy for pre-training the model, how the model utilizes genotype data to improve performance, and how genotype data affects performance on limited training data.
dc.identifier.coursecodeMVEX03
dc.identifier.urihttp://hdl.handle.net/20.500.12380/308525
dc.language.isoeng
dc.setspec.uppsokPhysicsChemistryMaths
dc.subjecttransformer, multimodal, machine learning, deep learning, data fusion, antibiotic resistance, artificial intelligence, masked language modeling, neural networks
dc.titlePredicting antibiotic resistance using fusion transformers
dc.type.degreeExamensarbete för masterexamensv
dc.type.degreeMaster's Thesisen
dc.type.uppsokH
local.programmeComplex adaptive systems (MPCAS), MSc
Ladda ner
Original bundle
Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
Masters_Thesis_Jesper Olsson_2024.pdf
Storlek:
1.57 MB
Format:
Adobe Portable Document Format
Beskrivning:
License bundle
Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
license.txt
Storlek:
2.35 KB
Format:
Item-specific license agreed upon to submission
Beskrivning: