Professor Paolo Missier

Professor Paolo Missier

School of Computer Science
Chair in Computer and Data Science

Contact details

Address
School of Computer Science
University of Birmingham
Edgbaston
Birmingham
B15 2TT
UK

Prof Paolo Missier is Professor of Computer and Data Science in the School of Computer Science, specialising in Data and Knowledge management, Data Science and Engineering, with applications mainly to the Health domain.  His current research covers two main themes (see his publications for a deep dive into these):

  1. "Improving Data Science to improve Science". This includes exploring how to make Data Science responsible and trustworthy by instrumenting complex data engineering and data processing pipelines, for accountability and explanations, using for instance automatically collected Data Provenance.
  2. Addressing the challenges of Health Data Engineering and Data Science at scale for participatory, preventative, personalised care. For example, I am involved in the NIHR-funded AI-MULTIPLY project (https://ai-multiply.co.uk/ ), focusing on the use of medical records and multi-modal learning to prevent and better manage multiple long-term conditions, and in the NortHFutures Digital Health Hub, creating a health innovation ecosystem that will "facilitate the research, development, and acceleration of responsibly designed, human-centred, and data-rich health-tech"  (https://northfutures.org/).

Much of his past research focused on the design of Workflows for Science (aka "scientific workflows"), on optimising the repeated execution of Life Sciences data analytics pipelines (ReComp), on the investigation of digital phenotyping of metabolic diseases (Type 2 Diabetes) from wearable accelerometers, and more.

Between 2011 and 2013, he has been one of the editors and authors of the W3C PROV standard data model for Provenance interoperability (https://www.w3.org/TR/2013/REC-prov-dm-20130430/)

Since 2016, he has been Sr. Associate Editor for the ACM Journal on Data and Information Quality (http://jdiq.acm.org/ )

He has been leading post-graduate teaching on Data Engineering for AI, and various UG modules including Database Technology and Introduction to Predictive Analytics.

Prior to joining Birmingham in 2024, from 2011 to 2023 he has been a Lecturer, Reader, and Professor in the School of Computing at Newcastle University, and a Fellow (2018-2023) of the Alan Turing Institute, UK's National Institute for Data Science and Artificial Intelligence.

Qualifications

  • PhD in Computer Science, 2008
  • MSc in Computer Science, 1993
  • BSc MSc in Computer Science, 1990

Teaching

Throughout Paolo's academic career, he has been leading undergraduate teaching on Databases and Data Engineering, and post-graduate teaching on Data Engineering for AI and distributed architectures for Big Data

Postgraduate supervision

Paolo has supervised 8 PhD students while at Newcastle, all successfully graduated. Paolo is still actively supervising 3 students from Newcastle, and he is open to new PhD supervision with his new role at Birmingham.

Research

Paolo's current esearch follows two main strands:

Improving Data Science to improve Science

Exploring how to make Data Science responsible and trustworthy by instrumenting complex data engineering and data processing pipelines, for accountability and explanations. This includes exploring new techniques for collecting and making sense of data provenance and audit traces and enhancing, specifically within the emerging context of Data-Centric AI. Recently (2021-23) we have been developing DPDS (“Data Provenance for Data Science”), a tool for collecting the provenance of dataframes that are manipulated using python pandas as part of Data Science pipelines.

Health Data Science at scale for participatory, preventative, personalised care

    1. The future of healthcare is not only data-driven, it is also participatory: we explore health technology and models aimed at motivating and empowering individuals to engage with their own health, enabling disease prevention and early onset detection. We envision "AI-enabled", trusted personal advisors who have current knowledge of our health trajectory and are able to track risk factors without the need for periodic clinical visits.
    2. Exploring the potential of integrated care systems, including Health Records but also data from self-monitoring devices, to prevent and better manage multiple long-term conditions. What models and tools can we provide to best support the patient-clinician-carer ecosystem through the life course, and how can we measure and then improve the Quality of Life of multi-morbid chronic patients?

Past research

His past research focused on (a) Workflows for Science, (b) optimising the repeated execution of data analytics pipelines specifically for Life Sciences / Genomics, (c) porting genomics pipelines to the cloud, (d) digital phenotyping of metabolic diseases (Type 2 Diabetes) from wearable accelerometers.

His activity on the area of data provenance culminated with the publication of the W3C PROV standard data model for Provenance interoperability. See also DPDS, above.

Other activities

Editorial activities

Since 2016 he has been Sr. Associate Editor for the ACM Journal on Data and Information Quality (JDIQ), in addition to serving on multiple PC committees specifically for data management conferences over many years, including CIKM, EDBT, VLDB. Too much journal reviewing to mention here!

External examiner for the Data Science MSc at Heriot-Watt University (until 2022)

Distinguished Appointments

  • Fellow of the Alan Turing Institute, 2019-2023
  • Visiting Professor, Dept. of Mathematics and Informatics, Dept. of Computer Engineering, Universita’ di Modena e Reggio Emilia, Italy, 2019-present.

Publications

Recent publications

Article

González-Zelaya, V, Salas, J, Megías, D & Missier, P 2024, 'Fair and Private Data Preprocessing through Microaggregation', ACM Transactions on Knowledge Discovery from Data, vol. 18, no. 3, 49. https://doi.org/10.1145/3617377

McTeer, M, Henderson, R, Anstee, QM & Missier, P 2024, 'Handling Overlapping Asymmetric Data Sets—A Twice Penalized P-Spline Approach', Mathematics, vol. 12, no. 5, 777. https://doi.org/10.3390/math12050777

The ADMISSION Research Collaborative, Lewis, J, Evison, F, Doal, R, Field, J, Gallier, S, Harris, S, le Roux, P, Osman, M, Plummer, C, Sapey, E, Singer, M, Sayer, AA & Witham, MD 2024, 'How far back do we need to look to capture diagnoses in electronic health records? A retrospective observational study of hospital electronic health record data', BMJ open, vol. 14, no. 2, e080678. https://doi.org/10.1136/bmjopen-2023-080678

LITMUS Consortium Investigators, McTeer, M, Applegate, D, Mesenbrink, P, Ratziu, V, Schattenberg, JM, Bugianesi, E, Geier, A, Romero-Gomez, M, Dufour, J-F, Ekstedt, M, Francque, S, Yki-järvinen, H, Allison, M, Valenti, L, Miele, L, Pavlides, M, Cobbold, J, Papatheodoridis, G, Holleboom, AG, Tiniakos, D, Brass, C, Anstee, QM & Missier, P 2024, 'Machine learning approaches to enhance diagnosis and staging of patients with MASLD using routinely available clinical information', PLOS One, vol. 19, no. 2, e0299487. https://doi.org/10.1371/journal.pone.0299487

Chapman, A, Lauro, L, Missier, P & Torlone, R 2024, 'Supporting Better Insights of Data Science Pipelines with Fine-grained Provenance', ACM Trans. Database Syst. https://doi.org/10.1145/3644385

Motta, F, Milic, J, Gozzi, L, Belli, M, Sighinolfi, L, Cuomo, G, Carli, F, Dolci, G, Iadisernia, V, Burastero, G, Mussini, C, Missier, P, Mandreoli, F & Guaraldi, G 2023, 'A Machine Learning Approach to Predict Weight Change in ART-Experienced People Living with HIV', Journal of Acquired Immune Deficiency Syndromes, vol. 94, no. 5, pp. 474-481. https://doi.org/10.1097/QAI.0000000000003302

Shao, S, Guan, Y, Zhai, B, Missier, P & Plötz, T 2023, 'ConvBoost: Boosting ConvNets for Sensor-based Activity Recognition', Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, vol. 7, no. 2, 75. https://doi.org/10.1145/3596234

Eto, F, Samuel, M, Henkin, R, Mahesh, M, Ahmad, T, Angdembe, A, Hamish McAllister-Williams, R, Missier, P, Reynolds, NJ, Barnes, MR, Hull, S, Finer, S & Mathur, R 2023, 'Ethnic differences in early onset multimorbidity and associations with health service use, long-term prescribing, years of life lost, and mortality: A cross-sectional study using clustering in the UK Clinical Practice Research Datalink', PLoS Medicine, vol. 20, no. 10, e1004300. https://doi.org/10.1371/journal.pmed.1004300

Evison, F, Cooper, R, Gallier, S, Missier, P, Sayer, AA, Sapey, E & Witham, MD 2023, 'Mapping inpatient care pathways for patients with COPD: an observational study using routinely collected electronic hospital record data', ERJ Open Research, vol. 9, no. 5, 00110-2023. https://doi.org/10.1183/23120541.00110-2023

Conference article

Calvanese, D, Ferro, N, Diamantini, C, Silvello, G, Marchesin, S, Tanca, L, Atzori, M, Bartolini, I, Bellomarini, L, Buccafurri, F, Cabibbo, L, Calí, A, Caruccio, L, Ceci, M, Chiusano, S, Ciaccia, P, Corradini, E, Crescenzi, V, Di Noia, T, Faggioli, G, Fazzinga, B, Ferrara, A, Ferrari, E, Firmani, D, Garza, P, Golfarelli, M, Guerrini, G, Gullo, F, Guzzi, PH, Lanti, D, Leotta, F, Manco, G, Mandreoli, F, Masciari, E, Maurino, A, Melchiori, M, Mircoli, A, Missier, P, Molinaro, C, Montanelli, S, Moscato, V, Papotti, P, Pensa, RG, Piantella, D, Pugliese, A, Quintarelli, E, Renso, C, Rinzivillo, S, Sartiani, C, Savo, DF, Simonini, G, Storti, E, Tagarelli, A, Amato, F, Baralis, E, Castano, S, Catania, B, De Antonellis, V, Greco, S, Lembo, D, Camporese, A, Giachelle, F, Irrera, O, Leoncini, D, Menotti, L, Pasin, A & Quaggio, E 2023, 'Preface of the 31st Italian Symposium on Advanced Database Systems', CEUR Workshop Proceedings, vol. 3478, pp. i-iv. <https://ceur-ws.org/Vol-3478/preface.pdf>

Conference contribution

Calero-Diaz, H, Hamad, RA, Atallah, C, Casement, J, Canoy, D, Reynolds, NJ, Barnes, MR & Missier, P 2024, Interpretable and robust hospital readmission predictions from Electronic Health Records. in J He, T Palpanas, X Hu, A Cuzzocrea, D Dou, D Slezak, W Wang, A Gruca, JC-W Lin & R Agrawal (eds), 2023 IEEE International Conference on Big Data (BigData). IEEE International Conference on Big Data, Institute of Electrical and Electronics Engineers (IEEE), pp. 3679-3687, 2023 IEEE International Conference on Big Data, BigData 2023, Sorrento, Italy, 15/12/23. https://doi.org/10.1109/BigData59044.2023.10386820

Shan, S, Guan, Y, Guan, X, Missier, P & Plotz, T 2023, On Training Strategies for LSTMs in Sensor-Based Human Activity Recognition. in 2023 IEEE International Conference on Pervasive Computing and Communications Workshops and other Affiliated Events, PerCom Workshops 2023. IEEE Annual Conference on Pervasive Computing and Communications Workshops (PerCom), Institute of Electrical and Electronics Engineers (IEEE), pp. 653-658, 2023 IEEE International Conference on Pervasive Computing and Communications Workshops and other Affiliated Events, PerCom Workshops 2023, Atlanta, United States, 13/03/23. https://doi.org/10.1109/PerComWorkshops56833.2023.10150305

González-Zelaya, V, Salas, J, Prangle, D & Missier, P 2023, Preprocessing Matters: Automated Pipeline Selection for Fair Classification. in V Torra & Y Narukawa (eds), Modeling Decisions for Artificial Intelligence - 20th International Conference, MDAI 2023, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 13890 LNCS, Springer, pp. 202-213, 20th International Conference on Modeling Decisions for Artificial Intelligence, MDAI 2023, Umeå, Sweden, 19/06/23. https://doi.org/10.1007/978-3-031-33498-6_14

Kremer, R, Raza, SM, Eto, F, Casement, J, Atallah, C, Finer, S, Lendrem, D, Barnes, M, Reynolds, NJ & Missier, P 2023, Tracking trajectories of multiple long-term conditions using dynamic patient-cluster associations. in S Tsumoto, Y Ohsawa, L Chen, D Van den Poel, X Hu, Y Motomura, T Takagi, L Wu, Y Xie, A Abe & V Raghavan (eds), 2022 IEEE International Conference on Big Data. Proceedings - 2022 IEEE International Conference on Big Data, Big Data 2022, Institute of Electrical and Electronics Engineers (IEEE), pp. 4390-4399, 2022 IEEE International Conference on Big Data, Big Data 2022, Osaka, Japan, 17/12/22. https://doi.org/10.1109/BigData55660.2022.10021034

Editorial

Witham, MD, Cooper, R, Missier, P, Robinson, SM, Sapey, E & Sayer, AA 2023, 'Researching multimorbidity in hospital: can we deliver on the promise of health informatics?', European Geriatric Medicine, vol. 14, no. 4, pp. 765-768. https://doi.org/10.1007/s41999-023-00753-6

View all publications in research portal