Mostrar el registro sencillo del ítem

dc.contributor.authorNelen, Jochem
dc.contributor.authorPérez Sánchez, Horacio
dc.contributor.authorDe Winter, Hans
dc.contributor.authorVan Rompaey, Dries
dc.date.accessioned2025-04-02T10:38:21Z
dc.date.available2025-04-02T10:38:21Z
dc.date.issued2025-01-20
dc.identifier.citationNelen, J., Pérez-Sánchez, H., De Winter, H. et al. Matched pairs demonstrate robustness against inter-assay variability. J Cheminform 17, 8 (2025). https://doi.org/10.1186/s13321-025-00956-yes
dc.identifier.urihttp://hdl.handle.net/10952/9482
dc.descriptionMachine learning models for chemistry require large datasets, often compiled by combining data from multiple assays. However, combining data without careful curation can introduce significant noise. While absolute values from different assays are rarely comparable, trends or differences between compounds are often assumed to be consistent. This study evaluates that assumption by analyzing potency differences between matched compound pairs across assays and assessing the impact of assay metadata curation on error reduction. We find that potency differences between matched pairs exhibit less variability than individual compound measurements, suggesting systematic assay differences may partially cancel out in paired data. Metadata curation further improves inter-assay agreement, albeit at the cost of dataset size. For minimally curated compound pairs, agreement within 0.3 pChEMBL units was found to be 44–46% for Ki and IC50 values respectively, which improved to 66–79% after curation. Similarly, the percentage of pairs with differences exceeding 1 pChEMBL unit dropped from 12 to 15% to 6–8% with extensive curation. These results establish a benchmark for expected noise in matched molecular pair data from the ChEMBL database, offering practical metrics for data quality assessment.es
dc.language.isoenes
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 Internacional*
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 Internacional*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/*
dc.subjectMatched structural pairses
dc.subjectAssay noisees
dc.subjectData curationes
dc.subjectChEMBLes
dc.subjectMachine learninges
dc.titleMatched pairs demonstrate robustness against inter-assay variabilityes
dc.typejournal articlees
dc.rights.accessRightsopen accesses
dc.journal.titleJournal of Cheminformaticses
dc.volume.number17es
dc.issue.number8es
dc.description.disciplineFarmaciaes
dc.identifier.doi10.1186/s13321-025-00956-yes
dc.description.facultyCiencias de la Saludes


Ficheros en el ítem

Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem

Attribution-NonCommercial-NoDerivatives 4.0 Internacional
Excepto si se señala otra cosa, la licencia del ítem se describe como Attribution-NonCommercial-NoDerivatives 4.0 Internacional