Auditing NLP systems for computational harms like surfacing stereotypes is an elusive goal. Several recent efforts have focused on benchmark datasets consisting of pairs of contrastive sentences, which are often accompanied by metrics that aggregate an NLP system's behavior on these pairs into measurements of harms. We examine four such benchmarks constructed for two NLP tasks: language modeling and coreference resolution. We apply a measurement modeling lens-originating from the social sciences-to inventory a range of pitfalls that threaten these benchmarks' validity as measurement models for stereotyping. We find that these benchmarks frequently lack clear articulations of what is being measured, and we highlight a range of ambiguities and unstated assumptions that affect how these benchmarks conceptualize and operationalize stereotyping.
Hevea brasiliensis hydroxynitrile lyase (HbHNL) and salicylic acid binding protein 2 (SABP2, an esterase) share 45% amino acid sequence identity, the same protein fold, and even the same catalytic triad of Ser-His-Asp. However, they catalyze different reactions: cleavage of hydroxynitriles and hydrolysis of esters, respectively. To understand how other active site differences in the two enzymes enable the same catalytic triad to catalyze different reactions, we substituted amino acid residues in HbHNL with the corresponding residues from SABP2, expecting hydroxynitrile lyase activity to decrease and esterase activity to increase. Previous mechanistic studies and x-ray crystallography suggested that esterase activity requires removal of an active site lysine and threonine from the hydroxynitrile lyase. The Thr11Gly Lys236Gly substitutions in HbHNL reduced hydroxynitrile lyase activity for cleavage of mandelonitrile 100-fold, but increased esterase activity only threefold to kcat ~ 0.1 min−1 for hydrolysis of p-nitrophenyl acetate. Adding a third substitution – Glu79His – increased esterase activity more than tenfold to kcat ~ 1.6 min−1. The specificity constant (kcat/KM) for this triple substitution variant versus wild type HbHNL shifted more than one million-fold from hydroxynitrile lyase activity (acetone cyanohydrin substrate) to esterase activity (p-nitrophenyl acetate substrate). The contribution of Glu79His to esterase activity was surprising since esterases and lipases contain many different amino acids at this position, including glutamate. Saturation mutagenesis at position 79 showed that 13 of 19 possible amino acid substitutions increased esterase activity, suggesting that removal of glutamate, not addition of histidine, increased esterase activity. Molecular modeling indicates that Glu79 disrupts esterase activity in HbHNL when its negatively charged side chain distorts the orientation of the catalytic histidine. Naturally occurring glutamate at the corresponding location of Candida lipases is uncharged due to other active site differences and does not cause the same distortion. This example of the fine tuning of the same catalytic triad for different types of catalysis by subtle interactions with other active site residues shows how difficult it is to design new catalytic reactions of enzymes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.