Cure, O. C., Maurer, H., Shah, N. H., Le Pendu, P. Detecting unplanned care from clinician notes in electronic health records. View details for DOI 10.1371/journal.pone.0063499, View details for PubMedCentralID PMC3662653, View details for Web of Science ID 000353637800023. Research indicates that 21% of prescriptions filled are for off-label indications. BibTeX @INPROCEEDINGS{Lee07usingannotations, author = {Woei-jyh Lee and Louiqa Raschid and Padmini Srinivasan and Nigam Shah and Daniel Rubin and Natasha Noy}, title = {Using annotations from controlled vocabularies to find meaningful associations}, booktitle = {In Fourth International Workshop on Data Integration in the Life Sciences (DILS 2007}, year = {2007}, pages = {27--29}} Our results suggest that metrics based on the ontology only may be preferable to information-content-based metrics, and point to the need for more research on validating the different approaches. recently identified a class of diseases--blood coagulation disorders--that were associated with a 14-fold depletion in substitutions at O-linked glycosylation sites. Curé, O. C., Maurer, H., Shah, N. H., Le Pendu, P. Analyzing search behavior of healthcare professionals for drug safety surveillance. View details for DOI 10.1186/1471-2105-8-296, View details for Web of Science ID 000249734300001, View details for PubMedCentralID PMC1988837. In addition, evidence from the trials frequently rests on narrow patient-inclusion criteria and thus may not generalize well to real clinical situations. This study compares the performance of 18 automatic confounder control methods.Methods include propensity scores, direct adjustment by machine learning, similarity matching and resampling in two simulated and one real-world EHR datasets.Direct adjustment by lasso regression and ensemble models involving multiple resamples have performance comparable to expert-based propensity scores and thus, may help provide real-time EHR-based evidence for timely clinical decisions. Promoting the use of simple, dictionary-based methods for population level analyses can advance adoption of NLP in practice. We used the syntactic structure of PGx statements to systematically extract commonly occurring relationships and to map them to a common schema. Term occurrences in 2010 i2b2/VA text were also mapped; eight example filters were designed from the Mayo-based statistics and applied to i2b2/VA data.For the corpus analysis, negligible numbers of mapped terms in the Mayo corpus had over six words or 55 characters. PDE3,4 and 10 inhibitors, including dipyridamole, were found to promote β-cell replication in an adenosine receptor-dependent manner. Chen, R., Ryan, P., Natarajan, K., Falconer, T., Crew, K. D., Reich, C. G., Vashisht, R., Randhawa, G., Shah, N. H., Hripcsak, G. A predictive tool for identification of SARS-CoV-2 PCR-negative emergency department patients using routine test results. Stanford, CA 94305-5479 Based on our analysis we also suggest areas of potential improvements for Mgrep. Persistent detection of SARS-CoV-2 RNA in patients and healthcare workers with COVID-19. CLASSES INSTRUCTED (100 series represents upper division, 200 series represents graduate The service makes a decision based on three criteria. We have developed analytical and visualization tools to automate the identification of expression profile groups with common gene ontology (GO) annotations based on the sub-cellular localization and function of the proteins encoded by the genes, as well as to automate promoter analysis for such gene groups. The steady rise in healthcare costs has deprived over 45 million Americans of healthcare services (1, 2) and has encouraged healthcare providers to look for opportunities to improve their operational efficiency. Other thought leaders in clinical research suggest that EHRs should be used to lower the cost of trials by integrating point-of-care randomization and data capture into clinical processes. Our approach provides the basis for a data-driven ontology alignment by mapping annotations of experimental data. This survey covers efforts dealing with the automatic recognition of relevant named entities (e.g. To test the association of androgen deprivation therapy (ADT) in the treatment of prostate cancer with subsequent Alzheimer's disease risk.We used a previously validated and implemented text-processing pipeline to analyze electronic medical record data in a retrospective cohort of patients at Stanford University and Mt. To understand the gene networks that underlie plant stress and defense responses, it is necessary to identify and characterize the genes that respond both initially and as the physiological response to the stress or pathogen develops. Verghese, A., Shah, N. H., Harrington, R. Accurate and interpretable intensive care risk adjustment for fused clinical data with generalized additive models. Our goals were: 1. analyze the frequency and syntactic distribution of Metathesaurus terms in MEDLINE; 2. create a filtered UMLS Metathesaurus based on the MEDLINE analysis; 3. augment the UMLS Metathesaurus where each term is associated with metadata on its MEDLINE frequency and syntactic distribution statistics. CV for David Cocker 9/1/2005 1 David R. Cocker III ... Ph.D. students: Sandip Shah*, Chen Song, Aniket Sawant, Bethany Warren, Abhilash Nigam, Quentin Malloy, Ajay Chaudhary Post Doctoral Researcher: Kwangsam Na * awarded Ph.D. We have used data from the Stanford Hospital to attempt to address these issues. This expansion means that researchers now face a hurdle to extracting the data they need from the large numbers of data that are available. Our mapping produces OWL-DL, a Description Logics based subset of OWL with desirable computational properties for efficiency and correctness. Using a clinical text-mining tool, we detected unplanned episodes documented in clinician notes (for non-SHC visits) or in coded encounter data for SHC-delivered care and the most frequent symptoms documented in emergency department (ED) notes.Combined reporting increased the identification of patients with one or more unplanned care visits by 32% (15% using coded data; 20% using all the data) among patients with 3 months of follow-up and by 21% (23% using coded data; 28% using all the data) among those with 1 year of follow-up. Mahalingam, R., Shah, N., Scrymgeour, A., Fedoroff, N. HyBrow: a prototype system for computer-aided hypothesis evaluation. Just as scientists can ask "Which biological process is over-represented in my set of interesting genes or proteins?" Wang, J. K., Schuler, A., Shah, N. H., Baiocchi, M. T., Chen, J. H. Toward multimodal signal detection of adverse drug reactions. Nigam Shah Learning from past patient data to provide better care. Presenting you an all new rendition of the biggest and most loved song of 2000's that simply would stay in our hearts for ages, " Kal Ho Na Ho". After a 2-week washout period, participants were crossed over to receive the alternate treatment for the ensuing 4 weeks. So, let's talk about hospitals. We also propose an approach and demonstrate its application in identifying optimal signaling thresholds, given specific misclassification tolerances. Feasibility of Prioritizing Drug-Drug-Event Associations Found in Electronic Health Records. While GO has been the principal target for enrichment analysis, the methods of enrichment analysis are generalizable. We explore this potential risk in the general population via data-mining approaches.Using a novel approach for mining clinical data for pharmacovigilance, we queried over 16 million clinical documents on 2.9 million individuals to examine whether PPI usage was associated with cardiovascular risk in the general population.In multiple data sources, we found gastroesophageal reflux disease (GERD) patients exposed to PPIs to have a 1.16 fold increased association (95% CI 1.09-1.24) with myocardial infarction (MI). Kim, D., Quinn, J., Pinsky, B., Shah, N. H., Brown, I. Lee, W., Shah, N., Sundlass, K., Musen, M. UMLS-Query: a perl module for querying the UMLS. The information explosion in biology makes it difficult for researchers to stay abreast of current biomedical knowledge and to make sense of the massive amounts of online information. Wounds were considered delayed with respect to healing time if they took more than 15 weeks to heal after presentation at a wound care center. Under the auspices of the National Center for Biomedical Ontology (NCBO), we have developed the Resource Index-a growing, large-scale ontology-based index of more than twenty heterogeneous biomedical resources. Sanya’s research journal . A small subset of OBO constructs requires deeper consideration. Liu, V. X., Bates, D. W., Wiens, J., Shah, N. H. Early Detection of Adverse Drug Reactions in Social Health Networks: A Natural Language Processing Pipeline for Signal Detection. Ghebremariam, Y. T., Lee, J. C., LePendu, P., Erlanson, D. A., Slaviero, A., Shah, N. H., Leiper, J. M., Cooke, J. P. Mining clinical text for signals of adverse drug-drug interactions. Our evaluations demonstrate that Mgrep has a clear edge over MetaMap for large-scale service oriented applications. View details for Web of Science ID 000281719100019, View details for PubMedCentralID PMC3001121. Professor. A., Fleming, S. L., Wilfley, D. E., Terence Wilson, G., Milstein, A., Jurafsky, D., Arnow, B. Wang, Q., Reps, J. M., Kostka, K. F., Ryan, P. B., Zou, Y., Voss, E. A., Rijnbeek, P. R., Chen, R., Rao, G. A., Morgan Stewart, H., Williams, A. E., Williams, R. D., Van Zandt, M., Falconer, T., Fernandez-Chas, M., Vashisht, R., Pfohl, S. R., Shah, N. H., Kasthurirathne, S. N., You, S. C., Jiang, Q., Reich, C., Zhou, Y. To assess the validity of this black box warning, we employed a novel text-analytics pipeline to quantify the adverse events associated with Cilostazol use in a clinical setting, including patients with congestive heart failure (CHF).We analyzed the electronic medical records of 1.8 million subjects from the Stanford clinical data warehouse spanning 18 years using a novel text-mining/statistical analytics pipeline. However, many events are only documented in free-text clinician notes and are labor intensive to detect by manual medical record review.We studied 308,096 free-text machine-readable documents linked to individual entries in our electronic health records, representing care for patients with breast, GI, or thoracic cancer, whose treatment was initiated at one academic medical center, Stanford Health Care (SHC). Schuler, A., Liu, V., Wan, J., Callahan, A., Udell, M., Stark, D. E., Shah, N. H. Rapid identification of slow healing wounds. Of these, 31 306 images for 68 probes on 125 slides have been released to the public. BibTeX @INPROCEEDINGS{Lee07usingannotations, author = {Woei-jyh Lee and Louiqa Raschid and Padmini Srinivasan and Nigam Shah and Daniel Rubin and Natasha Noy}, title = {Using annotations from controlled vocabularies to find meaningful associations}, booktitle = {In Fourth International Workshop on Data Integration in the Life Sciences (DILS 2007}, year = {2007}, pages = {27--29}} Research on Gun Violence vs Other Causes of Death. Because of the intrinsic noisiness of high-throughput measurements, statistical methods have been central to this effort. Student in Computer Science (Artificial Intelligence) Hi there! View details for DOI 10.1038/clpt.2012.50, View details for Web of Science ID 000304245800019, View details for DOI 10.1136/amiajnl-2012-000969, View details for Web of Science ID 000314151400002. –Nigam Shah, Associate Professor of Biomedical Informatics and of Biomedical Data Science. Caswell-Jin, J., Callahan, A., Purington, N., Han, S. S., Itakura, H., Sledge, G. W., Shah, N., Kurian, A. W. Development and validation of phenotype classifiers across multiple sites in the observational health data sciences and informatics network. Bauer-Mehren, A., LePendu, P., Iyer, S. V., Harpaz, R., Leeper, N. J., Shah, N. H. Chapter 9: Analyses Using Disease Ontologies, Mining the pharmacogenomics literature-a survey of the state of the art. Patient-level analyses suggest a clear separation between autism and the other disorders, while revealing significant overlap between schizophrenia and bipolar disorder. This paper demonstrates that this KR shortcoming leads users to interpret the files in ways that can be erroneous. Our goal is to develop and apply general enrichment analysis methods to profile other sets of interest, such as patient cohorts from the electronic medical record, using a variety of ontologies including SNOMED CT, MedDRA, RxNorm, and others. This work provides new mechanistic insights into cAMP-dependent growth regulation of β-cells and highlights the potential of commonly prescribed medications to influence β-cell growth. View details for PubMedCentralID PMC6457095. A., Bailenson, J., Hancock, J. 0 We consider that this approach can be extended to support other domains such as cohort building tools. Nigam has 3 jobs listed on their profile. This process, referred to as enrichment analysis, profiles a gene set, and is widely used to make sense of the results of high-throughput experiments. With the availability of tools for automatic annotation of datasets with terms from disease ontologies, there is no reason to restrict enrichment analyses to the GO. A method for systematic discovery of adverse drug events from clinical notes. Shah, N. H., LePendu, P., Bauer-Mehren, A., Ghebremariam, Y. T., Iyer, S. V., Marcus, J., Nead, K. T., Cooke, J. P., Leeper, N. J. Of those, more than 73% lack supporting scientific evidence. h�bbd``b`�$�C�`Y$8����%L��@�D�� d� View details for DOI 10.1038/sdata.2016.26, View details for PubMedCentralID PMC4872271. Risk of angioedema associated with levetiracetam compared with phenytoin: Findings of the observational health data sciences and informatics research network. These ontologies have been mainly expressed in either the Open Biomedical Ontology (OBO) format or the Web Ontology Language (OWL). The authors have developed a data-mining method for systematic, automated detection of ADEs from electronic medical records.This method uses the text from 9.5 million clinical notes, along with prior knowledge of drug usages and known ADEs, as inputs. Hypotheses as well as the supporting or refuting data are represented in RDF and directly linked to one another allowing scientists to browse from data to hypothesis and vice versa. Workshop on Data Integration in the Life Sciences}, year = {2007}, pages = {247--263}} Specifically, we extracted International Classification of Diseases-9th revision diagnosis and Current Procedural Terminology codes, medication lists, and positive-present mentions of drug and disease concepts from all clinical notes. Vashisht, R., Jung, ., Schuler, A., Banda, . Impact of Predicting Health Care Utilization Via Web Search Behavior: A Data-Driven Analysis. We hypothesize that drug-disease co-occurrences, extracted from ontology-based annotations of the clinical notes, can be examined for statistical enrichment and used for drug safety surveillance. We believe that a successful learning health care system will require both approaches, and we suggest a model that resolves this escalating tension: a "green button" function within EHRs to help clinicians leverage aggregate patient data for decision making at the point of care. Blair, D. R., Lyttle, C. S., Mortensen, J. M., Bearden, C. F., Jensen, A. Workshop on Data Integration in the Life Sciences}, year = {2007}, pages = {247--263}} Bringing cohort studies to the bedside: framework for a "green button' to support clinical decision-making. Parai, G. K., Jonquet, C., xu, r., Musen, M. A., Shah, N. H. Building a biomedical ontology recommender web service. 4th Int. less than 1 minute read. We have shown that the syntactic and frequency information is useful to identify errors in the Metathesaurus. Toward Automated Detection of Peripheral Artery Disease Using Electronic Health Records. Given the increasing availability of comprehensive clinical data in electronic health records (EHRs), some health system leaders are now advocating for a shift away from traditional trials and toward large-scale retrospective studies, which can use practice-based evidence that is generated as a by-product of clinical processes. View details for Web of Science ID 000275419900014, View details for DOI 10.1186/2041-1480-1-S1-I1, View details for Web of Science ID 000297613200031. Comedian Rajeev Nigam of Great Indian Laughter Challenge 2 fame is bereaved. The system uses textual metadata or a set of keywords describing a domain of interest and suggests appropriate ontologies for annotating or representing the data. Propensity score-matched analysis (hazard ratio, 1.88; 95% CI, 1.10 to 3.20; P = .021) and traditional multivariable-adjusted Cox regression analysis (hazard ratio, 1.66; 95% CI, 1.05 to 2.64; P = .031) both supported a statistically significant association between ADT use and Alzheimer's disease risk. Tamang, S. R., Hernandez-Boussard, T., Ross, E. G., Gaskin, G., Patel, M. I., Shah, N. H. Funding and Publication of Research on Gun Violence and Other Leading Causes of Death. But before we can reliably use a pathway knowledge-base as a data source, we need to proofread it to ensure that it can fully support computer-aided information integration and inference.We design a series of logical tests to detect potential problems we might encounter using a particular knowledge-base, the Reactome database, with a particular computer-aided hypothesis evaluation tool, HyBrow. This filtered and augmented UMLS Metathesaurus can potentially be used to improve efficiency and precision of UMLS-based information retrieval and NLP tasks. In this paper, we quantify this trade-off among text processing systems that make different trade-offs between speed and linguistic understanding. Dr. Nigam Shah is associate professor of Medicine (Biomedical Informatics) at Stanford University, Assistant Director of the Center for Biomedical Informatics Research, and a core member of the Biomedical Informatics Graduate Program. hal-01901566 Nigam H. Shah is a research scientist at the Stanford Center for Biomedical Informatics group and member of the National Center for Biomedical Ontology. This paper presents a novel method based on Formal Concept Analysis (FCA) and Semantic Query Expansion (SQE) to assist the end-user in defining their seed queries and in refining the expanded search space that it encompasses.We evaluate our method over a gold-standard corpus from the 2008 i2b2 Obesity Challenge. We can aggregate the annotating GO concepts for each gene in this list, and arrive at a profile of the biological processes or mechanisms affected by the condition under study. Our Web log mining approach has the potential to monitor responses to FDA alerts at a national level. To date, there have not been comparisons of the different semantic-similarity approaches on a single ontology. Giving clinicians such a tool would support patient care decisions in the absence of gold-standard evidence and would help prioritize clinical questions for which EHR-enabled randomization should be carried out. an OBO ontology may be translated to OWL and back without loss of knowledge. Confronted with rapidly accumulating data, researchers currently do not have the software tools to undertake the required information integration tasks.We present HyQue, a Semantic Web tool for querying scientific knowledge bases with the purpose of evaluating user submitted hypotheses. We have developed methods to detect population level off-label usage using computationally efficient annotation of free text from clinical notes to generate features encoding empirical information about drug-disease mentions. Soldatova, L. N., Sansone, S., Dumontier, M., Shah, N. H. STOP using just GO: a multi-ontology hypothesis generation tool for high throughput experimentation. A reference standard of over 600 established and plausible ADRs was created and used to evaluate the proposed approach against a comparator.The combined signaling system achieved a statistically significant large improvement over AERS (baseline) in the precision of top ranked signals. In particular, the meeting focused on discussing informatics challenges related to personalizing care through the integration of genomic or other high-volume biomolecular data with data from clinical systems to make health care more efficient and effective.