The Role of Academic Affiliation in the Treatment of Metastatic Castrate-Resistant Prostate Cancer in the Veterans Health Administration

Article Type
Changed
Fri, 09/08/2017 - 11:54
Abstract 4: 2017 AVAHO Meeting

Background: Cancer care in academically affiliated settings such as teaching hospitals has been associated with improved clinical outcomes. Historically, Veterans Affairs (VA) medical centers are partnered with academic affiliates; however, there have been few studies examining how this partnership affects clinical care in the Veterans Health Administration (VHA). We therefore examined the variation of first line therapy (1L) in patients with metastatic castrate resistant prostate cancer (mCRPC) in the VHA by degree of academic affiliation.

Methods: Information from the VA Central Cancer Registry was linked to clinical data from the VA Corporate Data Warehouse to identify incident cases of mCRPC, defined as first incidence of radiologic evidence of metastasis and castrate resistance in patients with prostate cancer. Patient demographics, disease characteristics and treatment practices were extracted. The degree of academic affiliation of the treating facility was calculated using the Herfindahl-Hirschman Index (HHI), which reflects how dispersed medical residents are among different specialties and how many specialties are available within a given VA facility.

Results: From 2006 to 2015, 3,637 patients received an mCRPC diagnosis and were treated in 123 VA facilities. Median HHI for treating facilities was 0.374. Of these patients, 1,723 (47%) were treated in a facility with higher academic affiliation (HAA; HHI ≥ 0.374) and 1,914 (53%) were treated in a facility with lower academic affiliation (LAA; HHI ≤ 0.373). There was no difference in patient or disease characteristics by academic affiliation; patients with HAA and LAA had comparable Gleason scores, stage of disease at diagnosis, primary local therapy, age and median PSA levels at time of diagnosis. Patients with mCRPC at HAA facilities were more likely to receive 1L (59% vs 55%, P = .015). Regimens frequently used for 1L were comparable: HAA, docetaxel (29%), abiraterone (22%), and enzalutamide (6%); LAA: docetaxel (25%), abiraterone (21%), and enzalutamide (7%).

Conclusions: Patients with mCRPC had a small but significant increase in likelihood of receiving 1L if treated in HAA vs LAA facilities. Further study will focus on identifying patient, prescriber and facility factors that are associated with the likelihood of initiating 1L and the choice of 1L regimen.

Publications
Topics
Sections
Abstract 4: 2017 AVAHO Meeting
Abstract 4: 2017 AVAHO Meeting

Background: Cancer care in academically affiliated settings such as teaching hospitals has been associated with improved clinical outcomes. Historically, Veterans Affairs (VA) medical centers are partnered with academic affiliates; however, there have been few studies examining how this partnership affects clinical care in the Veterans Health Administration (VHA). We therefore examined the variation of first line therapy (1L) in patients with metastatic castrate resistant prostate cancer (mCRPC) in the VHA by degree of academic affiliation.

Methods: Information from the VA Central Cancer Registry was linked to clinical data from the VA Corporate Data Warehouse to identify incident cases of mCRPC, defined as first incidence of radiologic evidence of metastasis and castrate resistance in patients with prostate cancer. Patient demographics, disease characteristics and treatment practices were extracted. The degree of academic affiliation of the treating facility was calculated using the Herfindahl-Hirschman Index (HHI), which reflects how dispersed medical residents are among different specialties and how many specialties are available within a given VA facility.

Results: From 2006 to 2015, 3,637 patients received an mCRPC diagnosis and were treated in 123 VA facilities. Median HHI for treating facilities was 0.374. Of these patients, 1,723 (47%) were treated in a facility with higher academic affiliation (HAA; HHI ≥ 0.374) and 1,914 (53%) were treated in a facility with lower academic affiliation (LAA; HHI ≤ 0.373). There was no difference in patient or disease characteristics by academic affiliation; patients with HAA and LAA had comparable Gleason scores, stage of disease at diagnosis, primary local therapy, age and median PSA levels at time of diagnosis. Patients with mCRPC at HAA facilities were more likely to receive 1L (59% vs 55%, P = .015). Regimens frequently used for 1L were comparable: HAA, docetaxel (29%), abiraterone (22%), and enzalutamide (6%); LAA: docetaxel (25%), abiraterone (21%), and enzalutamide (7%).

Conclusions: Patients with mCRPC had a small but significant increase in likelihood of receiving 1L if treated in HAA vs LAA facilities. Further study will focus on identifying patient, prescriber and facility factors that are associated with the likelihood of initiating 1L and the choice of 1L regimen.

Background: Cancer care in academically affiliated settings such as teaching hospitals has been associated with improved clinical outcomes. Historically, Veterans Affairs (VA) medical centers are partnered with academic affiliates; however, there have been few studies examining how this partnership affects clinical care in the Veterans Health Administration (VHA). We therefore examined the variation of first line therapy (1L) in patients with metastatic castrate resistant prostate cancer (mCRPC) in the VHA by degree of academic affiliation.

Methods: Information from the VA Central Cancer Registry was linked to clinical data from the VA Corporate Data Warehouse to identify incident cases of mCRPC, defined as first incidence of radiologic evidence of metastasis and castrate resistance in patients with prostate cancer. Patient demographics, disease characteristics and treatment practices were extracted. The degree of academic affiliation of the treating facility was calculated using the Herfindahl-Hirschman Index (HHI), which reflects how dispersed medical residents are among different specialties and how many specialties are available within a given VA facility.

Results: From 2006 to 2015, 3,637 patients received an mCRPC diagnosis and were treated in 123 VA facilities. Median HHI for treating facilities was 0.374. Of these patients, 1,723 (47%) were treated in a facility with higher academic affiliation (HAA; HHI ≥ 0.374) and 1,914 (53%) were treated in a facility with lower academic affiliation (LAA; HHI ≤ 0.373). There was no difference in patient or disease characteristics by academic affiliation; patients with HAA and LAA had comparable Gleason scores, stage of disease at diagnosis, primary local therapy, age and median PSA levels at time of diagnosis. Patients with mCRPC at HAA facilities were more likely to receive 1L (59% vs 55%, P = .015). Regimens frequently used for 1L were comparable: HAA, docetaxel (29%), abiraterone (22%), and enzalutamide (6%); LAA: docetaxel (25%), abiraterone (21%), and enzalutamide (7%).

Conclusions: Patients with mCRPC had a small but significant increase in likelihood of receiving 1L if treated in HAA vs LAA facilities. Further study will focus on identifying patient, prescriber and facility factors that are associated with the likelihood of initiating 1L and the choice of 1L regimen.

Publications
Publications
Topics
Article Type
Sections
Disallow All Ads
Content Gating
No Gating (article Unlocked/Free)
Alternative CME
Disqus Comments
Default

Examining Methods for Systematically Identifying Cytogenetic Testing Among Chronic Lymphoblastic Leukemia Patients

Article Type
Changed
Fri, 09/08/2017 - 12:05
Abstract 10: 2017 AVAHO Meeting

Purpose: To evaluate data extraction methods for identifying cytogenetic and fluorescence in situ hybridization (FISH) testing among chronic lymphoblastic leukemia (CLL) patients in the Veterans Health Administration (VHA).

Background: Cytogenetic/FISH testing are increasingly important for assessing risk and guiding therapy in patients with CLL. Administrative health data are frequently used to study testing practices; however, they are limited in their sensitivity and reliability. Increasing adoption of electronic health records (EHR) presents an opportunity to describe clinical practices in large patient populations. We compare three different EHR extraction methods to identify cytogenetic/ FISH testing in a cohort of CLL patients treated within the VHA.

Methods: CLL patients were identified using the VA Clinical Cancer Registry. Testing information was extracted from time of diagnosis to time of first treatment using three methods: (1) Current Procedural Terminology (CPT) codes; (2) Text mining of healthcare provider orders (HPO); (3) Clinical Lab Information Retrieval (CLIR), a previously validated conceptual framework that incorporates LOINC codes and test names that are then validated using test result information.

Results: 1,363 CLL patients were diagnosed and followed until their first line of therapy at VHA between 2008 and 2016: 635 (47%) had evidence of testing by text mining of HPO, 554 (41%) by CPT, and 399 (29%) by CLIR. Comparing CPT vs combined CLIR+HPO, CPT extraction had
a sensitivity of 52.8%, a precision of 73.1% and an F-measure of 0.613. Cytogenetic/FISH testing increased by nearly two-fold from 2008 to 2016, regardless of extraction method: HPO text mining (25% to 51%), CPT (20% to 54%), or CLIR (19% to 32%).

Conclusions: Advanced EHR extraction methods offer a more granular description of testing practices than administrative data alone as they examine multiple components of the EHR including the ordering, processing, and results of testing occurrences. Results suggest that there has been a slow increase in the number of CLL patients undergoing cytogenetic/FISH testing during the past decade, which is comparable to similar reports of testing practices outside the VHA, although approximately half of all CLL patients are not undergoing testing despite established clinical guideline recommendations.

Publications
Topics
Page Number
S17
Sections
Abstract 10: 2017 AVAHO Meeting
Abstract 10: 2017 AVAHO Meeting

Purpose: To evaluate data extraction methods for identifying cytogenetic and fluorescence in situ hybridization (FISH) testing among chronic lymphoblastic leukemia (CLL) patients in the Veterans Health Administration (VHA).

Background: Cytogenetic/FISH testing are increasingly important for assessing risk and guiding therapy in patients with CLL. Administrative health data are frequently used to study testing practices; however, they are limited in their sensitivity and reliability. Increasing adoption of electronic health records (EHR) presents an opportunity to describe clinical practices in large patient populations. We compare three different EHR extraction methods to identify cytogenetic/ FISH testing in a cohort of CLL patients treated within the VHA.

Methods: CLL patients were identified using the VA Clinical Cancer Registry. Testing information was extracted from time of diagnosis to time of first treatment using three methods: (1) Current Procedural Terminology (CPT) codes; (2) Text mining of healthcare provider orders (HPO); (3) Clinical Lab Information Retrieval (CLIR), a previously validated conceptual framework that incorporates LOINC codes and test names that are then validated using test result information.

Results: 1,363 CLL patients were diagnosed and followed until their first line of therapy at VHA between 2008 and 2016: 635 (47%) had evidence of testing by text mining of HPO, 554 (41%) by CPT, and 399 (29%) by CLIR. Comparing CPT vs combined CLIR+HPO, CPT extraction had
a sensitivity of 52.8%, a precision of 73.1% and an F-measure of 0.613. Cytogenetic/FISH testing increased by nearly two-fold from 2008 to 2016, regardless of extraction method: HPO text mining (25% to 51%), CPT (20% to 54%), or CLIR (19% to 32%).

Conclusions: Advanced EHR extraction methods offer a more granular description of testing practices than administrative data alone as they examine multiple components of the EHR including the ordering, processing, and results of testing occurrences. Results suggest that there has been a slow increase in the number of CLL patients undergoing cytogenetic/FISH testing during the past decade, which is comparable to similar reports of testing practices outside the VHA, although approximately half of all CLL patients are not undergoing testing despite established clinical guideline recommendations.

Purpose: To evaluate data extraction methods for identifying cytogenetic and fluorescence in situ hybridization (FISH) testing among chronic lymphoblastic leukemia (CLL) patients in the Veterans Health Administration (VHA).

Background: Cytogenetic/FISH testing are increasingly important for assessing risk and guiding therapy in patients with CLL. Administrative health data are frequently used to study testing practices; however, they are limited in their sensitivity and reliability. Increasing adoption of electronic health records (EHR) presents an opportunity to describe clinical practices in large patient populations. We compare three different EHR extraction methods to identify cytogenetic/ FISH testing in a cohort of CLL patients treated within the VHA.

Methods: CLL patients were identified using the VA Clinical Cancer Registry. Testing information was extracted from time of diagnosis to time of first treatment using three methods: (1) Current Procedural Terminology (CPT) codes; (2) Text mining of healthcare provider orders (HPO); (3) Clinical Lab Information Retrieval (CLIR), a previously validated conceptual framework that incorporates LOINC codes and test names that are then validated using test result information.

Results: 1,363 CLL patients were diagnosed and followed until their first line of therapy at VHA between 2008 and 2016: 635 (47%) had evidence of testing by text mining of HPO, 554 (41%) by CPT, and 399 (29%) by CLIR. Comparing CPT vs combined CLIR+HPO, CPT extraction had
a sensitivity of 52.8%, a precision of 73.1% and an F-measure of 0.613. Cytogenetic/FISH testing increased by nearly two-fold from 2008 to 2016, regardless of extraction method: HPO text mining (25% to 51%), CPT (20% to 54%), or CLIR (19% to 32%).

Conclusions: Advanced EHR extraction methods offer a more granular description of testing practices than administrative data alone as they examine multiple components of the EHR including the ordering, processing, and results of testing occurrences. Results suggest that there has been a slow increase in the number of CLL patients undergoing cytogenetic/FISH testing during the past decade, which is comparable to similar reports of testing practices outside the VHA, although approximately half of all CLL patients are not undergoing testing despite established clinical guideline recommendations.

Page Number
S17
Page Number
S17
Publications
Publications
Topics
Article Type
Sections
Disallow All Ads
Content Gating
No Gating (article Unlocked/Free)
Alternative CME
Disqus Comments
Default

Using Natural Language Processing in Radiology Reports to Identify the Presence of Metastatic Disease in Veterans With Prostate Cancer

Article Type
Changed
Fri, 09/08/2017 - 12:04
Abstract 9: 2017 AVAHO Meeting

Background: Radiographic imaging is important for the diagnosis and management of cancer. Radiology reports contain a wealth of information, but are typically formatted as unstructured text, making large scale information extraction challenging. We validated a natural language processing (NLP) algorithm to identify the presence of metastatic disease in radiographic imaging reports.

Methods: Using VA Clinical Cancer Registry and Corporate Data Warehouse, we identified approximately 3 million radiology reports for 120,374 patients receiving care for prostate cancer in the VA from 2006-2015. We focused on the impression section of CT, PET/CT, X-ray, bone scan, and MRI reports. We expanded on Chapman et al. “ConText” algorithm to identify the presence of metastatic disease: (1) Using UMLS, we identified terms compatible with “metastasis”; (2) Report impressions were preprocessed and tokenized at the sentence level and as part of the sentence; (3) Positive and negative trigger phrases were implemented as a series of regular expressions, which were refined over a number of iterations using training data from 2 batches of 600 reports, allowing us to extend trigger identification to a larger set of phrases. The final algorithm was validated using an independent sample of 2,000 reports annotated by a domain expert.

Results: The first training set of 600 of radiology reports achieved an accuracy of: 94% for reports with no mention of metastasis, 85% for negated mention of metastasis, and 74% mentions of metastasis without negation. Errors were reviewed resulting in vocabulary expansion and improved implementation of regular expressions to capture the expanded trigger phrases. Performance of the modified algorithm was tested on a new set of 600 reports and resulted in an increased accuracy of 96% for no mention of metastasis, 90% for negated mention of metastasis, and 89% mentions of metastasis without negation. After additional modifications were made, the revised algorithm was validated using an independent sample of 2,000 reports. The accuracy was 96% (Cohen’s kappa ~1), with precision of 98%, and a sensitivity of 98%.

Conclusions: Detecting presence of metastatic disease from radiographic notes is feasible with NLP.

References: (1) Sarkar S, Das S. A review of imaging methods for prostate cancer detection. Biomed Eng Comput Biol. 2016;7(Suppl 1):1-15. (2) Chapman WW, Bridewell W, Hanbury P, Cooper GF, Buchanan BG. A simple algorithm for identifying negated findings and diseases in discharge summaries. J Biomed Inform. 2001;34(5):301- 310. (3) Harkema H, Dowling JN, Thornblade T. Con-Text: An algorithm for determining negation, experiencer, and temporal status from clinical reports. J Biomed Inform. 2009;42(5):839-851.

Publications
Page Number
S17
Sections
Abstract 9: 2017 AVAHO Meeting
Abstract 9: 2017 AVAHO Meeting

Background: Radiographic imaging is important for the diagnosis and management of cancer. Radiology reports contain a wealth of information, but are typically formatted as unstructured text, making large scale information extraction challenging. We validated a natural language processing (NLP) algorithm to identify the presence of metastatic disease in radiographic imaging reports.

Methods: Using VA Clinical Cancer Registry and Corporate Data Warehouse, we identified approximately 3 million radiology reports for 120,374 patients receiving care for prostate cancer in the VA from 2006-2015. We focused on the impression section of CT, PET/CT, X-ray, bone scan, and MRI reports. We expanded on Chapman et al. “ConText” algorithm to identify the presence of metastatic disease: (1) Using UMLS, we identified terms compatible with “metastasis”; (2) Report impressions were preprocessed and tokenized at the sentence level and as part of the sentence; (3) Positive and negative trigger phrases were implemented as a series of regular expressions, which were refined over a number of iterations using training data from 2 batches of 600 reports, allowing us to extend trigger identification to a larger set of phrases. The final algorithm was validated using an independent sample of 2,000 reports annotated by a domain expert.

Results: The first training set of 600 of radiology reports achieved an accuracy of: 94% for reports with no mention of metastasis, 85% for negated mention of metastasis, and 74% mentions of metastasis without negation. Errors were reviewed resulting in vocabulary expansion and improved implementation of regular expressions to capture the expanded trigger phrases. Performance of the modified algorithm was tested on a new set of 600 reports and resulted in an increased accuracy of 96% for no mention of metastasis, 90% for negated mention of metastasis, and 89% mentions of metastasis without negation. After additional modifications were made, the revised algorithm was validated using an independent sample of 2,000 reports. The accuracy was 96% (Cohen’s kappa ~1), with precision of 98%, and a sensitivity of 98%.

Conclusions: Detecting presence of metastatic disease from radiographic notes is feasible with NLP.

References: (1) Sarkar S, Das S. A review of imaging methods for prostate cancer detection. Biomed Eng Comput Biol. 2016;7(Suppl 1):1-15. (2) Chapman WW, Bridewell W, Hanbury P, Cooper GF, Buchanan BG. A simple algorithm for identifying negated findings and diseases in discharge summaries. J Biomed Inform. 2001;34(5):301- 310. (3) Harkema H, Dowling JN, Thornblade T. Con-Text: An algorithm for determining negation, experiencer, and temporal status from clinical reports. J Biomed Inform. 2009;42(5):839-851.

Background: Radiographic imaging is important for the diagnosis and management of cancer. Radiology reports contain a wealth of information, but are typically formatted as unstructured text, making large scale information extraction challenging. We validated a natural language processing (NLP) algorithm to identify the presence of metastatic disease in radiographic imaging reports.

Methods: Using VA Clinical Cancer Registry and Corporate Data Warehouse, we identified approximately 3 million radiology reports for 120,374 patients receiving care for prostate cancer in the VA from 2006-2015. We focused on the impression section of CT, PET/CT, X-ray, bone scan, and MRI reports. We expanded on Chapman et al. “ConText” algorithm to identify the presence of metastatic disease: (1) Using UMLS, we identified terms compatible with “metastasis”; (2) Report impressions were preprocessed and tokenized at the sentence level and as part of the sentence; (3) Positive and negative trigger phrases were implemented as a series of regular expressions, which were refined over a number of iterations using training data from 2 batches of 600 reports, allowing us to extend trigger identification to a larger set of phrases. The final algorithm was validated using an independent sample of 2,000 reports annotated by a domain expert.

Results: The first training set of 600 of radiology reports achieved an accuracy of: 94% for reports with no mention of metastasis, 85% for negated mention of metastasis, and 74% mentions of metastasis without negation. Errors were reviewed resulting in vocabulary expansion and improved implementation of regular expressions to capture the expanded trigger phrases. Performance of the modified algorithm was tested on a new set of 600 reports and resulted in an increased accuracy of 96% for no mention of metastasis, 90% for negated mention of metastasis, and 89% mentions of metastasis without negation. After additional modifications were made, the revised algorithm was validated using an independent sample of 2,000 reports. The accuracy was 96% (Cohen’s kappa ~1), with precision of 98%, and a sensitivity of 98%.

Conclusions: Detecting presence of metastatic disease from radiographic notes is feasible with NLP.

References: (1) Sarkar S, Das S. A review of imaging methods for prostate cancer detection. Biomed Eng Comput Biol. 2016;7(Suppl 1):1-15. (2) Chapman WW, Bridewell W, Hanbury P, Cooper GF, Buchanan BG. A simple algorithm for identifying negated findings and diseases in discharge summaries. J Biomed Inform. 2001;34(5):301- 310. (3) Harkema H, Dowling JN, Thornblade T. Con-Text: An algorithm for determining negation, experiencer, and temporal status from clinical reports. J Biomed Inform. 2009;42(5):839-851.

Page Number
S17
Page Number
S17
Publications
Publications
Article Type
Sections
Disallow All Ads
Content Gating
No Gating (article Unlocked/Free)
Alternative CME
Disqus Comments
Default

The Clinical Lab Information Retrieval (CLIR) Framework—An R Framework for CDW Clinical Lab Data Extraction and Retrieval

Article Type
Changed
Tue, 12/13/2016 - 10:27
Abstract 42: 2016 AVAHO Meeting

Purpose: Extract, retrieve, and validate clinical lab information from the VA Corporate Data Warehouse (CDW).

Background: CDW clinical lab information provide a unique opportunity to assess real world cancer treatment effectiveness and safety with higher granularity and validity compared to administrative data. Unfortunately, there is significant heterogeneity in how this information is encoded across time and geography. Various efforts have been made to clean these data and provide a consistent and reliable mapping; however, the availability and validity of these efforts also vary across lab concepts. This presents a significant barrier to utilization of CDW clinical lab information in comparative effectiveness research.

Methods: We defined a conceptual framework for retrieval of lab information 5 features: Logical Observation Identifiers Names and Codes (LOINC) codes, test names, topography, unit, and unit reference ranges. This was then implemented as a framework in R comprised of 7 discrete modules. Each module corresponds to a defined task in the conceptual framework: Concept -> LOINC/test name -> cleaned LOINC/test name -> LOINC/test name internal identifier -> fact information retrieval -> topography selection -> unit and reference range cleaning and harmonization. Each module has a defined input and output allowing implementation transparency, reproducibility, and flexibility.

Results: Using the CLIR framework, we retrieved peripheral blood total white count of patients with hematologic malignancies. In a cohort of about 300,000 patients diagnosed and or treated for a hematologic malignancy in the VHA between 2001-2016, we identified ~ 11x10^6 potential total WBC count based on LOINC codes and lab test name. Of those, ~ 9x106 were mappable to the correct topography, and the overwhelming majority of which (99%) were mappable to a harmonized unit and reference range.

Conclusion: The CLIR framework provides a conceptual framework and an implementation in R for clinical lab information retrieval from the VA CDW. Future efforts will entail refining the methodology across multiple data domains and comparing CLIR output with other ongoing efforts aimed at cleaning and harmonization of clinical lab data in the CDW.

Publications
Sections
Abstract 42: 2016 AVAHO Meeting
Abstract 42: 2016 AVAHO Meeting

Purpose: Extract, retrieve, and validate clinical lab information from the VA Corporate Data Warehouse (CDW).

Background: CDW clinical lab information provide a unique opportunity to assess real world cancer treatment effectiveness and safety with higher granularity and validity compared to administrative data. Unfortunately, there is significant heterogeneity in how this information is encoded across time and geography. Various efforts have been made to clean these data and provide a consistent and reliable mapping; however, the availability and validity of these efforts also vary across lab concepts. This presents a significant barrier to utilization of CDW clinical lab information in comparative effectiveness research.

Methods: We defined a conceptual framework for retrieval of lab information 5 features: Logical Observation Identifiers Names and Codes (LOINC) codes, test names, topography, unit, and unit reference ranges. This was then implemented as a framework in R comprised of 7 discrete modules. Each module corresponds to a defined task in the conceptual framework: Concept -> LOINC/test name -> cleaned LOINC/test name -> LOINC/test name internal identifier -> fact information retrieval -> topography selection -> unit and reference range cleaning and harmonization. Each module has a defined input and output allowing implementation transparency, reproducibility, and flexibility.

Results: Using the CLIR framework, we retrieved peripheral blood total white count of patients with hematologic malignancies. In a cohort of about 300,000 patients diagnosed and or treated for a hematologic malignancy in the VHA between 2001-2016, we identified ~ 11x10^6 potential total WBC count based on LOINC codes and lab test name. Of those, ~ 9x106 were mappable to the correct topography, and the overwhelming majority of which (99%) were mappable to a harmonized unit and reference range.

Conclusion: The CLIR framework provides a conceptual framework and an implementation in R for clinical lab information retrieval from the VA CDW. Future efforts will entail refining the methodology across multiple data domains and comparing CLIR output with other ongoing efforts aimed at cleaning and harmonization of clinical lab data in the CDW.

Purpose: Extract, retrieve, and validate clinical lab information from the VA Corporate Data Warehouse (CDW).

Background: CDW clinical lab information provide a unique opportunity to assess real world cancer treatment effectiveness and safety with higher granularity and validity compared to administrative data. Unfortunately, there is significant heterogeneity in how this information is encoded across time and geography. Various efforts have been made to clean these data and provide a consistent and reliable mapping; however, the availability and validity of these efforts also vary across lab concepts. This presents a significant barrier to utilization of CDW clinical lab information in comparative effectiveness research.

Methods: We defined a conceptual framework for retrieval of lab information 5 features: Logical Observation Identifiers Names and Codes (LOINC) codes, test names, topography, unit, and unit reference ranges. This was then implemented as a framework in R comprised of 7 discrete modules. Each module corresponds to a defined task in the conceptual framework: Concept -> LOINC/test name -> cleaned LOINC/test name -> LOINC/test name internal identifier -> fact information retrieval -> topography selection -> unit and reference range cleaning and harmonization. Each module has a defined input and output allowing implementation transparency, reproducibility, and flexibility.

Results: Using the CLIR framework, we retrieved peripheral blood total white count of patients with hematologic malignancies. In a cohort of about 300,000 patients diagnosed and or treated for a hematologic malignancy in the VHA between 2001-2016, we identified ~ 11x10^6 potential total WBC count based on LOINC codes and lab test name. Of those, ~ 9x106 were mappable to the correct topography, and the overwhelming majority of which (99%) were mappable to a harmonized unit and reference range.

Conclusion: The CLIR framework provides a conceptual framework and an implementation in R for clinical lab information retrieval from the VA CDW. Future efforts will entail refining the methodology across multiple data domains and comparing CLIR output with other ongoing efforts aimed at cleaning and harmonization of clinical lab data in the CDW.

Publications
Publications
Article Type
Sections
Citation Override
Fed Pract. 2016 September;33 (supp 8):34S
Disallow All Ads