VNR code mapping to RxNorm

Details of VNR codes in finngen R6 mapped to standard RxNorm

The VNR codes are Nordic country-specific codes known as the Nordic Article Number. The VNR codes are 6-digit codes ranging from 000001-199999 and 370000-599999. They are assigned to all human medicines, veterinary medicines, herbal medicines, and traditional herbal medicines. Numbers outside this range are called National Article Numbers which are used differently depending on the country.

RxNorm terminology on the other hand, is US specific terminology and provides normalized names for medications allows linking to many drug vocabularies commonly used in the US market.

We have mapped the Nordic country-specific VNR codes to RxNorm in FinnGen R6. Although the initial mapping was performed in R6 and is located in library-green in finngen_R6 folder the mapping can be used with any Data Freeze/Release.

The mapping and readme are located:

/finngen/library-green/finngen_R6/finngen_R6_medical_codes/fgVNR.tsv

/finngen/library-green/finngen_R6/finngen_R6_medical_codes/fgVNR_readme.txt

How the VNR code to RxNorm mapping was done:

VNR codes originating from different sources within FinnGen were combined into single table called 'OriginalVNR'
Additional information on VNR codes with missing drug name, strength, ingredient information were requested from Pharmaceutical Information Centre (Lääketietokeskus). This information is stored in the table 'ltklVNR'
Missing ingredient information of VNR codes was filled using ATC codes.
Administration routes, dosage forms and units were created for codes in both 'OriginalVNR' + 'ltklVNR' tables.
We processed source text format of Package, Substance, Substance strength, Administration Route and Dosage Form for both 'OriginalVNR' + 'ltklVNR' tables.
We used OHDSI drugmapping tool to map the parsed VNR code information to map to standard RxNorm

More details till step 5 from can be found in github repository.

Description of the columns of fgVNR.tsv are shown in below table.

Column Name

Column Type

Description

Example

VNR

INT64

six-digit VNR code.

518

ATC

STRING

ATC group code

N05AH04

MedicineName

STRING

Commerical Name

SEROQUEL

AdministrationRouteSourceTextFI

STRING

Administration route in text format as in source.

Suun kautta

AdministrationRoute

STRING

Valid value for administration route.

Oral use

DosageFormSourceTextFI

STRING

Dosage form in text format as in source.

tabletti, kalvopäällysteinen

DosageForm

STRING

Valid value for dosage form.

film-coated tablet

PackageSourceTextFI

STRING

Package info in text format as in source.

10 FOL

PackageSize

FLOAT64

Size of package in float format.

PackageFactor

INT64

Factor of package in float format.

PackageUnit

STRING

A valid unit value.

fol

SubstanceSourceTextFI

STRING

List of substances as in source.

quetiapine

Substance

STRING

Substance name. one row per substance.

quetiapine

SubstanceStrengthTextFI

STRING

Substance's strength in text format as in source.

25+100+200 mg

Strength

STRING

Mapped or fixed or split substance strength. If not then source strength used.

100 mg

SubstanceStrengthNumenatorValue

FLOAT64

Substance's strength value in numerator in float format.

100

SubstanceStrengthNumenatorUnit

STRING

A valid unit value.

SubstanceStrengthDeominatorValue

FLOAT64

Substance's strength value in denominator in float format.

SubstanceStrengthDeominatorUnit

STRING

A valid unit value

ValidRange

BOOL

True if VNR is en the valid range (less than 200000 or between 370000 and 599999)).

TRUE

Source

STRING

From which table the code was taken.

ltklVNR or "originalVNR"

Status

STRING

How well the medicine has been processed

incomplete_dosageForm

VNRnew

STRING

A temporary VNR code created for drugs with single substance multiple strength values. Temporary VNR code will have letters a or b or c attached to the end.

000518a

calculateTotalStrength_message

STRING

How well the strength has been processed.

correct or "missmatch"

TotalStrength

FLOAT64

Total Strength of the drug which is PackageSize * PackageFactor * Dosage

10 * 1 * 100 = 1000

TotalStrengthUnit

STRING

Total Strength valid unit

n_codes

INT64

Frequency of the VNR code

260

Dosage

FLOAT64

SubstanceStrengthNumenatorValue/SubstanceStrengthDeominatorValue

100/1 = 100

DosageUnit

STRING

A valid unit value

MedicineNameFull

STRING

Commerical Name, Dosage Form and SubstanceStrength

SEROQUEL 25+100+200 mg

The fgVNR.tsv file was used as the input for OHDSI drugmapping tool. The tool requires VNR code with substance information followed by dosage form and drug strength. If no substance information is present then there will be no mapping.

DrugMapping tool requires a Common Data Model (CDM) database with vocabulary data. To create the CMD, we:
- Extracted CDM database schema from OHDSI common data model for Version 5.3.2 of CDM.
- Created the CDM database schema in a PostgreSQL server Version 14.2
- Changes were made in the CDM V5.3.2 SQL files generated from OHDSI common data model due to PostgreSQL server Version is > 9.
- Downloaded the Vocabulary data of Default vocab list + Addition vocabularies for "Dosage Form" from Athena.
- Uploaded the Vocabulary data from Athena to the PostgreSQL CDM v5.3.2 database
Once the CDM database was up and running from PostgreSQL, we started setting up the DrugMapping Tool
- DrugMapping Tool does following maps. We did "Clinical Drug Map"
- Input file for DrugMapping tool was formatted to add missing columns
- Information regarding the possible Clinical drug mapping possible along with total number of input drugs
  Input
  Value
  Total Drugs
  15,928
  Drugs with non-missing VNR Codes
  15,902
  Unique VNR codes
  14,655
  Drugs with non-missing VNR codes + Ingredient Codes
  13,876
  Drugs with non-missing VNR codes + Ingredient Codes + dosage form
  12,897
  Drugs with non-missing VNR codes + Ingredient Codes + dosage form + dosage value
  12,644
  Drugs with non-missing VNR codes + Ingredient Codes + dosage form + dosage + dosage unit
  12,638
After fixing the Input file for DrugMapping Tool, it created three intermediary files
- Ingredient Name Translation File
- Unit Mapping File
- Dose Form Mapping File
All the three intermediary files need to be filled carefully
- Ingredient Name Translation File - Simplest to fill using the input file
- Unit Mapping File
  - Source units were to be mapped to standard units such as 'mg' and 'mL'. Example
    SourceUnit
    DrugCount
    RecordCount
    Factor
    TargetUnit
    Comment
    %
    303
    1109084
    0.01
    mg/mg
    
    IU
    309
    400416
    1
    [U]
    
    U
    25
    360
    1
    [U]
    
    g
    160
    1614186
    1000
    mg
    
    g/l
    8
    1112
    1
    mg/mL
    
    mg
    20905
    68649900
    1
    mg
    
    mg/days
    111
    54182
    0,289
    mg/h
    
    mg/h
    5
    455
    1
    mg/h
    
    milli.IU
    73
    478234
    1000000
    [U]
    
    ml
    5
    4885
    1
    mL
    
    ug/puffs
    6
    212536
    0.001
    mg/{actuat}
  - Dose Form Mapping File
    First thing is to extract all the dose form in domain "Drug" with concept_class "Dose Form" from all the vocabularies in the CDM database.
    Second thing is to extract "relationship_id" of "Source - RxNorm eq" from CONCEPT_RELATIONSHIP table for all non-standard "Dose Form" from "additional vocabularies".
    Match the cells in "DoseFrom" to the extracted standard dose forms which was only 49 out of 147 dose forms.
    Manually filled out 85 dose forms with only 13 dose forms missing having low frequency. Example of filled dose form file can be seen below
    DoseForm
    DrugCount
    Priority
    concept_id
    concept_name
    Comments
    BASIC CREAM
    11
    
    19082224
    Topical Cream
    
    BATH ADDITIVE
    1
    
    19082228
    Topical Solution
    
    BODY LOTION
    1
    
    CAPSULE
    279
    0
    19082168
    Oral Capsule
    Standard
    CAPSULE
    279
    1
    19021887
    Capsule
    Non-Standard
    CAPSULE, HARD
    664
    
    19082168
    Oral Capsule
  - The result of DrugMapping Tool after carefully filling out all three files can be shown below
    Percentage of possible drugs mapped is 12,089 of 12,638 (95.6%)
    Source drugs mapped to Clinical Drug
    12089 of 14692 (82.283%)
    Source drugs mapped to Clinical Drug Form
    562 of 14692 (3.825%)
    Source drugs mapped to Clinical Drug Comp
    354 of 14692 (2.409%)
    Source drugs mapped to Ingredient
    588 of 14692 (4.002%)
    Source drugs mapped Splitted
    74 of 14692 (0.504%)
    Source drugs mapped Splitted Incomplete
    3 of 14692 (0.02%)
    Source drugs mapped Total
    13670 of 14692 (93.044%)
    Source drugs mapped to None
    1022 of 14692 (6.956%)

PreviousMore information on health code sets NextRegister code translation files

Last updated 2 years ago

Was this helpful?

hashtagHow the VNR code to RxNorm mapping was done:

How the VNR code to RxNorm mapping was done: