Development of a Database for Study Data in Registration Applications for Veterinary Medicinal Products

Objective: In the present study, the feasibility of a systematic record of clinical study data from marketing authorisation applications for veterinary medicinal products (VMP) and benefits of the selected approach were investigated. Background: Drug registration dossiers for veterinary medicinal products contain extensive data from drug studies, which are not easily accessible to assessors. Evidentiary value: Fast access to these data including specific search tools could facilitate a meaningful use of the data and allow assessors for comparison of test and studies from different dossiers. Methods: First, pivotal test parameters and their mutual relationships were identified. Second, a data model was developed and implemented in a relational database management system, including a data entry form and various reports for database searches. Compilation of study data in the database was demonstrated using all available clinical studies involving VMPs containing the anthelmintic drug Praziquantel. By means of descriptive data analysis possibilities of data evaluation including graphical presentation were shown. Suitability of the database to support the performance of meta-analyses was tentatively validated. Results: The data model was designed to cover the specific requirements arising from study data. A total of 308 clinical studies related to 95 VMPs containing Praziquantel (single agent and combination drugs) was selected for prototype testing. The relevant data extracted from these studies were appropriately structured and shown to be basically suitable for descriptive data analyses as well as for meta-analyses. Conclusion: The database-supported collection of study data would provide users with easy access to the continuously increasing pool of scientific information held by competent authorities. It enables specific data analyses. Database design allows expanding the data model to all types of studies and classes of drugs registered in veterinary medicine. The needs for detailed data recording and versatility of the data model must be carefully balanced. Application: The database will be used by regulatory authorities.


INTRODUCTION
Medicinal products for human or veterinary use need to have suitable properties in terms of quality, safety and efficacy, which must be shown by scientific studies and other scientific literature.The supporting documents are compiled in registration dossiers, which are the primary source of information for evaluating medicinal products by regulatory authorities.
In Germany, veterinary medicinal products (VMPs) are authorised by the Federal Office for Consumer Protection and Food Safety (Bundesamt für Verbraucherschutz und Lebensmittelsicherheit, BVL).This authority keeps a large amount of scientific documentation related to VMPs.Registration dossiers usually consist of four parts, of which Part III Safety and Part IV Efficacy contain the necessary information including study reports on pharmacology, toxicology, safety, and efficacy of the product or active compound.Table 1 gives an overview of the main studies, which must be submitted along with the application for authorisation.The authority has archived all applications for authorisation including the supporting dossiers in electronic or in paper form.Administrative product data are managed in a central database (Arzneimittel-Informationssystem, AMIS), which contains the relevant administrative product data (including Summary of Product Characteristics, SPCs) and is used for management purposes and to provide (non-confidential data) information to the public.Scientific data from registration dossiers have not been stored in a database so far.Therefore, if assessors need to make use of study data, they have to extract them from the paper or electronic dossiers.
In human medicine, various databases for the registration of clinical studies on the efficacy of pharmaceuticals have been established in recent years In veterinary medicine no similar database systems for clinical studies have been established so far.
In 2015 an international consortium started the VetAllTrials initiative, dedicated to the development of one or more veterinary clinical trial registries (VetAllTrials, 2015).Some attempts at collecting study data in veterinary medicine with limited focus and user groups have been made (e. systematic collection of data from studies in veterinary medicine, up to now only publication databases like Medline or Veterinary Science Database have been able to be used.
In accordance with the approach in practice for studies on human medicinal products, also for registration studies on veterinary medicinal products, a database for authority use should be established.Decisions whether parts of it may become publicly available should be reached at a later time.
Nowadays the majority of marketing authorisations for VMPs in the European Union are granted by way of decentralised (DCP) or mutual recognition procedure (MRP) involving more than one Member State (MS).This is closely followed by an increasing exchange of information between competent authorities (CA) in order to come to a consensual decision.It is obvious that a common database, which includes all studies submitted to European CAs, would be a useful tool to smoothen all these procedures within the Network.
The objective of this feasibility study was to develop a database for recording data from drug registration studies.The pivotal steps of this study included the construction of a relational data model, the implementation of the data model in a database, and the testing of functions and application options of the database.Finally, application examples from an authority perspective were presented.

METHODS & MATERIALS
The dossiers for registration applications for VMPs submitted to BVL between 1974 and 2010 served as the data source.From the total number of dossiers a subset of 95 was selected encompassing all applications for VMPs containing the anthelminthic agent Praziquantel as either the only or one of several active substance(s).Data from Part I and Part IV of the dossiers, i.e. data concerning the drugs and data on clinical efficacy studies were included in this feasibility study.Within this publication, no confidential data are disclosed.

Data model design and implementation in a prototype database (MS Access)
Data modelling was based on the entity-relationship method (Chen, 1976).First, the dossiers were screened and the main product data and relevant study data were identified.Entities, attributes and their relationships were determined as the key components of a relational data model.
Rules of normalisation were applied to the entire study and product data.To this end, all tables were split to remove any data redundancy and to resolve dependencies between non-key fields (Klug, 2008).The data model was constructed of data tables, assignment tables and lookup tables.Each table contained an attribute which served as a primary key that uniquely identified a data set.Primary keys were labelled with the suffix "ID" (e. g.Study_ID, Result_ID).
Each data table represented an entity including its attributes.Lookup tables provided selection lists of terms to the user (e. g. list of species or active substances).Some of these lists were based on internationally agreed terminology in the field of drug registration and pharmacovigilance (ATCvet, 2016; EDQM Standard Terms, 2016).
Tables were linked by 1:n relationship and the primary key of the reference table served as foreign key in the associated tables (Bildner, 2008).
The relational data model was implemented in a prototype database using MS Access 2003.

Statistical evaluation
Following transfer of the data from Access to SAS (Version 9.1.3,Statistical Analysis System, SAS Institute, Cary, NC, USA) descriptive data analyses were performed.
If applicable, significance testing was applied using the Mann-Whitney U test and the Kruskal-Wallis test (Clauß and Ebner, 1989).The null hypothesis of no differences between groups under observation was rejected if the p-value was below 5%.

Data source: choice of compound and study types
In veterinary medicine, the most frequently used medicinal products are indicated for the treatment of parasitic and bacterial infections and the majority of authorised VMPs in Germany as well as in the European Union are antiparasitics and antibiotics (IFAH Annual Report, 2013).
Compounds considered suitable for the purpose of this feasibility study should already have been on the market in Germany for a long time (since the intention was to include product and study data from the early days of German Drug Law until the deadline in 12/2010).The products should have been approved for use in food-producing animal species and pet animal species.They should have been authorised by national, decentralised / mutual recognition or centralised procedure to cover the different data requirements involved.These criteria were best met by the antheminthic compound Praziquantel.This agent is used in dogs, cats, horses and sheep to treat infestations with flatworms (Plathelminthes) like tapeworms and trematoda.It is available as single agent VMP or combination VMP, i.e. in combination with other anthelminthic agents.
Three subtypes of clinical studies on the efficacy of Praziquantel were identified in the dossiers, namely dose determination/dose titration studies, dose confirmation studies, and clinical evaluation/field studies.All studies meeting the criteria listed in Table 2 were included in the database prototype and subsequently used for data analysis.

Inclusion criteria
-marketing authorisation applications for drugs for veterinary use containing Praziquantel -applications since 1976 -clinical studies conducted in target species -actual status of VMP: marketable, suspended, withdrawn -test drugs: finished medicinal products for animals containing Praziquantel and maybe further substances for antiparasitic treatment -dosage forms: topical, oral, parenteral -prescription-only, pharmacy-only and over-the-counter VMPs -type of archiving of dossiers: electronically or in paper form -marketing authorisation granted, refused or application withdrawn Exclusion criteria -original study data not included in the registration dossier (e.g.data from publications)

Data model design
The relational data model was based on the two pillars product data, i.e. administrative data and product characteristics, illustrated by the central entity Data_Product and data from clinical studies illustrated by the central entity Data_StudyAdminData. Product data and study data were linked through an n:m relationship to ensure multiple possibilities of interaction between both.
An assignment table (Zuord_StudyAdminData_Product) was placed between these central entities in order to break down their relationship into two 1:n relationships.
The following sections give a brief description of the entities, their attributes and relationships resulting from the requirements analysis and implemented in the data model.An overview of the complete relational data model is given in Figure 1.Terms referring to those used in the data model are printed in italics.

Product data
Product data were mapped in the data model by the central entity Data_Product.This entity contains attributes that are commonly used in the daily work of regulatory authorities and the pharmaceutical industry.These are: (1) name of the drug (VMP); (2) submission number (ENR) which is a unique seven-digit number allocated to each VMP for which registration application has ever been submitted; (3) authorisation number (AuthorNumb) which is a unique number allocated to approved products; (4) dosage form (DosForm) such as tablet, solution, ointment; (5) ATCvet code (ATCcode) based on the ATC index for the classification of drugs (ATCvet, 2016); and (6) the authorisation procedure (Procedure) which is either national, decentralised or centralised depending on the European Member States involved.Four other entities were connected with the central entity.They provide selection lists (Prf) of active substances, dosage forms, ATCvet codes, and application procedures (Prf_Substance; Prf_DosageForm; Prf_ATC; Prf_Procedure).From these, the entity Prf_Substance was linked via assignment table (Zuord_Product_Substance) to the central entity since VMPs may be composed of more than one active substance and each active substance can be used in several products.

Study data
The other part of the data model illustrates the relevant study-related data and the relationships thereof.Data_StudyAdminData which encompasses attributes of purely administrative nature and attributes which focus on the study design constitutes the central entity.These attributes are: (1) the type of the study (StudySubCat) as shown in Table 1; (2) the study identification number (StudyNo) which was allocated by the sponsor; (3) name of the study (StudyTitle); (4) beginning and end of the study (StudyYear1, StudyYear2); (5) place of study (StudyCountry1, StudyCountry2 etc.); ( 6) -( 10) randomisation of the test animals; blinding of the assessors; compliance with Good Clinical Practice (VICH GL 9, 2000); compliance with Good Laboratory Practice (Directive 2004/9/EC, Directive 2004/10/EC); and the inclusion of a control group.Combo boxes (Yes/No/Unknown) were added for the attributes ( 6) to (10).Data entries of the attribute study type and place of study were made by means of look-up tables.Text fields were added for (11) the description of the study design (StudyDesign); ( 12) object of the study (TargetParam); (13) methods used (MeasMeth); ( 14) inclusion and exclusion criteria (CriteriaForInclusion); and the description of the results (ResultsDescript).Lastly, (15)  If the data model is made applicable to other types of studies (non-clinical studies) it is easy to achieve by adding further entities to the basic structure.Thus, extensions can be made without the need for modifications to the original data model.

Implementation in MS Access
Microsoft Access 2003 was used to implement the data model (Figure 1).Additional features were implemented to allow for convenient input of study data and also for database queries with data outputs in a comprehensive and clear format.
The main functions for database users are data entry and search functions.Convenient access to these functionalities was provided by a top menu (Figure 2), which leads to data entry on the one hand and to queries on the other hand.The items printed in bold have already been realised in the prototype; these may be supplemented in an extended version.For data entry, a form which automatically fills the data tables and includes drop down lists for contents of lookup tables, was created.Besides ease of use, this helps to avoid errors in data entry as data sets in all related tables are generated automatically.Four queries were implemented which exemplify ad-hoc searches on study data.These queries aim to identify those studies available per product, per active substance and per ATCvet code.Main study data are given as tabular outputs.Complete details of study reports can be requested with the fourth query on study details.These basic search functions may be extended by the addition of filters in further database versions.In reports a) to c) filters for study year, target species or study quality (randomisation, blinding, and number of animals included) would produce more specific results, provided that a sufficient number of studies are in the database.The use of filters should be optional.Users are able to obtain an overview of the available studies and then select an appropriate subset of studies.Due to the feasibility nature of the study, results shown below may serve as examples only and do not allow conclusions to be drawn concerning the entire amount of clinical studies submitted to BVL.As only a small subset of the available study data was included in the database prototype, results should not be interpreted regarding their contents but rather the possibilities for data usage thereof.Data from 308 clinical studies derived from registration dossiers of 95 veterinary drugs containing Praziquantel (from which 33 were single agent products and 62 combination products) met the inclusion criteria (Table 2).Data from these clinical studies were largely similar and could be recorded in a structured manner and were therefore suitable to be included in the database.
The dossiers of 33 products (=35%) contained bibliographical data or were referenced to other authorised drugs.Therefore, no clinical studies were presented in the dossiers of these products.
Seventy-five percent of the dossiers contained between 0 and 8 studies, with 35% containing no studies and 40% containing between 1 and 8 studies.The remaining dossiers had between 9 and 100 studies.The maximum of 100 studies was found in one dossier from 1993.As most of these studies were not conducted and presented according to current guidelines, with very small numbers of animals per study, they would not be counted as full-value studies nowadays.See Figure 3.  Stratification of number of studies per 5-year period gives information on distribution of years studies were conducted, which has to be considered when assessing changes of certain parameters over time.Up to 1975, already 69 studies on the single agent products had been conducted; further studies on different combinations of Praziquantel with other substances have been conducted in recent years.As in all periods studies were carried out, study evaluation and time series analysis with regard to quality-related parameters is possible (see Table 3).

Table 3: Number of studies stratified for 5-year periods when studies were conducted
The analysis of the distribution of the number of studies per authorisation procedure gives an impression of the amount of knowledge gained in different kinds of procedures.The mean number of studies per drug is largest for centralised procedures as these are used for new and innovative agents or combinations (Table 4).
Several other analyses were performed, looking at the number of studies used in one or several marketing authorisation applications (Figure 4) or vice versa on the number of studies per dossier, which gives information on the re-use of studies for more than one drug and on the amount of clinical data per dossier.Re-using studies in more than one dossier is possible if a company applies for or is granted more than one marketing authorisation for the same or a similar product.The total number of animals included in a clinical study is an important criterion for the power of the study and herewith its quality.In the selected subset of studies, study size was between one (only in very old and pilot studies) and 539 animals per study (median 20 animals).Based on the limited amount of studies included further stratifications were performed for study size in different kinds of authorisation procedures.Further stratifications would be interesting to look at in larger numbers of clinical studies, e.g. per species or study subtype.Some parameters associated with quality of studies were analysed for variation over time.This may be interpreted against the background of changes in legislation and adoption of guidelines for conducting clinical studies.Whereas the study size, measured as numbers of animals per study, did not significantly increase, the proportion of studies with blinding, randomisation, GCP and/or GLP compliance clearly increased over time (Figure 5).For a larger amount of studies, this analysis should be performed not for time intervals but for single years to see if implementation of guidelines have led to greater changes in study quality.

Evaluation of the data model and critical assessment of study data quality
The data model developed in this feasibility study has proved to be suitable for presenting clinical study data from registration dossiers.It contains the most relevant information on veterinary medicinal products and their active substances.For the study data, a modular approach has been chosen, consisting of more general data relevant for all study types (administrative data, e.g.study number, date, title, results) and of data with specific information applicable for one study type only.Information from clinical studies has been used as an example here, but tables for other study types can be added without any need for changes to the already existing structure.
No major flaws were noticed during practical testing.However, it became evident that some minor changes to the data model would clearly improve performance and usability of the database.3.In the currently implemented version, up to three countries per study can be recorded, which was not sufficient for the dataset used for testing.More veterinary drugs are going to be developed for small patient cohorts, and the animal patients used in the registration studies need to be recruited from all over the world.The current limit of three countries should be changed by introducing a further allocation table linking data table Data_StudyAdminData and check table Prf_StudyCountry.4. All active substances included in test and reference products should be searchable in the database.In the current version active substances are related to the test product only.Other products that are administered in a clinical study and the active ingredients of these products are not covered by the current search functions of the database.A more precise mapping of data from study reports containing different products could be realised by adding an allocation table to link the lookup tables Prf_Substance to the data table Data_Dosage_ClincalStudies.This would allow for recording of all active substances administered in compliance with the first normal form and for searches on test product basis.5.In the set of studies used for testing the data model, some studies integrate more than one study type within one study, e.g. a prevalence study and an efficacy study.As these include study aims and study details, which differ from one another, they were not fully presentable as one study within the prototype data model.This issue was not addressed in our feasibility study but may be covered by hierarchical sets of study type specific substudies.
In one aspect the data model should even undergo some simplification: Data table Data_Dosage_ClinicalStudies contains information on the parasite species a drug is effective against, which is applicable for products for antiparasitic use only.Knowledge gained on the current study population showed that parasite species were the same among tested dosage groups.Thus, combining dosage group and parasite species is considered unnecessary.As the data model should also be suitable for clinical studies on other kinds of drugs, the demands of detailed recording of data should be balanced with the need for versatility of the data model.Therefore, information concerning the parasite species should be integrated in the more general target parameter data field (data field TargetParam in table Data_StudyAdminData).
Otherwise, inclusion of different data tables for different classes of drugs would lead to an unreasonably more complex data model.
For use of a study database in live operation, constant updating of product data would be necessary.Therefore, an interface between study database and product database would have to be implemented.
As shown in the descriptive data analysis, the quality of studies, as well as the quality of study reports have improved in more recent clinical studies.In the set of studies used for testing, problems in data recording due to insufficient or missing data were found in older studies only.
Examples of problems in data recording due to shortcomings in data quality and presentation of data are missing study numbers or inconsistencies regarding contents such as differences in numbers of parasites under consideration between study plans and final study reports.As only indications based on endpoints predetermined in study plans would be accepted, this may have an impact on the outcome of authorisation procedures.Missing values were found concerning numbers of animals per treatment group.Also, in some older studies, only summary reports were available with missing information on aspects like inclusion or exclusion criteria.In any case, entry of data should be combined with a check of data quality and it may be useful to have a text field in the administrative part of the data mentioning problems in data quality.

Figure 1 : 1
Figure 1: Implementation of the entity-relationship model in MS Access, 1:n relations are shown as 1: relations

Figure 2 :
Figure 2: Top Menu to open forms for Data Input and Queries

Figure 3 :
Figure 3: Number of studies per drug, comparison of mono-preparations and combination drugs

Figure 4 :
Figure 4: Number of studies used in one or several applications for approval

Figure 5 :
Figure 5: Proportion of blinded, randomised, GCP/GLP compliant studies per interval related to all studies conducted

p a g e | 5 total pages: 18
the total number of animals (TotNumbAnim) was included in the central entity.Data_StudyAdminData compiles the quantitative study results (Data_Result).Irrespective of the type of study, all types share the central entity and the entity Data_Result.

Table 4 : Distribution of studies per authorisation procedure
1.A second data field for study numbers should be added.Besides the study number assigned by the principal investigator, some studies were allocated further numbers by other parties involved.When searching on the basis of the study number (Query study details) both data fields should be accessed.2. A duplicate check should be established in order to avoid repeated entry of identical studies.A duplicate check may be achieved by extended query functions (e.g. for total number of animals, TotNumbAnim, and for year(s) of study conduct, StudyYear_1/StudyYear2).