In a systematic review of the scientific literature, we found 62 studies in which 41 marketed or under-development portable devices were evaluated. This review identified very limited information on their performance (particularly in field settings), and major gaps of evidence, such as which APIs and which medicine formulations the devices can accurately test, their performance to quantitate APIs in finished pharmaceutical products, and abilities to identify substandard medicines.
We included 11 devices in our study, of which four were included in a laboratory evaluation only and seven (in bold), were also tested by 16 medicine inspectors from the Lao MRA in a field evaluation study: four handheld spectrometers using infrared (MicroPHAZIR RX, NIRScan) or Raman (Progeny, Truscan RM); five portable devices using infrared (4500 aFTIR, Neospectra 2.5), liquid chromatography (C-Vue), thin-layer chromatography (Minilab), microfluidic technology with luminescence detection (PharmaChk); and two single-use disposable devices: one using paper-based colour test (PADs) and one using lateral flow immunoassay technology (RDTs).
In the laboratory evaluation, all devices tested on simulated and field-collected branded medicines containing seven different anti-infectives (within each device’s capabilities to detect certain APIs) showed 100% sensitivities to correctly identify samples with 0% and wrong API after removal from their packaging except the NIRScan (91.5%). Specificities of 100% were observed for all devices, 14 except for the C-Vue (60.0%), PharmaChk (50.0%) and Progeny (95.5%). The two devices with stated abilities to quantitate APIs showed high sensitivities to correctly identify 50%/80% API samples in a pass/fail configuration (C-Vue : 100% and PharmaChk : 83.3%) whereas the RDTs, able to identify samples containing lower API than stated, showed a sensitivity of 17%. Spectrometers included in the evaluation were not stated to have the ability to identify medicines with lower API than stated using the device stock built-in algorithms available. Accordingly, the mentioned spectrometers showed limited sensitivities (from 6% to 50%). Of the field-evaluated devices the Minilab was the most sensitive to correctly identify 50%/80% API samples in the laboratory evaluation (59.5%), with significantly higher sensitivity than other devices (p<0.05), except the MicroPHAZIR (50%).
Results in brief
The NIRScan was the fastest of the field-evaluated devices to test one sample, followed by the MicroPHAZIR RX whilst the PADs and the Minilab were the slowest devices. The time spent to inspect the pharmacy was significantly longer when using the devices compared to visual inspection only, for all the devices except the NIRScan and Truscan RM. The main errors made by medicine inspectors were the selection of the wrong reference library while using the Truscan RM, NIRScan, MicroPHAZIR RX (Truscan RM seemed to be less prone to this error) and wrong user interpretation of the PADs and 4500a FTIR results. When testing a set of samples, the PADs showed lower accuracy than other devices to correctly identify samples as poor or good quality, except the Progeny and the Minilab [no significant (p>0.05) statistical difference observed]. An under-development web-based reader of the results of the PADs could reduce sample misclassification.
The Truscan RM had the highest fixed total costs over a 5-years period, followed by the Progeny, MicroPHAZIR, 4500a FTIR, NIRScan, and PADs. At the country level, all spectrometers were found to be cost-effective in settings with ‘high’ and ‘lower’ prevalence of falsified and substandard antimalarials and all were cost-effective compared with the baseline of visual inspections alone. The 15 NIRScan, that had the lowest initial cost per device (below US$5,000), was the most cost-effective in the two prevalence scenarios.
Difficulties to assemble batches of quality-assured genuine medicines to create and update reference libraries, high costs of most devices, maintenance / calibration and low sensitivity to identify substandard medicines without highly trained operators using complex API-specific models were perceived as the main obstacles for the implementation of the field-evaluated spectrometers. Sample preparation and sourcing of consumables (for the Minilab only), level of training and results that were felt too user-dependent (for the PADs only) were the main barriers to the use of PADs and Minilab.
Recommendations and next steps
Although we provide general recommendations of the best strategy to choosing devices adapted to different settings, major gaps of evidence were identified by our work: the lack of knowledge about the level of training required; the effect of the potential ‘false confidence’ on the device versus visual inspection of medicines; the best sampling strategies for field testing (standard operating procedures are required in different contexts in the absence of manufacturer guidelines); the APIs and medicines formulation each device is able to test (except for a few devices such as the Minilab or the PADs); at which level of the supply chain they would be best used (we believe this is highly setting dependent) and how the health system should adapt to optimise their use; the impact of tablet coatings, packagings and capsule shells on the performance of spectrometers.
With the current evidence, it is unlikely that any one device would be able to effectively monitor the quality of all medicines.
Much more work is needed to evaluate devices for the great diversity of medicines, and to expand our work with a platform, independent from device manufacturers, to evaluate new devices using standard protocols and samples.