Newsletter Article – Digital Data Collection – Experiences from Uttar Pradesh

By Sitaram Mukherjee, Mudita Tiwari

In partnership with FINO, CMF is evaluating a door-to-door step banking services, FINO has a network of Business Correspondents or Bandhus, who open no-frills savings accounts with the Union Bank for the clients.  They operate in areas where formal banking services are not available and clients belong to low-income groups.
The Bandhus use terminal (hand-held) machines to conduct savings and withdrawal transactions. The baseline data collection was conducted in Varanasi and Azamgarh districts of Uttar Pradesh to assess the household information, income and expenditure habits, loans and savings information. The clients were also provided basic Financial Literacy Education by FINO. With the endline currently underway, the most innovative part about the endline process is the digital data collection being conducted on the notebook and the tablets.  Currently 12 notebooks, and 12 tablets are being used to survey a sample size of 3,000.

This is the first CMF project where digital data collection has been used extensively to conduct a survey, and there have been interesting challenges.  We wanted to highlight some key lessons:
Training: A dedicated team should be hired with prior field experience and basic computer knowledge. It is advisable to use local flyers and newspapers for recruitment purposes. Each surveyor went through specific training in using the tablet and the notebook.  The surveyors had the choice to select the device they wanted to work on and their skillset matched the device they were working on. Two weeks of extensive training included use of devices, questionnaires and even mock calls.
Basic infrastructure: Setting up an infrastructure to support electronic devices is important. Backup generators were required in Varanasi so that the devices could be charged as there is limited supply of electricity. Voltage fluctuations could also lead to device damage. Hence safety of such devices is of utmost priority. Monitoring of devices need to take place daily and issues pertaining to them should be addressed without delay. Data security is also vital, data from the notebooks had to be downloaded daily and backups were maintained regularly.
Robust Software: Software designing for the data entry devices should commence as early as possible; at least one month should be spent in developing the software, and another one month in testing the software.  Following must be kept in mind for the software:
a.    Questionnaire timeline and piloting- Ideally the questionnaire should be finalized before the software is developed. The questionnaire should therefore be piloted rigorously to ensure the questions are pertinent, clear and concise, logical patterns are followed and the appropriate skips are in place.
b.    Questionnaire size and complexity- The software should be able to handle the size and the complexity of the questionnaire. In case the surveys makes use of lots of grids (such as household rosters),
c.    Interface – Interface should be user-friendly.  The developers may use tooltips and help buttons to provide instructions.
d.    Software stability- The software should be stable and data loss should not occur in case the device stops working. An auto-save feature should be in place. If the devices are unusable for any reason, backup devices should be available, and should be easily configurable.
Piloting: The devices should be used to conduct pilots in the field only after the questionnaire has been piloted separately. Classroom training has been provided to the field staff on the devices and the questionnaire. This is to assess how familiar the field staff is with the devices and solidify their data entry skills. The piloting can be conducted until the data quality is acceptable. Changes to the software if any should be done for every device and the software team should ensure that the database structure and the software version are consistent across all devices.
Data Validation and Accuracy: Once the data collection starts, it is essential that the supervisors and monitors be involved in accuracy checks of the responses.  The issues are:
Back-checks – Supervisors and monitors should conduct a 100% back-check and ensure that the logical patterns are followed and the skips are appropriate.  The supervisors and monitors must go to the client’s house to compare the responses.  If error rate is below 50%, then the errors are resolved in the field and the new answers are recorded.  Else, the entire survey is conducted again. The error rates are recorded for each surveyor so that surveyors can be retrained if required. Surveyors are never told in advance which questions will be back-checked to ensure quality checks are random.
Machine Issues – There are incidents when the machines freeze, shut down or discharge in the middle of a survey. Therefore, it is critical that the data be saved regularly preferably after each question or section so that the entire data is not lost.  Paper surveys may be given to the surveyors in case the machine shuts down as backup methods to continue on with the survey. These answers can then be entered when the device is operational.
Data Backup – Each night the data from all devices is merged and automatically sent to a central server for back-up (for tablets only), or is manually uploaded to a backup server (for the notebooks only).  The data can then be downloaded and analysed for accuracy.