Policy-Driven, Multimodal Deep Learning for Predicting Visual Fields from the Optic Disc and OCT Imaging

In a new paper published in the journal Ophthalmology, Lee Lab members Yuka Kihara and Aaron Lee report the development of a novel deep learning approach to address a key problem in glaucoma research and treatment - predicting visual function from structural changes detected on optical coherence tomography (OCT). Physicians rely on visual field testing to assess whether glaucoma patients will progress to more severe disease, but the test is notoriously challenging for patients and the results can be inconsistent. Advanced retinal imaging such as OCT can reveal key structural changes in the retina and optic nerve, but previous methods for linking structural changes to visual function outcomes have been challenging. This approach provides a fully automated method for improving the prediction of standard automated perimetry sensitivity directly from structural imaging data and also enables artificial intelligence–derived structure–function mapping.

Figure 2. Prediction examples from the test set. The top of each panel shows the disc and OCT images used for predictions alongside the ground-truth Humphrey visual field (HVF). The bottom of each panel shows, in order, the predictions from the 2 submodels, the choice score of the policy network for each location, and the final prediction of the policy network. A, Both submodels predict a similar location of the defect, but the OCT to HVF is more accurate in predicting the magnitude of glaucoma damage; the policy network correctly selects this prediction for the superior hemifield. B, The prediction from either individual submodel is wrong, showing either little damage or diffused advanced loss; however, the policy network correctly selects predictions for each location to obtain a result very close to the ground truth, better characterizing the spared paracentral visual field. I = inferior; N = nasal; S = superior; T = temporal.

The authors employed a hybrid deep learning method that took advantage of two different types of retinal imaging data: circumpapillary OCT scans of the optic nerve head and infrared reflectance optic disc imaging. A separate deep learning model was trained for each imaging modality, and then a policy deep learning model was constructed to take the feature maps from both models and fuse the 2 models' predictions. The deep learning models predicted visual field pointwise sensitivities agnostically from the OCT and infrared reflectance images, and the predictions were further improved with the policy-based fusion of the 2 results. The DL models also generated structure–function maps that were compatible with established anatomic features and therefore were able to capture the functional consequence of relevant structural changes.

Visual field estimation from imaging allows the conversion of the information contained in OCT and infrared reflectance images into a more clinically meaningful format. This approach could enable structural data to be seamlessly integrated into analyses of visual field progression or into the visual field test itself, which could reduce between-visit measurement variability when following a patient over time. The latter is appealing because this might improve the power to detect disease progression in a clinical trial, and the authors plan to continue this work with that goal in mind.

Kihara Y, Montesano G, Chen A, Amerasinghe N, Dimitriou C, Jacob A, Chabi A, Crabb DP, Lee AY. Policy-Driven, Multimodal Deep Learning for Predicting Visual Fields from the Optic Disc and OCT Imaging. Ophthalmology. 2022 Feb 21.