Dataset Overview
There are two groups of datasets: de-identified data and retinal images.
The de-identified data cannot be traced to individuals so can be shared with any interested researcher. The retinal images can theoretically be traced to individuals, because the retinal vasculature has a unique pattern in each person, so these can only be shared with research groups at institutions which enter into a data-sharing agreement with Indiana University.
The data files and images provide demographic information and summaries from retinal scans and perimetric measurements from 506 people:
- 77 patients with glaucoma (PWG)
- 381 control subjects free of eye disease (CTR)
- 35 patients classed as glaucoma suspects (SUS)
- 13 patients who do not fit any of these categories (OTHER).
These people were tested as part of research funded by grants from the National Eye Institute: R01EY007716, R01EY024542, R01EY028135. Perimetric data were gathered with two devices the Humphrey Field Analyzer and the Humphrey Matrix using the 10-2 and 24-2 test patterns. Retinal images were gathered with a Spectralis optical coherence tomography (OCT) device, using the built-in circumpapillary and Posterior Pole scans as well as custom volume scans.
A. A set of spreadsheets is available to any researcher who is interested, and contains de-identified data:
- 482 people have data in these Excel files:
- 254 performed perimetry
- 438 had circumpapillary scans
- 434 had posterior pole scans
B. Supplemental files contain retinal images that although de-identified do contain images of the retinal vasculature, and the retinal vasculature can be used as a biometric identifier. These files can be made available to researchers at institutions that sign an agreement with Indiana University about confidentiality.
- 366 people have results of custom retinal scans
- 227 had montages
- 159 had widefield scans
A. Spreadsheets with De-identified Data
The first spreadsheet is Master_Subject_List.xlsx, which provides an identification number and information (diagnosis, axial length, corneal curvature, refraction, race, ethnicity) for each subject in the other sheets and images. In order to provide information about the sequence of images and perimetric data over time, while not revealing personal health information (PHI) such as visit date, this file gives the age at study start date and the other files give the number of days from study start date. This means it is possible to determine for an individual person how many days occurred between different tests, which is important when looking for progression.
The other three spreadsheets contain results of specific tests:
- Visual Fields 24-2_10-2_FDT_Other.xlsx
- Circumpapillary.xlsx
- Posterior Pole.xlsx
Master_Subject_List.xlsx
This list has demographic information for all people who were recruited and tested for any of the datasets or images. Some did not meet the criteria for specific studies, but are included for completeness.
- A: AgeSSdate gives the age in years at a reference date that is different for each person.
- B: ID number is an anonymous ID that is not related to any personal information but generated in the lab
- C: Client type indicates the diagnosis at time of enrollment. Sometimes a review of clinic data found that a person did not meet the criteria for the study, due to extreme refractive error or for an ocular or systemic condition. The code is CTR for control, PWG for patient with glaucoma, SUS for glaucoma suspect and OTHER for someone who did not meet the criteria for one of these three groups, in which case notes about the exclusion are in column D.
- D: Special Considerations gives information for a few people that affects their diagnostic status, such as traumatic brain injury (TBI) that caused a person to stop being considered a control, development of age-related macular degeneration (AMD) while being followed, or a clinic visit with IOP > 21 for the controls.
- E: Gender is how the person identified themself as male or female, without reference to sex assigned at birth.
- F & G: Race and Ethnicity are how the person answered a questionnaire that used the NIH classifications.
- H: Last Comprehensive Eye Exam is the number of days between the reference date and the most recent comprehensive eye exam (CEE) in our review of clinic records. There may have been a CEE since our review, so this is a minimum estimate. In some cases the number of days is negative which means that the most recent CEE in our records was before the reference date. Some published studies required that the most recent CEE for a control was within a specific number of years (e.g., 2 for older controls, 5 for younger controls). In a few cases the last CEE was more than 5 years before the reference date; these were people who volunteered as controls for pilot data but whose data were not used in published data due to the date of the CEE.
- I - L: Axial length (in mm) and corneal curvature (K-reading) for right (OD) and left (OS) eyes as measured with an IOLMaster.
- M-V: Refractive error for OD and OS at most recent CEE, in diopters (D). Spherical error is given as RX_OD_SPH & RX_OS_SPH. When there was cylinder it is given as RX_OD_CYL, RX_OS_CYL for diopters and RX_OD_AXIS, RX_OS_AXIS for axis in degrees; otherwise these cells are blank. Spherical equivalent is computed from sphere and cylinder as SPH_EQUILAV_OD, SPH_EQUILAV_OS. When add was prescribed for near vision, this is given as RX_OD_ADD, RX_OS_ADD; otherwise these cells are blank.
Visual Fields.xlsx
This has 4 sheets for different types of perimetric data, three sheets for data from the Humphrey Field Analyzer (HFA) and one sheet for data from the Humphrey Matrix.
- Sheet “HFA 24-2” has data for 3,687 perimetric tests from 212 people, for the HFA with the 24-2 test and the SITA-STANDARD algorithm, or else 24-2C and the SITA Faster algorithm.
- Sheet “10-2 HFA” has data for 161 perimetric tests from 48 people, for the HFA with the 10-2 test pattern and the SITA-STANDARD algorithm.
- Sheet “Other HFA” has data for 74 perimetric tests from 26 people, for the HFA with either the 30-2 test pattern, or the 24-2 test pattern with algorithms other than SITA-Standard.
- Sheet “Matrix 24-2” has data for 670 perimetric tests from 177 people, for the Matrix with the 24-2 test pattern.
In all four sheets the data are in in decibel (dB) units, for which contrast sensitivity increases by 0.1 log unit per degree for HFA and 0.05 log unit per degree for Matrix. The 24-2 test pattern has 54 locations for HFA and 55 locations for Matrix, within 24° of fixation except for two nasal locations at 27°.The HFA has two other test patterns: 24-2C and 10-2. The 10-2 test pattern has 68 locations within 10° of fixation. The 24-2C test pattern combines features of the 24-2 and 10-2 test patterns, including all 54 24-2C test patterns as well as 10 locations within 10° of fixation.
For 24-2 and 10-2 test patterns, the visual field locations are organized in the same manner. Starting at the top row with the nasal-most location, then proceeding horizontally in a temporal direction to the end of the row, then the nasal-most location in the next row, and so on. These are numbered L1, L2, L3 etc. for the sensitivities; then TD1, TD2 and so on for Total Deviation; then PD1, PD2 and so on for Pattern Deviation. For the HFA 24-2C test pattern, the values for the additional 10 locations are listed after the first 54 locations hat are identical to the 54 locations for the 24-2 test pattern.
- Columns A-Q are the same for all 4 sheets:
- Column A has age at SS date
- Column B has days since SSdate when the test was administered
- Column C has the subject ID number
- Column D has the eye tested
- Column E has the time of day when the test was administered
- Column F has the type of test
- Column G has the pupil diameter – not recorded for Matrix
- Column H has the spherical trial lens used – not recorded for Matrix
- Column I has the duration of the test
- Column J has the number of times the person responded to a fixation loss trial
- Column K has the total number of fixation loss trials
- Column L has the fixation loss rate
- Column M has the false positive rate
- Column N has the false negative rate
- Column O has the Mean Deviation (MD) in dB
- Column P has the Pattern Deviation (PD) in dB
For the sheet “HFA 24-2”:
- Columns Q-CB have sensitivity for each location
- Columns CD-EE have Total Deviation (TD) for the first 54 locations
- Locations 26 & 36 are blank due to blind spot
- Columns EG-GH have Pattern Deviation (PD) for the first 54 location
- Locations 26 & 36 are blank due to blind spot
- Column GJ has any clinical comments (e.g., “Lens Rim Artifact”)
- Column GK has the Visual Field Index (VFI) as percent
- Column GH has results for the Glaucoma Hemifield test (GHT)
For the sheet “HFA 10-2”:
- Columns Q-CF have sensitivity for each location
- Columns CH-EW have Total Deviation (TD) for the 68 locations
- Columns EY-HN have Pattern Deviation (PD) for the 68 locations
For the sheet “Other HFA”:
- Columns Q-CF have sensitivity for each location
For the sheet “Matrix 24-2”:
- Columns Q-BR have sensitivity for each location
- Columns BT-DU have Total Deviation (TD) for the 54 locations
- Columns DW-FX have Pattern Deviation (PD) for the 54 locations
- Column FZ has sensitivity at the fovea
- Column GA has Total Deviation (TD) at the fovea
- Column GB has Pattern Deviation (PD) at the fovea
- Column GD has the number of false positive trials that were responded to
- Column GE has the total number of false positive trials
- Column GF has the number of false negative trials that were not responded to
- Column GG has the total number of false negative trials
- Column GI has results for the Glaucoma Hemifield test (GHT)
- Column GL has the percentile for MD
- Column GN has the percentile for PSD
The sheet “Test patterns” as the test locations for HFA 24-2 and 10-2, in degrees of visual angle
- Column A gives location number for 24-2
- Columns B and C give the x values for 24-2 locations for right eye (OD) and left eye (OD)
- Column D gives the y values for 24-2 locations
- Column F gives location number for 10--2
- Columns G and H give the x values for 10-2 locations for right eye (OD) and left eye (OD)
- Columns I gives the y values for 10-2 locations
Circumpapillary.xlsx
This file has RNFL thicknesses from scans in a circle around the optic disc, using standard software.
There are 4072 scans from 438 people.
- Column A has age at SS date
- Column B has days since SSdate when scan was taken
- Column C has the fixation target
- Column D has number of A-scans.
- Column E has the number of B-scans
- Column F has the subject ID number
- Column G has the eye tested
- Column H has the scan type
- Column I has the scan diameter in mm
- Column J has the scan diameter in degrees of visual angle
- Column K has 0 when diameter is fixed in degrees, and 1 when diameter is fixed in mm
- Column L has the time of day of the scan
- Column M has the software version
- Column N has the index of scan quality
- Column O has the number of scans averaged
- Columns P through V are the mean RNFL thickness values for Global (G) and the 6 sectors (T, TS, TI, N, NS, NI)
- Columns W-AC give the classifications of the RNFL thicknesses: within normal limits (WNL, p > 5%), Borderline (BO (5%>p>1%), Outside Normal Limits (ONL, p < 1%)
- Columns AD-ADQ give the thicknesses for 768 locations around the optic disc
POSTERIOR POLE.xlsx
This file has results from scans a rectangle centered on the fovea and angle along the fovea-disc angle, using standard software.
There are three sheets:
- RNFL-GCL-PP gives ganglion cell layer (GCL) thicknesses
- ILM-RNFL-PP gives retinal nerve fiber layer (RNFL) thicknesses
- ILM-M-PP gives retinal thickness
Each folder has the same structure:
- Column A has age at SS date
- Column B has when the scan was taken, as number of days after SSdate.
- Column C has the subject ID number
- Column D has the eye tested
- Column E has the time of day of the scan
- Column F has the index of scan quality
- Column G has the number of scans averaged
- Column H has the image type (these are all identical)
- Columns I and J give the upper and lower layers for computing thickness
- Column K gives type of grid (these are all identical)
- Columns L through BW give the mean thicknesses for the 64 boxes in the grid
B. Supplemental Files that Contain Retinal Images
There are three main folders:
1. Spectralis Imaging
The folder has three files,
- “SpectralisImaging.xlsx”
- “SpectralisMontages.zip”
- “SpectralisWidefieldImages.zip”
SpectralisImaging.xlsx
This is a summary file with two sheets, “Widefield” and “Montage”, which give information about the images in the zip files. Each sheet has four columns:
- “Subject” gives the subject ID number
- “Visit” gives the visit number
- “Eye” gives the eye tested
- “Days since ref” gives the number of days since the reference date that was used to compute the subject’s age
SpectralisMontages.zip
This has 648 folders for 227 people
The images are from OCT scans that were gathered using multiple fixation targets so that different portions of the retina were scanned, and then montaged together using i2K software to generate 1 image. There were 4 or 6 scans, depending on the protocol. The i2K software aligned the SLO images with rotation and translation, with no stretching or warping, then applied these to the OCT scans to make the montages.
The folders with images are named “ID_”+ ID number + “_” + visit number. Each of these has one or two subfolders (one if only one eye was tested), with “OD” or “OS” to indicate which eye.
Within the folder for an eye there is a folder “rotTrans” which contains two folders, "SLO" and “Montage”.
The “SLO” folder has the SLO images used for montaging, named “SLO_” + image number. It may also have a file “Single_disk_RPE_1” which shows reflectance at the level of the retinal pigment epithelium (RPE) and a circle to show location of the optic disc.
Most of the images are in the folder “Montages”, where the naming convention for the images is to begin with ID number + “_” + visit number “_” + Eye + “_Max_”, to indicate the person, visit and eye, and that in regions of overlap the maximum of two pixel values was used. The rest of the name indicates the image. Most of the images show reflectance of the retina at different depths below the ILM in 4 µm steps from 0 to 160 µm, as indicated by digits at the end of the name: “ _depth_0” for at the ILM, “_depth _12” for 12 µm below the ILM, “_depth _160” for 160 µm below the ILM. There are two sets of these images: the raw reflectance values are in images with “ILMimage” and attenuation coefficient values are in images with “NormalizedILM”. For both sets of images pixel values, the pixel values are in log unit steps from -2 to +2 log units.
The remaining images are:
- “GCLthickness” for thickness of the ganglion cell layer
- “GCLthicknessCorrected” for thickness of the ganglion cell layer with automated smoothing across segmentation errors.
- “retinalThickness” for retinal thickness from Bruch’s membrane to the ILM
- “Thickness” for thickness of the RNFL, with circles superimposed for locations of disc and fovea, and a line for location of the temporal raphe
- “Thickness_2” for thickness of the RNFL without the circles or lines
- “ThicknessCorrected” and “ThicknessCorrected-2” for these images with automated smoothing across segmentation errors.
- “RapheILM” shows the estimation for location of the temporal raphe (yellow line), based on attenuation coefficient values along the white lines at a range of angles and starting from the fovea (red circle)
- “raphePlot” show average attenuation coefficients for the white lines, as a function of the angle of the lines.
- “RPEimage” shows reflectance at the level of the retinal pigment epithelium (RPE)
- “SLOimage” shows the montaged SLO images, with a red circle demarcating the fovea and a white circle demarcating the optic disc
- “Vesselimage” shows results of an attempt to outline the retinal vasculature
- “customSeg” shows the average of attenuation coefficients from 0 to 160 µm below the ILM, in a color scale where blue is low and red is high
Other files in this folder that end with “.mat” are Matlab files with numerical values for the thickness measures.
Files that end with “.txt” are about the montaging:
- “xforms-SLO_1-SLO_4.txt” or “xforms-SLO_1-SLO_6.txt” show Matrices used for montages with the name having “4” or “6” depending on the number of scans (4 or 6).
- “matches-SLO_1-SLO_6.txt” gives information about the regions of overlap
The folder “overlapPlot” has plots of the overlap
There are also folders for the individual scans used in the montage, which have the same naming conventions as for the montages.
SpectralisWidefieldImages.zip
This has 364 folders for 159 people
These 55° x 40° images are from OCT scans that were gathered using a Widefield lens and central fixation target. The images are interpolated so that each pixel represents a square region 1/14° wide.
The same file structure and naming conventions are used as in the Montages folder, with these exceptions:
Within the folder for an eye there is a folder whose name begins with “AC_” followed by a number that is unique to the image but not used elsewhere. This has all of the JPEG images listed in “Montages”, but not the .mat. or .txt files, files about montaging, or any subfolders. It also has four more images, “ILMimage”, “ILMSsegCorrected,” “NFLImage”, “NFLsegCorrected” which were used in the internal computations but have no external meaning.
2. Quantification of Reflectance
This folder contains 3 subfolders generated by custom software that computes attenuation coefficient (AC) values for square regions of images. The first step is to identify pixels that are for the retinal vascular and remove those from the computation. Then the remaining pixels are converted from logAC to linear AC, and the average of these pixels is computed and recorded in the corresponding cell in the file “output.xls”. The width of this square (in pixels) is given in cell B3 of this file; in all cases a width of 30 pixels was used, and for the Widefield images widths of 5, 10, 14 and 28 pixels were also used. When it was not possible to use square regions at the bottom edge and right edge of the image, rectangles were used with the remaining pixels. The squares (or rectangles) are shown overlaid on the image in the “scan_box” image, numbering height (H) and width (W) the format H1W7 for the 7th square in the first row, although this was often truncated when it was too large for the square. In columns A and B of the sheet “Main Output” of the outpit.xls file, H_num gives row number and W_num gives column number. Column C, “Rectangle Size” gives the number of pixels in the square (or rectangle); these are shown in the same arrangement as the image in the sheet “Rectangle”. Column D “Vessels” gives the number of pixels considered to be vasculature; these are shown in the same arrangement as the image in the sheet “Vessels”. Column E “Percentage” expresses this is the fraction of pixels in the square assigned to vasculature; these are shown in the same arrangement as the image in the sheet “Percentage.” The remaining rows give the mean value for AC of the pixels not assigned to vasculature, for depths 0 to 160 µm below the inner limiting membrane (ILM); these are shown in the same arrangement as the image in the sheet “logAC_Depth_N”. Note that the labels for the rows and sheets at different depths have the label “logAC” but the values are actually linear AC. Because the images code black as -2 log unit, the smallest value a pixel can have for AC is 0.01, so when an entire square is black then the average is 0.01
The first folder, “6-scan Montages”, analyzes OCT images that were gathered using multiple fixation targets so that different portions of the retina were scanned, and then montaged together using i2K software to generate 1 image. The second folder, “Optic Disc Image”, analyzes only a single optic disc image. The last folder, “Widefield Images”, analyzes a single image of a retinal region 55° wide and 40° tall (in degrees of visual angle) with the person fixating in the center.
The folder “Widefield Images” includes subfolders for analysis of many different box sizes. There are 311 images analyzed for each of the 5 box sizes (5, 10, 14, 28, and 30). The other two folders, “6-scan Montages” and “Optic Disc Image”, are not organized into subfolder of box size. The box size used for the analysis can be found in the “output.xls” file for each image. The “6-scan Montages” folder contains analysis for 125 images, while “Optic Disc Image” contains analysis for 110 images.
Folder structure:
There is a subfolder for each for images analyzed with the format “IDnum _VisitNum_Eye_Box” (i.e. “1006_1_OS_Box). Nested within that folder is another subfolder called “1 Scan”.
Details of the files found in the folders “IDnum _VisitNum_Eye_Box->1 Scan” are as follows:
- “IDnum _VisitNum_Eye_1 Scan_box.jpg”: Box grid overlayed on image
- “IDnum _VisitNum_Eye_1 Scan_original.jpg”: Original Image
- “IDnum _VisitNum_Eye_1 Scan_sets.jpg”: Original Image on left, next three have pixels identified as vasculature at two spatial scales (middle two images) and then these pixels combined (right image)
- Output.xls: Quantitative attenuation coefficient data, described in detail in the first paragraph above
- Process subfolder: This has intermediate steps used in identifying pixels as vasculature
- “IDnum _VisitNum_Eye_1 Scan_artery.jpg”
- “IDnum _VisitNum_Eye_1 Scan_binary.jpg”
- “IDnum _VisitNum_Eye_1 Scan_binary2.jpg”
- “IDnum _VisitNum_Eye_1 Scan_blur.jpg
- “IDnum _VisitNum_Eye_1 Scan_combine.jpg
- “IDnum _VisitNum_Eye_1 Scan_diff.jpg
- “IDnum _VisitNum_Eye_1 Scan_out.jpg”
- “IDnum _VisitNum_Eye_1 Scan_overall.jpg”
3. Effects of Shape of the Eye
This has two subfolders:
Test-Retest for 15° x 30° disc B-scans
- This contains an Excel file and a folder
Subject OD, visit and Eye for Test-retest B-scans.xlsx
- This file gives information about the images:
- “Subject” gives the subject ID number
- “Visit” gives the visit number
- “Eye” gives the eye tested
B-scans
The folder contains 20 subfolders, for 2 sets of scans of 10 eyes (one eye per person). Each subfolder has the name ID number + “_” + visit number “_” + eye.
Within a folder are TIFF files for 145 vertical b-scans that were 30° long, with a fixation target at 15° horizontal so that the scan covered the optic disc, with the 145 b-scans covering a horizontal region 15° across. Each folder also contains a Matlab file with data for the segmented scans. Each b-scan image is in horizonal view, but the scan was actually vertical.
The en face images for these scans are in the Spectralis Imaging.zip “Montages” folder which has all the 6-scan montages.
The mean reflectance values are given in the Spectralis Imaging.zip “All Images Output” folder, in the “Optic Disc Image” subfolder.
Vary Scan Angle
This contains an Excel file and three subfolders. There are three “visits” per eye, all on the same day with dilated pupil and three different pupil entries to tilt the b-scan (each pupil entry is a different “visit”). The goal was to image the same part of the retina, although in some cases cyclorotation causes a small change in location.
Subject OD, visit and Eye for Varying Scan Angle.xlsx
This file gives information about the images:
- “Subject” gives the subject ID number
- “Visit” gives the visit number – representing pupil position
- “Eye” gives the eye tested
Widefield images - rectangular pixels
This folder has images from 55° x 40° scans, as described above for Spectralis Widefield Images.zip except that the pixels represent rectangular regions, where a region that is 1° wide is covered by 154 pixels horizontally and 16/6 pixels vertically.
Tilted B-scans
This folder has b-scans with an SLO image to show the location for each b-scan. This shows the tilt due to change in scan angle.
Quantified Output
This folder has mean reflectances, in the same format as described above for “2. All Images Output."