Dr. Klaus Mueller, professor of computer science and Dr. Eric Papenhausen, a 2017 computer science alumnus at Stony Brook University have recently led their technology startup, Akai Kaeru, in the application of an algorithm that can predict which counties will have a higher-than-average rate of deaths caused by COVID-19.
Akai Kaeru, Japanese for “red frog,” was created in 2016 and received funding from the National Science Foundation to perform visual data analytics research. Mueller has taught in the Stony Brook Department of Computer Science for over 20 years and has authored more than 200 research papers, amassing over 9,000 total citations.
The technology, developed by both Mueller and Papenhausen, can visualize emerging correlations between demographic, economic and infrastructure related attributes, and the rate of COVID-19 deaths.
The team used public data collected from the United States Census Bureau, Health and Human Services and the Center for Disease Control (CDC) among others, according to Mueller and Papenhausen.
“It was in May that we had enough data and I started to see consistencies from the month prior in patterns that were predicting a high death rate in April and them still being valid in May,” Papenhausen said.
Upon analysis of data collected from 3,007 counties in the United States, Mueller and Papenhausen discovered discernible patterns contributing to higher-than-average rates of COVID-19 deaths in 985 counties, with 80-90% of these counties being located in Mississippi, Georgia and Louisiana.
The first set of data analyzed consisted of counties with poor and aging populations. Residents living in counties sparsely populated with poor and aging residents had higher-than-average rates of COVID-19-related deaths.
The second set of data analyzed by the algorithm indicated COVID-19 deaths were higher-than-average in counties with high amounts of sleep-deprived and under-educated residents who lacked health insurance. High COVID-19-related death tolls also followed for a third set: which analyzed residents living in counties with low Asian populations, but high minority populations in which contain Black children living in poverty.
The technology was initially created to generate and analyze financial data prior to COVID-19. That was before Mueller recognized the crucial need for his technology in March when COVID-19 began to escalate in the United States.
“When this COVID-19 data came along, we recognized that we could help the CDC and Health and Human Services become better informed about who gets the disease,” Mueller said.
Central to the algorithm is the analysis of county-wide data. “Counties have a certain DNA,” Mueller said. “Counties are more likely to be similar with respect to economic and demographic data.”
The patterns emerging from Mueller’s research may help to determine which counties will be affected by COVID-19 in the future. For example, supply chains, like mask manufacturers, will benefit from knowing which counties may be strained for resources.
In the imminent future, Mueller and Papenhausen hope to launch a web browser that will allow the general public to access their technology. The software used to generate the COVID-19 data analytics, can also be used for financial trading, determining risk factors for diseases like strokes and demonstrating the software’s versatility. Those who are interested in using the software can schedule demo presentations from the startup team.
Dr. Arie Kaufman, Distinguished Professor and former chair of the Stony Brook Computer Science Department from 1999 to 2017, spoke highly of Akai Kaeru’s efforts.
“[Mueller] is really a world expert in high dimensional feature analysis and that’s something that he’s used here very successfully,” Kaufman stated.
Kaufman, who is part of a larger pandemic data analytics project currently being pursued at Stony Brook University, spoke to how algorithms can constantly be retrained and refined, a process which is necessary for developing accurate predictions.
“Machine learning has a training component and a verification component.” Kaufman said. “The more data you have, the better the model because you can train the model. And the model learns from all of this data, and can do a good job.”
Kaufman has also been developing a machine learning classification model using segmented lung computed tomography (CT) images and a large data set containing thousands of patient data entities, including CT scans, X-rays and demographic data.
“This is actually a really intensive project and has a lot of machine learning algorithms that take the CT scan of the chest of the patient and segment the lungs, generating the correlation model between all 1500 parameters, and the COVID-19 outcome,” Kaufman said.
In addition to his classification model, Kaufman is currently developing an app that will facilitate self-quarantining of those who may be coming from micro hotspots, compared to general state-level restrictions imposed by governors. Micro hotspots can narrow the geospatial locations of outbreaks down to the street level.
“Instead of saying the entire state of Arizona is banned, we will characterize it by this micro hotspot,” Kaufman described.
For Akai Kaeru, however, the tech startup’s next goal is to spread the word about their technology. “I’m going to write a letter to Anthony Fauci,” Mueller said. “I believe he’s responsive to emails.”