Please wait while the transcript is being prepared...
0:00
Hello. My name is Haifeng Wang, I'm an assistant professor from industrial and systems engineering department of Mississippi State University. It's my great pleasure to talk about our research. Today, my presentation is about the integration of machine learning with biomedical data.
0:20
I will talk about the challenges of using biomedical data, and then move to several examples that have been conducted in our lab. Then I will finish the presentation with conclusions and a few references from our publication.
0:37
As you all may know, machine learning currently is a very popular method for biomedical applications. This slide basically shows the typical process of a machine learning model for medical analysis. Usually, when we collect the data the data could be in a different format. It could be signals of the heart rate. It could be an image like MRI of the brain and CT scan of the lung. It could be a DNA sequence or RNA sequence or proteins, those things. Based on raw data, usually the first step is to define the features. The features could be related to clinical features, such as the tumor size, the tumor shape, and then the signal pattern that relates to the irregular heart rate. They could be statistical features, such as the average standard deviation, skewness, just based on statistical measures, and could be other feature measures extracted from other machine learning models. After we gather those features, typically we prepare the data in a tableau format so that we have columns, we have different rows, and each row represents a patient. After we gather this data, the next step usually is the feature preprocessing. For the feature pre-processing, the main aim is to remove the noise, identify outliers from feature selection, feature extraction. Let's say if we have the dimension too high, and we reduce the dimension. If I have too many features, and we try to figure out what other features are available and what other features we do not provide much information. Then, after this feature preprocessing we have a dataset that's very good, that is prepared in good shape and then we will input that into a model. The model development essentially relates to how we decide which model we are going to use. Whether this is a classification problem or regulation problem, or this is unsupervised learning, we totally do not have a label. We just want to group the data into different groups; identify different groups of patients. Nowadays, deep learning is very popular, so which deep learning model can we use? For some of the deep learning models, such as a convolutional neural network or STM, we can even skip the define feature and the feature preprocessing steps. We directly use the raw data as input for a deep learning model. The aim here at this step of the model development is to get a high performance model. When we say high performance model, it depends on the application. We could focus on accuracy and the AUC. We could focus on efficiency. We could focus on the trust needs, explainability. We could focus on the security. It depends on the problem still, and we get a high performance model and customized model to reach the goal. After this step, usually from the define feature, pre-processing, and model development and these three steps are in the lab. Then, after we have the model ready, we need to have several iterations and different experiments to test whether this model could be used. Usually in this step, we need to integrate that with the existing infrastructure to get a decision support system. For example, if we develop a machine learning model to detect cancer, we might integrate this model with the existing medical image analysis software so that we can click a button and use the machine learning model to make a prediction, because its final product is an expert system. In this process, actually, many challenges machine learning model tries to solve.

Quiz available with full talk access. Request Free Trial or Login.

Hide

Integration of machine learning with biomedical data

Embed in course/own notes