Final project

Table of contents

  1. Introduction
  2. Public healthcare datasets
  3. Project proposal
  4. Project mid report
  5. Poster presentation
  6. Final report and code
  7. What we are looking for in a final project:

Introduction

The course requires students to complete a semester-long group project focused on healthcare and AI. The project is a significant component of the course evaluation and consists of several deliverables: Project proposal, Midterm report, Poster presentation, and Final report.

Forming groups is mandatory. Students may form their own groups. However, if you need help, the instructor will facilitate a survey to understand your interests and share the results so that it will be easier for you to find potential group members. For all project deliverables, one submission per group is sufficient

Public healthcare datasets

Here is a list of publicly available health-related datasets of different types (e.g., imaging, signals, omics, EHR, wearables). Some of these might require additional approvals, which is usually the case with healthcare data, so please make sure to look into how soon you can get access before deciding to do a project on these datasets. Also, do not feel obligated to use a dataset listed here, you can always choose other datasets you find or have access to.

Project proposal

The project proposal accounts for 10% of the final project grade. By the proposal deadline, students should have gained access to their data and demonstrate an in-depth understanding of their dataset. They must also complete a literature review related to the project and establish clear (and achievable) project goals. Here is a template you can use to write your project proposals.

Project mid report

The mid-report constitutes 20% of the final project grade. Students must present preliminary results and show appropriate progress since the project proposal. A template for the mid-report is available here. You do not have to fill in some of the sections unless you want to make some changes to your original proposal.

Poster presentation

Presentation of final projects is going to be done via a poster presentation which will constitute 30% of the final project grade.

  1. Please use this template to design your posters. The template has information on what you need to include in the poster. You can modify the template as you see fit (as long as you describe all the requested information in some way). Please do not change font sizes or the final print size.
  2. You will use the CSE-IT poster printing service (free of charge) to print the poster. Please submit the print job using this request form at least two business days ahead of the day of presentation. Mention CSENG CS&E Administration (11108) as the department.
  3. The presentations will be evaluated based on a number of aspects, including your understanding of the clinical problem, whether the solution addresses the problem at hand, whether the evaluations are reasonable, whether the results support the expected outcomes, how the group worked together as a whole, and the clarity in poster design and presentation. Here is the detailed rubric. Your posters and presentations will be evaluated by peers from the class and external graders.

Final report and code

Final report accounts for 40% of the final project grade. It is recommended to create a git repo with your code and include a link in the report. Otherwise, you can also create a zip file including your report and code. Please follow this template for your final report.

What we are looking for in a final project:

  1. Your understanding of the clinical problem and its significance: simply downloading a healthcare dataset without a clear understanding of the problem is not recommended. I am assuming you are taking this class because you have some interest in healthcare. So think carefully about a problem that you are interested in. See if you can find a dataset that will help solve the problem of interest. If you cannot find a dataset, then see if you can revise your problem definition to fit a dataset that is available.

  2. Whether you are able to formulate a machine learning-based solution to the problem: think about how you can convert the problem into a machine learning question. For example, if you are thinking about optimizing resource allocation for COVID-19, then the ML question could be predicting COVID-19 severity based on some type of data (e.g., chest x-ray).

  3. The ML model you are using: Once you define the machine learning task, then think about the approach. We have been looking at many different ML approaches in class (and we will look at more in the future). Feel free to choose any of them as long as the model can answer the question you are asking. Note that I am not looking for a very sophisticated ML approach. You can choose any ML algorithm as long as you do a thorough job in model training and evaluation. Keep in mind that, sometimes, you may have to perform some initial analysis of the data to understand what ML model could fit the problem. You also have to carefully think about your dataset size, the difficulty of implementing the ML model, the computational resources you have, amount of time you have, etc. Remember you only have two months and you probably have other deadlines too. 

  4. Whether you are able to analyze the model performance, explain why you are getting what you are getting, make modifications to improve performance, and/or discuss its limitations: The ML model is not going to give you the perfect result in the first try (and it may never). So you may have to go and inspect what is happening during training, testing, etc. I will look at what sort of approaches you took understand the results and to remedy any issues. This could involve visualizing the data, inspecting model learning curves, looking at different evaluation metrics, etc. This could also involve performing comparisons with other ML approaches (but not necessarily).  

One way you could make things easy is by looking at some publications on the topic you are interested in and see what they have done. It is perfectly OK if you choose to reproduce what has been done in a published paper. But remember that you still have to internalize the problem and think carefully about the solution and the results.

If your project idea needs some revision, you will hear from the instructor after you submit the proposal. We will provide some tips on how to reformulate the problem, what other analyses you could do, etc. Also, you are not strictly bound to the things that you propose in a proposal. You have the flexibility to change your project scope as long as the changes are within reason and you keep us in the loop.

We certainly understand that some students are more experienced in independent projects. We will take this into account during grading. But please make sure to get in touch as soon as possible if you are struggling to make progress. Our advice is, start early, make steady progress, and ask for help if needed.