Speech-To-Text
Overview
What is this project about?
ICS' Role
What is ICS' Role in the project?
Notes
-
background info to add to the text to speech summary
- also include these things in the summary
- per surgeon template for each surgery
-
test the tool on 3 procedures one template for each
-
needs to review the summary for inaccuracies
- how to review and correct
- ideally would not need to be edited
-
patient comes in
- do pre-op consult
- how, risks benefits etc
- pretty much a script
- Q&A
- recorded or realtime
- if realtime
- be part of discussion
- find images?
- take small clips and send for running transcription
- summary the transcript send to LLM
- send to AI
- review and correct if needed
- then print to document
- after patient goes home with summary
- do pre-op consult
-
HIPAA compliant
-
integrated with epic
-
installed on a specific device
- streaming transcript as translated
-
review page
-
save page
challenges
- open web ui does not do realtime text-to-speech, requires button clicking to handle sending audio
- not a great review/edit experience
Next Steps
- investigate building a demo env
- investigate other available tools
- send follow up email
- instructions on testing open web ui
-
- https://www.docker.com/products/docker-desktop/
- https://docs.openwebui.com/getting-started/#installing-open-webui-with-bundled-ollama-support
- Sign up with your information
- Go to admin - settings - models
- pull model llama3.2:1b
- go to audio and pull the base whisper model
- next steps
- eval open web ui and/or other tools
- create SOW for custom tool